2018-06-05 18:08:30

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [GIT PULL 00/46] perf/core fixes and improvements

From: Arnaldo Carvalho de Melo <[email protected]>

Hi Ingo,

Please consider pulling,

- Arnaldo

Test results at the end of this message, as usual.

The following changes since commit 7869e5889477e4e32e4024d665431b35e8b7b693:

Merge remote-tracking branch 'tip/perf/urgent' into perf/core (2018-06-04 10:28:20 -0300)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.18-20180605

for you to fetch changes up to 03ac4e71cd120d2c3411d106d00d266114575f74:

perf intel-pt: Fix "Unexpected indirect branch" error (2018-06-05 12:28:52 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

perf stat:

. Display user and system time for workload targets (Jiri Olsa)

perf record:

. Enable arbitrary event names thru name= modifier (Alexey Budankov)

PowerPC:

. Add a python script for hypervisor call statistics (Ravi Bangoria)

Intel PT: (Adrian Hunter)

. Fix sync_switch INTEL_PT_SS_NOT_TRACING

. Fix decoding to accept CBR between FUP and corresponding TIP

. Fix MTC timing after overflow

. Fix "Unexpected indirect branch" error

perf test:

. record+probe_libc_inet_pton:

. To get the symbol table for dynamic
shared objects on ubuntu we need to pass the -D/--dynamic command line
option, unlike with the fedora distros (Arnaldo Carvalho de Melo)

. code-reading:

. Fix perf_env setup for PTI entry trampolines (Adrian Hunter)

. kmod-path:

. Add tests for vdso32 and vdsox32 (Adrian Hunter)

. Use header file util/debug.h (Thomas Richter)

perf annotate:

. Make the various UI backends (stdio, TUI, gtk) use more consistently
structs with annotation options as specified by the user (Arnaldo Carvalho de Melo)

. Move annotation specific knobs from the symbol_conf global kitchen
sink to the annotation option structs (Arnaldo Carvalho de Melo)

Core:

. Fix misleading error for some unparsable events mentioning PMUs when
those are not involved in the problem (Jiri Olsa)

- Fix symbol and object code resolution for vdso32 and vdsox32 (Adrian Hunter)

. No need to check for null when passing pointers to foo__get() style
refcount grabbing helpers, just like in the kernel and with free(),
its safe to pass a NULL pointer to avoid having to check it before
each and every foo__get() call (Arnaldo Carvalho de Melo)

. Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)

. Remove some needless globals, making them local (Arnaldo Carvalho de Melo)

. Reduce usage of symbol_conf.use_callchain, using other means of
finding out if callchains are in use or available for specific events,
as we evolved this codebase to allow requesting callchains for just
a subset of the monitored events. In time it will help polish
recording and showing mixed sets accross the various tools:

perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions

(Arnaldo Carvalho de Melo)

. Consider PTI entry trampolines in map__rip_2objdump() (Adrian Hunter)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------
Adrian Hunter (8):
perf tests kmod-path: Add tests for vdso32 and vdsox32
perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
perf test code-reading: Fix perf_env setup for PTI entry trampolines
perf map: Consider PTI entry trampolines in rip_2objdump()
perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
perf intel-pt: Fix MTC timing after overflow
perf intel-pt: Fix "Unexpected indirect branch" error

Alexey Budankov (1):
perf record: Enable arbitrary event names thru name= modifier

Arnaldo Carvalho de Melo (33):
perf tools: Remove dead quote.[ch] code
perf probe: Use return of map__get() to make code more compact
perf cgroup: Make evlist__find_cgroup() more compact
perf tools: No need to check if the argument to __get() function is NULL
perf annotate: Pass perf_evsel instead of just evsel->idx
perf annotate: __symbol__acount_cycles doesn't need notes
perf annotate: Split allocation of annotated_source struct
perf annotate: Introduce constructor/destructor for annotated_source
perf annotate: Introduce annotated_source__alloc_histograms
perf annotate: __symbol__inc_addr_samples() needs just annotated_source
perf annotate: Introduce symbol__hists()
perf annotate: Introduce symbol__cycle_hists()
perf annotate: Stop using symbol_conf.nr_events global in symbol__hists()
perf annotate: Replace symbol__alloc_hists() with symbol__hists()
perf tools: Ditch the symbol_conf.nr_events global
perf annotate: Add comment about annotated_src->nr_histograms
perf annotate stdio: Use annotation_options consistently
perf srcline: Introduce map__srcline() to make code more compact
perf sort: Introduce addr_map_symbol__srcline() to make code more compact
perf srcline: Make hist_entry srcline helper consistent with map's
perf annotate: Pass annotation_options to symbol__annotate()
perf annotate: Adopt anotation options from symbol_conf
perf annotate: Move disassembler_style global to annotation_options
perf hists browser: Pass annotation_options from tool to browser
perf annotate: Move objdump_path to struct annotation_options
perf report: No need to have report_callchain_help as a global
perf evsel: Add has_callchain() helper to make code more compact/clear
perf script: Check if evsel has callchains before trying to use it
perf sched: Use sched->show_callchain where appropriate
perf hists: Do not allocate space for callchains for evsels without them
perf hists: Introduce hist_entry__has_callchain() method
perf hists: Check if a hist_entry has callchains before using them
perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols

Jiri Olsa (2):
perf stat: Display user and system time
perf tools: Fix pmu events parsing rule

Ravi Bangoria (1):
perf script powerpc: Python script for hypervisor call statistics

Thomas Richter (1):
perf test: Use header file util/debug.h

tools/perf/Documentation/perf-list.txt | 6 +-
tools/perf/Documentation/perf-record.txt | 3 +
tools/perf/Documentation/perf-stat.txt | 40 +++--
tools/perf/arch/common.c | 4 +-
tools/perf/arch/common.h | 4 +-
tools/perf/builtin-annotate.c | 36 ++--
tools/perf/builtin-c2c.c | 2 +-
tools/perf/builtin-kvm.c | 2 -
tools/perf/builtin-probe.c | 3 +-
tools/perf/builtin-report.c | 39 ++--
tools/perf/builtin-sched.c | 14 +-
tools/perf/builtin-script.c | 12 +-
tools/perf/builtin-stat.c | 28 ++-
tools/perf/builtin-top.c | 48 +++--
tools/perf/builtin-trace.c | 2 +-
tools/perf/perf.c | 1 -
.../perf/scripts/python/bin/powerpc-hcalls-record | 2 +
.../perf/scripts/python/bin/powerpc-hcalls-report | 2 +
tools/perf/scripts/python/powerpc-hcalls.py | 200 +++++++++++++++++++++
tools/perf/tests/code-reading.c | 1 +
tools/perf/tests/kmod-path.c | 16 ++
tools/perf/tests/parse-events.c | 4 +-
tools/perf/tests/python-use.c | 3 +-
.../tests/shell/record+probe_libc_inet_pton.sh | 2 +-
tools/perf/ui/browsers/annotate.c | 21 ++-
tools/perf/ui/browsers/hists.c | 43 +++--
tools/perf/ui/browsers/hists.h | 3 +
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/ui/gtk/hists.c | 5 +-
tools/perf/ui/hist.c | 2 +-
tools/perf/ui/stdio/hist.c | 4 +-
tools/perf/util/Build | 1 -
tools/perf/util/annotate.c | 160 ++++++++++-------
tools/perf/util/annotate.h | 53 ++++--
tools/perf/util/cgroup.c | 9 +-
tools/perf/util/dso.c | 2 +
tools/perf/util/evsel.c | 4 +-
tools/perf/util/evsel.h | 5 +
tools/perf/util/header.c | 24 ++-
tools/perf/util/hist.c | 23 ++-
tools/perf/util/hist.h | 26 ++-
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 23 ++-
.../perf/util/intel-pt-decoder/intel-pt-decoder.h | 9 +
tools/perf/util/intel-pt.c | 5 +
tools/perf/util/map.c | 26 ++-
tools/perf/util/map.h | 1 +
tools/perf/util/parse-events.l | 18 +-
tools/perf/util/parse-events.y | 14 +-
tools/perf/util/probe-event.c | 3 +-
tools/perf/util/quote.c | 62 -------
tools/perf/util/quote.h | 31 ----
tools/perf/util/session.c | 2 +-
tools/perf/util/sort.c | 81 +++------
tools/perf/util/sort.h | 7 +-
tools/perf/util/symbol.c | 1 -
tools/perf/util/symbol.h | 3 -
tools/perf/util/top.h | 3 +-
57 files changed, 731 insertions(+), 419 deletions(-)
create mode 100644 tools/perf/scripts/python/bin/powerpc-hcalls-record
create mode 100644 tools/perf/scripts/python/bin/powerpc-hcalls-report
create mode 100644 tools/perf/scripts/python/powerpc-hcalls.py
delete mode 100644 tools/perf/util/quote.c
delete mode 100644 tools/perf/util/quote.h

Test results:

The first ones are container (docker) based builds of tools/perf with
and without libelf support. Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.

The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.

Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.

The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.

Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.

# dm
1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0
2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822
3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0
4 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0
5 alpine:edge : Ok gcc (Alpine 6.4.0) 6.4.0
6 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
7 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
8 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
9 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
10 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
11 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18)
12 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
13 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2
14 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u1) 4.9.2
15 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
16 debian:experimental : Ok gcc (Debian 7.3.0-19) 7.3.0
17 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 7.3.0-19) 7.3.0
18 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 7.3.0-19) 7.3.0
19 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 7.3.0-18) 7.3.0
20 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 7.3.0-19) 7.3.0
21 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
22 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
23 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
24 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
25 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
26 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
27 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
28 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
29 fedora:27 : Ok gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
30 fedora:28 : Ok gcc (GCC) 8.1.1 20180502 (Red Hat 8.1.1-1)
31 fedora:rawhide : Ok gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20)
32 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 6.4.0-r1 p1.3) 6.4.0
33 mageia:5 : Ok gcc (GCC) 4.9.2
34 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0
35 opensuse:42.1 : Ok gcc (SUSE Linux) 4.8.5
36 opensuse:42.2 : Ok gcc (SUSE Linux) 4.8.5
37 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5
38 opensuse:tumbleweed : Ok gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
39 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18.0.7)
40 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28.0.1)
41 ubuntu:12.04.5 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
42 ubuntu:14.04.4 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
43 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
44 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
45 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
46 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
47 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
48 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
49 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
50 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
51 ubuntu:16.10 : Ok gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
52 ubuntu:17.04 : Ok gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
53 ubuntu:17.10 : Ok gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
54 ubuntu:18.04 : Ok gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
#

# perf version
perf version 4.17.rc7.g03ac4e
# git log -1 --oneline
03ac4e71cd12 (HEAD -> perf/core, seventh/perf/core) perf intel-pt: Fix "Unexpected indirect branch" error
# uname -a
Linux jouet 4.17.0-rc5 #21 SMP Mon May 14 15:35:35 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : Ok
15: syscalls:sys_enter_openat event fields : Ok
16: Setup struct perf_event_attr : Ok
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : Ok
20: Breakpoint overflow sampling : Ok
21: Breakpoint accounting : Ok
22: Number of exit events of a simple workload : Ok
23: Software clock events period values : Ok
24: Object code reading : Ok
25: Sample parsing : Ok
26: Use a dummy software event to keep tracking : Ok
27: Parse with no sample_id_all bit set : Ok
28: Filter hist entries : Ok
29: Lookup mmap thread : Ok
30: Share thread mg : Ok
31: Sort output of hist entries : Ok
32: Cumulate child hist entries : Ok
33: Track with sched_switch : Ok
34: Filter fds with revents mask in a fdarray : Ok
35: Add fd to a fdarray, making it autogrow : Ok
36: kmod_path__parse : Ok
37: Thread map : Ok
38: LLVM search and compile :
38.1: Basic BPF llvm compile : Ok
38.2: kbuild searching : Ok
38.3: Compile source for BPF prologue generation : Ok
38.4: Compile source for BPF relocation : Ok
39: Session topology : Ok
40: BPF filter :
40.1: Basic BPF filtering : Ok
40.2: BPF pinning : Ok
40.3: BPF prologue generation : Ok
40.4: BPF relocation checker : Ok
41: Synthesize thread map : Ok
42: Remove thread map : Ok
43: Synthesize cpu map : Ok
44: Synthesize stat config : Ok
45: Synthesize stat : Ok
46: Synthesize stat round : Ok
47: Synthesize attr update : Ok
48: Event times : Ok
49: Read backward ring buffer : Ok
50: Print cpu map : Ok
51: Probe SDT events : Ok
52: is_printable_array : Ok
53: Print bitmap : Ok
54: perf hooks : Ok
55: builtin clang support : Skip (not compiled in)
56: unit_number__scnprintf : Ok
57: mem2node : Ok
58: x86 rdpmc : Ok
59: Convert perf time to TSC : Ok
60: DWARF unwind : Ok
61: x86 instruction decoder - new instructions : Ok
62: Use vfs_getname probe to get syscall args filenames : Ok
63: Check open filename arg using perf trace + vfs_getname: Ok
64: probe libc's inet_pton & backtrace it with ping : Ok
65: Add vfs_getname probe to get syscall args filenames : Ok
#

$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_help_O: make help
make_pure_O: make
make_no_auxtrace_O: make NO_AUXTRACE=1
make_debug_O: make DEBUG=1
make_clean_all_O: make clean all
make_util_map_o_O: make util/map.o
make_no_libelf_O: make NO_LIBELF=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_newt_O: make NO_NEWT=1
make_tags_O: make tags
make_no_libbionic_O: make NO_LIBBIONIC=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libpython_O: make NO_LIBPYTHON=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_doc_O: make doc
make_no_libaudit_O: make NO_LIBAUDIT=1
make_perf_o_O: make perf.o
make_install_O: make install
make_static_O: make LDFLAGS=-static
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_demangle_O: make NO_DEMANGLE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_install_bin_O: make install-bin
make_no_slang_O: make NO_SLANG=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_backtrace_O: make NO_BACKTRACE=1
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$


2018-06-05 17:51:43

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 05/46] perf annotate: Pass perf_evsel instead of just evsel->idx

From: Arnaldo Carvalho de Melo <[email protected]>

The code gets shorter and we'll be able to use evsel->evlist in a
followup patch.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-annotate.c | 6 +++---
tools/perf/builtin-report.c | 17 ++++++++---------
tools/perf/builtin-top.c | 6 +++---
tools/perf/util/annotate.c | 12 ++++++------
tools/perf/util/annotate.h | 4 ++--
5 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index da5704240239..2b21bbcd70ea 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -162,12 +162,12 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
hist__account_cycles(sample->branch_stack, al, sample, false);

bi = he->branch_info;
- err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);

if (err)
goto out;

- err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);

out:
return err;
@@ -249,7 +249,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
if (he == NULL)
return -ENOMEM;

- ret = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
+ ret = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
hists__inc_nr_samples(hists, true);
return ret;
}
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ad978e3ee2b8..7a689c933f04 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -136,26 +136,25 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,

if (sort__mode == SORT_MODE__BRANCH) {
bi = he->branch_info;
- err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
if (err)
goto out;

- err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);

} else if (rep->mem_mode) {
mi = he->mem_info;
- err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel);
if (err)
goto out;

- err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
+ err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);

} else if (symbol_conf.cumulate_callchain) {
if (single)
- err = hist_entry__inc_addr_samples(he, sample, evsel->idx,
- al->addr);
+ err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
} else {
- err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
+ err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
}

out:
@@ -181,11 +180,11 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
rep->nonany_branch_mode);

bi = he->branch_info;
- err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
if (err)
goto out;

- err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+ err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);

branch_type_count(&rep->brtype_stat, &bi->flags,
bi->from.addr, bi->to.addr);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 7a349fcd3864..bc71e899096d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -188,7 +188,7 @@ static void ui__warn_map_erange(struct map *map, struct symbol *sym, u64 ip)
static void perf_top__record_precise_ip(struct perf_top *top,
struct hist_entry *he,
struct perf_sample *sample,
- int counter, u64 ip)
+ struct perf_evsel *evsel, u64 ip)
{
struct annotation *notes;
struct symbol *sym = he->ms.sym;
@@ -204,7 +204,7 @@ static void perf_top__record_precise_ip(struct perf_top *top,
if (pthread_mutex_trylock(&notes->lock))
return;

- err = hist_entry__inc_addr_samples(he, sample, counter, ip);
+ err = hist_entry__inc_addr_samples(he, sample, evsel, ip);

pthread_mutex_unlock(&notes->lock);

@@ -691,7 +691,7 @@ static int hist_iter__top_callback(struct hist_entry_iter *iter,
struct perf_evsel *evsel = iter->evsel;

if (perf_hpp_list.sym && single)
- perf_top__record_precise_ip(top, he, iter->sample, evsel->idx, al->addr);
+ perf_top__record_precise_ip(top, he, iter->sample, evsel, al->addr);

hist__account_cycles(iter->sample->branch_stack, al, iter->sample,
!(top->record_opts.branch_stack & PERF_SAMPLE_BRANCH_ANY));
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 71897689dacf..0f5ed6091e00 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -836,7 +836,7 @@ static struct annotation *symbol__get_annotation(struct symbol *sym, bool cycles
}

static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
- int evidx, u64 addr,
+ struct perf_evsel *evsel, u64 addr,
struct perf_sample *sample)
{
struct annotation *notes;
@@ -846,7 +846,7 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
notes = symbol__get_annotation(sym, false);
if (notes == NULL)
return -ENOMEM;
- return __symbol__inc_addr_samples(sym, map, notes, evidx, addr, sample);
+ return __symbol__inc_addr_samples(sym, map, notes, evsel->idx, addr, sample);
}

static int symbol__account_cycles(u64 addr, u64 start,
@@ -974,15 +974,15 @@ void annotation__compute_ipc(struct annotation *notes, size_t size)
}

int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, struct perf_sample *sample,
- int evidx)
+ struct perf_evsel *evsel)
{
- return symbol__inc_addr_samples(ams->sym, ams->map, evidx, ams->al_addr, sample);
+ return symbol__inc_addr_samples(ams->sym, ams->map, evsel, ams->al_addr, sample);
}

int hist_entry__inc_addr_samples(struct hist_entry *he, struct perf_sample *sample,
- int evidx, u64 ip)
+ struct perf_evsel *evsel, u64 ip)
{
- return symbol__inc_addr_samples(he->ms.sym, he->ms.map, evidx, ip, sample);
+ return symbol__inc_addr_samples(he->ms.sym, he->ms.map, evsel, ip, sample);
}

static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 5080b6dd98b8..aef9eae4f125 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -279,14 +279,14 @@ static inline struct annotation *symbol__annotation(struct symbol *sym)
}

int addr_map_symbol__inc_samples(struct addr_map_symbol *ams, struct perf_sample *sample,
- int evidx);
+ struct perf_evsel *evsel);

int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
struct addr_map_symbol *start,
unsigned cycles);

int hist_entry__inc_addr_samples(struct hist_entry *he, struct perf_sample *sample,
- int evidx, u64 addr);
+ struct perf_evsel *evsel, u64 addr);

int symbol__alloc_hist(struct symbol *sym);
void symbol__annotate_zero_histograms(struct symbol *sym);
--
2.14.3


2018-06-05 17:51:44

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 07/46] perf annotate: Split allocation of annotated_source struct

From: Arnaldo Carvalho de Melo <[email protected]>

So that we can allocate just the notes->src->cyc_hist, that, unlike
notes->src->histograms, is not per event, and in paths where we
need to lazily allocate notes->src->cyc_hist we don't have the
number of events handy to also allocate ->histograms.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 10 +++++++---
tools/perf/util/annotate.h | 6 +++---
2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index a7221f9fa504..f0c6941bca6c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -701,13 +701,17 @@ int symbol__alloc_hist(struct symbol *sym)
sizeof_sym_hist = (sizeof(struct sym_hist) + size * sizeof(struct sym_hist_entry));

/* Check for overflow in zalloc argument */
- if (sizeof_sym_hist > (SIZE_MAX - sizeof(*notes->src))
- / symbol_conf.nr_events)
+ if (sizeof_sym_hist > SIZE_MAX / symbol_conf.nr_events)
return -1;

- notes->src = zalloc(sizeof(*notes->src) + symbol_conf.nr_events * sizeof_sym_hist);
+ notes->src = zalloc(sizeof(*notes->src));
if (notes->src == NULL)
return -1;
+ notes->src->histograms = calloc(symbol_conf.nr_events, sizeof_sym_hist);
+ if (notes->src->histograms == NULL) {
+ zfree(&notes->src);
+ return -1;
+ }
notes->src->sizeof_sym_hist = sizeof_sym_hist;
notes->src->nr_histograms = symbol_conf.nr_events;
INIT_LIST_HEAD(&notes->src->source);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index aef9eae4f125..94b60e34c3a7 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -201,7 +201,7 @@ struct cyc_hist {

/** struct annotated_source - symbols with hits have this attached as in sannotation
*
- * @histogram: Array of addr hit histograms per event being monitored
+ * @histograms: Array of addr hit histograms per event being monitored
* @lines: If 'print_lines' is specified, per source code line percentages
* @source: source parsed from a disassembler like objdump -dS
* @cyc_hist: Average cycles per basic block
@@ -217,7 +217,7 @@ struct annotated_source {
int nr_histograms;
size_t sizeof_sym_hist;
struct cyc_hist *cycles_hist;
- struct sym_hist histograms[0];
+ struct sym_hist *histograms;
};

struct annotation {
@@ -269,7 +269,7 @@ void annotation__init_column_widths(struct annotation *notes, struct symbol *sym

static inline struct sym_hist *annotation__histogram(struct annotation *notes, int idx)
{
- return (((void *)&notes->src->histograms) +
+ return (((void *)notes->src->histograms) +
(notes->src->sizeof_sym_hist * idx));
}

--
2.14.3


2018-06-05 17:53:04

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 30/46] perf sched: Use sched->show_callchain where appropriate

From: Arnaldo Carvalho de Melo <[email protected]>

Instead of using symbol_conf.use_callchain, reducing its usage a bit
more.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-sched.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 97f9e755e8e6..cbf39dab19c1 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2143,7 +2143,7 @@ static void save_task_callchain(struct perf_sched *sched,
return;
}

- if (!symbol_conf.use_callchain || sample->callchain == NULL)
+ if (!sched->show_callchain || sample->callchain == NULL)
return;

if (thread__resolve_callchain(thread, cursor, evsel, sample,
@@ -2271,10 +2271,11 @@ static struct thread *get_idle_thread(int cpu)
return idle_threads[cpu];
}

-static void save_idle_callchain(struct idle_thread_runtime *itr,
+static void save_idle_callchain(struct perf_sched *sched,
+ struct idle_thread_runtime *itr,
struct perf_sample *sample)
{
- if (!symbol_conf.use_callchain || sample->callchain == NULL)
+ if (!sched->show_callchain || sample->callchain == NULL)
return;

callchain_cursor__copy(&itr->cursor, &callchain_cursor);
@@ -2320,7 +2321,7 @@ static struct thread *timehist_get_thread(struct perf_sched *sched,

/* copy task callchain when entering to idle */
if (perf_evsel__intval(evsel, sample, "next_pid") == 0)
- save_idle_callchain(itr, sample);
+ save_idle_callchain(sched, itr, sample);
}
}

@@ -2849,7 +2850,7 @@ static void timehist_print_summary(struct perf_sched *sched,
printf(" CPU %2d idle entire time window\n", i);
}

- if (sched->idle_hist && symbol_conf.use_callchain) {
+ if (sched->idle_hist && sched->show_callchain) {
callchain_param.mode = CHAIN_FOLDED;
callchain_param.value = CCVAL_PERIOD;

--
2.14.3


2018-06-05 17:53:55

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 45/46] perf intel-pt: Fix MTC timing after overflow

From: Adrian Hunter <[email protected]>

On some platforms, overflows will clear before MTC wraparound, and there
is no following TSC/TMA packet. In that case the previous TMA is valid.
Since there will be a valid TMA either way, stop setting 'have_tma' to
false upon overflow.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index e5eb91777383..881d7c5e5e2a 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1376,7 +1376,6 @@ static int intel_pt_overflow(struct intel_pt_decoder *decoder)
{
intel_pt_log("ERROR: Buffer overflow\n");
intel_pt_clear_tx_flags(decoder);
- decoder->have_tma = false;
decoder->cbr = 0;
decoder->timestamp_insn_cnt = 0;
decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC;
--
2.14.3


2018-06-05 17:54:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 44/46] perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP

From: Adrian Hunter <[email protected]>

It is possible to have a CBR packet between a FUP packet and
corresponding TIP packet. Stop treating it as an error.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index f9157aed1289..e5eb91777383 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1604,7 +1604,6 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
case INTEL_PT_PSB:
case INTEL_PT_TSC:
case INTEL_PT_TMA:
- case INTEL_PT_CBR:
case INTEL_PT_MODE_TSX:
case INTEL_PT_BAD:
case INTEL_PT_PSBEND:
@@ -1620,6 +1619,10 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
decoder->pkt_step = 0;
return -ENOENT;

+ case INTEL_PT_CBR:
+ intel_pt_calc_cbr(decoder);
+ break;
+
case INTEL_PT_OVF:
return intel_pt_overflow(decoder);

--
2.14.3


2018-06-05 17:54:12

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 46/46] perf intel-pt: Fix "Unexpected indirect branch" error

From: Adrian Hunter <[email protected]>

Some Atom CPUs can produce FUP packets that contain NLIP (next linear
instruction pointer) instead of CLIP (current linear instruction
pointer). That will result in "Unexpected indirect branch" errors. Fix
by comparing IP to NLIP in that case.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 17 +++++++++++++++--
tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 9 +++++++++
tools/perf/util/intel-pt.c | 4 ++++
3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 881d7c5e5e2a..d404bed7003a 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -113,6 +113,7 @@ struct intel_pt_decoder {
bool have_cyc;
bool fixup_last_mtc;
bool have_last_ip;
+ enum intel_pt_param_flags flags;
uint64_t pos;
uint64_t last_ip;
uint64_t ip;
@@ -226,6 +227,8 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
decoder->return_compression = params->return_compression;
decoder->branch_enable = params->branch_enable;

+ decoder->flags = params->flags;
+
decoder->period = params->period;
decoder->period_type = params->period_type;

@@ -1097,6 +1100,15 @@ static bool intel_pt_fup_event(struct intel_pt_decoder *decoder)
return ret;
}

+static inline bool intel_pt_fup_with_nlip(struct intel_pt_decoder *decoder,
+ struct intel_pt_insn *intel_pt_insn,
+ uint64_t ip, int err)
+{
+ return decoder->flags & INTEL_PT_FUP_WITH_NLIP && !err &&
+ intel_pt_insn->branch == INTEL_PT_BR_INDIRECT &&
+ ip == decoder->ip + intel_pt_insn->length;
+}
+
static int intel_pt_walk_fup(struct intel_pt_decoder *decoder)
{
struct intel_pt_insn intel_pt_insn;
@@ -1109,10 +1121,11 @@ static int intel_pt_walk_fup(struct intel_pt_decoder *decoder)
err = intel_pt_walk_insn(decoder, &intel_pt_insn, ip);
if (err == INTEL_PT_RETURN)
return 0;
- if (err == -EAGAIN) {
+ if (err == -EAGAIN ||
+ intel_pt_fup_with_nlip(decoder, &intel_pt_insn, ip, err)) {
if (intel_pt_fup_event(decoder))
return 0;
- return err;
+ return -EAGAIN;
}
decoder->set_fup_tx_flags = false;
if (err)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index fc1752d50019..51c18d67f4ca 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -60,6 +60,14 @@ enum {
INTEL_PT_ERR_MAX,
};

+enum intel_pt_param_flags {
+ /*
+ * FUP packet can contain next linear instruction pointer instead of
+ * current linear instruction pointer.
+ */
+ INTEL_PT_FUP_WITH_NLIP = 1 << 0,
+};
+
struct intel_pt_state {
enum intel_pt_sample_type type;
int err;
@@ -106,6 +114,7 @@ struct intel_pt_params {
unsigned int mtc_period;
uint32_t tsc_ctc_ratio_n;
uint32_t tsc_ctc_ratio_d;
+ enum intel_pt_param_flags flags;
};

struct intel_pt_decoder;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 3db7f0ee52a8..aec68908d604 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -749,6 +749,7 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
unsigned int queue_nr)
{
struct intel_pt_params params = { .get_trace = 0, };
+ struct perf_env *env = pt->machine->env;
struct intel_pt_queue *ptq;

ptq = zalloc(sizeof(struct intel_pt_queue));
@@ -830,6 +831,9 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
}
}

+ if (env->cpuid && !strncmp(env->cpuid, "GenuineIntel,6,92,", 18))
+ params.flags |= INTEL_PT_FUP_WITH_NLIP;
+
ptq->decoder = intel_pt_decoder_new(&params);
if (!ptq->decoder)
goto out_free;
--
2.14.3


2018-06-05 17:54:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 42/46] perf script powerpc: Python script for hypervisor call statistics

From: Ravi Bangoria <[email protected]>

Add python script to show hypervisor call statistics. Ex,

# perf record -a -e "{powerpc:hcall_entry,powerpc:hcall_exit}"
# perf script -s scripts/python/powerpc-hcalls.py
hcall count min(ns) max(ns) avg(ns)
--------------------------------------------------------------------
H_RANDOM 82 838 1164 904
H_PUT_TCE 47 1078 5928 2003
H_EOI 266 1336 3546 1654
H_ENTER 28 1646 4038 1952
H_PUT_TCE_INDIRECT 230 2166 18168 6109
H_IPI 238 1072 3232 1688
H_SEND_LOGICAL_LAN 42 5488 21366 7694
H_STUFF_TCE 294 986 6210 3591
H_XIRR 266 2286 6990 3783
H_PROTECT 10 2196 3556 2555
H_VIO_SIGNAL 294 1028 2784 1311
H_ADD_LOGICAL_LAN_BUFFER 53 1978 3450 2600
H_SEND_CRQ 77 1762 7240 2447

Signed-off-by: Ravi Bangoria <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Naveen N. Rao <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Fixup typo: table_loockup -> table_lookup ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
.../perf/scripts/python/bin/powerpc-hcalls-record | 2 +
.../perf/scripts/python/bin/powerpc-hcalls-report | 2 +
tools/perf/scripts/python/powerpc-hcalls.py | 200 +++++++++++++++++++++
3 files changed, 204 insertions(+)
create mode 100644 tools/perf/scripts/python/bin/powerpc-hcalls-record
create mode 100644 tools/perf/scripts/python/bin/powerpc-hcalls-report
create mode 100644 tools/perf/scripts/python/powerpc-hcalls.py

diff --git a/tools/perf/scripts/python/bin/powerpc-hcalls-record b/tools/perf/scripts/python/bin/powerpc-hcalls-record
new file mode 100644
index 000000000000..b7402aa9147d
--- /dev/null
+++ b/tools/perf/scripts/python/bin/powerpc-hcalls-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -e "{powerpc:hcall_entry,powerpc:hcall_exit}" $@
diff --git a/tools/perf/scripts/python/bin/powerpc-hcalls-report b/tools/perf/scripts/python/bin/powerpc-hcalls-report
new file mode 100644
index 000000000000..dd32ad7465f6
--- /dev/null
+++ b/tools/perf/scripts/python/bin/powerpc-hcalls-report
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/powerpc-hcalls.py
diff --git a/tools/perf/scripts/python/powerpc-hcalls.py b/tools/perf/scripts/python/powerpc-hcalls.py
new file mode 100644
index 000000000000..00e0e7476e55
--- /dev/null
+++ b/tools/perf/scripts/python/powerpc-hcalls.py
@@ -0,0 +1,200 @@
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Copyright (C) 2018 Ravi Bangoria, IBM Corporation
+#
+# Hypervisor call statisics
+
+import os
+import sys
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+ '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from Util import *
+
+# output: {
+# opcode: {
+# 'min': minimum time nsec
+# 'max': maximum time nsec
+# 'time': average time nsec
+# 'cnt': counter
+# } ...
+# }
+output = {}
+
+# d_enter: {
+# cpu: {
+# opcode: nsec
+# } ...
+# }
+d_enter = {}
+
+hcall_table = {
+ 4: 'H_REMOVE',
+ 8: 'H_ENTER',
+ 12: 'H_READ',
+ 16: 'H_CLEAR_MOD',
+ 20: 'H_CLEAR_REF',
+ 24: 'H_PROTECT',
+ 28: 'H_GET_TCE',
+ 32: 'H_PUT_TCE',
+ 36: 'H_SET_SPRG0',
+ 40: 'H_SET_DABR',
+ 44: 'H_PAGE_INIT',
+ 48: 'H_SET_ASR',
+ 52: 'H_ASR_ON',
+ 56: 'H_ASR_OFF',
+ 60: 'H_LOGICAL_CI_LOAD',
+ 64: 'H_LOGICAL_CI_STORE',
+ 68: 'H_LOGICAL_CACHE_LOAD',
+ 72: 'H_LOGICAL_CACHE_STORE',
+ 76: 'H_LOGICAL_ICBI',
+ 80: 'H_LOGICAL_DCBF',
+ 84: 'H_GET_TERM_CHAR',
+ 88: 'H_PUT_TERM_CHAR',
+ 92: 'H_REAL_TO_LOGICAL',
+ 96: 'H_HYPERVISOR_DATA',
+ 100: 'H_EOI',
+ 104: 'H_CPPR',
+ 108: 'H_IPI',
+ 112: 'H_IPOLL',
+ 116: 'H_XIRR',
+ 120: 'H_MIGRATE_DMA',
+ 124: 'H_PERFMON',
+ 220: 'H_REGISTER_VPA',
+ 224: 'H_CEDE',
+ 228: 'H_CONFER',
+ 232: 'H_PROD',
+ 236: 'H_GET_PPP',
+ 240: 'H_SET_PPP',
+ 244: 'H_PURR',
+ 248: 'H_PIC',
+ 252: 'H_REG_CRQ',
+ 256: 'H_FREE_CRQ',
+ 260: 'H_VIO_SIGNAL',
+ 264: 'H_SEND_CRQ',
+ 272: 'H_COPY_RDMA',
+ 276: 'H_REGISTER_LOGICAL_LAN',
+ 280: 'H_FREE_LOGICAL_LAN',
+ 284: 'H_ADD_LOGICAL_LAN_BUFFER',
+ 288: 'H_SEND_LOGICAL_LAN',
+ 292: 'H_BULK_REMOVE',
+ 304: 'H_MULTICAST_CTRL',
+ 308: 'H_SET_XDABR',
+ 312: 'H_STUFF_TCE',
+ 316: 'H_PUT_TCE_INDIRECT',
+ 332: 'H_CHANGE_LOGICAL_LAN_MAC',
+ 336: 'H_VTERM_PARTNER_INFO',
+ 340: 'H_REGISTER_VTERM',
+ 344: 'H_FREE_VTERM',
+ 348: 'H_RESET_EVENTS',
+ 352: 'H_ALLOC_RESOURCE',
+ 356: 'H_FREE_RESOURCE',
+ 360: 'H_MODIFY_QP',
+ 364: 'H_QUERY_QP',
+ 368: 'H_REREGISTER_PMR',
+ 372: 'H_REGISTER_SMR',
+ 376: 'H_QUERY_MR',
+ 380: 'H_QUERY_MW',
+ 384: 'H_QUERY_HCA',
+ 388: 'H_QUERY_PORT',
+ 392: 'H_MODIFY_PORT',
+ 396: 'H_DEFINE_AQP1',
+ 400: 'H_GET_TRACE_BUFFER',
+ 404: 'H_DEFINE_AQP0',
+ 408: 'H_RESIZE_MR',
+ 412: 'H_ATTACH_MCQP',
+ 416: 'H_DETACH_MCQP',
+ 420: 'H_CREATE_RPT',
+ 424: 'H_REMOVE_RPT',
+ 428: 'H_REGISTER_RPAGES',
+ 432: 'H_DISABLE_AND_GETC',
+ 436: 'H_ERROR_DATA',
+ 440: 'H_GET_HCA_INFO',
+ 444: 'H_GET_PERF_COUNT',
+ 448: 'H_MANAGE_TRACE',
+ 468: 'H_FREE_LOGICAL_LAN_BUFFER',
+ 472: 'H_POLL_PENDING',
+ 484: 'H_QUERY_INT_STATE',
+ 580: 'H_ILLAN_ATTRIBUTES',
+ 592: 'H_MODIFY_HEA_QP',
+ 596: 'H_QUERY_HEA_QP',
+ 600: 'H_QUERY_HEA',
+ 604: 'H_QUERY_HEA_PORT',
+ 608: 'H_MODIFY_HEA_PORT',
+ 612: 'H_REG_BCMC',
+ 616: 'H_DEREG_BCMC',
+ 620: 'H_REGISTER_HEA_RPAGES',
+ 624: 'H_DISABLE_AND_GET_HEA',
+ 628: 'H_GET_HEA_INFO',
+ 632: 'H_ALLOC_HEA_RESOURCE',
+ 644: 'H_ADD_CONN',
+ 648: 'H_DEL_CONN',
+ 664: 'H_JOIN',
+ 676: 'H_VASI_STATE',
+ 688: 'H_ENABLE_CRQ',
+ 696: 'H_GET_EM_PARMS',
+ 720: 'H_SET_MPP',
+ 724: 'H_GET_MPP',
+ 748: 'H_HOME_NODE_ASSOCIATIVITY',
+ 756: 'H_BEST_ENERGY',
+ 764: 'H_XIRR_X',
+ 768: 'H_RANDOM',
+ 772: 'H_COP',
+ 788: 'H_GET_MPP_X',
+ 796: 'H_SET_MODE',
+ 61440: 'H_RTAS',
+}
+
+def hcall_table_lookup(opcode):
+ if (hcall_table.has_key(opcode)):
+ return hcall_table[opcode]
+ else:
+ return opcode
+
+print_ptrn = '%-28s%10s%10s%10s%10s'
+
+def trace_end():
+ print print_ptrn % ('hcall', 'count', 'min(ns)', 'max(ns)', 'avg(ns)')
+ print '-' * 68
+ for opcode in output:
+ h_name = hcall_table_lookup(opcode)
+ time = output[opcode]['time']
+ cnt = output[opcode]['cnt']
+ min_t = output[opcode]['min']
+ max_t = output[opcode]['max']
+
+ print print_ptrn % (h_name, cnt, min_t, max_t, time/cnt)
+
+def powerpc__hcall_exit(name, context, cpu, sec, nsec, pid, comm, callchain,
+ opcode, retval):
+ if (d_enter.has_key(cpu) and d_enter[cpu].has_key(opcode)):
+ diff = nsecs(sec, nsec) - d_enter[cpu][opcode]
+
+ if (output.has_key(opcode)):
+ output[opcode]['time'] += diff
+ output[opcode]['cnt'] += 1
+ if (output[opcode]['min'] > diff):
+ output[opcode]['min'] = diff
+ if (output[opcode]['max'] < diff):
+ output[opcode]['max'] = diff
+ else:
+ output[opcode] = {
+ 'time': diff,
+ 'cnt': 1,
+ 'min': diff,
+ 'max': diff,
+ }
+
+ del d_enter[cpu][opcode]
+# else:
+# print "Can't find matching hcall_enter event. Ignoring sample"
+
+def powerpc__hcall_entry(event_name, context, cpu, sec, nsec, pid, comm,
+ callchain, opcode):
+ if (d_enter.has_key(cpu)):
+ d_enter[cpu][opcode] = nsecs(sec, nsec)
+ else:
+ d_enter[cpu] = {opcode: nsecs(sec, nsec)}
--
2.14.3


2018-06-05 17:55:35

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 41/46] perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols

From: Arnaldo Carvalho de Melo <[email protected]>

Adrian reported that this test fails in his system where:

probe libc's inet_pton & backtrace it with ping: FAILED!
root@kbl04:~/git/linux-perf# nm -g /lib/x86_64-linux-gnu/libc-2.19.so | grep inet_pton
nm: /lib/x86_64-linux-gnu/libc-2.19.so: no symbols

This fails on ubuntu systems, with Adrian's being kubuntu 14.04, I
tested with ubuntu 14.04.4 and 18.04, and there we need to use the
-D/--dynamic 'nm' option to have this test working. And it works as well
with that on fedora 27, so use it.

Reported-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Naveen N. Rao <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
index 650b208f700f..263057039693 100755
--- a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
+++ b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
@@ -11,7 +11,7 @@
. $(dirname $0)/lib/probe.sh

libc=$(grep -w libc /proc/self/maps | head -1 | sed -r 's/.*[[:space:]](\/.*)/\1/g')
-nm -g $libc 2>/dev/null | fgrep -q inet_pton || exit 254
+nm -Dg $libc 2>/dev/null | fgrep -q inet_pton || exit 254

trace_libc_inet_pton_backtrace() {
idx=0
--
2.14.3


2018-06-05 17:55:39

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 40/46] perf map: Consider PTI entry trampolines in rip_2objdump()

From: Adrian Hunter <[email protected]>

perf tools uses map__rip_2objdump() to calculate objdump virtual addresses.
map__rip_2objdump() needs to be amended to deal with PTI entry trampolines.

Signed-off-by: Adrian Hunter <[email protected]>
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/map.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 92abc8e248c5..89ac5b5dc218 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -449,6 +449,20 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
*/
u64 map__rip_2objdump(struct map *map, u64 rip)
{
+ struct kmap *kmap = __map__kmap(map);
+
+ /*
+ * vmlinux does not have program headers for PTI entry trampolines and
+ * kcore may not either. However the trampoline object code is on the
+ * main kernel map, so just use that instead.
+ */
+ if (kmap && is_entry_trampoline(kmap->name) && kmap->kmaps && kmap->kmaps->machine) {
+ struct map *kernel_map = machine__kernel_map(kmap->kmaps->machine);
+
+ if (kernel_map)
+ map = kernel_map;
+ }
+
if (!map->dso->adjust_symbols)
return rip;

--
2.14.3


2018-06-05 17:56:01

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 43/46] perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING

From: Adrian Hunter <[email protected]>

sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.

In one case, INTEL_PT_SS_NOT_TRACING state was not correctly
transitioning to INTEL_PT_SS_TRACING state due to a missing case clause.
Add it.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/intel-pt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 492986a25ef6..3db7f0ee52a8 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1521,6 +1521,7 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)

if (intel_pt_is_switch_ip(ptq, state->to_ip)) {
switch (ptq->switch_state) {
+ case INTEL_PT_SS_NOT_TRACING:
case INTEL_PT_SS_UNKNOWN:
case INTEL_PT_SS_EXPECTING_SWITCH_IP:
err = intel_pt_next_tid(pt, ptq);
--
2.14.3


2018-06-05 17:58:23

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 39/46] perf test code-reading: Fix perf_env setup for PTI entry trampolines

From: Adrian Hunter <[email protected]>

The "Object code reading" test will not create maps for the PTI entry
trampolines unless the machine environment exists to show that the arch is
x86_64.

Signed-off-by: Adrian Hunter <[email protected]>
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/code-reading.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index afa4ce21ba7c..4892bd2dc33e 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -560,6 +560,7 @@ static int do_test_code_reading(bool try_kcore)
pid = getpid();

machine = machine__new_host();
+ machine->env = &perf_env;

ret = machine__create_kernel_maps(machine);
if (ret < 0) {
--
2.14.3


2018-06-05 17:58:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 37/46] perf stat: Display user and system time

From: Jiri Olsa <[email protected]>

Adding the support to read rusage data once the workload is finished and
display the system/user time values:

$ perf stat --null perf bench sched pipe
...

Performance counter stats for 'perf bench sched pipe':

5.342599256 seconds time elapsed

2.544434000 seconds user
4.549691000 seconds sys

It works only in non -r mode and only for workload target.

So as of now, for workload targets, we display 3 types of timings. The
time we meassure in perf stat from enable to disable+period:

5.342599256 seconds time elapsed

The time spent in user and system lands, displayed only for workload
session/target:

2.544434000 seconds user
4.549691000 seconds sys

Those times are the very same displayed by 'time' tool. They are
returned by wait4 call via the getrusage struct interface.

Committer notes:

Had to rename some variables to avoid this on older systems such as
centos:6:

builtin-stat.c: In function 'print_footer':
builtin-stat.c:1831: warning: declaration of 'stime' shadows a global declaration
/usr/include/time.h:297: warning: shadowed declaration is here

Committer testing:

# perf stat --null time perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

Total time: 5.526 [sec]

5.526534 usecs/op
180945 ops/sec
1.00user 6.25system 0:05.52elapsed 131%CPU (0avgtext+0avgdata 8056maxresident)k
0inputs+0outputs (0major+606minor)pagefaults 0swaps

Performance counter stats for 'time perf bench sched pipe':

5.530978744 seconds time elapsed

1.004037000 seconds user
6.259937000 seconds sys

#

Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 40 ++++++++++++++++++++++++----------
tools/perf/builtin-stat.c | 28 +++++++++++++++++++++++-
2 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 3a822f308e6d..5dfe102fb5b5 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -310,20 +310,38 @@ Users who wants to get the actual value can apply --no-metric-only.
EXAMPLES
--------

-$ perf stat -- make -j
+$ perf stat -- make

- Performance counter stats for 'make -j':
+ Performance counter stats for 'make':

- 8117.370256 task clock ticks # 11.281 CPU utilization factor
- 678 context switches # 0.000 M/sec
- 133 CPU migrations # 0.000 M/sec
- 235724 pagefaults # 0.029 M/sec
- 24821162526 CPU cycles # 3057.784 M/sec
- 18687303457 instructions # 2302.138 M/sec
- 172158895 cache references # 21.209 M/sec
- 27075259 cache misses # 3.335 M/sec
+ 83723.452481 task-clock:u (msec) # 1.004 CPUs utilized
+ 0 context-switches:u # 0.000 K/sec
+ 0 cpu-migrations:u # 0.000 K/sec
+ 3,228,188 page-faults:u # 0.039 M/sec
+ 229,570,665,834 cycles:u # 2.742 GHz
+ 313,163,853,778 instructions:u # 1.36 insn per cycle
+ 69,704,684,856 branches:u # 832.559 M/sec
+ 2,078,861,393 branch-misses:u # 2.98% of all branches

- Wall-clock time elapsed: 719.554352 msecs
+ 83.409183620 seconds time elapsed
+
+ 74.684747000 seconds user
+ 8.739217000 seconds sys
+
+TIMINGS
+-------
+As displayed in the example above we can display 3 types of timings.
+We always display the time the counters were enabled/alive:
+
+ 83.409183620 seconds time elapsed
+
+For workload sessions we also display time the workloads spent in
+user/system lands:
+
+ 74.684747000 seconds user
+ 8.739217000 seconds sys
+
+Those times are the very same as displayed by the 'time' tool.

CSV FORMAT
----------
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a4f662a462c6..096ccb25c11f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -80,6 +80,9 @@
#include <sys/stat.h>
#include <sys/wait.h>
#include <unistd.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/wait.h>

#include "sane_ctype.h"

@@ -175,6 +178,8 @@ static int output_fd;
static int print_free_counters_hint;
static int print_mixed_hw_group_error;
static u64 *walltime_run;
+static bool ru_display = false;
+static struct rusage ru_data;

struct perf_stat {
bool record;
@@ -726,7 +731,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
break;
}
}
- waitpid(child_pid, &status, 0);
+ wait4(child_pid, &status, 0, &ru_data);

if (workload_exec_errno) {
const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
@@ -1804,6 +1809,11 @@ static void print_table(FILE *output, int precision, double avg)
fprintf(output, "\n%*s# Final result:\n", indent, "");
}

+static double timeval2double(struct timeval *t)
+{
+ return t->tv_sec + (double) t->tv_usec/USEC_PER_SEC;
+}
+
static void print_footer(void)
{
double avg = avg_stats(&walltime_nsecs_stats) / NSEC_PER_SEC;
@@ -1815,6 +1825,15 @@ static void print_footer(void)

if (run_count == 1) {
fprintf(output, " %17.9f seconds time elapsed", avg);
+
+ if (ru_display) {
+ double ru_utime = timeval2double(&ru_data.ru_utime);
+ double ru_stime = timeval2double(&ru_data.ru_stime);
+
+ fprintf(output, "\n\n");
+ fprintf(output, " %17.9f seconds user\n", ru_utime);
+ fprintf(output, " %17.9f seconds sys\n", ru_stime);
+ }
} else {
double sd = stddev_stats(&walltime_nsecs_stats) / NSEC_PER_SEC;
/*
@@ -2950,6 +2969,13 @@ int cmd_stat(int argc, const char **argv)

setup_system_wide(argc);

+ /*
+ * Display user/system times only for single
+ * run and when there's specified tracee.
+ */
+ if ((run_count == 1) && target__none(&target))
+ ru_display = true;
+
if (run_count < 0) {
pr_err("Run count must be a positive number\n");
parse_options_usage(stat_usage, stat_options, "r", 1);
--
2.14.3


2018-06-05 17:59:19

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 32/46] perf hists: Introduce hist_entry__has_callchain() method

From: Arnaldo Carvalho de Melo <[email protected]>

We'll use this helper more frequently when reworking
symbol_conf.use_callchain logic, where knowing if a hist_entry has
callchains is the important bit, so make going from hist_entry to hists
to evsel easier, compact.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/hist.c | 3 +--
tools/perf/util/sort.h | 6 ++++++
2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 496913eeed75..6173dd489f39 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -461,7 +461,6 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template,
bool sample_self)
{
struct hist_entry_ops *ops = template->ops;
- struct perf_evsel *evsel = hists_to_evsel(template->hists);
size_t callchain_size = 0;
struct hist_entry *he;
int err = 0;
@@ -473,7 +472,7 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template,
* e.g.: 'perf record -e cycles,cache-misses/max-stack=10/', so lets
* not waste space for that.
*/
- if (symbol_conf.use_callchain && evsel__has_callchain(evsel))
+ if (hist_entry__has_callchains(template) && symbol_conf.use_callchain)
callchain_size = sizeof(struct callchain_root);

he = ops->new(callchain_size);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f007a26d6f6d..1a046157bfef 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -151,6 +151,12 @@ struct hist_entry {
struct callchain_root callchain[0]; /* must be last member */
};

+static __pure inline bool hist_entry__has_callchains(struct hist_entry *he)
+{
+ const struct perf_evsel *evsel = hists_to_evsel(he->hists);
+ return evsel__has_callchain(evsel);
+}
+
static inline bool hist_entry__has_pairs(struct hist_entry *he)
{
return !list_empty(&he->pairs.node);
--
2.14.3


2018-06-05 17:59:23

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 35/46] perf tools: Fix symbol and object code resolution for vdso32 and vdsox32

From: Adrian Hunter <[email protected]>

Fix __kmod_path__parse() so that perf tools does not treat vdso32 and
vdsox32 as kernel modules and fail to find the object.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: [email protected]
Fixes: 1f121b03d058 ("perf tools: Deal with kernel module names in '[]' correctly")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/dso.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index cdfc2e5f55f5..51cf82cf1882 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -354,6 +354,8 @@ int __kmod_path__parse(struct kmod_path *m, const char *path,
if ((strncmp(name, "[kernel.kallsyms]", 17) == 0) ||
(strncmp(name, "[guest.kernel.kallsyms", 22) == 0) ||
(strncmp(name, "[vdso]", 6) == 0) ||
+ (strncmp(name, "[vdso32]", 8) == 0) ||
+ (strncmp(name, "[vdsox32]", 9) == 0) ||
(strncmp(name, "[vsyscall]", 10) == 0)) {
m->kmod = false;

--
2.14.3


2018-06-05 17:59:27

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 38/46] perf tools: Fix pmu events parsing rule

From: Jiri Olsa <[email protected]>

Currently all the event parsing fails end up
in the event_pmu rule, and display misleading
help like:

$ perf stat -e inst kill
event syntax error: 'inst'
\___ Cannot find PMU `inst'. Missing kernel support?
...

The reason is that the event_pmu is too strong
and match also single string. Changing it to
force the '/' separators to be part of the rule,
and getting the proper error now:

$ perf stat -e inst kill
event syntax error: 'inst'
\___ parser error
Run 'perf list' for a list of valid events
...

Suggested-by: Adrian Hunter <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/parse-events.y | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index e37608a87dba..155d2570274f 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -73,6 +73,7 @@ static void inc_group_count(struct list_head *list,
%type <num> value_sym
%type <head> event_config
%type <head> opt_event_config
+%type <head> opt_pmu_config
%type <term> event_term
%type <head> event_pmu
%type <head> event_legacy_symbol
@@ -224,7 +225,7 @@ event_def: event_pmu |
event_bpf_file

event_pmu:
-PE_NAME opt_event_config
+PE_NAME opt_pmu_config
{
struct list_head *list, *orig_terms, *terms;

@@ -496,6 +497,17 @@ opt_event_config:
$$ = NULL;
}

+opt_pmu_config:
+'/' event_config '/'
+{
+ $$ = $2;
+}
+|
+'/' '/'
+{
+ $$ = NULL;
+}
+
start_terms: event_config
{
struct parse_events_state *parse_state = _parse_state;
--
2.14.3


2018-06-05 17:59:31

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 34/46] perf tests kmod-path: Add tests for vdso32 and vdsox32

From: Adrian Hunter <[email protected]>

Add tests for vdso32 and vdsox32. This will cause the overall test to
fail because __kmod_path__parse() does not handle vdso32 or vdsox32.

Fixes: 1f121b03d058 ("perf tools: Deal with kernel module names in '[]' correctly")
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/kmod-path.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/tools/perf/tests/kmod-path.c b/tools/perf/tests/kmod-path.c
index 8e57d46109de..148dd31cc201 100644
--- a/tools/perf/tests/kmod-path.c
+++ b/tools/perf/tests/kmod-path.c
@@ -127,6 +127,22 @@ int test__kmod_path__parse(struct test *t __maybe_unused, int subtest __maybe_un
M("[vdso]", PERF_RECORD_MISC_KERNEL, false);
M("[vdso]", PERF_RECORD_MISC_USER, false);

+ T("[vdso32]", true , true , false, false, "[vdso32]", NULL);
+ T("[vdso32]", false , true , false, false, NULL , NULL);
+ T("[vdso32]", true , false , false, false, "[vdso32]", NULL);
+ T("[vdso32]", false , false , false, false, NULL , NULL);
+ M("[vdso32]", PERF_RECORD_MISC_CPUMODE_UNKNOWN, false);
+ M("[vdso32]", PERF_RECORD_MISC_KERNEL, false);
+ M("[vdso32]", PERF_RECORD_MISC_USER, false);
+
+ T("[vdsox32]", true , true , false, false, "[vdsox32]", NULL);
+ T("[vdsox32]", false , true , false, false, NULL , NULL);
+ T("[vdsox32]", true , false , false, false, "[vdsox32]", NULL);
+ T("[vdsox32]", false , false , false, false, NULL , NULL);
+ M("[vdsox32]", PERF_RECORD_MISC_CPUMODE_UNKNOWN, false);
+ M("[vdsox32]", PERF_RECORD_MISC_KERNEL, false);
+ M("[vdsox32]", PERF_RECORD_MISC_USER, false);
+
/* path alloc_name alloc_ext kmod comp name ext */
T("[vsyscall]", true , true , false, false, "[vsyscall]", NULL);
T("[vsyscall]", false , true , false, false, NULL , NULL);
--
2.14.3


2018-06-05 17:59:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 36/46] perf record: Enable arbitrary event names thru name= modifier

From: Alexey Budankov <[email protected]>

Enable complex event names containing [.:=,] symbols to be encoded into Perf
trace using name= modifier e.g. like this:

perf record -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\
period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex

Below is how it looks like in the report output. Please note explicit escaped
quoting at cmdline string in the header so that thestring can be directly reused
for another collection in shell:

perf report --header

# ========
...
# cmdline : /root/abudanko/kernel/tip/tools/perf/perf record -v -e cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
# event : name = OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM, , type = 4, size = 112, config = 0x100003c, { sample_period, sample_freq } = 3500000, sample_type = IP|TID|TIME, disabled = 1, inh
...
# ========
#
#
# Total Lost Samples: 0
#
# Samples: 24K of event 'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM'
# Event count (approx.): 86492000000
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ..............................................
#
14.75% futex [kernel.vmlinux] [k] __entry_trampoline_start
...

perf stat -e cpu/name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\',period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex

10000000 process context switches in 16678890291ns (1667.9ns/ctxsw)

Performance counter stats for './futex':

88,095,770,571 CPU_CLK_UNHALTED.THREAD:cmask=0x1

16.679542407 seconds time elapsed

Signed-off-by: Alexey Budankov <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-list.txt | 6 +++++-
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/util/header.c | 20 ++++++++++++++++++--
tools/perf/util/parse-events.l | 18 +++++++++++++++++-
4 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 2549c34a7895..11300dbe35c5 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -124,7 +124,11 @@ The available PMUs and their raw parameters can be listed with
For example the raw event "LSD.UOPS" core pmu event above could
be specified as

- perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ...
+ perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
+
+ or using extended name syntax
+
+ perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...

PER SOCKET PMUS
---------------
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index cc37b3a4be76..04168da4268e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -57,6 +57,9 @@ OPTIONS
FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
"no" for disable callgraph.
- 'stack-size': user stack size for dwarf mode
+ - 'name' : User defined event name. Single quotes (') may be used to
+ escape symbols in the name from parsing by shell and tool
+ like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'.

See the linkperf:perf-list[1] man page for more parameters.

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 2625cc38a0d6..540cd2dcd3e7 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1459,8 +1459,24 @@ static void print_cmdline(struct feat_fd *ff, FILE *fp)

fprintf(fp, "# cmdline : ");

- for (i = 0; i < nr; i++)
- fprintf(fp, "%s ", ff->ph->env.cmdline_argv[i]);
+ for (i = 0; i < nr; i++) {
+ char *argv_i = strdup(ff->ph->env.cmdline_argv[i]);
+ if (!argv_i) {
+ fprintf(fp, "%s ", ff->ph->env.cmdline_argv[i]);
+ } else {
+ char *mem = argv_i;
+ do {
+ char *quote = strchr(argv_i, '\'');
+ if (!quote)
+ break;
+ *quote++ = '\0';
+ fprintf(fp, "%s\\\'", argv_i);
+ argv_i = quote;
+ } while (1);
+ fprintf(fp, "%s ", argv_i);
+ free(mem);
+ }
+ }
fputc('\n', fp);
}

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index a1a01b1ac8b8..5f761f3ed0f3 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -53,7 +53,21 @@ static int str(yyscan_t scanner, int token)
YYSTYPE *yylval = parse_events_get_lval(scanner);
char *text = parse_events_get_text(scanner);

- yylval->str = strdup(text);
+ if (text[0] != '\'') {
+ yylval->str = strdup(text);
+ } else {
+ /*
+ * If a text tag specified on the command line
+ * contains opening single quite ' then it is
+ * expected that the tag ends with single quote
+ * as well, like this:
+ * name=\'CPU_CLK_UNHALTED.THREAD:cmask=1\'
+ * quotes need to be escaped to bypass shell
+ * processing.
+ */
+ yylval->str = strndup(&text[1], strlen(text) - 2);
+ }
+
return token;
}

@@ -176,6 +190,7 @@ num_dec [0-9]+
num_hex 0x[a-fA-F0-9]+
num_raw_hex [a-fA-F0-9]+
name [a-zA-Z_*?\[\]][a-zA-Z0-9_*?.\[\]]*
+name_tag [\'][a-zA-Z_*?\[\]][a-zA-Z0-9_*?\-,\.\[\]:=]*[\']
name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
/* If you add a modifier you need to update check_modifier() */
@@ -344,6 +359,7 @@ r{num_raw_hex} { return raw(yyscanner); }
{bpf_object} { if (!isbpf(yyscanner)) { USER_REJECT }; return str(yyscanner, PE_BPF_OBJECT); }
{bpf_source} { if (!isbpf(yyscanner)) { USER_REJECT }; return str(yyscanner, PE_BPF_SOURCE); }
{name} { return pmu_str_check(yyscanner); }
+{name_tag} { return str(yyscanner, PE_NAME); }
"/" { BEGIN(config); return '/'; }
- { return '-'; }
, { BEGIN(event); return ','; }
--
2.14.3


2018-06-05 17:59:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 33/46] perf hists: Check if a hist_entry has callchains before using them

From: Arnaldo Carvalho de Melo <[email protected]>

So far if we use 'perf record -g' this will make
symbol_conf.use_callchain 'true' and logic will assume that all events
have callchains enabled, but ever since we added the possibility of
setting up callchains for some events (e.g.: -e
cycles/call-graph=dwarf/) while not for others, we limit usage scenarios
by looking at that symbol_conf.use_callchain global boolean, we better
look at each event attributes.

On the road to that we need to look if a hist_entry has callchains, that
is, to go from hist_entry->hists to the evsel that contains it, to then
look at evsel->sample_type for PERF_SAMPLE_CALLCHAIN.

The next step is to add a symbol_conf.ignore_callchains global, to use
in the places where what we really want to know is if callchains should
be ignored, even if present.

Then -g will mean just to select a callchain mode to be applied to all
events not explicitely setting some other callchain mode, i.e. a default
callchain mode, and --no-call-graph will set
symbol_conf.ignore_callchains with that clear intention.

That too will at some point become a per evsel thing, that tools can set
for all or just a few of its evsels.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/browsers/hists.c | 11 ++++++-----
tools/perf/ui/gtk/hists.c | 5 +++--
tools/perf/ui/hist.c | 2 +-
tools/perf/ui/stdio/hist.c | 4 ++--
tools/perf/util/hist.c | 11 ++++++-----
tools/perf/util/hist.h | 6 ++++++
tools/perf/util/sort.h | 3 +--
7 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 22054107f1af..a96f62ca984a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1231,6 +1231,7 @@ static int hist_browser__show_entry(struct hist_browser *browser,
int width = browser->b.width;
char folded_sign = ' ';
bool current_entry = ui_browser__is_current_entry(&browser->b, row);
+ bool use_callchain = hist_entry__has_callchains(entry) && symbol_conf.use_callchain;
off_t row_offset = entry->row_offset;
bool first = true;
struct perf_hpp_fmt *fmt;
@@ -1240,7 +1241,7 @@ static int hist_browser__show_entry(struct hist_browser *browser,
browser->selection = &entry->ms;
}

- if (symbol_conf.use_callchain) {
+ if (use_callchain) {
hist_entry__init_have_children(entry);
folded_sign = hist_entry__folded(entry);
}
@@ -1276,7 +1277,7 @@ static int hist_browser__show_entry(struct hist_browser *browser,
}

if (first) {
- if (symbol_conf.use_callchain) {
+ if (use_callchain) {
ui_browser__printf(&browser->b, "%c ", folded_sign);
width -= 2;
}
@@ -1583,7 +1584,7 @@ hists_browser__scnprintf_headers(struct hist_browser *browser, char *buf,
int column = 0;
int span = 0;

- if (symbol_conf.use_callchain) {
+ if (hists__has_callchains(hists) && symbol_conf.use_callchain) {
ret = scnprintf(buf, size, " ");
if (advance_hpp_check(&dummy_hpp, ret))
return ret;
@@ -1987,7 +1988,7 @@ static int hist_browser__fprintf_entry(struct hist_browser *browser,
bool first = true;
int ret;

- if (symbol_conf.use_callchain) {
+ if (hist_entry__has_callchains(he) && symbol_conf.use_callchain) {
folded_sign = hist_entry__folded(he);
printed += fprintf(fp, "%c ", folded_sign);
}
@@ -2671,7 +2672,7 @@ static void hist_browser__update_percent_limit(struct hist_browser *hb,
he->nr_rows = 0;
}

- if (!he->leaf || !symbol_conf.use_callchain)
+ if (!he->leaf || !hist_entry__has_callchains(he) || !symbol_conf.use_callchain)
goto next;

if (callchain_param.mode == CHAIN_GRAPH_REL) {
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 24e1ec201ffd..b085f1b3e34d 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -382,7 +382,8 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
gtk_tree_store_set(store, &iter, col_idx++, s, -1);
}

- if (symbol_conf.use_callchain && hists__has(hists, sym)) {
+ if (hists__has_callchains(hists) &&
+ symbol_conf.use_callchain && hists__has(hists, sym)) {
if (callchain_param.mode == CHAIN_GRAPH_REL)
total = symbol_conf.cumulate_callchain ?
h->stat_acc->period : h->stat.period;
@@ -479,7 +480,7 @@ static void perf_gtk__add_hierarchy_entries(struct hists *hists,
}
}

- if (symbol_conf.use_callchain && he->leaf) {
+ if (he->leaf && hist_entry__has_callchains(he) && symbol_conf.use_callchain) {
if (callchain_param.mode == CHAIN_GRAPH_REL)
total = symbol_conf.cumulate_callchain ?
he->stat_acc->period : he->stat.period;
diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 706f6f1e9c7d..fe3dfaa64a91 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -207,7 +207,7 @@ static int __hpp__sort_acc(struct hist_entry *a, struct hist_entry *b,
if (ret)
return ret;

- if (a->thread != b->thread || !symbol_conf.use_callchain)
+ if (a->thread != b->thread || !hist_entry__has_callchains(a) || !symbol_conf.use_callchain)
return 0;

ret = b->callchain->max_depth - a->callchain->max_depth;
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index c1eb476da91b..69b7a28f7a1c 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -516,7 +516,7 @@ static int hist_entry__hierarchy_fprintf(struct hist_entry *he,
}
printed += putc('\n', fp);

- if (symbol_conf.use_callchain && he->leaf) {
+ if (he->leaf && hist_entry__has_callchains(he) && symbol_conf.use_callchain) {
u64 total = hists__total_period(hists);

printed += hist_entry_callchain__fprintf(he, total, 0, fp);
@@ -550,7 +550,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,

ret = fprintf(fp, "%s\n", bf);

- if (use_callchain)
+ if (hist_entry__has_callchains(he) && use_callchain)
callchain_ret = hist_entry_callchain__fprintf(he, total_period,
0, fp);

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6173dd489f39..b4c8e3a46370 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -410,7 +410,7 @@ static int hist_entry__init(struct hist_entry *he,
map__get(he->mem_info->daddr.map);
}

- if (symbol_conf.use_callchain)
+ if (hist_entry__has_callchains(he) && symbol_conf.use_callchain)
callchain_init(he->callchain);

if (he->raw_data) {
@@ -496,7 +496,7 @@ static u8 symbol__parent_filter(const struct symbol *parent)

static void hist_entry__add_callchain_period(struct hist_entry *he, u64 period)
{
- if (!symbol_conf.use_callchain)
+ if (!hist_entry__has_callchains(he) || !symbol_conf.use_callchain)
return;

he->hists->callchain_period += period;
@@ -990,7 +990,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
iter->he = he;
he_cache[iter->curr++] = he;

- if (symbol_conf.use_callchain)
+ if (hist_entry__has_callchains(he) && symbol_conf.use_callchain)
callchain_append(he->callchain, &cursor, sample->period);
return 0;
}
@@ -1377,7 +1377,8 @@ static int hists__hierarchy_insert_entry(struct hists *hists,
if (new_he) {
new_he->leaf = true;

- if (symbol_conf.use_callchain) {
+ if (hist_entry__has_callchains(new_he) &&
+ symbol_conf.use_callchain) {
callchain_cursor_reset(&callchain_cursor);
if (callchain_merge(&callchain_cursor,
new_he->callchain,
@@ -1418,7 +1419,7 @@ static int hists__collapse_insert_entry(struct hists *hists,
if (symbol_conf.cumulate_callchain)
he_stat__add_stat(iter->stat_acc, he->stat_acc);

- if (symbol_conf.use_callchain) {
+ if (hist_entry__has_callchains(he) && symbol_conf.use_callchain) {
callchain_cursor_reset(&callchain_cursor);
if (callchain_merge(&callchain_cursor,
iter->callchain,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index cafafbf2aa9f..06607c434949 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -220,6 +220,12 @@ static inline struct hists *evsel__hists(struct perf_evsel *evsel)
return &hevsel->hists;
}

+static __pure inline bool hists__has_callchains(struct hists *hists)
+{
+ const struct perf_evsel *evsel = hists_to_evsel(hists);
+ return evsel__has_callchain(evsel);
+}
+
int hists__init(void);
int __hists__init(struct hists *hists, struct perf_hpp_list *hpp_list);

diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 1a046157bfef..7cf2d5cc038e 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -153,8 +153,7 @@ struct hist_entry {

static __pure inline bool hist_entry__has_callchains(struct hist_entry *he)
{
- const struct perf_evsel *evsel = hists_to_evsel(he->hists);
- return evsel__has_callchain(evsel);
+ return hists__has_callchains(he->hists);
}

static inline bool hist_entry__has_pairs(struct hist_entry *he)
--
2.14.3


2018-06-05 17:59:58

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 31/46] perf hists: Do not allocate space for callchains for evsels without them

From: Arnaldo Carvalho de Melo <[email protected]>

This is another case were a big switch for callchains,
symbol_conf.use_callchain, is inconvenient, as we can have both events
with and without callchains, no point in checking a single switch to
decide to allocate space for callchains.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/hist.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 34864c87cd3c..496913eeed75 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -461,14 +461,19 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template,
bool sample_self)
{
struct hist_entry_ops *ops = template->ops;
+ struct perf_evsel *evsel = hists_to_evsel(template->hists);
size_t callchain_size = 0;
struct hist_entry *he;
int err = 0;

if (!ops)
ops = template->ops = &default_ops;
-
- if (symbol_conf.use_callchain)
+ /*
+ * It is possible to have callchains for some evsels but not for others,
+ * e.g.: 'perf record -e cycles,cache-misses/max-stack=10/', so lets
+ * not waste space for that.
+ */
+ if (symbol_conf.use_callchain && evsel__has_callchain(evsel))
callchain_size = sizeof(struct callchain_root);

he = ops->new(callchain_size);
--
2.14.3


2018-06-05 18:00:10

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 28/46] perf evsel: Add has_callchain() helper to make code more compact/clear

From: Arnaldo Carvalho de Melo <[email protected]>

Its common to have the (evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN),
so add an evsel__has_callchain(evsel) helper.

This will actually get more uses as we check that instead of
symbol_conf.use_callchain in places where that produces the same result
but makes this decision to be more fine grained, per evsel.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-sched.c | 3 +--
tools/perf/builtin-script.c | 10 +++-------
tools/perf/builtin-trace.c | 2 +-
tools/perf/tests/parse-events.c | 4 ++--
tools/perf/util/evsel.c | 4 ++--
tools/perf/util/evsel.h | 5 +++++
tools/perf/util/hist.c | 2 +-
tools/perf/util/session.c | 2 +-
8 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 4dfdee668b0c..97f9e755e8e6 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2933,8 +2933,7 @@ static int timehist_check_attr(struct perf_sched *sched,
return -1;
}

- if (sched->show_callchain &&
- !(evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) {
+ if (sched->show_callchain && !evsel__has_callchain(evsel)) {
pr_info("Samples do not have callchains.\n");
sched->show_callchain = 0;
symbol_conf.use_callchain = 0;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index cefc8813e91e..48e940efb3cb 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -517,7 +517,7 @@ static int perf_session__check_output_opt(struct perf_session *session)

evlist__for_each_entry(session->evlist, evsel) {
not_pipe = true;
- if (evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
+ if (evsel__has_callchain(evsel)) {
use_callchain = true;
break;
}
@@ -532,22 +532,18 @@ static int perf_session__check_output_opt(struct perf_session *session)
*/
if (symbol_conf.use_callchain &&
!output[PERF_TYPE_TRACEPOINT].user_set) {
- struct perf_event_attr *attr;
-
j = PERF_TYPE_TRACEPOINT;

evlist__for_each_entry(session->evlist, evsel) {
if (evsel->attr.type != j)
continue;

- attr = &evsel->attr;
-
- if (attr->sample_type & PERF_SAMPLE_CALLCHAIN) {
+ if (evsel__has_callchain(evsel)) {
output[j].fields |= PERF_OUTPUT_IP;
output[j].fields |= PERF_OUTPUT_SYM;
output[j].fields |= PERF_OUTPUT_SYMOFFSET;
output[j].fields |= PERF_OUTPUT_DSO;
- set_print_ip_opts(attr);
+ set_print_ip_opts(&evsel->attr);
goto out;
}
}
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 560aed7da36a..6a748eca2edb 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2491,7 +2491,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
* to override an explicitely set --max-stack global setting.
*/
evlist__for_each_entry(evlist, evsel) {
- if ((evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
+ if (evsel__has_callchain(evsel) &&
evsel->attr.sample_max_stack == 0)
evsel->attr.sample_max_stack = trace->max_stack;
}
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index b9ebe15afb13..7d4077068454 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -499,7 +499,7 @@ static int test__checkevent_pmu_partial_time_callgraph(struct perf_evlist *evlis
* while this test executes only parse events method.
*/
TEST_ASSERT_VAL("wrong period", 0 == evsel->attr.sample_period);
- TEST_ASSERT_VAL("wrong callgraph", !(PERF_SAMPLE_CALLCHAIN & evsel->attr.sample_type));
+ TEST_ASSERT_VAL("wrong callgraph", !evsel__has_callchain(evsel));
TEST_ASSERT_VAL("wrong time", !(PERF_SAMPLE_TIME & evsel->attr.sample_type));

/* cpu/config=2,call-graph=no,time=0,period=2000/ */
@@ -512,7 +512,7 @@ static int test__checkevent_pmu_partial_time_callgraph(struct perf_evlist *evlis
* while this test executes only parse events method.
*/
TEST_ASSERT_VAL("wrong period", 0 == evsel->attr.sample_period);
- TEST_ASSERT_VAL("wrong callgraph", !(PERF_SAMPLE_CALLCHAIN & evsel->attr.sample_type));
+ TEST_ASSERT_VAL("wrong callgraph", !evsel__has_callchain(evsel));
TEST_ASSERT_VAL("wrong time", !(PERF_SAMPLE_TIME & evsel->attr.sample_type));

return 0;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 150db5ed7400..94fce4f537e9 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2197,7 +2197,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}
}

- if (type & PERF_SAMPLE_CALLCHAIN) {
+ if (evsel__has_callchain(evsel)) {
const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);

OVERFLOW_CHECK_u64(array);
@@ -2857,7 +2857,7 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
"Hint: Try again after reducing the number of events.\n"
"Hint: Try increasing the limit with 'ulimit -n <limit>'");
case ENOMEM:
- if ((evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0 &&
+ if (evsel__has_callchain(evsel) &&
access("/proc/sys/kernel/perf_event_max_stack", F_OK) == 0)
return scnprintf(msg, size,
"Not enough memory to setup event with callchain.\n"
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index b13f5f234c8f..d277930b19a1 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -459,6 +459,11 @@ static inline bool perf_evsel__has_branch_callstack(const struct perf_evsel *evs
return evsel->attr.branch_sample_type & PERF_SAMPLE_BRANCH_CALL_STACK;
}

+static inline bool evsel__has_callchain(const struct perf_evsel *evsel)
+{
+ return (evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0;
+}
+
typedef int (*attr__fprintf_f)(FILE *, const char *, const char *, void *);

int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 95333b068109..34864c87cd3c 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1757,7 +1757,7 @@ void perf_evsel__output_resort(struct perf_evsel *evsel, struct ui_progress *pro
bool use_callchain;

if (evsel && symbol_conf.use_callchain && !symbol_conf.show_ref_callgraph)
- use_callchain = evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN;
+ use_callchain = evsel__has_callchain(evsel);
else
use_callchain = symbol_conf.use_callchain;

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index b998bb475589..8b9369303561 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1094,7 +1094,7 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,

sample_type = evsel->attr.sample_type;

- if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ if (evsel__has_callchain(evsel))
callchain__printf(evsel, sample);

if ((sample_type & PERF_SAMPLE_BRANCH_STACK) && !perf_evsel__has_branch_callstack(evsel))
--
2.14.3


2018-06-05 18:00:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 29/46] perf script: Check if evsel has callchains before trying to use it

From: Arnaldo Carvalho de Melo <[email protected]>

We were checking just if callchain processing was asked for by the
user, not if the evsel itself has callchains, and since we can have
some evsels with callchains and others without, check that.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-script.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 48e940efb3cb..b3bf35512d21 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -606,7 +606,7 @@ static int perf_sample__fprintf_start(struct perf_sample *sample,
if (PRINT_FIELD(COMM)) {
if (latency_format)
printed += fprintf(fp, "%8.8s ", thread__comm_str(thread));
- else if (PRINT_FIELD(IP) && symbol_conf.use_callchain)
+ else if (PRINT_FIELD(IP) && evsel__has_callchain(evsel) && symbol_conf.use_callchain)
printed += fprintf(fp, "%s ", thread__comm_str(thread));
else
printed += fprintf(fp, "%16s ", thread__comm_str(thread));
--
2.14.3


2018-06-05 18:01:08

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 27/46] perf report: No need to have report_callchain_help as a global

From: Arnaldo Carvalho de Melo <[email protected]>

It is used in a single place, move the declaration to that function.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-report.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bc133e7a7ac2..cdb5b6949832 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -946,12 +946,6 @@ parse_percent_limit(const struct option *opt, const char *str,
return 0;
}

-#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
-
-const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
- CALLCHAIN_REPORT_HELP
- "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
-
int cmd_report(int argc, const char **argv)
{
struct perf_session *session;
@@ -960,6 +954,10 @@ int cmd_report(int argc, const char **argv)
bool has_br_stack = false;
int branch_mode = -1;
bool branch_call_mode = false;
+#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
+ const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
+ CALLCHAIN_REPORT_HELP
+ "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
const char * const report_usage[] = {
"perf report [<options>]",
--
2.14.3


2018-06-05 18:01:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 22/46] perf annotate: Adopt anotation options from symbol_conf

From: Arnaldo Carvalho de Melo <[email protected]>

Continuing to group annotation options in an annotation specific struct.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-annotate.c | 4 ++--
tools/perf/builtin-report.c | 6 ++++--
tools/perf/builtin-top.c | 4 ++--
tools/perf/util/annotate.c | 6 ++++--
tools/perf/util/annotate.h | 4 +++-
tools/perf/util/symbol.c | 1 -
tools/perf/util/symbol.h | 2 --
7 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 7238010f28d4..2ca7172f0780 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -515,9 +515,9 @@ int cmd_annotate(int argc, const char **argv)
OPT_CALLBACK(0, "symfs", NULL, "directory",
"Look for files with symbols relative to this directory",
symbol__config_symfs),
- OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
+ OPT_BOOLEAN(0, "source", &annotate.opts.annotate_src,
"Interleave source code with assembly code (default)"),
- OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
+ OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7a689c933f04..bee6dbfbf11e 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -71,6 +71,7 @@ struct report {
bool group_set;
int max_stack;
struct perf_read_values show_threads_values;
+ struct annotation_options annotation_opts;
const char *pretty_printing_style;
const char *cpu_list;
const char *symbol_filter_str;
@@ -988,6 +989,7 @@ int cmd_report(int argc, const char **argv)
.max_stack = PERF_MAX_STACK_DEPTH,
.pretty_printing_style = "normal",
.socket_filter = -1,
+ .annotation_opts = annotation__default_options,
};
const struct option options[] = {
OPT_STRING('i', "input", &input_name, "file",
@@ -1077,9 +1079,9 @@ int cmd_report(int argc, const char **argv)
"list of cpus to profile"),
OPT_BOOLEAN('I', "show-info", &report.show_full_info,
"Display extended information about perf.data file"),
- OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
+ OPT_BOOLEAN(0, "source", &report.annotation_opts.annotate_src,
"Interleave source code with assembly code (default)"),
- OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
+ OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 2c14ca61c657..e65e72c06a01 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1340,9 +1340,9 @@ int cmd_top(int argc, const char **argv)
"only consider symbols in these comms"),
OPT_STRING(0, "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
"only consider these symbols"),
- OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
+ OPT_BOOLEAN(0, "source", &top.annotation_opts.annotate_src,
"Interleave source code with assembly code (default)"),
- OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
+ OPT_BOOLEAN(0, "asm-raw", &top.annotation_opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 502f9d124a44..ff8f4f474b22 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -47,6 +47,7 @@
struct annotation_options annotation__default_options = {
.use_offset = true,
.jump_arrows = true,
+ .annotate_src = true,
.offset_level = ANNOTATION__OFFSET_JUMP_TARGETS,
};

@@ -1609,6 +1610,7 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil

static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
{
+ struct annotation_options *opts = args->options;
struct map *map = args->ms.map;
struct dso *dso = map->dso;
char *command;
@@ -1661,8 +1663,8 @@ static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
disassembler_style ? disassembler_style : "",
map__rip_2objdump(map, sym->start),
map__rip_2objdump(map, sym->end),
- symbol_conf.annotate_asm_raw ? "" : "--no-show-raw",
- symbol_conf.annotate_src ? "-S" : "",
+ opts->show_asm_raw ? "" : "--no-show-raw",
+ opts->annotate_src ? "-S" : "",
symfs_filename, symfs_filename);

if (err < 0) {
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 013d414b0e57..476ea2a25649 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -73,7 +73,9 @@ struct annotation_options {
show_nr_jumps,
show_nr_samples,
show_total_period,
- show_minmax_cycle;
+ show_minmax_cycle,
+ show_asm_raw,
+ annotate_src;
u8 offset_level;
int min_pcnt;
int max_lines;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 8c84437f2a10..3f632c60888f 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -40,7 +40,6 @@ char **vmlinux_path;
struct symbol_conf symbol_conf = {
.use_modules = true,
.try_vmlinux_path = true,
- .annotate_src = true,
.demangle = true,
.demangle_kernel = false,
.cumulate_callchain = true,
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 1be9a6bad967..f25fae4b5743 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -108,8 +108,6 @@ struct symbol_conf {
show_cpu_utilization,
initialized,
kptr_restrict,
- annotate_asm_raw,
- annotate_src,
event_group,
demangle,
demangle_kernel,
--
2.14.3


2018-06-05 18:01:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 20/46] perf srcline: Make hist_entry srcline helper consistent with map's

From: Arnaldo Carvalho de Melo <[email protected]>

No need to have "get_srcline", plain hist_entry__srcline() is enough and
shorter.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-c2c.c | 2 +-
tools/perf/util/sort.c | 8 ++++----
tools/perf/util/sort.h | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 2126bfbcb385..307b3594525f 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1976,7 +1976,7 @@ static int filter_cb(struct hist_entry *he)
c2c_he = container_of(he, struct c2c_hist_entry, he);

if (c2c.show_src && !he->srcline)
- he->srcline = hist_entry__get_srcline(he);
+ he->srcline = hist_entry__srcline(he);

calc_width(c2c_he);

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 4ab0b4ab24e4..fed2952ab45a 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -331,7 +331,7 @@ struct sort_entry sort_sym = {

/* --sort srcline */

-char *hist_entry__get_srcline(struct hist_entry *he)
+char *hist_entry__srcline(struct hist_entry *he)
{
return map__srcline(he->ms.map, he->ip, he->ms.sym);
}
@@ -340,9 +340,9 @@ static int64_t
sort__srcline_cmp(struct hist_entry *left, struct hist_entry *right)
{
if (!left->srcline)
- left->srcline = hist_entry__get_srcline(left);
+ left->srcline = hist_entry__srcline(left);
if (!right->srcline)
- right->srcline = hist_entry__get_srcline(right);
+ right->srcline = hist_entry__srcline(right);

return strcmp(right->srcline, left->srcline);
}
@@ -351,7 +351,7 @@ static int hist_entry__srcline_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
if (!he->srcline)
- he->srcline = hist_entry__get_srcline(he);
+ he->srcline = hist_entry__srcline(he);

return repsep_snprintf(bf, size, "%-.*s", width, he->srcline);
}
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 9e6896293bbd..f007a26d6f6d 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -292,5 +292,5 @@ int64_t
sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right);
int64_t
sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right);
-char *hist_entry__get_srcline(struct hist_entry *he);
+char *hist_entry__srcline(struct hist_entry *he);
#endif /* __PERF_SORT_H */
--
2.14.3


2018-06-05 18:01:55

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 24/46] perf hists browser: Pass annotation_options from tool to browser

From: Arnaldo Carvalho de Melo <[email protected]>

So that things changed in the command line may percolate to the browser
code without using globals.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
[ Merged fix for NO_SLANG=1 build provided by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 3 ++-
tools/perf/ui/browsers/annotate.c | 19 ++++++++++++-------
tools/perf/ui/browsers/hists.c | 29 ++++++++++++++++++++---------
tools/perf/ui/browsers/hists.h | 3 +++
tools/perf/util/annotate.h | 7 ++++---
tools/perf/util/hist.h | 20 ++++++++++++++------
8 files changed, 57 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 3ee063598364..2339ae719e1d 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -341,7 +341,7 @@ static void hists__find_annotations(struct hists *hists,
/* skip missing symbols */
nd = rb_next(nd);
} else if (use_browser == 1) {
- key = hist_entry__tui_annotate(he, evsel, NULL);
+ key = hist_entry__tui_annotate(he, evsel, NULL, &ann->opts);

switch (key) {
case -1:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index c74f9a219ad1..14b516a3a0de 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -561,7 +561,7 @@ static int report__browse_hists(struct report *rep)
ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
rep->min_percent,
&session->header.env,
- true);
+ true, &rep->annotation_opts);
/*
* Usually "ret" is the last pressed key, and we only
* care if the key notifies us to switch data file.
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 739c158fb39e..bd60a631a481 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -606,7 +606,8 @@ static void *display_thread_tui(void *arg)
perf_evlist__tui_browse_hists(top->evlist, help, &hbt,
top->min_percent,
&top->session->header.env,
- !top->record_opts.overwrite);
+ !top->record_opts.overwrite,
+ &top->annotation_opts);

done = 1;
return NULL;
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 3bfe17e176fe..3b4f1c10ff57 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -29,6 +29,7 @@ struct annotate_browser {
struct rb_node *curr_hot;
struct annotation_line *selection;
struct arch *arch;
+ struct annotation_options *opts;
bool searching_backwards;
char search_bf[128];
};
@@ -418,7 +419,7 @@ static bool annotate_browser__callq(struct annotate_browser *browser,
}

pthread_mutex_unlock(&notes->lock);
- symbol__tui_annotate(dl->ops.target.sym, ms->map, evsel, hbt);
+ symbol__tui_annotate(dl->ops.target.sym, ms->map, evsel, hbt, browser->opts);
sym_title(ms->sym, ms->map, title, sizeof(title));
ui_browser__show_title(&browser->b, title);
return true;
@@ -817,24 +818,27 @@ static int annotate_browser__run(struct annotate_browser *browser,
}

int map_symbol__tui_annotate(struct map_symbol *ms, struct perf_evsel *evsel,
- struct hist_browser_timer *hbt)
+ struct hist_browser_timer *hbt,
+ struct annotation_options *opts)
{
- return symbol__tui_annotate(ms->sym, ms->map, evsel, hbt);
+ return symbol__tui_annotate(ms->sym, ms->map, evsel, hbt, opts);
}

int hist_entry__tui_annotate(struct hist_entry *he, struct perf_evsel *evsel,
- struct hist_browser_timer *hbt)
+ struct hist_browser_timer *hbt,
+ struct annotation_options *opts)
{
/* reset abort key so that it can get Ctrl-C as a key */
SLang_reset_tty();
SLang_init_tty(0, 0, 0);

- return map_symbol__tui_annotate(&he->ms, evsel, hbt);
+ return map_symbol__tui_annotate(&he->ms, evsel, hbt, opts);
}

int symbol__tui_annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel,
- struct hist_browser_timer *hbt)
+ struct hist_browser_timer *hbt,
+ struct annotation_options *opts)
{
struct annotation *notes = symbol__annotation(sym);
struct map_symbol ms = {
@@ -851,6 +855,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
.priv = &ms,
.use_navkeypressed = true,
},
+ .opts = opts,
};
int ret = -1, err;

@@ -860,7 +865,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
if (map->dso->annotate_warned)
return -1;

- err = symbol__annotate2(sym, map, evsel, &annotation__default_options, &browser.arch);
+ err = symbol__annotate2(sym, map, evsel, opts, &browser.arch);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index e5f247247daa..3af1b74608ab 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2175,7 +2175,8 @@ struct hist_browser *hist_browser__new(struct hists *hists)
static struct hist_browser *
perf_evsel_browser__new(struct perf_evsel *evsel,
struct hist_browser_timer *hbt,
- struct perf_env *env)
+ struct perf_env *env,
+ struct annotation_options *annotation_opts)
{
struct hist_browser *browser = hist_browser__new(evsel__hists(evsel));

@@ -2183,6 +2184,7 @@ perf_evsel_browser__new(struct perf_evsel *evsel,
browser->hbt = hbt;
browser->env = env;
browser->title = hists_browser__scnprintf_title;
+ browser->annotation_opts = annotation_opts;
}
return browser;
}
@@ -2344,7 +2346,8 @@ do_annotate(struct hist_browser *browser, struct popup_action *act)
return 0;

evsel = hists_to_evsel(browser->hists);
- err = map_symbol__tui_annotate(&act->ms, evsel, browser->hbt);
+ err = map_symbol__tui_annotate(&act->ms, evsel, browser->hbt,
+ browser->annotation_opts);
he = hist_browser__selected_entry(browser);
/*
* offer option to annotate the other branch source or target
@@ -2697,10 +2700,11 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
struct hist_browser_timer *hbt,
float min_pcnt,
struct perf_env *env,
- bool warn_lost_event)
+ bool warn_lost_event,
+ struct annotation_options *annotation_opts)
{
struct hists *hists = evsel__hists(evsel);
- struct hist_browser *browser = perf_evsel_browser__new(evsel, hbt, env);
+ struct hist_browser *browser = perf_evsel_browser__new(evsel, hbt, env, annotation_opts);
struct branch_info *bi;
#define MAX_OPTIONS 16
char *options[MAX_OPTIONS];
@@ -3062,6 +3066,7 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
struct perf_evsel_menu {
struct ui_browser b;
struct perf_evsel *selection;
+ struct annotation_options *annotation_opts;
bool lost_events, lost_events_warned;
float min_pcnt;
struct perf_env *env;
@@ -3163,7 +3168,8 @@ static int perf_evsel_menu__run(struct perf_evsel_menu *menu,
true, hbt,
menu->min_pcnt,
menu->env,
- warn_lost_event);
+ warn_lost_event,
+ menu->annotation_opts);
ui_browser__show_title(&menu->b, title);
switch (key) {
case K_TAB:
@@ -3222,7 +3228,8 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
struct hist_browser_timer *hbt,
float min_pcnt,
struct perf_env *env,
- bool warn_lost_event)
+ bool warn_lost_event,
+ struct annotation_options *annotation_opts)
{
struct perf_evsel *pos;
struct perf_evsel_menu menu = {
@@ -3237,6 +3244,7 @@ static int __perf_evlist__tui_browse_hists(struct perf_evlist *evlist,
},
.min_pcnt = min_pcnt,
.env = env,
+ .annotation_opts = annotation_opts,
};

ui_helpline__push("Press ESC to exit");
@@ -3257,7 +3265,8 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
struct hist_browser_timer *hbt,
float min_pcnt,
struct perf_env *env,
- bool warn_lost_event)
+ bool warn_lost_event,
+ struct annotation_options *annotation_opts)
{
int nr_entries = evlist->nr_entries;

@@ -3267,7 +3276,8 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,

return perf_evsel__hists_browse(first, nr_entries, help,
false, hbt, min_pcnt,
- env, warn_lost_event);
+ env, warn_lost_event,
+ annotation_opts);
}

if (symbol_conf.event_group) {
@@ -3285,5 +3295,6 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,

return __perf_evlist__tui_browse_hists(evlist, nr_entries, help,
hbt, min_pcnt, env,
- warn_lost_event);
+ warn_lost_event,
+ annotation_opts);
}
diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
index 9428bee076f2..91d3e18b50aa 100644
--- a/tools/perf/ui/browsers/hists.h
+++ b/tools/perf/ui/browsers/hists.h
@@ -4,6 +4,8 @@

#include "ui/browser.h"

+struct annotation_options;
+
struct hist_browser {
struct ui_browser b;
struct hists *hists;
@@ -12,6 +14,7 @@ struct hist_browser {
struct hist_browser_timer *hbt;
struct pstack *pstack;
struct perf_env *env;
+ struct annotation_options *annotation_opts;
int print_seq;
bool show_dso;
bool show_headers;
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 71a734b86873..6e6e2a571928 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -357,13 +357,14 @@ int symbol__tty_annotate2(struct symbol *sym, struct map *map,
#ifdef HAVE_SLANG_SUPPORT
int symbol__tui_annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel,
- struct hist_browser_timer *hbt);
+ struct hist_browser_timer *hbt,
+ struct annotation_options *opts);
#else
static inline int symbol__tui_annotate(struct symbol *sym __maybe_unused,
struct map *map __maybe_unused,
struct perf_evsel *evsel __maybe_unused,
- struct hist_browser_timer *hbt
- __maybe_unused)
+ struct hist_browser_timer *hbt __maybe_unused,
+ struct annotation_options *opts __maybe_unused)
{
return 0;
}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index fbabfd8a215d..cafafbf2aa9f 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -419,19 +419,24 @@ struct hist_browser_timer {
int refresh;
};

+struct annotation_options;
+
#ifdef HAVE_SLANG_SUPPORT
#include "../ui/keysyms.h"
int map_symbol__tui_annotate(struct map_symbol *ms, struct perf_evsel *evsel,
- struct hist_browser_timer *hbt);
+ struct hist_browser_timer *hbt,
+ struct annotation_options *annotation_opts);

int hist_entry__tui_annotate(struct hist_entry *he, struct perf_evsel *evsel,
- struct hist_browser_timer *hbt);
+ struct hist_browser_timer *hbt,
+ struct annotation_options *annotation_opts);

int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
struct hist_browser_timer *hbt,
float min_pcnt,
struct perf_env *env,
- bool warn_lost_event);
+ bool warn_lost_event,
+ struct annotation_options *annotation_options);
int script_browse(const char *script_opt);
#else
static inline
@@ -440,20 +445,23 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist __maybe_unused,
struct hist_browser_timer *hbt __maybe_unused,
float min_pcnt __maybe_unused,
struct perf_env *env __maybe_unused,
- bool warn_lost_event __maybe_unused)
+ bool warn_lost_event __maybe_unused,
+ struct annotation_options *annotation_options __maybe_unused)
{
return 0;
}
static inline int map_symbol__tui_annotate(struct map_symbol *ms __maybe_unused,
struct perf_evsel *evsel __maybe_unused,
- struct hist_browser_timer *hbt __maybe_unused)
+ struct hist_browser_timer *hbt __maybe_unused,
+ struct annotation_options *annotation_options __maybe_unused)
{
return 0;
}

static inline int hist_entry__tui_annotate(struct hist_entry *he __maybe_unused,
struct perf_evsel *evsel __maybe_unused,
- struct hist_browser_timer *hbt __maybe_unused)
+ struct hist_browser_timer *hbt __maybe_unused,
+ struct annotation_options *annotation_opts __maybe_unused)
{
return 0;
}
--
2.14.3


2018-06-05 18:01:56

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 26/46] perf test: Use header file util/debug.h

From: Thomas Richter <[email protected]>

Use the header file util/debug.h instead of declaration of verbose
variable.

Signed-off-by: Thomas Richter <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/python-use.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/tests/python-use.c b/tools/perf/tests/python-use.c
index 5d2df65ada6a..40ab72149ce1 100644
--- a/tools/perf/tests/python-use.c
+++ b/tools/perf/tests/python-use.c
@@ -7,8 +7,7 @@
#include <stdlib.h>
#include <linux/compiler.h>
#include "tests.h"
-
-extern int verbose;
+#include "util/debug.h"

int test__python_use(struct test *test __maybe_unused, int subtest __maybe_unused)
{
--
2.14.3


2018-06-05 18:02:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 21/46] perf annotate: Pass annotation_options to symbol__annotate()

From: Arnaldo Carvalho de Melo <[email protected]>

Now all callers to symbol__disassemble() can hand it the per-tool
annotation_options, which will allow us to remove lots of stuff
from symbol_options, the kitchen sink of perf configs, reducing its
size and getting annotation specific stuff grouped together.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-top.c | 2 +-
tools/perf/ui/gtk/annotate.c | 2 +-
tools/perf/util/annotate.c | 7 +++++--
tools/perf/util/annotate.h | 1 +
4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5e58cd4de90b..2c14ca61c657 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -133,7 +133,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
return err;
}

- err = symbol__annotate(sym, map, evsel, 0, NULL);
+ err = symbol__annotate(sym, map, evsel, 0, &top->annotation_opts, NULL);
if (err == 0) {
top->sym_filter_entry = he;
} else {
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index aeeaf15029f0..48428c9acd89 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -169,7 +169,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct map *map,
if (map->dso->annotate_warned)
return -1;

- err = symbol__annotate(sym, map, evsel, 0, NULL);
+ err = symbol__annotate(sym, map, evsel, 0, &annotation__default_options, NULL);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index abcc7e24c365..502f9d124a44 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1067,6 +1067,7 @@ struct annotate_args {
struct arch *arch;
struct map_symbol ms;
struct perf_evsel *evsel;
+ struct annotation_options *options;
s64 offset;
char *line;
int line_nr;
@@ -1803,11 +1804,13 @@ void symbol__calc_percent(struct symbol *sym, struct perf_evsel *evsel)

int symbol__annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel, size_t privsize,
+ struct annotation_options *options,
struct arch **parch)
{
struct annotate_args args = {
.privsize = privsize,
.evsel = evsel,
+ .options = options,
};
struct perf_env *env = perf_evsel__env(evsel);
const char *arch_name = perf_env__arch(env);
@@ -2409,7 +2412,7 @@ int symbol__tty_annotate(struct symbol *sym, struct map *map,
struct dso *dso = map->dso;
struct rb_root source_line = RB_ROOT;

- if (symbol__annotate(sym, map, evsel, 0, NULL) < 0)
+ if (symbol__annotate(sym, map, evsel, 0, opts, NULL) < 0)
return -1;

symbol__calc_percent(sym, evsel);
@@ -2655,7 +2658,7 @@ int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel *ev
if (perf_evsel__is_group_event(evsel))
nr_pcnt = evsel->nr_members;

- err = symbol__annotate(sym, map, evsel, 0, parch);
+ err = symbol__annotate(sym, map, evsel, 0, options, parch);
if (err)
goto out_free_offsets;

diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 20f3326cc640..013d414b0e57 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -306,6 +306,7 @@ void symbol__annotate_zero_histograms(struct symbol *sym);

int symbol__annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel, size_t privsize,
+ struct annotation_options *options,
struct arch **parch);
int symbol__annotate2(struct symbol *sym, struct map *map,
struct perf_evsel *evsel,
--
2.14.3


2018-06-05 18:02:18

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 25/46] perf annotate: Move objdump_path to struct annotation_options

From: Arnaldo Carvalho de Melo <[email protected]>

One more step in grouping annotation options.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/arch/common.c | 4 ++--
tools/perf/arch/common.h | 4 +---
tools/perf/builtin-annotate.c | 7 ++++---
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 7 ++++---
tools/perf/ui/browsers/hists.c | 3 ++-
tools/perf/util/annotate.c | 3 +--
tools/perf/util/annotate.h | 1 +
8 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index c6f373508a4f..82657c01a3b8 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -189,7 +189,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
return -1;
}

-int perf_env__lookup_objdump(struct perf_env *env)
+int perf_env__lookup_objdump(struct perf_env *env, const char **path)
{
/*
* For live mode, env->arch will be NULL and we can use
@@ -198,5 +198,5 @@ int perf_env__lookup_objdump(struct perf_env *env)
if (env->arch == NULL)
return 0;

- return perf_env__lookup_binutils_path(env, "objdump", &objdump_path);
+ return perf_env__lookup_binutils_path(env, "objdump", path);
}
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index 2d875baa92e6..2167001b18c5 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -4,8 +4,6 @@

#include "../util/env.h"

-extern const char *objdump_path;
-
-int perf_env__lookup_objdump(struct perf_env *env);
+int perf_env__lookup_objdump(struct perf_env *env, const char **path);

#endif /* ARCH_PERF_COMMON_H */
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2339ae719e1d..5eb22cc56363 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -388,8 +388,9 @@ static int __cmd_annotate(struct perf_annotate *ann)
goto out;
}

- if (!objdump_path) {
- ret = perf_env__lookup_objdump(&session->header.env);
+ if (!ann->opts.objdump_path) {
+ ret = perf_env__lookup_objdump(&session->header.env,
+ &ann->opts.objdump_path);
if (ret)
goto out;
}
@@ -521,7 +522,7 @@ int cmd_annotate(int argc, const char **argv)
"Display raw encoding of assembly instructions (default)"),
OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
- OPT_STRING(0, "objdump", &objdump_path, "path",
+ OPT_STRING(0, "objdump", &annotate.opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
"Show event group information together"),
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 14b516a3a0de..bc133e7a7ac2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1094,7 +1094,7 @@ int cmd_report(int argc, const char **argv)
parse_branch_mode),
OPT_BOOLEAN(0, "branch-history", &branch_call_mode,
"add last branch records to call history"),
- OPT_STRING(0, "objdump", &objdump_path, "path",
+ OPT_STRING(0, "objdump", &report.annotation_opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
"Disable symbol demangling"),
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index bd60a631a481..ffdc2769ff9f 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1077,8 +1077,9 @@ static int __cmd_top(struct perf_top *top)
if (top->session == NULL)
return -1;

- if (!objdump_path) {
- ret = perf_env__lookup_objdump(&top->session->header.env);
+ if (!top->annotation_opts.objdump_path) {
+ ret = perf_env__lookup_objdump(&top->session->header.env,
+ &top->annotation_opts.objdump_path);
if (ret)
goto out_delete;
}
@@ -1347,7 +1348,7 @@ int cmd_top(int argc, const char **argv)
"Display raw encoding of assembly instructions (default)"),
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
- OPT_STRING(0, "objdump", &objdump_path, "path",
+ OPT_STRING(0, "objdump", &top.annotation_opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_STRING('M', "disassembler-style", &top.annotation_opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 3af1b74608ab..22054107f1af 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2338,7 +2338,8 @@ do_annotate(struct hist_browser *browser, struct popup_action *act)
struct hist_entry *he;
int err;

- if (!objdump_path && perf_env__lookup_objdump(browser->env))
+ if (!browser->annotation_opts->objdump_path &&
+ perf_env__lookup_objdump(browser->env, &browser->annotation_opts->objdump_path))
return 0;

notes = symbol__annotation(act->ms.sym);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index a90777717b60..2baa22933b0e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -51,7 +51,6 @@ struct annotation_options annotation__default_options = {
.offset_level = ANNOTATION__OFFSET_JUMP_TARGETS,
};

-const char *objdump_path;
static regex_t file_lineno;

static struct ins_ops *ins__find(struct arch *arch, const char *name);
@@ -1657,7 +1656,7 @@ static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
"%s %s%s --start-address=0x%016" PRIx64
" --stop-address=0x%016" PRIx64
" -l -d %s %s -C \"%s\" 2>/dev/null|grep -v \"%s:\"|expand",
- objdump_path ? objdump_path : "objdump",
+ opts->objdump_path ?: "objdump",
opts->disassembler_style ? "-M " : "",
opts->disassembler_style ?: "",
map__rip_2objdump(map, sym->start),
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 6e6e2a571928..a4c0d91907e6 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -80,6 +80,7 @@ struct annotation_options {
int min_pcnt;
int max_lines;
int context;
+ const char *objdump_path;
const char *disassembler_style;
};

--
2.14.3


2018-06-05 18:02:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 23/46] perf annotate: Move disassembler_style global to annotation_options

From: Arnaldo Carvalho de Melo <[email protected]>

Continuing to group annotation specific stuff into a struct.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 2 +-
tools/perf/util/annotate.c | 5 ++---
tools/perf/util/annotate.h | 3 +--
5 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2ca7172f0780..3ee063598364 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -519,7 +519,7 @@ int cmd_annotate(int argc, const char **argv)
"Interleave source code with assembly code (default)"),
OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
- OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
+ OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_STRING(0, "objdump", &objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index bee6dbfbf11e..c74f9a219ad1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1083,7 +1083,7 @@ int cmd_report(int argc, const char **argv)
"Interleave source code with assembly code (default)"),
OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw,
"Display raw encoding of assembly instructions (default)"),
- OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
+ OPT_STRING('M', "disassembler-style", &report.annotation_opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
"Show a column with the sum of periods"),
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index e65e72c06a01..739c158fb39e 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1348,7 +1348,7 @@ int cmd_top(int argc, const char **argv)
"Enable kernel symbol demangling"),
OPT_STRING(0, "objdump", &objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
- OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
+ OPT_STRING('M', "disassembler-style", &top.annotation_opts.disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_STRING('u', "uid", &target->uid_str, "user", "user to profile"),
OPT_CALLBACK(0, "percent-limit", &top, "percent",
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ff8f4f474b22..a90777717b60 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -51,7 +51,6 @@ struct annotation_options annotation__default_options = {
.offset_level = ANNOTATION__OFFSET_JUMP_TARGETS,
};

-const char *disassembler_style;
const char *objdump_path;
static regex_t file_lineno;

@@ -1659,8 +1658,8 @@ static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
" --stop-address=0x%016" PRIx64
" -l -d %s %s -C \"%s\" 2>/dev/null|grep -v \"%s:\"|expand",
objdump_path ? objdump_path : "objdump",
- disassembler_style ? "-M " : "",
- disassembler_style ? disassembler_style : "",
+ opts->disassembler_style ? "-M " : "",
+ opts->disassembler_style ?: "",
map__rip_2objdump(map, sym->start),
map__rip_2objdump(map, sym->end),
opts->show_asm_raw ? "" : "--no-show-raw",
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 476ea2a25649..71a734b86873 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -80,6 +80,7 @@ struct annotation_options {
int min_pcnt;
int max_lines;
int context;
+ const char *disassembler_style;
};

enum {
@@ -368,8 +369,6 @@ static inline int symbol__tui_annotate(struct symbol *sym __maybe_unused,
}
#endif

-extern const char *disassembler_style;
-
void annotation_config__init(void);

#endif /* __PERF_ANNOTATE_H */
--
2.14.3


2018-06-05 18:03:31

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 18/46] perf srcline: Introduce map__srcline() to make code more compact

From: Arnaldo Carvalho de Melo <[email protected]>

Replacing a common open coded sequence.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/map.c | 12 ++++++----
tools/perf/util/map.h | 1 +
tools/perf/util/sort.c | 60 +++++++++++---------------------------------------
3 files changed, 22 insertions(+), 51 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6ae97eda370b..92abc8e248c5 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -415,16 +415,20 @@ size_t map__fprintf_dsoname(struct map *map, FILE *fp)
return fprintf(fp, "%s", dsoname);
}

+char *map__srcline(struct map *map, u64 addr, struct symbol *sym)
+{
+ if (map == NULL)
+ return SRCLINE_UNKNOWN;
+ return get_srcline(map->dso, map__rip_2objdump(map, addr), sym, true, true, addr);
+}
+
int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
FILE *fp)
{
- char *srcline;
int ret = 0;

if (map && map->dso) {
- srcline = get_srcline(map->dso,
- map__rip_2objdump(map, addr), NULL,
- true, true, addr);
+ char *srcline = map__srcline(map, addr, NULL);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 97e2a063bd65..4cb90f242bed 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -169,6 +169,7 @@ static inline void __map__zput(struct map **map)
int map__overlap(struct map *l, struct map *r);
size_t map__fprintf(struct map *map, FILE *fp);
size_t map__fprintf_dsoname(struct map *map, FILE *fp);
+char *map__srcline(struct map *map, u64 addr, struct symbol *sym);
int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
FILE *fp);

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 4058ade352a5..71096dbfeb88 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -333,13 +333,7 @@ struct sort_entry sort_sym = {

char *hist_entry__get_srcline(struct hist_entry *he)
{
- struct map *map = he->ms.map;
-
- if (!map)
- return SRCLINE_UNKNOWN;
-
- return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
- he->ms.sym, true, true, he->ip);
+ return map__srcline(he->ms.map, he->ip, he->ms.sym);
}

static int64_t
@@ -375,28 +369,14 @@ static int64_t
sort__srcline_from_cmp(struct hist_entry *left, struct hist_entry *right)
{
if (!left->branch_info->srcline_from) {
- struct map *map = left->branch_info->from.map;
- if (!map)
- left->branch_info->srcline_from = SRCLINE_UNKNOWN;
- else
- left->branch_info->srcline_from = get_srcline(map->dso,
- map__rip_2objdump(map,
- left->branch_info->from.al_addr),
- left->branch_info->from.sym,
- true, true,
- left->branch_info->from.al_addr);
+ left->branch_info->srcline_from = map__srcline(left->branch_info->from.map,
+ left->branch_info->from.al_addr,
+ left->branch_info->from.sym);
}
if (!right->branch_info->srcline_from) {
- struct map *map = right->branch_info->from.map;
- if (!map)
- right->branch_info->srcline_from = SRCLINE_UNKNOWN;
- else
- right->branch_info->srcline_from = get_srcline(map->dso,
- map__rip_2objdump(map,
- right->branch_info->from.al_addr),
- right->branch_info->from.sym,
- true, true,
- right->branch_info->from.al_addr);
+ right->branch_info->srcline_from = map__srcline(right->branch_info->from.map,
+ right->branch_info->from.al_addr,
+ right->branch_info->from.sym);
}
return strcmp(right->branch_info->srcline_from, left->branch_info->srcline_from);
}
@@ -420,28 +400,14 @@ static int64_t
sort__srcline_to_cmp(struct hist_entry *left, struct hist_entry *right)
{
if (!left->branch_info->srcline_to) {
- struct map *map = left->branch_info->to.map;
- if (!map)
- left->branch_info->srcline_to = SRCLINE_UNKNOWN;
- else
- left->branch_info->srcline_to = get_srcline(map->dso,
- map__rip_2objdump(map,
- left->branch_info->to.al_addr),
- left->branch_info->from.sym,
- true, true,
- left->branch_info->to.al_addr);
+ left->branch_info->srcline_to = map__srcline(left->branch_info->to.map,
+ left->branch_info->to.al_addr,
+ left->branch_info->to.sym);
}
if (!right->branch_info->srcline_to) {
- struct map *map = right->branch_info->to.map;
- if (!map)
- right->branch_info->srcline_to = SRCLINE_UNKNOWN;
- else
- right->branch_info->srcline_to = get_srcline(map->dso,
- map__rip_2objdump(map,
- right->branch_info->to.al_addr),
- right->branch_info->to.sym,
- true, true,
- right->branch_info->to.al_addr);
+ right->branch_info->srcline_to = map__srcline(right->branch_info->to.map,
+ right->branch_info->to.al_addr,
+ right->branch_info->to.sym);
}
return strcmp(right->branch_info->srcline_to, left->branch_info->srcline_to);
}
--
2.14.3


2018-06-05 18:03:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 19/46] perf sort: Introduce addr_map_symbol__srcline() to make code more compact

From: Arnaldo Carvalho de Melo <[email protected]>

Since we have 'struct addr_map_symbol' and the srcline sort order keys
all operate on those, make the code more compact by introducing a
function that receives a pointer to such struct and expands the
arguments to map__srcline().

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/sort.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 71096dbfeb88..4ab0b4ab24e4 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -365,19 +365,20 @@ struct sort_entry sort_srcline = {

/* --sort srcline_from */

+static char *addr_map_symbol__srcline(struct addr_map_symbol *ams)
+{
+ return map__srcline(ams->map, ams->al_addr, ams->sym);
+}
+
static int64_t
sort__srcline_from_cmp(struct hist_entry *left, struct hist_entry *right)
{
- if (!left->branch_info->srcline_from) {
- left->branch_info->srcline_from = map__srcline(left->branch_info->from.map,
- left->branch_info->from.al_addr,
- left->branch_info->from.sym);
- }
- if (!right->branch_info->srcline_from) {
- right->branch_info->srcline_from = map__srcline(right->branch_info->from.map,
- right->branch_info->from.al_addr,
- right->branch_info->from.sym);
- }
+ if (!left->branch_info->srcline_from)
+ left->branch_info->srcline_from = addr_map_symbol__srcline(&left->branch_info->from);
+
+ if (!right->branch_info->srcline_from)
+ right->branch_info->srcline_from = addr_map_symbol__srcline(&right->branch_info->from);
+
return strcmp(right->branch_info->srcline_from, left->branch_info->srcline_from);
}

@@ -399,16 +400,12 @@ struct sort_entry sort_srcline_from = {
static int64_t
sort__srcline_to_cmp(struct hist_entry *left, struct hist_entry *right)
{
- if (!left->branch_info->srcline_to) {
- left->branch_info->srcline_to = map__srcline(left->branch_info->to.map,
- left->branch_info->to.al_addr,
- left->branch_info->to.sym);
- }
- if (!right->branch_info->srcline_to) {
- right->branch_info->srcline_to = map__srcline(right->branch_info->to.map,
- right->branch_info->to.al_addr,
- right->branch_info->to.sym);
- }
+ if (!left->branch_info->srcline_to)
+ left->branch_info->srcline_to = addr_map_symbol__srcline(&left->branch_info->to);
+
+ if (!right->branch_info->srcline_to)
+ right->branch_info->srcline_to = addr_map_symbol__srcline(&right->branch_info->to);
+
return strcmp(right->branch_info->srcline_to, left->branch_info->srcline_to);
}

--
2.14.3


2018-06-05 18:04:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 15/46] perf tools: Ditch the symbol_conf.nr_events global

From: Arnaldo Carvalho de Melo <[email protected]>

Since over time the places where we need to pass this got reduced
because we can obtain it from evsel->evlist->nr_entries, no need to have
this global anymore.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-kvm.c | 2 --
tools/perf/builtin-top.c | 2 --
tools/perf/util/header.c | 4 ----
tools/perf/util/symbol.h | 1 -
4 files changed, 9 deletions(-)

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 72e2ca096bf5..2b1ef704169f 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1438,8 +1438,6 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
goto out;
}

- symbol_conf.nr_events = kvm->evlist->nr_entries;
-
if (perf_evlist__create_maps(kvm->evlist, &kvm->opts.target) < 0)
usage_with_options(live_usage, live_options);

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 04fe04885e99..4284840022a3 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1462,8 +1462,6 @@ int cmd_top(int argc, const char **argv)
goto out_delete_evlist;
}

- symbol_conf.nr_events = top.evlist->nr_entries;
-
if (top.delay_secs < 1)
top.delay_secs = 1;

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index a8bff2178fbc..2625cc38a0d6 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3312,8 +3312,6 @@ int perf_session__read_header(struct perf_session *session)
lseek(fd, tmp, SEEK_SET);
}

- symbol_conf.nr_events = nr_attrs;
-
perf_header__process_sections(header, fd, &session->tevent,
perf_file_section__process);

@@ -3739,8 +3737,6 @@ int perf_event__process_attr(struct perf_tool *tool __maybe_unused,
perf_evlist__id_add(evlist, evsel, 0, i, event->attr.id[i]);
}

- symbol_conf.nr_events = evlist->nr_entries;
-
return 0;
}

diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 1a16438eb3ce..1be9a6bad967 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -90,7 +90,6 @@ struct intlist;

struct symbol_conf {
unsigned short priv_size;
- unsigned short nr_events;
bool try_vmlinux_path,
init_annotation,
force,
--
2.14.3


2018-06-05 18:04:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 14/46] perf annotate: Replace symbol__alloc_hists() with symbol__hists()

From: Arnaldo Carvalho de Melo <[email protected]>

Its a bit shorter, so ditch the old symbol__alloc_hists() function.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-top.c | 8 +-------
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/util/annotate.c | 21 ++-------------------
tools/perf/util/annotate.h | 2 +-
4 files changed, 5 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index bc71e899096d..04fe04885e99 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -123,14 +123,9 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
}

notes = symbol__annotation(sym);
- if (notes->src != NULL) {
- pthread_mutex_lock(&notes->lock);
- goto out_assign;
- }
-
pthread_mutex_lock(&notes->lock);

- if (symbol__alloc_hist(sym) < 0) {
+ if (!symbol__hists(sym, top->evlist->nr_entries)) {
pthread_mutex_unlock(&notes->lock);
pr_err("Not enough memory for annotating '%s' symbol!\n",
sym->name);
@@ -140,7 +135,6 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)

err = symbol__annotate(sym, map, evsel, 0, NULL);
if (err == 0) {
-out_assign:
top->sym_filter_entry = he;
} else {
char msg[BUFSIZ];
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 8be40fa903aa..3bfe17e176fe 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -410,7 +410,7 @@ static bool annotate_browser__callq(struct annotate_browser *browser,
notes = symbol__annotation(dl->ops.target.sym);
pthread_mutex_lock(&notes->lock);

- if (notes->src == NULL && symbol__alloc_hist(dl->ops.target.sym) < 0) {
+ if (!symbol__hists(dl->ops.target.sym, evsel->evlist->nr_entries)) {
pthread_mutex_unlock(&notes->lock);
ui__warning("Not enough memory for annotating '%s' symbol!\n",
dl->ops.target.sym->name);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7c194b04a2da..bcd5d3e17b85 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -689,7 +689,7 @@ static struct annotated_source *annotated_source__new(void)
return src;
}

-static void annotated_source__delete(struct annotated_source *src)
+static __maybe_unused void annotated_source__delete(struct annotated_source *src)
{
if (src == NULL)
return;
@@ -729,23 +729,6 @@ static int annotated_source__alloc_histograms(struct annotated_source *src,
return src->histograms ? 0 : -1;
}

-int symbol__alloc_hist(struct symbol *sym)
-{
- size_t size = symbol__size(sym);
- struct annotation *notes = symbol__annotation(sym);
-
- notes->src = annotated_source__new();
- if (notes->src == NULL)
- return -1;
-
- if (annotated_source__alloc_histograms(notes->src, size, symbol_conf.nr_events) < 0) {
- annotated_source__delete(notes->src);
- notes->src = NULL;
- return -1;
- }
- return 0;
-}
-
/* The cycles histogram is lazily allocated. */
static int symbol__alloc_hist_cycles(struct symbol *sym)
{
@@ -868,7 +851,7 @@ static struct cyc_hist *symbol__cycles_hist(struct symbol *sym)
return notes->src->cycles_hist;
}

-static struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists)
+struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists)
{
struct annotation *notes = symbol__annotation(sym);

diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 2a73f9084930..7ad503fbff74 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -292,7 +292,7 @@ int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
int hist_entry__inc_addr_samples(struct hist_entry *he, struct perf_sample *sample,
struct perf_evsel *evsel, u64 addr);

-int symbol__alloc_hist(struct symbol *sym);
+struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists);
void symbol__annotate_zero_histograms(struct symbol *sym);

int symbol__annotate(struct symbol *sym, struct map *map,
--
2.14.3


2018-06-05 18:04:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 17/46] perf annotate stdio: Use annotation_options consistently

From: Arnaldo Carvalho de Melo <[email protected]>

Accross all the routines, this way we can have eventually have a
consistent set of defaults for all UIs.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-annotate.c | 15 +++++++--------
tools/perf/builtin-top.c | 14 ++++++++------
tools/perf/util/annotate.c | 31 +++++++++++++++----------------
tools/perf/util/annotate.h | 15 +++++++++------
tools/perf/util/top.h | 3 ++-
5 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2b21bbcd70ea..7238010f28d4 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -40,9 +40,8 @@
struct perf_annotate {
struct perf_tool tool;
struct perf_session *session;
+ struct annotation_options opts;
bool use_tui, use_stdio, use_stdio2, use_gtk;
- bool full_paths;
- bool print_line;
bool skip_missing;
bool has_br_stack;
bool group_set;
@@ -289,10 +288,9 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
struct perf_annotate *ann)
{
if (!ann->use_stdio2)
- return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel,
- ann->print_line, ann->full_paths, 0, 0);
- return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel,
- ann->print_line, ann->full_paths);
+ return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel, &ann->opts);
+
+ return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel, &ann->opts);
}

static void hists__find_annotations(struct hists *hists,
@@ -476,6 +474,7 @@ int cmd_annotate(int argc, const char **argv)
.ordered_events = true,
.ordering_requires_timestamps = true,
},
+ .opts = annotation__default_options,
};
struct perf_data data = {
.mode = PERF_DATA_MODE_READ,
@@ -503,9 +502,9 @@ int cmd_annotate(int argc, const char **argv)
"file", "vmlinux pathname"),
OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules,
"load module symbols - WARNING: use only with -k and LIVE kernel"),
- OPT_BOOLEAN('l', "print-line", &annotate.print_line,
+ OPT_BOOLEAN('l', "print-line", &annotate.opts.print_lines,
"print matching source lines (may be slow)"),
- OPT_BOOLEAN('P', "full-paths", &annotate.full_paths,
+ OPT_BOOLEAN('P', "full-paths", &annotate.opts.full_path,
"Don't shorten the displayed pathnames"),
OPT_BOOLEAN(0, "skip-missing", &annotate.skip_missing,
"Skip symbols that cannot be annotated"),
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 4284840022a3..5e58cd4de90b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -243,10 +243,9 @@ static void perf_top__show_details(struct perf_top *top)
goto out_unlock;

printf("Showing %s for %s\n", perf_evsel__name(top->sym_evsel), symbol->name);
- printf(" Events Pcnt (>=%d%%)\n", top->sym_pcnt_filter);
+ printf(" Events Pcnt (>=%d%%)\n", top->annotation_opts.min_pcnt);

- more = symbol__annotate_printf(symbol, he->ms.map, top->sym_evsel,
- 0, top->sym_pcnt_filter, top->print_entries, 4);
+ more = symbol__annotate_printf(symbol, he->ms.map, top->sym_evsel, &top->annotation_opts);

if (top->evlist->enabled) {
if (top->zero)
@@ -406,7 +405,7 @@ static void perf_top__print_mapped_keys(struct perf_top *top)

fprintf(stdout, "\t[f] profile display filter (count). \t(%d)\n", top->count_filter);

- fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", top->sym_pcnt_filter);
+ fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", top->annotation_opts.min_pcnt);
fprintf(stdout, "\t[s] annotate symbol. \t(%s)\n", name?: "NULL");
fprintf(stdout, "\t[S] stop annotation.\n");

@@ -509,7 +508,7 @@ static bool perf_top__handle_keypress(struct perf_top *top, int c)
prompt_integer(&top->count_filter, "Enter display event count filter");
break;
case 'F':
- prompt_percent(&top->sym_pcnt_filter,
+ prompt_percent(&top->annotation_opts.min_pcnt,
"Enter details display event filter (percent)");
break;
case 'K':
@@ -1259,7 +1258,7 @@ int cmd_top(int argc, const char **argv)
.overwrite = 1,
},
.max_stack = sysctl__max_stack(),
- .sym_pcnt_filter = 5,
+ .annotation_opts = annotation__default_options,
.nr_threads_synthesize = UINT_MAX,
};
struct record_opts *opts = &top.record_opts;
@@ -1385,6 +1384,9 @@ int cmd_top(int argc, const char **argv)
if (status < 0)
return status;

+ top.annotation_opts.min_pcnt = 5;
+ top.annotation_opts.context = 4;
+
top.evlist = perf_evlist__new();
if (top.evlist == NULL)
return -ENOMEM;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index bcd5d3e17b85..abcc7e24c365 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1985,8 +1985,8 @@ static int annotated_source__addr_fmt_width(struct list_head *lines, u64 start)
}

int symbol__annotate_printf(struct symbol *sym, struct map *map,
- struct perf_evsel *evsel, bool full_paths,
- int min_pcnt, int max_lines, int context)
+ struct perf_evsel *evsel,
+ struct annotation_options *opts)
{
struct dso *dso = map->dso;
char *filename;
@@ -1998,6 +1998,7 @@ int symbol__annotate_printf(struct symbol *sym, struct map *map,
u64 start = map__rip_2objdump(map, sym->start);
int printed = 2, queue_len = 0, addr_fmt_width;
int more = 0;
+ bool context = opts->context;
u64 len;
int width = symbol_conf.show_total_period ? 12 : 8;
int graph_dotted_len;
@@ -2007,7 +2008,7 @@ int symbol__annotate_printf(struct symbol *sym, struct map *map,
if (!filename)
return -ENOMEM;

- if (full_paths)
+ if (opts->full_path)
d_filename = filename;
else
d_filename = basename(filename);
@@ -2042,7 +2043,7 @@ int symbol__annotate_printf(struct symbol *sym, struct map *map,
}

err = annotation_line__print(pos, sym, start, evsel, len,
- min_pcnt, printed, max_lines,
+ opts->min_pcnt, printed, opts->max_lines,
queue, addr_fmt_width);

switch (err) {
@@ -2375,20 +2376,19 @@ static void symbol__calc_lines(struct symbol *sym, struct map *map,
}

int symbol__tty_annotate2(struct symbol *sym, struct map *map,
- struct perf_evsel *evsel, bool print_lines,
- bool full_paths)
+ struct perf_evsel *evsel,
+ struct annotation_options *opts)
{
struct dso *dso = map->dso;
struct rb_root source_line = RB_ROOT;
- struct annotation_options opts = annotation__default_options;
struct annotation *notes = symbol__annotation(sym);
char buf[1024];

- if (symbol__annotate2(sym, map, evsel, &opts, NULL) < 0)
+ if (symbol__annotate2(sym, map, evsel, opts, NULL) < 0)
return -1;

- if (print_lines) {
- srcline_full_filename = full_paths;
+ if (opts->print_lines) {
+ srcline_full_filename = opts->full_path;
symbol__calc_lines(sym, map, &source_line);
print_summary(&source_line, dso->long_name);
}
@@ -2403,8 +2403,8 @@ int symbol__tty_annotate2(struct symbol *sym, struct map *map,
}

int symbol__tty_annotate(struct symbol *sym, struct map *map,
- struct perf_evsel *evsel, bool print_lines,
- bool full_paths, int min_pcnt, int max_lines)
+ struct perf_evsel *evsel,
+ struct annotation_options *opts)
{
struct dso *dso = map->dso;
struct rb_root source_line = RB_ROOT;
@@ -2414,14 +2414,13 @@ int symbol__tty_annotate(struct symbol *sym, struct map *map,

symbol__calc_percent(sym, evsel);

- if (print_lines) {
- srcline_full_filename = full_paths;
+ if (opts->print_lines) {
+ srcline_full_filename = opts->full_path;
symbol__calc_lines(sym, map, &source_line);
print_summary(&source_line, dso->long_name);
}

- symbol__annotate_printf(sym, map, evsel, full_paths,
- min_pcnt, max_lines, 0);
+ symbol__annotate_printf(sym, map, evsel, opts);

annotated_source__purge(symbol__annotation(sym)->src);

diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 3dc4ca1d6c08..20f3326cc640 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -67,12 +67,17 @@ struct annotation_options {
bool hide_src_code,
use_offset,
jump_arrows,
+ print_lines,
+ full_path,
show_linenr,
show_nr_jumps,
show_nr_samples,
show_total_period,
show_minmax_cycle;
u8 offset_level;
+ int min_pcnt;
+ int max_lines;
+ int context;
};

enum {
@@ -328,8 +333,8 @@ int symbol__strerror_disassemble(struct symbol *sym, struct map *map,
int errnum, char *buf, size_t buflen);

int symbol__annotate_printf(struct symbol *sym, struct map *map,
- struct perf_evsel *evsel, bool full_paths,
- int min_pcnt, int max_lines, int context);
+ struct perf_evsel *evsel,
+ struct annotation_options *options);
int symbol__annotate_fprintf2(struct symbol *sym, FILE *fp);
void symbol__annotate_zero_histogram(struct symbol *sym, int evidx);
void symbol__annotate_decay_histogram(struct symbol *sym, int evidx);
@@ -340,12 +345,10 @@ int map_symbol__annotation_dump(struct map_symbol *ms, struct perf_evsel *evsel)
bool ui__has_annotation(void);

int symbol__tty_annotate(struct symbol *sym, struct map *map,
- struct perf_evsel *evsel, bool print_lines,
- bool full_paths, int min_pcnt, int max_lines);
+ struct perf_evsel *evsel, struct annotation_options *opts);

int symbol__tty_annotate2(struct symbol *sym, struct map *map,
- struct perf_evsel *evsel, bool print_lines,
- bool full_paths);
+ struct perf_evsel *evsel, struct annotation_options *opts);

#ifdef HAVE_SLANG_SUPPORT
int symbol__tui_annotate(struct symbol *sym, struct map *map,
diff --git a/tools/perf/util/top.h b/tools/perf/util/top.h
index 9892323cdd7c..9add1f72ce95 100644
--- a/tools/perf/util/top.h
+++ b/tools/perf/util/top.h
@@ -3,6 +3,7 @@
#define __PERF_TOP_H 1

#include "tool.h"
+#include "annotate.h"
#include <linux/types.h>
#include <stddef.h>
#include <stdbool.h>
@@ -16,6 +17,7 @@ struct perf_top {
struct perf_tool tool;
struct perf_evlist *evlist;
struct record_opts record_opts;
+ struct annotation_options annotation_opts;
/*
* Symbols will be added here in perf_event__process_sample and will
* get out after decayed.
@@ -35,7 +37,6 @@ struct perf_top {
struct perf_session *session;
struct winsize winsize;
int realtime_prio;
- int sym_pcnt_filter;
const char *sym_filter;
float min_percent;
unsigned int nr_threads_synthesize;
--
2.14.3


2018-06-05 18:05:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 13/46] perf annotate: Stop using symbol_conf.nr_events global in symbol__hists()

From: Arnaldo Carvalho de Melo <[email protected]>

Since now we have evsel->evlist->nr_entries in the single place calling
this function, use it.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index f11199f0be27..7c194b04a2da 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -21,6 +21,7 @@
#include "debug.h"
#include "annotate.h"
#include "evsel.h"
+#include "evlist.h"
#include "block-range.h"
#include "string2.h"
#include "arch/common.h"
@@ -867,7 +868,7 @@ static struct cyc_hist *symbol__cycles_hist(struct symbol *sym)
return notes->src->cycles_hist;
}

-static struct annotated_source *symbol__hists(struct symbol *sym)
+static struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists)
{
struct annotation *notes = symbol__annotation(sym);

@@ -881,7 +882,7 @@ static struct annotated_source *symbol__hists(struct symbol *sym)
if (notes->src->histograms == NULL) {
alloc_histograms:
annotated_source__alloc_histograms(notes->src, symbol__size(sym),
- symbol_conf.nr_events);
+ nr_hists);
}

return notes->src;
@@ -895,7 +896,7 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,

if (sym == NULL)
return 0;
- src = symbol__hists(sym);
+ src = symbol__hists(sym, evsel->evlist->nr_entries);
if (src == NULL)
return -ENOMEM;
return __symbol__inc_addr_samples(sym, map, src, evsel->idx, addr, sample);
--
2.14.3


2018-06-05 18:05:04

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 11/46] perf annotate: Introduce symbol__hists()

From: Arnaldo Carvalho de Melo <[email protected]>

In this case we're wanting just notes->src->histograms, allocating it if needed.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index a5a6d686004e..467bae0279ce 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -863,18 +863,38 @@ static struct annotation *symbol__get_annotation(struct symbol *sym, bool cycles
return notes;
}

+static struct annotated_source *symbol__hists(struct symbol *sym)
+{
+ struct annotation *notes = symbol__annotation(sym);
+
+ if (notes->src == NULL) {
+ notes->src = annotated_source__new();
+ if (notes->src == NULL)
+ return NULL;
+ goto alloc_histograms;
+ }
+
+ if (notes->src->histograms == NULL) {
+alloc_histograms:
+ annotated_source__alloc_histograms(notes->src, symbol__size(sym),
+ symbol_conf.nr_events);
+ }
+
+ return notes->src;
+}
+
static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
struct perf_evsel *evsel, u64 addr,
struct perf_sample *sample)
{
- struct annotation *notes;
+ struct annotated_source *src;

if (sym == NULL)
return 0;
- notes = symbol__get_annotation(sym, false);
- if (notes == NULL)
+ src = symbol__hists(sym);
+ if (src == NULL)
return -ENOMEM;
- return __symbol__inc_addr_samples(sym, map, notes->src, evsel->idx, addr, sample);
+ return __symbol__inc_addr_samples(sym, map, src, evsel->idx, addr, sample);
}

static int symbol__account_cycles(u64 addr, u64 start,
--
2.14.3


2018-06-05 18:05:20

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 12/46] perf annotate: Introduce symbol__cycle_hists()

From: Arnaldo Carvalho de Melo <[email protected]>

In this case we're wanting just notes->src->cycles_hist, allocating it if needed.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 467bae0279ce..f11199f0be27 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -848,19 +848,23 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
return 0;
}

-static struct annotation *symbol__get_annotation(struct symbol *sym, bool cycles)
+static struct cyc_hist *symbol__cycles_hist(struct symbol *sym)
{
struct annotation *notes = symbol__annotation(sym);

if (notes->src == NULL) {
- if (symbol__alloc_hist(sym) < 0)
+ notes->src = annotated_source__new();
+ if (notes->src == NULL)
return NULL;
+ goto alloc_cycles_hist;
}
- if (!notes->src->cycles_hist && cycles) {
- if (symbol__alloc_hist_cycles(sym) < 0)
- return NULL;
+
+ if (!notes->src->cycles_hist) {
+alloc_cycles_hist:
+ symbol__alloc_hist_cycles(sym);
}
- return notes;
+
+ return notes->src->cycles_hist;
}

static struct annotated_source *symbol__hists(struct symbol *sym)
@@ -900,13 +904,13 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
static int symbol__account_cycles(u64 addr, u64 start,
struct symbol *sym, unsigned cycles)
{
- struct annotation *notes;
+ struct cyc_hist *cycles_hist;
unsigned offset;

if (sym == NULL)
return 0;
- notes = symbol__get_annotation(sym, true);
- if (notes == NULL)
+ cycles_hist = symbol__cycles_hist(sym);
+ if (cycles_hist == NULL)
return -ENOMEM;
if (addr < sym->start || addr >= sym->end)
return -ERANGE;
@@ -918,7 +922,7 @@ static int symbol__account_cycles(u64 addr, u64 start,
start = 0;
}
offset = addr - sym->start;
- return __symbol__account_cycles(notes->src->cycles_hist,
+ return __symbol__account_cycles(cycles_hist,
start ? start - sym->start : 0,
offset, cycles,
!!start);
--
2.14.3


2018-06-05 18:05:36

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 16/46] perf annotate: Add comment about annotated_src->nr_histograms

From: Arnaldo Carvalho de Melo <[email protected]>

When we have multiple groups in an evlist, say:

$ perf stat -e '{cycles,instructions},{cache-references,cache-misses}' sleep 1

Performance counter stats for 'sleep 1':

343,134 cycles:u
249,292 instructions:u # 0.73 insn per cycle
15,556 cache-references:u
8,925 cache-misses:u # 57.373 % of all cache refs

1.000957550 seconds time elapsed

$

Then the perf_evsel instances for the two group leaders ("cycles" and
"cache-references") will have evsel->nr_members set to 2, while all the
evsel->evlist->nr_entries will be set to 4, so we can't use
evsel->evlist->nr_entries everywhere, as event groups need to be taken
into account.

But this probably requires us to audit at least the forced-group code,
where we want all of the events to be in a "group", to see them all in
the screen, one column for each, even knowing that they were not
necessarily scheduled to count at the same time by the kernel perf
subsystem.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 7ad503fbff74..3dc4ca1d6c08 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -202,6 +202,10 @@ struct cyc_hist {
/** struct annotated_source - symbols with hits have this attached as in sannotation
*
* @histograms: Array of addr hit histograms per event being monitored
+ * nr_histograms: This may not be the same as evsel->evlist->nr_entries if
+ * we have more than a group in a evlist, where we will want
+ * to see each group separately, that is why symbol__annotate2()
+ * sets src->nr_histograms to evsel->nr_members.
* @lines: If 'print_lines' is specified, per source code line percentages
* @source: source parsed from a disassembler like objdump -dS
* @cyc_hist: Average cycles per basic block
--
2.14.3


2018-06-05 18:05:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 08/46] perf annotate: Introduce constructor/destructor for annotated_source

From: Arnaldo Carvalho de Melo <[email protected]>

More stuff will go in there, all the parts that are not needed when a
symbol had no samples and that were mistakenly added to 'struct
annotation'.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index f0c6941bca6c..f6c9bb29ac84 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -678,6 +678,25 @@ static struct arch *arch__find(const char *name)
return bsearch(name, architectures, nmemb, sizeof(struct arch), arch__key_cmp);
}

+static struct annotated_source *annotated_source__new(void)
+{
+ struct annotated_source *src = zalloc(sizeof(*src));
+
+ if (src != NULL)
+ INIT_LIST_HEAD(&src->source);
+
+ return src;
+}
+
+static void annotated_source__delete(struct annotated_source *src)
+{
+ if (src == NULL)
+ return;
+ zfree(&src->histograms);
+ zfree(&src->cycles_hist);
+ free(src);
+}
+
int symbol__alloc_hist(struct symbol *sym)
{
struct annotation *notes = symbol__annotation(sym);
@@ -704,17 +723,17 @@ int symbol__alloc_hist(struct symbol *sym)
if (sizeof_sym_hist > SIZE_MAX / symbol_conf.nr_events)
return -1;

- notes->src = zalloc(sizeof(*notes->src));
+ notes->src = annotated_source__new();
if (notes->src == NULL)
return -1;
notes->src->histograms = calloc(symbol_conf.nr_events, sizeof_sym_hist);
if (notes->src->histograms == NULL) {
- zfree(&notes->src);
+ annotated_source__delete(notes->src);
+ notes->src = NULL;
return -1;
}
notes->src->sizeof_sym_hist = sizeof_sym_hist;
notes->src->nr_histograms = symbol_conf.nr_events;
- INIT_LIST_HEAD(&notes->src->source);
return 0;
}

--
2.14.3


2018-06-05 18:06:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 09/46] perf annotate: Introduce annotated_source__alloc_histograms

From: Arnaldo Carvalho de Melo <[email protected]>

So that we can call it independently, in contexts were we know we
already have notes->src allocated.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index f6c9bb29ac84..a6fa49bf879b 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -697,10 +697,9 @@ static void annotated_source__delete(struct annotated_source *src)
free(src);
}

-int symbol__alloc_hist(struct symbol *sym)
+static int annotated_source__alloc_histograms(struct annotated_source *src,
+ size_t size, int nr_hists)
{
- struct annotation *notes = symbol__annotation(sym);
- size_t size = symbol__size(sym);
size_t sizeof_sym_hist;

/*
@@ -720,20 +719,29 @@ int symbol__alloc_hist(struct symbol *sym)
sizeof_sym_hist = (sizeof(struct sym_hist) + size * sizeof(struct sym_hist_entry));

/* Check for overflow in zalloc argument */
- if (sizeof_sym_hist > SIZE_MAX / symbol_conf.nr_events)
+ if (sizeof_sym_hist > SIZE_MAX / nr_hists)
return -1;

+ src->sizeof_sym_hist = sizeof_sym_hist;
+ src->nr_histograms = nr_hists;
+ src->histograms = calloc(nr_hists, sizeof_sym_hist) ;
+ return src->histograms ? 0 : -1;
+}
+
+int symbol__alloc_hist(struct symbol *sym)
+{
+ size_t size = symbol__size(sym);
+ struct annotation *notes = symbol__annotation(sym);
+
notes->src = annotated_source__new();
if (notes->src == NULL)
return -1;
- notes->src->histograms = calloc(symbol_conf.nr_events, sizeof_sym_hist);
- if (notes->src->histograms == NULL) {
+
+ if (annotated_source__alloc_histograms(notes->src, size, symbol_conf.nr_events) < 0) {
annotated_source__delete(notes->src);
notes->src = NULL;
return -1;
}
- notes->src->sizeof_sym_hist = sizeof_sym_hist;
- notes->src->nr_histograms = symbol_conf.nr_events;
return 0;
}

--
2.14.3


2018-06-05 18:06:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 06/46] perf annotate: __symbol__acount_cycles doesn't need notes

From: Arnaldo Carvalho de Melo <[email protected]>

It only operates on the notes->src->cyc_hist, just pass that to it.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 0f5ed6091e00..a7221f9fa504 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -741,14 +741,11 @@ void symbol__annotate_zero_histograms(struct symbol *sym)
pthread_mutex_unlock(&notes->lock);
}

-static int __symbol__account_cycles(struct annotation *notes,
+static int __symbol__account_cycles(struct cyc_hist *ch,
u64 start,
unsigned offset, unsigned cycles,
unsigned have_start)
{
- struct cyc_hist *ch;
-
- ch = notes->src->cycles_hist;
/*
* For now we can only account one basic block per
* final jump. But multiple could be overlapping.
@@ -870,7 +867,7 @@ static int symbol__account_cycles(u64 addr, u64 start,
start = 0;
}
offset = addr - sym->start;
- return __symbol__account_cycles(notes,
+ return __symbol__account_cycles(notes->src->cycles_hist,
start ? start - sym->start : 0,
offset, cycles,
!!start);
--
2.14.3


2018-06-05 18:06:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 10/46] perf annotate: __symbol__inc_addr_samples() needs just annotated_source

From: Arnaldo Carvalho de Melo <[email protected]>

It only operates on the histograms, so no need for the encompassing
'struct annotation'.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/annotate.c | 6 +++---
tools/perf/util/annotate.h | 8 ++++++--
2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index a6fa49bf879b..a5a6d686004e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -819,7 +819,7 @@ static int __symbol__account_cycles(struct cyc_hist *ch,
}

static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
- struct annotation *notes, int evidx, u64 addr,
+ struct annotated_source *src, int evidx, u64 addr,
struct perf_sample *sample)
{
unsigned offset;
@@ -835,7 +835,7 @@ static int __symbol__inc_addr_samples(struct symbol *sym, struct map *map,
}

offset = addr - sym->start;
- h = annotation__histogram(notes, evidx);
+ h = annotated_source__histogram(src, evidx);
h->nr_samples++;
h->addr[offset].nr_samples++;
h->period += sample->period;
@@ -874,7 +874,7 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
notes = symbol__get_annotation(sym, false);
if (notes == NULL)
return -ENOMEM;
- return __symbol__inc_addr_samples(sym, map, notes, evsel->idx, addr, sample);
+ return __symbol__inc_addr_samples(sym, map, notes->src, evsel->idx, addr, sample);
}

static int symbol__account_cycles(u64 addr, u64 start,
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 94b60e34c3a7..2a73f9084930 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -267,10 +267,14 @@ void annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
void annotation__update_column_widths(struct annotation *notes);
void annotation__init_column_widths(struct annotation *notes, struct symbol *sym);

+static inline struct sym_hist *annotated_source__histogram(struct annotated_source *src, int idx)
+{
+ return ((void *)src->histograms) + (src->sizeof_sym_hist * idx);
+}
+
static inline struct sym_hist *annotation__histogram(struct annotation *notes, int idx)
{
- return (((void *)notes->src->histograms) +
- (notes->src->sizeof_sym_hist * idx));
+ return annotated_source__histogram(notes->src, idx);
}

static inline struct annotation *symbol__annotation(struct symbol *sym)
--
2.14.3


2018-06-05 18:07:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 02/46] perf probe: Use return of map__get() to make code more compact

From: Arnaldo Carvalho de Melo <[email protected]>

The __get() idiom returns a reference count for the object passed, i.e.
all functions of this type return the object passed, so take advantage
of that to make the code more compact.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/probe-event.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 3094f11e7d81..f119eb628dbb 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -165,8 +165,7 @@ static struct map *kernel_get_module_map(const char *module)
if (strncmp(pos->dso->short_name + 1, module,
pos->dso->short_name_len - 2) == 0 &&
module[pos->dso->short_name_len - 2] == '\0') {
- map__get(pos);
- return pos;
+ return map__get(pos);
}
}
return NULL;
--
2.14.3


2018-06-05 18:07:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 03/46] perf cgroup: Make evlist__find_cgroup() more compact

From: Arnaldo Carvalho de Melo <[email protected]>

By taking advantage that __get() routines return the pointer to the
object for which a reference count is being get.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/cgroup.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index decb91f9da82..ccd02634a616 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -93,20 +93,17 @@ static int open_cgroup(const char *name)
static struct cgroup *evlist__find_cgroup(struct perf_evlist *evlist, const char *str)
{
struct perf_evsel *counter;
- struct cgroup *cgrp = NULL;
/*
* check if cgrp is already defined, if so we reuse it
*/
evlist__for_each_entry(evlist, counter) {
if (!counter->cgrp)
continue;
- if (!strcmp(counter->cgrp->name, str)) {
- cgrp = cgroup__get(counter->cgrp);
- break;
- }
+ if (!strcmp(counter->cgrp->name, str))
+ return cgroup__get(counter->cgrp);
}

- return cgrp;
+ return NULL;
}

static struct cgroup *cgroup__new(const char *name)
--
2.14.3


2018-06-05 18:07:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 01/46] perf tools: Remove dead quote.[ch] code

From: Arnaldo Carvalho de Melo <[email protected]>

In c68677014bac ("perf tools: Remove support for command aliases") we
removed the only remaining use of a function provided by these files, so
ditch it.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/perf.c | 1 -
tools/perf/util/Build | 1 -
tools/perf/util/quote.c | 62 -------------------------------------------------
tools/perf/util/quote.h | 31 -------------------------
4 files changed, 95 deletions(-)
delete mode 100644 tools/perf/util/quote.c
delete mode 100644 tools/perf/util/quote.h

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 51c81509a315..a11cb006f968 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -12,7 +12,6 @@
#include "util/env.h"
#include <subcmd/exec-cmd.h>
#include "util/config.h"
-#include "util/quote.h"
#include <subcmd/run-command.h>
#include "util/parse-events.h"
#include <subcmd/parse-options.h>
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 5d4c45b76895..b604ef334dc9 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -24,7 +24,6 @@ libperf-y += libstring.o
libperf-y += bitmap.o
libperf-y += hweight.o
libperf-y += smt.o
-libperf-y += quote.o
libperf-y += strbuf.o
libperf-y += string.o
libperf-y += strlist.o
diff --git a/tools/perf/util/quote.c b/tools/perf/util/quote.c
deleted file mode 100644
index 22eaa201aa27..000000000000
--- a/tools/perf/util/quote.c
+++ /dev/null
@@ -1,62 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <errno.h>
-#include <stdlib.h>
-#include "strbuf.h"
-#include "quote.h"
-#include "util.h"
-
-/* Help to copy the thing properly quoted for the shell safety.
- * any single quote is replaced with '\'', any exclamation point
- * is replaced with '\!', and the whole thing is enclosed in a
- *
- * E.g.
- * original sq_quote result
- * name ==> name ==> 'name'
- * a b ==> a b ==> 'a b'
- * a'b ==> a'\''b ==> 'a'\''b'
- * a!b ==> a'\!'b ==> 'a'\!'b'
- */
-static inline int need_bs_quote(char c)
-{
- return (c == '\'' || c == '!');
-}
-
-static int sq_quote_buf(struct strbuf *dst, const char *src)
-{
- char *to_free = NULL;
- int ret;
-
- if (dst->buf == src)
- to_free = strbuf_detach(dst, NULL);
-
- ret = strbuf_addch(dst, '\'');
- while (!ret && *src) {
- size_t len = strcspn(src, "'!");
- ret = strbuf_add(dst, src, len);
- src += len;
- while (!ret && need_bs_quote(*src))
- ret = strbuf_addf(dst, "'\\%c\'", *src++);
- }
- if (!ret)
- ret = strbuf_addch(dst, '\'');
- free(to_free);
-
- return ret;
-}
-
-int sq_quote_argv(struct strbuf *dst, const char** argv, size_t maxlen)
-{
- int i, ret;
-
- /* Copy into destination buffer. */
- ret = strbuf_grow(dst, 255);
- for (i = 0; !ret && argv[i]; ++i) {
- ret = strbuf_addch(dst, ' ');
- if (ret)
- break;
- ret = sq_quote_buf(dst, argv[i]);
- if (maxlen && dst->len > maxlen)
- return -ENOSPC;
- }
- return ret;
-}
diff --git a/tools/perf/util/quote.h b/tools/perf/util/quote.h
deleted file mode 100644
index 274bf26d3511..000000000000
--- a/tools/perf/util/quote.h
+++ /dev/null
@@ -1,31 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __PERF_QUOTE_H
-#define __PERF_QUOTE_H
-
-#include <stddef.h>
-
-/* Help to copy the thing properly quoted for the shell safety.
- * any single quote is replaced with '\'', any exclamation point
- * is replaced with '\!', and the whole thing is enclosed in a
- * single quote pair.
- *
- * For example, if you are passing the result to system() as an
- * argument:
- *
- * sprintf(cmd, "foobar %s %s", sq_quote(arg0), sq_quote(arg1))
- *
- * would be appropriate. If the system() is going to call ssh to
- * run the command on the other side:
- *
- * sprintf(cmd, "git-diff-tree %s %s", sq_quote(arg0), sq_quote(arg1));
- * sprintf(rcmd, "ssh %s %s", sq_util/quote.host), sq_quote(cmd));
- *
- * Note that the above examples leak memory! Remember to free result from
- * sq_quote() in a real application.
- */
-
-struct strbuf;
-
-int sq_quote_argv(struct strbuf *, const char **argv, size_t maxlen);
-
-#endif /* __PERF_QUOTE_H */
--
2.14.3


2018-06-05 18:07:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 04/46] perf tools: No need to check if the argument to __get() function is NULL

From: Arnaldo Carvalho de Melo <[email protected]>

Those functions always check if the argument is NULL before trying to
grab a reference count, and also will return the received object, so, to
make code more compact, no need to check for NULL.

Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Krister Johansen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-probe.c | 3 +--
tools/perf/util/hist.c | 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index c0065923a525..99de91698de1 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -81,8 +81,7 @@ static int parse_probe_event(const char *str)
params.target_used = true;
}

- if (params.nsi)
- pev->nsi = nsinfo__get(params.nsi);
+ pev->nsi = nsinfo__get(params.nsi);

/* Parse a perf-probe command into event */
ret = parse_perf_probe_command(str, pev);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4d602fba40b2..95333b068109 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1039,7 +1039,7 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
int err, err2;
struct map *alm = NULL;

- if (al && al->map)
+ if (al)
alm = map__get(al->map);

err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
--
2.14.3


2018-06-06 14:54:17

by Paul A. Clarke

[permalink] [raw]
Subject: Re: [PATCH 42/46] perf script powerpc: Python script for hypervisor call statistics

On 06/05/2018 12:50 PM, Arnaldo Carvalho de Melo wrote:
> From: Ravi Bangoria <[email protected]>
>
> Add python script to show hypervisor call statistics. Ex,
>
> # perf record -a -e "{powerpc:hcall_entry,powerpc:hcall_exit}"
> # perf script -s scripts/python/powerpc-hcalls.py
> hcall count min(ns) max(ns) avg(ns)
> --------------------------------------------------------------------
> H_RANDOM 82 838 1164 904
> H_PUT_TCE 47 1078 5928 2003
> H_EOI 266 1336 3546 1654
> H_ENTER 28 1646 4038 1952
> H_PUT_TCE_INDIRECT 230 2166 18168 6109
> H_IPI 238 1072 3232 1688
> H_SEND_LOGICAL_LAN 42 5488 21366 7694
> H_STUFF_TCE 294 986 6210 3591
> H_XIRR 266 2286 6990 3783
> H_PROTECT 10 2196 3556 2555
> H_VIO_SIGNAL 294 1028 2784 1311
> H_ADD_LOGICAL_LAN_BUFFER 53 1978 3450 2600
> H_SEND_CRQ 77 1762 7240 2447

This translation from HCALL code to mnemonic is more generally useful. Is there a good way to make the "hcall_table_lookup" function more generally available, like "syscall_name" in scripts/python/Perf-Trace-Util/lib/Perf/Trace/Util.py ? It's even simpler than the syscall ID-to-name mapping, because the HCALL codes are constant, unlike syscall IDs which vary between arches.

> diff --git a/tools/perf/scripts/python/powerpc-hcalls.py b/tools/perf/scripts/python/powerpc-hcalls.py
> new file mode 100644
> index 000000000000..00e0e7476e55
> --- /dev/null
> +++ b/tools/perf/scripts/python/powerpc-hcalls.py
> @@ -0,0 +1,200 @@
> +# SPDX-License-Identifier: GPL-2.0+
> +#
> +# Copyright (C) 2018 Ravi Bangoria, IBM Corporation
> +#
> +# Hypervisor call statisics
> +
> +import os
> +import sys
> +
> +sys.path.append(os.environ['PERF_EXEC_PATH'] + \
> + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
> +
> +from perf_trace_context import *
> +from Core import *
> +from Util import *
> +
> +# output: {
> +# opcode: {
> +# 'min': minimum time nsec
> +# 'max': maximum time nsec
> +# 'time': average time nsec
> +# 'cnt': counter
> +# } ...
> +# }
> +output = {}
> +
> +# d_enter: {
> +# cpu: {
> +# opcode: nsec
> +# } ...
> +# }
> +d_enter = {}
> +
> +hcall_table = {
> + 4: 'H_REMOVE',
> + 8: 'H_ENTER',
> + 12: 'H_READ',
> + 16: 'H_CLEAR_MOD',
> + 20: 'H_CLEAR_REF',
> + 24: 'H_PROTECT',
> + 28: 'H_GET_TCE',
> + 32: 'H_PUT_TCE',
> + 36: 'H_SET_SPRG0',
> + 40: 'H_SET_DABR',
> + 44: 'H_PAGE_INIT',
> + 48: 'H_SET_ASR',
> + 52: 'H_ASR_ON',
> + 56: 'H_ASR_OFF',
> + 60: 'H_LOGICAL_CI_LOAD',
> + 64: 'H_LOGICAL_CI_STORE',
> + 68: 'H_LOGICAL_CACHE_LOAD',
> + 72: 'H_LOGICAL_CACHE_STORE',
> + 76: 'H_LOGICAL_ICBI',
> + 80: 'H_LOGICAL_DCBF',
> + 84: 'H_GET_TERM_CHAR',
> + 88: 'H_PUT_TERM_CHAR',
> + 92: 'H_REAL_TO_LOGICAL',
> + 96: 'H_HYPERVISOR_DATA',
> + 100: 'H_EOI',
> + 104: 'H_CPPR',
> + 108: 'H_IPI',
> + 112: 'H_IPOLL',
> + 116: 'H_XIRR',
> + 120: 'H_MIGRATE_DMA',
> + 124: 'H_PERFMON',
> + 220: 'H_REGISTER_VPA',
> + 224: 'H_CEDE',
> + 228: 'H_CONFER',
> + 232: 'H_PROD',
> + 236: 'H_GET_PPP',
> + 240: 'H_SET_PPP',
> + 244: 'H_PURR',
> + 248: 'H_PIC',
> + 252: 'H_REG_CRQ',
> + 256: 'H_FREE_CRQ',
> + 260: 'H_VIO_SIGNAL',
> + 264: 'H_SEND_CRQ',
> + 272: 'H_COPY_RDMA',
> + 276: 'H_REGISTER_LOGICAL_LAN',
> + 280: 'H_FREE_LOGICAL_LAN',
> + 284: 'H_ADD_LOGICAL_LAN_BUFFER',
> + 288: 'H_SEND_LOGICAL_LAN',
> + 292: 'H_BULK_REMOVE',
> + 304: 'H_MULTICAST_CTRL',
> + 308: 'H_SET_XDABR',
> + 312: 'H_STUFF_TCE',
> + 316: 'H_PUT_TCE_INDIRECT',
> + 332: 'H_CHANGE_LOGICAL_LAN_MAC',
> + 336: 'H_VTERM_PARTNER_INFO',
> + 340: 'H_REGISTER_VTERM',
> + 344: 'H_FREE_VTERM',
> + 348: 'H_RESET_EVENTS',
> + 352: 'H_ALLOC_RESOURCE',
> + 356: 'H_FREE_RESOURCE',
> + 360: 'H_MODIFY_QP',
> + 364: 'H_QUERY_QP',
> + 368: 'H_REREGISTER_PMR',
> + 372: 'H_REGISTER_SMR',
> + 376: 'H_QUERY_MR',
> + 380: 'H_QUERY_MW',
> + 384: 'H_QUERY_HCA',
> + 388: 'H_QUERY_PORT',
> + 392: 'H_MODIFY_PORT',
> + 396: 'H_DEFINE_AQP1',
> + 400: 'H_GET_TRACE_BUFFER',
> + 404: 'H_DEFINE_AQP0',
> + 408: 'H_RESIZE_MR',
> + 412: 'H_ATTACH_MCQP',
> + 416: 'H_DETACH_MCQP',
> + 420: 'H_CREATE_RPT',
> + 424: 'H_REMOVE_RPT',
> + 428: 'H_REGISTER_RPAGES',
> + 432: 'H_DISABLE_AND_GETC',
> + 436: 'H_ERROR_DATA',
> + 440: 'H_GET_HCA_INFO',
> + 444: 'H_GET_PERF_COUNT',
> + 448: 'H_MANAGE_TRACE',
> + 468: 'H_FREE_LOGICAL_LAN_BUFFER',
> + 472: 'H_POLL_PENDING',
> + 484: 'H_QUERY_INT_STATE',
> + 580: 'H_ILLAN_ATTRIBUTES',
> + 592: 'H_MODIFY_HEA_QP',
> + 596: 'H_QUERY_HEA_QP',
> + 600: 'H_QUERY_HEA',
> + 604: 'H_QUERY_HEA_PORT',
> + 608: 'H_MODIFY_HEA_PORT',
> + 612: 'H_REG_BCMC',
> + 616: 'H_DEREG_BCMC',
> + 620: 'H_REGISTER_HEA_RPAGES',
> + 624: 'H_DISABLE_AND_GET_HEA',
> + 628: 'H_GET_HEA_INFO',
> + 632: 'H_ALLOC_HEA_RESOURCE',
> + 644: 'H_ADD_CONN',
> + 648: 'H_DEL_CONN',
> + 664: 'H_JOIN',
> + 676: 'H_VASI_STATE',
> + 688: 'H_ENABLE_CRQ',
> + 696: 'H_GET_EM_PARMS',
> + 720: 'H_SET_MPP',
> + 724: 'H_GET_MPP',
> + 748: 'H_HOME_NODE_ASSOCIATIVITY',
> + 756: 'H_BEST_ENERGY',
> + 764: 'H_XIRR_X',
> + 768: 'H_RANDOM',
> + 772: 'H_COP',
> + 788: 'H_GET_MPP_X',
> + 796: 'H_SET_MODE',
> + 61440: 'H_RTAS',
> +}
> +
> +def hcall_table_lookup(opcode):
> + if (hcall_table.has_key(opcode)):
> + return hcall_table[opcode]
> + else:
> + return opcode
> +
> +print_ptrn = '%-28s%10s%10s%10s%10s'
> +
> +def trace_end():
> + print print_ptrn % ('hcall', 'count', 'min(ns)', 'max(ns)', 'avg(ns)')
> + print '-' * 68
> + for opcode in output:
> + h_name = hcall_table_lookup(opcode)
> + time = output[opcode]['time']
> + cnt = output[opcode]['cnt']
> + min_t = output[opcode]['min']
> + max_t = output[opcode]['max']
> +
> + print print_ptrn % (h_name, cnt, min_t, max_t, time/cnt)
> +
> +def powerpc__hcall_exit(name, context, cpu, sec, nsec, pid, comm, callchain,
> + opcode, retval):
> + if (d_enter.has_key(cpu) and d_enter[cpu].has_key(opcode)):
> + diff = nsecs(sec, nsec) - d_enter[cpu][opcode]
> +
> + if (output.has_key(opcode)):
> + output[opcode]['time'] += diff
> + output[opcode]['cnt'] += 1
> + if (output[opcode]['min'] > diff):
> + output[opcode]['min'] = diff
> + if (output[opcode]['max'] < diff):
> + output[opcode]['max'] = diff
> + else:
> + output[opcode] = {
> + 'time': diff,
> + 'cnt': 1,
> + 'min': diff,
> + 'max': diff,
> + }
> +
> + del d_enter[cpu][opcode]
> +# else:
> +# print "Can't find matching hcall_enter event. Ignoring sample"
> +
> +def powerpc__hcall_entry(event_name, context, cpu, sec, nsec, pid, comm,
> + callchain, opcode):
> + if (d_enter.has_key(cpu)):
> + d_enter[cpu][opcode] = nsecs(sec, nsec)
> + else:
> + d_enter[cpu] = {opcode: nsecs(sec, nsec)}
>

PC


2018-06-07 05:29:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 00/46] perf/core fixes and improvements


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> From: Arnaldo Carvalho de Melo <[email protected]>
>
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit 7869e5889477e4e32e4024d665431b35e8b7b693:
>
> Merge remote-tracking branch 'tip/perf/urgent' into perf/core (2018-06-04 10:28:20 -0300)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.18-20180605
>
> for you to fetch changes up to 03ac4e71cd120d2c3411d106d00d266114575f74:
>
> perf intel-pt: Fix "Unexpected indirect branch" error (2018-06-05 12:28:52 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> perf stat:
>
> . Display user and system time for workload targets (Jiri Olsa)
>
> perf record:
>
> . Enable arbitrary event names thru name= modifier (Alexey Budankov)
>
> PowerPC:
>
> . Add a python script for hypervisor call statistics (Ravi Bangoria)
>
> Intel PT: (Adrian Hunter)
>
> . Fix sync_switch INTEL_PT_SS_NOT_TRACING
>
> . Fix decoding to accept CBR between FUP and corresponding TIP
>
> . Fix MTC timing after overflow
>
> . Fix "Unexpected indirect branch" error
>
> perf test:
>
> . record+probe_libc_inet_pton:
>
> . To get the symbol table for dynamic
> shared objects on ubuntu we need to pass the -D/--dynamic command line
> option, unlike with the fedora distros (Arnaldo Carvalho de Melo)
>
> . code-reading:
>
> . Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
>
> . kmod-path:
>
> . Add tests for vdso32 and vdsox32 (Adrian Hunter)
>
> . Use header file util/debug.h (Thomas Richter)
>
> perf annotate:
>
> . Make the various UI backends (stdio, TUI, gtk) use more consistently
> structs with annotation options as specified by the user (Arnaldo Carvalho de Melo)
>
> . Move annotation specific knobs from the symbol_conf global kitchen
> sink to the annotation option structs (Arnaldo Carvalho de Melo)
>
> Core:
>
> . Fix misleading error for some unparsable events mentioning PMUs when
> those are not involved in the problem (Jiri Olsa)
>
> - Fix symbol and object code resolution for vdso32 and vdsox32 (Adrian Hunter)
>
> . No need to check for null when passing pointers to foo__get() style
> refcount grabbing helpers, just like in the kernel and with free(),
> its safe to pass a NULL pointer to avoid having to check it before
> each and every foo__get() call (Arnaldo Carvalho de Melo)
>
> . Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
>
> . Remove some needless globals, making them local (Arnaldo Carvalho de Melo)
>
> . Reduce usage of symbol_conf.use_callchain, using other means of
> finding out if callchains are in use or available for specific events,
> as we evolved this codebase to allow requesting callchains for just
> a subset of the monitored events. In time it will help polish
> recording and showing mixed sets accross the various tools:
>
> perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions
>
> (Arnaldo Carvalho de Melo)
>
> . Consider PTI entry trampolines in map__rip_2objdump() (Adrian Hunter)
>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> ----------------------------------------------------------------
> Adrian Hunter (8):
> perf tests kmod-path: Add tests for vdso32 and vdsox32
> perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
> perf test code-reading: Fix perf_env setup for PTI entry trampolines
> perf map: Consider PTI entry trampolines in rip_2objdump()
> perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
> perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
> perf intel-pt: Fix MTC timing after overflow
> perf intel-pt: Fix "Unexpected indirect branch" error
>
> Alexey Budankov (1):
> perf record: Enable arbitrary event names thru name= modifier
>
> Arnaldo Carvalho de Melo (33):
> perf tools: Remove dead quote.[ch] code
> perf probe: Use return of map__get() to make code more compact
> perf cgroup: Make evlist__find_cgroup() more compact
> perf tools: No need to check if the argument to __get() function is NULL
> perf annotate: Pass perf_evsel instead of just evsel->idx
> perf annotate: __symbol__acount_cycles doesn't need notes
> perf annotate: Split allocation of annotated_source struct
> perf annotate: Introduce constructor/destructor for annotated_source
> perf annotate: Introduce annotated_source__alloc_histograms
> perf annotate: __symbol__inc_addr_samples() needs just annotated_source
> perf annotate: Introduce symbol__hists()
> perf annotate: Introduce symbol__cycle_hists()
> perf annotate: Stop using symbol_conf.nr_events global in symbol__hists()
> perf annotate: Replace symbol__alloc_hists() with symbol__hists()
> perf tools: Ditch the symbol_conf.nr_events global
> perf annotate: Add comment about annotated_src->nr_histograms
> perf annotate stdio: Use annotation_options consistently
> perf srcline: Introduce map__srcline() to make code more compact
> perf sort: Introduce addr_map_symbol__srcline() to make code more compact
> perf srcline: Make hist_entry srcline helper consistent with map's
> perf annotate: Pass annotation_options to symbol__annotate()
> perf annotate: Adopt anotation options from symbol_conf
> perf annotate: Move disassembler_style global to annotation_options
> perf hists browser: Pass annotation_options from tool to browser
> perf annotate: Move objdump_path to struct annotation_options
> perf report: No need to have report_callchain_help as a global
> perf evsel: Add has_callchain() helper to make code more compact/clear
> perf script: Check if evsel has callchains before trying to use it
> perf sched: Use sched->show_callchain where appropriate
> perf hists: Do not allocate space for callchains for evsels without them
> perf hists: Introduce hist_entry__has_callchain() method
> perf hists: Check if a hist_entry has callchains before using them
> perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols
>
> Jiri Olsa (2):
> perf stat: Display user and system time
> perf tools: Fix pmu events parsing rule
>
> Ravi Bangoria (1):
> perf script powerpc: Python script for hypervisor call statistics
>
> Thomas Richter (1):
> perf test: Use header file util/debug.h
>
> tools/perf/Documentation/perf-list.txt | 6 +-
> tools/perf/Documentation/perf-record.txt | 3 +
> tools/perf/Documentation/perf-stat.txt | 40 +++--
> tools/perf/arch/common.c | 4 +-
> tools/perf/arch/common.h | 4 +-
> tools/perf/builtin-annotate.c | 36 ++--
> tools/perf/builtin-c2c.c | 2 +-
> tools/perf/builtin-kvm.c | 2 -
> tools/perf/builtin-probe.c | 3 +-
> tools/perf/builtin-report.c | 39 ++--
> tools/perf/builtin-sched.c | 14 +-
> tools/perf/builtin-script.c | 12 +-
> tools/perf/builtin-stat.c | 28 ++-
> tools/perf/builtin-top.c | 48 +++--
> tools/perf/builtin-trace.c | 2 +-
> tools/perf/perf.c | 1 -
> .../perf/scripts/python/bin/powerpc-hcalls-record | 2 +
> .../perf/scripts/python/bin/powerpc-hcalls-report | 2 +
> tools/perf/scripts/python/powerpc-hcalls.py | 200 +++++++++++++++++++++
> tools/perf/tests/code-reading.c | 1 +
> tools/perf/tests/kmod-path.c | 16 ++
> tools/perf/tests/parse-events.c | 4 +-
> tools/perf/tests/python-use.c | 3 +-
> .../tests/shell/record+probe_libc_inet_pton.sh | 2 +-
> tools/perf/ui/browsers/annotate.c | 21 ++-
> tools/perf/ui/browsers/hists.c | 43 +++--
> tools/perf/ui/browsers/hists.h | 3 +
> tools/perf/ui/gtk/annotate.c | 2 +-
> tools/perf/ui/gtk/hists.c | 5 +-
> tools/perf/ui/hist.c | 2 +-
> tools/perf/ui/stdio/hist.c | 4 +-
> tools/perf/util/Build | 1 -
> tools/perf/util/annotate.c | 160 ++++++++++-------
> tools/perf/util/annotate.h | 53 ++++--
> tools/perf/util/cgroup.c | 9 +-
> tools/perf/util/dso.c | 2 +
> tools/perf/util/evsel.c | 4 +-
> tools/perf/util/evsel.h | 5 +
> tools/perf/util/header.c | 24 ++-
> tools/perf/util/hist.c | 23 ++-
> tools/perf/util/hist.h | 26 ++-
> .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 23 ++-
> .../perf/util/intel-pt-decoder/intel-pt-decoder.h | 9 +
> tools/perf/util/intel-pt.c | 5 +
> tools/perf/util/map.c | 26 ++-
> tools/perf/util/map.h | 1 +
> tools/perf/util/parse-events.l | 18 +-
> tools/perf/util/parse-events.y | 14 +-
> tools/perf/util/probe-event.c | 3 +-
> tools/perf/util/quote.c | 62 -------
> tools/perf/util/quote.h | 31 ----
> tools/perf/util/session.c | 2 +-
> tools/perf/util/sort.c | 81 +++------
> tools/perf/util/sort.h | 7 +-
> tools/perf/util/symbol.c | 1 -
> tools/perf/util/symbol.h | 3 -
> tools/perf/util/top.h | 3 +-
> 57 files changed, 731 insertions(+), 419 deletions(-)
> create mode 100644 tools/perf/scripts/python/bin/powerpc-hcalls-record
> create mode 100644 tools/perf/scripts/python/bin/powerpc-hcalls-report
> create mode 100644 tools/perf/scripts/python/powerpc-hcalls.py
> delete mode 100644 tools/perf/util/quote.c
> delete mode 100644 tools/perf/util/quote.h

Pulled, thanks a lot Arnaldo!

Ingo

2018-06-07 05:36:27

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH 42/46] perf script powerpc: Python script for hypervisor call statistics

Hi Paul,

On 06/06/2018 08:23 PM, Paul Clarke wrote:
> On 06/05/2018 12:50 PM, Arnaldo Carvalho de Melo wrote:
>> From: Ravi Bangoria <[email protected]>
>>
>> Add python script to show hypervisor call statistics. Ex,
>>
>> # perf record -a -e "{powerpc:hcall_entry,powerpc:hcall_exit}"
>> # perf script -s scripts/python/powerpc-hcalls.py
>> hcall count min(ns) max(ns) avg(ns)
>> --------------------------------------------------------------------
>> H_RANDOM 82 838 1164 904
>> H_PUT_TCE 47 1078 5928 2003
>> H_EOI 266 1336 3546 1654
>> H_ENTER 28 1646 4038 1952
>> H_PUT_TCE_INDIRECT 230 2166 18168 6109
>> H_IPI 238 1072 3232 1688
>> H_SEND_LOGICAL_LAN 42 5488 21366 7694
>> H_STUFF_TCE 294 986 6210 3591
>> H_XIRR 266 2286 6990 3783
>> H_PROTECT 10 2196 3556 2555
>> H_VIO_SIGNAL 294 1028 2784 1311
>> H_ADD_LOGICAL_LAN_BUFFER 53 1978 3450 2600
>> H_SEND_CRQ 77 1762 7240 2447
>
> This translation from HCALL code to mnemonic is more generally useful. Is there a good way to make the "hcall_table_lookup" function more generally available, like "syscall_name" in scripts/python/Perf-Trace-Util/lib/Perf/Trace/Util.py ? It's even simpler than the syscall ID-to-name mapping, because the HCALL codes are constant, unlike syscall IDs which vary between arches.
>

Right, but I don't see any other python script for hcalls. So, I thought
to add it in the script itself. Let me know if you are planning to add
any :) We will move it in a generic module.

Ravi


2018-06-07 13:44:01

by Paul A. Clarke

[permalink] [raw]
Subject: Re: [PATCH 42/46] perf script powerpc: Python script for hypervisor call statistics

On 06/07/2018 12:34 AM, Ravi Bangoria wrote:
> On 06/06/2018 08:23 PM, Paul Clarke wrote:
>> On 06/05/2018 12:50 PM, Arnaldo Carvalho de Melo wrote:
>>> From: Ravi Bangoria <[email protected]>
>>>
>>> Add python script to show hypervisor call statistics. Ex,
>>>
>>> # perf record -a -e "{powerpc:hcall_entry,powerpc:hcall_exit}"
>>> # perf script -s scripts/python/powerpc-hcalls.py
>>> hcall count min(ns) max(ns) avg(ns)
>>> --------------------------------------------------------------------
>>> H_RANDOM 82 838 1164 904
>>> H_PUT_TCE 47 1078 5928 2003
>>> H_EOI 266 1336 3546 1654
>>> H_ENTER 28 1646 4038 1952
>>> H_PUT_TCE_INDIRECT 230 2166 18168 6109
>>> H_IPI 238 1072 3232 1688
>>> H_SEND_LOGICAL_LAN 42 5488 21366 7694
>>> H_STUFF_TCE 294 986 6210 3591
>>> H_XIRR 266 2286 6990 3783
>>> H_PROTECT 10 2196 3556 2555
>>> H_VIO_SIGNAL 294 1028 2784 1311
>>> H_ADD_LOGICAL_LAN_BUFFER 53 1978 3450 2600
>>> H_SEND_CRQ 77 1762 7240 2447
>>
>> This translation from HCALL code to mnemonic is more generally useful. Is there a good way to make the "hcall_table_lookup" function more generally available, like "syscall_name" in scripts/python/Perf-Trace-Util/lib/Perf/Trace/Util.py ? It's even simpler than the syscall ID-to-name mapping, because the HCALL codes are constant, unlike syscall IDs which vary between arches.
>>
>
> Right, but I don't see any other python script for hcalls. So, I thought
> to add it in the script itself. Let me know if you are planning to add
> any :) We will move it in a generic module.

We're already doing the same thing in an external script. (https://github.com/open-power-sdk/curt/blob/master/curt.py#L264 revision 107c97c3aadba3177bc8cf35b06ba165e6ac557f.)

...which reminds me that I should consider sending that script to linux-perf-users to see if it's interesting enough to be included upstream.

PC


2018-06-07 13:48:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 42/46] perf script powerpc: Python script for hypervisor call statistics

Em Thu, Jun 07, 2018 at 08:41:46AM -0500, Paul Clarke escreveu:
> On 06/07/2018 12:34 AM, Ravi Bangoria wrote:
> > On 06/06/2018 08:23 PM, Paul Clarke wrote:
> >> On 06/05/2018 12:50 PM, Arnaldo Carvalho de Melo wrote:
> >>> From: Ravi Bangoria <[email protected]>
> >>>
> >>> Add python script to show hypervisor call statistics. Ex,
> >>>
> >>> # perf record -a -e "{powerpc:hcall_entry,powerpc:hcall_exit}"
> >>> # perf script -s scripts/python/powerpc-hcalls.py
> >>> hcall count min(ns) max(ns) avg(ns)
> >>> --------------------------------------------------------------------
> >>> H_RANDOM 82 838 1164 904
> >>> H_PUT_TCE 47 1078 5928 2003
> >>> H_EOI 266 1336 3546 1654
> >>> H_ENTER 28 1646 4038 1952
> >>> H_PUT_TCE_INDIRECT 230 2166 18168 6109
> >>> H_IPI 238 1072 3232 1688
> >>> H_SEND_LOGICAL_LAN 42 5488 21366 7694
> >>> H_STUFF_TCE 294 986 6210 3591
> >>> H_XIRR 266 2286 6990 3783
> >>> H_PROTECT 10 2196 3556 2555
> >>> H_VIO_SIGNAL 294 1028 2784 1311
> >>> H_ADD_LOGICAL_LAN_BUFFER 53 1978 3450 2600
> >>> H_SEND_CRQ 77 1762 7240 2447
> >>
> >> This translation from HCALL code to mnemonic is more generally useful. Is there a good way to make the "hcall_table_lookup" function more generally available, like "syscall_name" in scripts/python/Perf-Trace-Util/lib/Perf/Trace/Util.py ? It's even simpler than the syscall ID-to-name mapping, because the HCALL codes are constant, unlike syscall IDs which vary between arches.
> >>
> >
> > Right, but I don't see any other python script for hcalls. So, I thought
> > to add it in the script itself. Let me know if you are planning to add
> > any :) We will move it in a generic module.

> We're already doing the same thing in an external script.
> (https://github.com/open-power-sdk/curt/blob/master/curt.py#L264
> revision 107c97c3aadba3177bc8cf35b06ba165e6ac557f.)

> ...which reminds me that I should consider sending that script to
> linux-perf-users to see if it's interesting enough to be included
> upstream.

If you don't have any restriction, then its interesting to have it
included, I think, this way we can find these kinds of needed common
infrastructure more easily, and also when doing changes take into
account existing scripts.

- Arnaldo