LinuxLists.cc - [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)

2014-02-07 01:35:41

Subject: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)

Hello,

This is a new attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

In this version, I separated out the --percentage patchset not to have
dependency on it since it requires further works.

Please see the patch 01/21. I refactored functions that add hist
entries with struct hist_entry_iter. While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain. A hist_entry now has an additional fields to keep the
cumulative period if --children option is given on perf report.

I changed the option as a separate --children and added a new
"Children" column (and renamed the default "Overhead" column into
"Self"). The output will be sorted by children (cumulative) overhead
for now. The reason I changed to the --children is that I still think
it's much different from other --call-graph options. The --call-graph
option will take care of it even with --children option.

I know that the UI should be changed also to be more flexible as Ingo
requested, but I'd like to do this first and then move to work on the
next. I also added a new config option to enable it by default.

* changes in v8:
- not depends on --percentage patchkit
- fix callchain resolving bug (Jiri)
- convert to sample__resolve_{mem,bstack}
- eliminate 'event' field from hist_entry_iter

* changes in v7:
- add Tested-by tags from Arun
- rebase onto current acme/perf/core

* changes in v6:
- separate struct hist_iter_ops (Jiri)
- check iter->he before calling ->add_entry_cb (Jiri)
- fix locking issue on perf top (Jiri)

* changes in v5:
- support both of --children and --call-graph (Arun)
- refactor hist_entry_iter to share with perf top (Jiri)
- various cleanups and fixes (Jiri)
- add ack's from Jiri

* changes in v4:
- change to --children option (Ingo)
- rebased on new annotation change (Arnaldo)
- support perf top also
- enable --children option by default (Ingo)

* changes in v3:
- change to --cumulate option
- fix a couple of bugs (Jiri, Rodrigo)
- rename some help functions (Arnaldo)
- cache previous hist entries rathen than just symbol and dso
- add some preparatory cleanups
- add report.cumulate config option

Let me show you an example:

$ cat abc.c
#define barrier() asm volatile("" ::: "memory")

void a(void)
{
int i;
for (i = 0; i < 1000000; i++)
barrier();
}
void b(void)
{
a();
}
void c(void)
{
b();
}
int main(void)
{
c();
return 0;
}

With this simple program I ran perf record and report:

$ perf record -g -e cycles:u ./abc

Case 1.

$ perf report --stdio --no-call-graph --no-children

# Overhead Command Shared Object Symbol
# ........ ....... ................. ..............
#
91.50% abc abc [.] a
8.18% abc ld-2.17.so [.] strlen
0.31% abc [kernel.kallsyms] [k] page_fault
0.01% abc ld-2.17.so [.] _start

Case 2. (current default behavior)

$ perf report --stdio --call-graph --no-children

# Overhead Command Shared Object Symbol
# ........ ....... ................. ..............
#
91.50% abc abc [.] a
|
--- a
b
c
main
__libc_start_main

8.18% abc ld-2.17.so [.] strlen
|
--- strlen
_dl_sysdep_start

0.31% abc [kernel.kallsyms] [k] page_fault
|
--- page_fault
_start

0.01% abc ld-2.17.so [.] _start
|
--- _start

Case 3.

$ perf report --no-call-graph --children --stdio

# Self Children Command Shared Object Symbol
# ........ ........ ....... ................. .....................
#
0.00% 91.50% abc libc-2.17.so [.] __libc_start_main
0.00% 91.50% abc abc [.] main
0.00% 91.50% abc abc [.] c
0.00% 91.50% abc abc [.] b
91.50% 91.50% abc abc [.] a
0.00% 8.18% abc ld-2.17.so [.] _dl_sysdep_start
8.18% 8.18% abc ld-2.17.so [.] strlen
0.01% 0.33% abc ld-2.17.so [.] _start
0.31% 0.31% abc [kernel.kallsyms] [k] page_fault

As you can see __libc_start_main -> main -> c -> b -> a callchain show
up in the output.

Finally, it looks like below with both option enabled:

Case 4. (default behavior?)

$ perf report --call-graph --children --stdio

# Self Children Command Shared Object Symbol
# ........ ........ ....... ................. .....................
#
0.00% 91.50% abc libc-2.17.so [.] __libc_start_main
|
--- __libc_start_main

0.00% 91.50% abc abc [.] main
|
--- main
__libc_start_main

0.00% 91.50% abc abc [.] c
|
--- c
main
__libc_start_main

0.00% 91.50% abc abc [.] b
|
--- b
c
main
__libc_start_main

91.50% 91.50% abc abc [.] a
|
--- a
b
c
main
__libc_start_main
...

Currently the perf enables both of --call-graph and --children when it
finds callchains in the samples. While this is useful for TUI or GTK,
I'm not sure for stdio as it'd consume so much lines.

It does not handle all kind of cases like event groups and annotations
yet, but I really want to release it and get reviews.

You can also get this series on 'perf/cumulate-v8' branch in my tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks.
Namhyung

Cc: Arun Sharma <[email protected]>
Cc: Frederic Weisbecker <[email protected]>

[1] https://lkml.org/lkml/2012/3/31/6

Namhyung Kim (21):
perf tools: Introduce struct hist_entry_iter
perf hists: Add support for accumulated stat of hist entry
perf hists: Check if accumulated when adding a hist entry
perf hists: Accumulate hist entry stat based on the callchain
perf tools: Update cpumode for each cumulative entry
perf report: Cache cumulative callchains
perf callchain: Add callchain_cursor_snapshot()
perf tools: Save callchain info for each cumulative entry
perf hists: Sort hist entries by accumulated period
perf ui/hist: Add support to accumulated hist stat
perf ui/browser: Add support to accumulated hist stat
perf ui/gtk: Add support to accumulated hist stat
perf tools: Apply percent-limit to cumulative percentage
perf tools: Add more hpp helper functions
perf report: Add --children option
perf report: Add report.children config option
perf tools: Add callback function to hist_entry_iter
perf top: Convert to hist_entry_iter
perf top: Add --children option
perf top: Add top.children config option
perf tools: Enable --children option by default

tools/perf/Documentation/perf-report.txt | 5 +
tools/perf/Documentation/perf-top.txt | 6 +
tools/perf/builtin-annotate.c | 3 +-
tools/perf/builtin-diff.c | 2 +-
tools/perf/builtin-report.c | 178 +++--------
tools/perf/builtin-top.c | 89 +++---
tools/perf/tests/hists_link.c | 4 +-
tools/perf/ui/browsers/hists.c | 50 ++--
tools/perf/ui/gtk/hists.c | 20 +-
tools/perf/ui/hist.c | 62 ++++
tools/perf/ui/stdio/hist.c | 4 +-
tools/perf/util/callchain.c | 45 ++-
tools/perf/util/callchain.h | 11 +
tools/perf/util/hist.c | 499 ++++++++++++++++++++++++++++++-
tools/perf/util/hist.h | 45 ++-
tools/perf/util/sort.h | 18 +-
tools/perf/util/symbol.c | 11 +-
tools/perf/util/symbol.h | 1 +
18 files changed, 836 insertions(+), 217 deletions(-)

--
1.7.11.7

2014-02-07 01:36:00

Subject: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)

Subject: [PATCH 20/21] perf top: Add top.children config option

Subject: [PATCH 19/21] perf top: Add --children option

Subject: [PATCH 05/21] perf tools: Update cpumode for each cumulative entry

Subject: [PATCH 06/21] perf report: Cache cumulative callchains

Subject: [PATCH 07/21] perf callchain: Add callchain_cursor_snapshot()

Subject: [PATCH 10/21] perf ui/hist: Add support to accumulated hist stat

Subject: [PATCH 11/21] perf ui/browser: Add support to accumulated hist stat

Subject: [PATCH 12/21] perf ui/gtk: Add support to accumulated hist stat

Subject: [PATCH 18/21] perf top: Convert to hist_entry_iter

Subject: [PATCH 21/21] perf tools: Enable --children option by default

Subject: [PATCH 08/21] perf tools: Save callchain info for each cumulative entry

Subject: [PATCH 17/21] perf tools: Add callback function to hist_entry_iter

Subject: [PATCH 09/21] perf hists: Sort hist entries by accumulated period

Subject: [PATCH 15/21] perf report: Add --children option

Subject: [PATCH 16/21] perf report: Add report.children config option

Subject: [PATCH 01/21] perf tools: Introduce struct hist_entry_iter

Subject: [PATCH 02/21] perf hists: Add support for accumulated stat of hist entry

Subject: [PATCH 03/21] perf hists: Check if accumulated when adding a hist entry

Subject: [PATCH 04/21] perf hists: Accumulate hist entry stat based on the callchain

Subject: [PATCH 13/21] perf tools: Apply percent-limit to cumulative percentage

Subject: [PATCH 14/21] perf tools: Add more hpp helper functions

Subject: Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)

Subject: Re: [PATCH 18/21] perf top: Convert to hist_entry_iter

Subject: Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)

Subject: Re: [PATCH 18/21] perf top: Convert to hist_entry_iter

Subject: Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)

Subject: Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v8)