LinuxLists.cc - [PATCHSET 00/28] perf tools: Add support to accumulate hist periods (v5)

2014-01-08 08:46:46

Subject: [PATCHSET 00/28] perf tools: Add support to accumulate hist periods (v5)

Hello,

This is my third attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

The patch 01 to 03 are independent cleanups and can be applied separately.

Please see the patch 04/28. I refactored functions that add hist
entries with struct add_entry_iter. While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain. A hist_entry now has an additional fields to keep the
cumulative period if --children option is given on perf report.

I changed the option as a separate --children and added a new
"Children" column (and renamed the default "Overhead" column into
"Self"). The output will be sorted by children (cumulative) overhead
for now. The reason I changed to the --children is that I still think
it's much different from other --call-graph options. The --call-graph
option will take care of it even with --children option.

I know that the UI should be changed also to be more flexible as Ingo
requested, but I'd like to do this first and then move to work on the
next. I also added a new config option to enable it by default.

* changes in v5:
- support both of --children and --call-graph (Arun)
- refactor hist_entry_iter to share with perf top (Jiri)
- various cleanups and fixes (Jiri)
- add ack's from Jiri

* changes in v4:
- change to --children option (Ingo)
- rebased on new annotation change (Arnaldo)
- support perf top also
- enable --children option by default (Ingo)

* changes in v3:
- change to --cumulate option
- fix a couple of bugs (Jiri, Rodrigo)
- rename some help functions (Arnaldo)
- cache previous hist entries rathen than just symbol and dso
- add some preparatory cleanups
- add report.cumulate config option

Let me show you an example:

$ cat abc.c
#define barrier() asm volatile("" ::: "memory")

void a(void)
{
int i;
for (i = 0; i < 1000000; i++)
barrier();
}
void b(void)
{
a();
}
void c(void)
{
b();
}
int main(void)
{
c();
return 0;
}

With this simple program I ran perf record and report:

$ perf record -g -e cycles:u ./abc

Case 1.

$ perf report --stdio --no-call-graph --no-children

# Overhead Command Shared Object Symbol
# ........ ....... ................. ..............
#
91.50% abc abc [.] a
8.18% abc ld-2.17.so [.] strlen
0.31% abc [kernel.kallsyms] [k] page_fault
0.01% abc ld-2.17.so [.] _start

Case 2. (current default behavior)

$ perf report --stdio --call-graph --no-children

# Overhead Command Shared Object Symbol
# ........ ....... ................. ..............
#
91.50% abc abc [.] a
|
--- a
b
c
main
__libc_start_main

8.18% abc ld-2.17.so [.] strlen
|
--- strlen
_dl_sysdep_start

0.31% abc [kernel.kallsyms] [k] page_fault
|
--- page_fault
_start

0.01% abc ld-2.17.so [.] _start
|
--- _start

Case 3.

$ perf report --no-call-graph --children --stdio

# Self Children Command Shared Object Symbol
# ........ ........ ....... ................. .....................
#
0.00% 91.50% abc libc-2.17.so [.] __libc_start_main
0.00% 91.50% abc abc [.] main
0.00% 91.50% abc abc [.] c
0.00% 91.50% abc abc [.] b
91.50% 91.50% abc abc [.] a
0.00% 8.18% abc ld-2.17.so [.] _dl_sysdep_start
8.18% 8.18% abc ld-2.17.so [.] strlen
0.01% 0.33% abc ld-2.17.so [.] _start
0.31% 0.31% abc [kernel.kallsyms] [k] page_fault

As you can see __libc_start_main -> main -> c -> b -> a callchain show
up in the output.

Finally, it looks like below with both option enabled:

Case 4. (default behavior?)

$ perf report --call-graph --children --stdio

# Self Children Command Shared Object Symbol
# ........ ........ ....... ................. .....................
#
0.00% 91.50% abc libc-2.17.so [.] __libc_start_main
|
--- __libc_start_main

0.00% 91.50% abc abc [.] main
|
--- main
__libc_start_main

0.00% 91.50% abc abc [.] c
|
--- c
main
__libc_start_main

0.00% 91.50% abc abc [.] b
|
--- b
c
main
__libc_start_main

91.50% 91.50% abc abc [.] a
|
--- a
b
c
main
__libc_start_main
...

Currently the perf enables both of --call-graph and --children when it
finds callchains in the samples. While this is useful for TUI or GTK,
I'm not sure for stdio as it'd consume so much lines.

I know it have some rough edges or even bugs, but I really want to
release it and get reviews. It does not handle event groups and
annotations yet.

You can also get this series on 'perf/cumulate-v5' branch in my tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks.
Namhyung

Cc: Arun Sharma <[email protected]>
Cc: Frederic Weisbecker <[email protected]>

[1] https://lkml.org/lkml/2012/3/31/6

Namhyung Kim (28):
perf tools: Insert filtered entries to hists also
perf tools: Do not update total period of a hists when filtering
perf tools: Remove symbol_conf.use_callchain check
perf tools: Introduce struct hist_entry_iter
perf hists: Convert hist entry functions to use struct he_stat
perf hists: Add support for accumulated stat of hist entry
perf hists: Check if accumulated when adding a hist entry
perf hists: Accumulate hist entry stat based on the callchain
perf tools: Update cpumode for each cumulative entry
perf report: Cache cumulative callchains
perf callchain: Add callchain_cursor_snapshot()
perf tools: Save callchain info for each cumulative entry
perf hists: Sort hist entries by accumulated period
perf ui/hist: Add support to accumulated hist stat
perf ui/browser: Add support to accumulated hist stat
perf ui/gtk: Add support to accumulated hist stat
perf tools: Apply percent-limit to cumulative percentage
perf tools: Add more hpp helper functions
perf report: Add --children option
perf report: Add report.children config option
perf tools: Factor out sample__resolve_callchain()
perf tools: Factor out fill_callchain_info()
perf tools: Factor out hist_entry_iter code
perf tools: Add callback function to hist_entry_iter
perf top: Convert to hist_entry_iter
perf top: Add --children option
perf top: Add top.children config option
perf tools: Enable --children option by default

tools/perf/Documentation/perf-report.txt | 5 +
tools/perf/Documentation/perf-top.txt | 6 +
tools/perf/builtin-annotate.c | 3 +-
tools/perf/builtin-diff.c | 2 +-
tools/perf/builtin-report.c | 202 +++---------
tools/perf/builtin-top.c | 104 +++---
tools/perf/tests/hists_link.c | 4 +-
tools/perf/ui/browsers/hists.c | 50 +--
tools/perf/ui/gtk/hists.c | 20 +-
tools/perf/ui/hist.c | 62 ++++
tools/perf/ui/stdio/hist.c | 4 +-
tools/perf/util/callchain.c | 66 ++++
tools/perf/util/callchain.h | 17 +
tools/perf/util/event.c | 18 +-
tools/perf/util/hist.c | 542 +++++++++++++++++++++++++++++--
tools/perf/util/hist.h | 49 ++-
tools/perf/util/machine.c | 2 -
tools/perf/util/sort.h | 18 +-
tools/perf/util/symbol.c | 11 +-
tools/perf/util/symbol.h | 3 +-
20 files changed, 899 insertions(+), 289 deletions(-)

--
1.7.11.7

2014-01-08 08:46:56

Subject: [PATCHSET 00/28] perf tools: Add support to accumulate hist periods (v5)

Subject: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: [PATCH 03/28] perf tools: Remove symbol_conf.use_callchain check

Subject: [PATCH 02/28] perf tools: Do not update total period of a hists when filtering

Subject: [PATCH 12/28] perf tools: Save callchain info for each cumulative entry

Subject: [PATCH 11/28] perf callchain: Add callchain_cursor_snapshot()

Subject: [PATCH 10/28] perf report: Cache cumulative callchains

Subject: [PATCH 09/28] perf tools: Update cpumode for each cumulative entry

Subject: [PATCH 13/28] perf hists: Sort hist entries by accumulated period

Subject: [PATCH 08/28] perf hists: Accumulate hist entry stat based on the callchain

Subject: [PATCH 24/28] perf tools: Add callback function to hist_entry_iter

Subject: [PATCH 22/28] perf tools: Factor out fill_callchain_info()

Subject: [PATCH 26/28] perf top: Add --children option

Subject: [PATCH 25/28] perf top: Convert to hist_entry_iter

Subject: [PATCH 23/28] perf tools: Factor out hist_entry_iter code

Subject: [PATCH 28/28] perf tools: Enable --children option by default

Subject: [PATCH 21/28] perf tools: Factor out sample__resolve_callchain()

Subject: [PATCH 27/28] perf top: Add top.children config option

Subject: [PATCH 20/28] perf report: Add report.children config option

Subject: [PATCH 19/28] perf report: Add --children option

Subject: [PATCH 16/28] perf ui/gtk: Add support to accumulated hist stat

Subject: [PATCH 17/28] perf tools: Apply percent-limit to cumulative percentage

Subject: [PATCH 18/28] perf tools: Add more hpp helper functions

Subject: [PATCH 15/28] perf ui/browser: Add support to accumulated hist stat

Subject: [PATCH 05/28] perf hists: Convert hist entry functions to use struct he_stat

Subject: [PATCH 14/28] perf ui/hist: Add support to accumulated hist stat

Subject: [PATCH 06/28] perf hists: Add support for accumulated stat of hist entry

Subject: [PATCH 07/28] perf hists: Check if accumulated when adding a hist entry

Subject: [PATCH 04/28] perf tools: Introduce struct hist_entry_iter

Subject: Re: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: Re: [PATCH 03/28] perf tools: Remove symbol_conf.use_callchain check

Subject: Re: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: Re: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: Re: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: Re: [PATCH 03/28] perf tools: Remove symbol_conf.use_callchain check

Subject: Re: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: Re: [PATCH 10/28] perf report: Cache cumulative callchains

Subject: Re: [PATCH 10/28] perf report: Cache cumulative callchains

Subject: Re: [PATCH 23/28] perf tools: Factor out hist_entry_iter code

Subject: Re: [PATCH 25/28] perf top: Convert to hist_entry_iter

Subject: Re: [PATCH 10/28] perf report: Cache cumulative callchains

Subject: Re: [PATCH 23/28] perf tools: Factor out hist_entry_iter code

Subject: Re: [PATCH 25/28] perf top: Convert to hist_entry_iter

Subject: Re: [PATCH 25/28] perf top: Convert to hist_entry_iter

Subject: Re: [PATCH 10/28] perf report: Cache cumulative callchains

Subject: Re: [PATCH 01/28] perf tools: Insert filtered entries to hists also

Subject: Re: [PATCH 10/28] perf report: Cache cumulative callchains