2020-03-25 16:42:44

by Tony Jones

[permalink] [raw]
Subject: [PATCH] perf tools: update docs regarding kernel/user space unwinding

The method of unwinding for kernel space is defined by the kernel config,
not by the value of --call-graph. Improve the documentation to reflect
this.

Signed-off-by: Tony Jones <[email protected]>

---
tools/perf/Documentation/perf-config.txt | 14 ++++++++------
tools/perf/Documentation/perf-record.txt | 18 ++++++++++++------
2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 8ead55593984..88cf35fbedc5 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -405,14 +405,16 @@ ui.*::
This option is only applied to TUI.

call-graph.*::
- When sub-commands 'top' and 'report' work with -g/—-children
- there're options in control of call-graph.
+ The following controls the handling of call-graphs (obtained via the
+ -g/--callgraph options).

call-graph.record-mode::
- The record-mode can be 'fp' (frame pointer), 'dwarf' and 'lbr'.
- The value of 'dwarf' is effective only if perf detect needed library
- (libunwind or a recent version of libdw).
- 'lbr' only work for cpus that support it.
+ The mode for user space can be 'fp' (frame pointer), 'dwarf'
+ and 'lbr'. The value 'dwarf' is effective only if libunwind
+ (or a recent version of libdw) is present on the system;
+ the value 'lbr' only works for certain cpus. The method for
+ kernel space is controlled not by this option but by the
+ kernel config (CONFIG_UNWINDER_*).

call-graph.dump-size::
The size of stack to dump in order to do post-unwinding. Default is 8192 (byte).
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 7f4db7592467..b25e028458e2 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -237,16 +237,22 @@ OPTIONS
option and remains only for backward compatibility. See --event.

-g::
- Enables call-graph (stack chain/backtrace) recording.
+ Enables call-graph (stack chain/backtrace) recording for both
+ kernel space and user space.

--call-graph::
Setup and enable call-graph (stack chain/backtrace) recording,
- implies -g. Default is "fp".
+ implies -g. Default is "fp" (for user space).

- Allows specifying "fp" (frame pointer) or "dwarf"
- (DWARF's CFI - Call Frame Information) or "lbr"
- (Hardware Last Branch Record facility) as the method to collect
- the information used to show the call graphs.
+ The unwinding method used for kernel space is dependent on the
+ unwinder used by the active kernel configuration, i.e
+ CONFIG_UNWINDER_FRAME_POINTER (fp) or CONFIG_UNWINDER_ORC (orc)
+
+ Any option specified here controls the method used for user space.
+
+ Valid options are "fp" (frame pointer), "dwarf" (DWARF's CFI -
+ Call Frame Information) or "lbr" (Hardware Last Branch Record
+ facility).

In some systems, where binaries are build with gcc
--fomit-frame-pointer, using the "fp" method will produce bogus
--
2.25.0


2020-03-25 19:19:37

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf tools: update docs regarding kernel/user space unwinding

Em Wed, Mar 25, 2020 at 09:40:53AM -0700, Tony Jones escreveu:
> The method of unwinding for kernel space is defined by the kernel config,
> not by the value of --call-graph. Improve the documentation to reflect
> this.

Fixed the callgraph -> call-graph bit, as you pointed out privately,
applied.

About your question, to get the answer in some public location as
documentation about perf usage:

> As an aside, for record path, do you know where PERF_SAMPLE_CALLCHAIN
> is actually set before being passed to kernel space?

So, I think is somewhere down from perf_evsel__config()... its in:

perf_evsel__set_sample_bit(evsel, CALLCHAIN);

which is set at:

$ perf probe -x ~/bin/perf -L __perf_evsel__config_callchain
<__perf_evsel__config_callchain@/home/acme/git/perf/tools/perf/util/evsel.c:0>
0 static void __perf_evsel__config_callchain(struct evsel *evsel,
struct record_opts *opts,
struct callchain_param *param)
3 {
4 bool function = perf_evsel__is_function_event(evsel);
5 struct perf_event_attr *attr = &evsel->core.attr;

7 perf_evsel__set_sample_bit(evsel, CALLCHAIN);

9 attr->sample_max_stack = param->max_stack;

11 if (opts->kernel_callchains)
12 attr->exclude_callchain_user = 1;
13 if (opts->user_callchains)
14 attr->exclude_callchain_kernel = 1;
15 if (param->record_mode == CALLCHAIN_LBR) {


Line 7 of __perf_evsel__config_callchain(), so lets use perf probe +
perf trace + perf callchains to see where perf callchains are asked from
the kernel:

[root@seventh ~]# perf probe -x ~/bin/perf __perf_evsel__config_callchain:7
Added new event:
probe_perf:__perf_evsel__config_callchain_L7 (on __perf_evsel__config_callchain:7 in /home/acme/bin/perf)

You can now use it in all perf tools, such as:

perf record -e probe_perf:__perf_evsel__config_callchain_L7 -aR sleep 1

[root@seventh ~]#
[root@seventh ~]# perf trace -e probe_perf:*callchain*/max-stack=16/ perf record -g sleep 1
0.000 perf/14860 probe_perf:__perf_evsel__config_callchain_L7(__probe_ip: 5263069)
__perf_evsel__config_callchain (/home/acme/bin/perf)
perf_evsel__config_callchain (/home/acme/bin/perf)
perf_evsel__config (/home/acme/bin/perf)
perf_evlist__config (/home/acme/bin/perf)
record__open (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
run_argv (/home/acme/bin/perf)
main (/home/acme/bin/perf)
__libc_start_main (/usr/lib64/libc-2.29.so)
[0] ([unknown])
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (8 samples) ]
[root@seventh ~]#

That [0] is the ugly part here, have seen it before, need to nail it
down, unsee it and all the rest seems ok, right?

- Arnaldo

2020-03-25 19:31:45

by Tony Jones

[permalink] [raw]
Subject: Re: [PATCH] perf tools: update docs regarding kernel/user space unwinding

On Wed, Mar 25, 2020 at 04:17:57PM -0300, Arnaldo Carvalho de Melo wrote:

> > As an aside, for record path, do you know where PERF_SAMPLE_CALLCHAIN
> > is actually set before being passed to kernel space?
>
> So, I think is somewhere down from perf_evsel__config()... its in:
>
> perf_evsel__set_sample_bit(evsel, CALLCHAIN);

Thanks for the pointer. I was having a hard time finding it, being too specific.

Tony

2020-03-26 21:33:26

by Paul A. Clarke

[permalink] [raw]
Subject: Re: [PATCH] perf tools: update docs regarding kernel/user space unwinding

On 3/25/20 11:40 AM, Tony Jones wrote:
> The method of unwinding for kernel space is defined by the kernel config,
> not by the value of --call-graph. Improve the documentation to reflect
> this.

> diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
> index 8ead55593984..88cf35fbedc5 100644
> --- a/tools/perf/Documentation/perf-config.txt
> +++ b/tools/perf/Documentation/perf-config.txt
> @@ -405,14 +405,16 @@ ui.*::
> This option is only applied to TUI.
>
> call-graph.*::
> - When sub-commands 'top' and 'report' work with -g/—-children
> - there're options in control of call-graph.
> + The following controls the handling of call-graphs (obtained via the
> + -g/--callgraph options).
>
> call-graph.record-mode::
> - The record-mode can be 'fp' (frame pointer), 'dwarf' and 'lbr'.
> - The value of 'dwarf' is effective only if perf detect needed library
> - (libunwind or a recent version of libdw).
> - 'lbr' only work for cpus that support it.
> + The mode for user space can be 'fp' (frame pointer), 'dwarf'
> + and 'lbr'. The value 'dwarf' is effective only if libunwind
> + (or a recent version of libdw) is present on the system;
> + the value 'lbr' only works for certain cpus. The method for
> + kernel space is controlled not by this option but by the
> + kernel config (CONFIG_UNWINDER_*).

Your changes are just copying the old text, so this isn't a criticism of your patches.

Do we have information to replace "a recent version of libdw", which will quickly get stale?

PC

2020-03-27 20:11:10

by Tony Jones

[permalink] [raw]
Subject: Re: [PATCH] perf tools: update docs regarding kernel/user space unwinding

On Thu, Mar 26, 2020 at 04:32:26PM -0500, Paul Clarke wrote:
> > + and 'lbr'. The value 'dwarf' is effective only if libunwind
> > + (or a recent version of libdw) is present on the system;
> > + the value 'lbr' only works for certain cpus. The method for
> > + kernel space is controlled not by this option but by the
> > + kernel config (CONFIG_UNWINDER_*).
>
> Your changes are just copying the old text, so this isn't a criticism of your patches.
>
> Do we have information to replace "a recent version of libdw", which will quickly get stale?

Hi Paul.

The original "(libunwind or a recent version of libdw)" text was from Feb 2016. So a while ago.

bd0419e2a5a9f requires >= 0.157 but this is for probing. 0a4f2b6a3ba50 specifies >= 0.158 but I see no mention of
why in the commit but since it's from 2014 and elfutils is now at 0.178, I think it's safe to just remove the
reference.

As an aside, there is a lot of detail in perf-config.txt that's available in some of the other subcomands help files.
Seems a good way for things to get stale. It could also do with some grammatical cleanup.

Tony

2020-03-27 20:19:25

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf tools: update docs regarding kernel/user space unwinding



On March 27, 2020 5:09:34 PM GMT-03:00, Tony Jones <[email protected]> wrote:
>On Thu, Mar 26, 2020 at 04:32:26PM -0500, Paul Clarke wrote:
>> > + and 'lbr'. The value 'dwarf' is effective only if libunwind
>> > + (or a recent version of libdw) is present on the system;
>> > + the value 'lbr' only works for certain cpus. The method for
>> > + kernel space is controlled not by this option but by the
>> > + kernel config (CONFIG_UNWINDER_*).
>>
>> Your changes are just copying the old text, so this isn't a criticism
>of your patches.
>>
>> Do we have information to replace "a recent version of libdw", which
>will quickly get stale?
>
>Hi Paul.
>
>The original "(libunwind or a recent version of libdw)" text was from
>Feb 2016. So a while ago.

Unfortunate wording, would be better to have the version where the required feature was added to libdw.

>
>bd0419e2a5a9f requires >= 0.157 but this is for probing. 0a4f2b6a3ba50
>specifies >= 0.158 but I see no mention of
>why in the commit but since it's from 2014 and elfutils is now at
>0.178, I think it's safe to just remove the
>reference.
>
>As an aside, there is a lot of detail in perf-config.txt that's
>available in some of the other subcomands help files.
>Seems a good way for things to get stale. It could also do with some
>grammatical cleanup.

English as a second language, many contributors, please consider sending fixes, would be really appreciated,

Thanks,

- Arnaldo
>
>Tony

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-03-27 20:35:31

by Tony Jones

[permalink] [raw]
Subject: Re: [PATCH] perf tools: update docs regarding kernel/user space unwinding

On Fri, Mar 27, 2020 at 05:17:59PM -0300, Arnaldo Melo wrote:

> English as a second language, many contributors, please consider sending fixes, would be really appreciated,

Understood and I was planning on doing so.

Tony

2020-04-04 08:45:05

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: perf/urgent] perf callchain: Update docs regarding kernel/user space unwinding

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: eadcaa3dfd706bbf46682c8b8b5979262443c3c3
Gitweb: https://git.kernel.org/tip/eadcaa3dfd706bbf46682c8b8b5979262443c3c3
Author: Tony Jones <[email protected]>
AuthorDate: Wed, 25 Mar 2020 09:40:53 -07:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Wed, 25 Mar 2020 16:13:21 -03:00

perf callchain: Update docs regarding kernel/user space unwinding

The method of unwinding for kernel space is defined by the kernel
config, not by the value of --call-graph. Improve the documentation to
reflect this.

Signed-off-by: Tony Jones <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-config.txt | 14 ++++++++------
tools/perf/Documentation/perf-record.txt | 18 ++++++++++++------
2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 8ead555..f16d8a7 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -405,14 +405,16 @@ ui.*::
This option is only applied to TUI.

call-graph.*::
- When sub-commands 'top' and 'report' work with -g/—-children
- there're options in control of call-graph.
+ The following controls the handling of call-graphs (obtained via the
+ -g/--call-graph options).

call-graph.record-mode::
- The record-mode can be 'fp' (frame pointer), 'dwarf' and 'lbr'.
- The value of 'dwarf' is effective only if perf detect needed library
- (libunwind or a recent version of libdw).
- 'lbr' only work for cpus that support it.
+ The mode for user space can be 'fp' (frame pointer), 'dwarf'
+ and 'lbr'. The value 'dwarf' is effective only if libunwind
+ (or a recent version of libdw) is present on the system;
+ the value 'lbr' only works for certain cpus. The method for
+ kernel space is controlled not by this option but by the
+ kernel config (CONFIG_UNWINDER_*).

call-graph.dump-size::
The size of stack to dump in order to do post-unwinding. Default is 8192 (byte).
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 7f4db75..b25e028 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -237,16 +237,22 @@ OPTIONS
option and remains only for backward compatibility. See --event.

-g::
- Enables call-graph (stack chain/backtrace) recording.
+ Enables call-graph (stack chain/backtrace) recording for both
+ kernel space and user space.

--call-graph::
Setup and enable call-graph (stack chain/backtrace) recording,
- implies -g. Default is "fp".
+ implies -g. Default is "fp" (for user space).

- Allows specifying "fp" (frame pointer) or "dwarf"
- (DWARF's CFI - Call Frame Information) or "lbr"
- (Hardware Last Branch Record facility) as the method to collect
- the information used to show the call graphs.
+ The unwinding method used for kernel space is dependent on the
+ unwinder used by the active kernel configuration, i.e
+ CONFIG_UNWINDER_FRAME_POINTER (fp) or CONFIG_UNWINDER_ORC (orc)
+
+ Any option specified here controls the method used for user space.
+
+ Valid options are "fp" (frame pointer), "dwarf" (DWARF's CFI -
+ Call Frame Information) or "lbr" (Hardware Last Branch Record
+ facility).

In some systems, where binaries are build with gcc
--fomit-frame-pointer, using the "fp" method will produce bogus