I found the hierarchy mode useful, but it's easy to make a typo when
using it. Let's add a short option for that.
Also update the documentation. :)
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 29 ++++++++++++++++++++-
tools/perf/Documentation/perf-top.txt | 32 +++++++++++++++++++++++-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 2 +-
4 files changed, 61 insertions(+), 4 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 38f59ac064f7..d8b863e01fe0 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -531,8 +531,35 @@ include::itrace.txt[]
--raw-trace::
When displaying traceevent output, do not use print fmt or plugins.
+-H::
--hierarchy::
- Enable hierarchical output.
+ Enable hierarchical output. In the hierarchy mode, each sort key groups
+ samples based on the criteria and then sub-divide it using the lower
+ level sort key.
+
+ For example:
+ In normal output:
+
+ perf report -s dso,sym
+ # Overhead Shared Object Symbol
+ 50.00% [kernel.kallsyms] [k] kfunc1
+ 20.00% perf [.] foo
+ 15.00% [kernel.kallsyms] [k] kfunc2
+ 10.00% perf [.] bar
+ 5.00% libc.so [.] libcall
+
+ In hierarchy output:
+
+ perf report -s dso,sym --hierarchy
+ # Overhead Shared Object / Symbol
+ 65.00% [kernel.kallsyms]
+ 50.00% [k] kfunc1
+ 15.00% [k] kfunc2
+ 30.00% perf
+ 20.00% [.] foo
+ 10.00% [.] bar
+ 5.00% libc.so
+ 5.00% [.] libcall
--inline::
If a callgraph address belongs to an inlined function, the inline stack
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 3c202ec080ba..a754875fa5bb 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -261,8 +261,38 @@ Default is to monitor all CPUS.
--raw-trace::
When displaying traceevent output, do not use print fmt or plugins.
+-H::
--hierarchy::
- Enable hierarchy output.
+ Enable hierarchical output. In the hierarchy mode, each sort key groups
+ samples based on the criteria and then sub-divide it using the lower
+ level sort key.
+
+ For example, in normal output:
+
+ perf report -s dso,sym
+ #
+ # Overhead Shared Object Symbol
+ # ........ ................. ...........
+ 50.00% [kernel.kallsyms] [k] kfunc1
+ 20.00% perf [.] foo
+ 15.00% [kernel.kallsyms] [k] kfunc2
+ 10.00% perf [.] bar
+ 5.00% libc.so [.] libcall
+
+ In hierarchy output:
+
+ perf report -s dso,sym --hierarchy
+ #
+ # Overhead Shared Object / Symbol
+ # .......... ......................
+ 65.00% [kernel.kallsyms]
+ 50.00% [k] kfunc1
+ 15.00% [k] kfunc2
+ 30.00% perf
+ 20.00% [.] foo
+ 10.00% [.] bar
+ 5.00% libc.so
+ 5.00% [.] libcall
--overwrite::
Enable this to use just the most recent records, which helps in high core count
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index f2ed2b7e80a3..ccb91fe6b876 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1410,7 +1410,7 @@ int cmd_report(int argc, const char **argv)
"only show processor socket that match with this filter"),
OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
"Show raw trace event output (do not use print fmt or plugins)"),
- OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
+ OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
"Show entries in a hierarchy"),
OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
"'always' (default), 'never' or 'auto' only applicable to --stdio mode",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index baf1ab083436..03cf45088fd8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1573,7 +1573,7 @@ int cmd_top(int argc, const char **argv)
"add last branch records to call history"),
OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
"Show raw trace event output (do not use print fmt or plugins)"),
- OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
+ OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
"Show entries in a hierarchy"),
OPT_BOOLEAN(0, "overwrite", &top.record_opts.overwrite,
"Use a backward ring buffer, default: no"),
--
2.43.0.429.g432eaa2c6b-goog
Em Wed, Jan 24, 2024 at 09:51:24PM -0800, Namhyung Kim escreveu:
> I found the hierarchy mode useful, but it's easy to make a typo when
> using it. Let's add a short option for that.
Fair enough, but:
[root@quaco ~]# perf report --hi + head -15
Error: Ambiguous option: hi (could be --hide-unresolved or --hierarchy)
Usage: perf report [<options>]
-U, --hide-unresolved
Only display entries resolved to a symbol
--hierarchy Show entries in a hierarchy
[root@quaco ~]# perf report --hie | head -15
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 56 of event 'cycles:P'
# Event count (approx.): 13456952
#
# Overhead Command / Shared Object / Symbol
# .............. ........................................
#
72.56% swapper
72.56% [kernel.kallsyms]
72.56% [k] intel_idle_ibrs
18.53% perf
[root@quaco ~]#
> Also update the documentation. :)
Thanks, as a suggestion maybe we should have a:
$ perf config ui.hierarchy
as we have:
[root@quaco ~]# perf config ui.show-headers=true
[root@quaco ~]# perf config ui.show-headers
ui.show-headers=true
[root@quaco ~]#
Acked-by: Arnaldo Carvalho de Melo <[email protected]>
- Arnaldo
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/Documentation/perf-report.txt | 29 ++++++++++++++++++++-
> tools/perf/Documentation/perf-top.txt | 32 +++++++++++++++++++++++-
> tools/perf/builtin-report.c | 2 +-
> tools/perf/builtin-top.c | 2 +-
> 4 files changed, 61 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 38f59ac064f7..d8b863e01fe0 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -531,8 +531,35 @@ include::itrace.txt[]
> --raw-trace::
> When displaying traceevent output, do not use print fmt or plugins.
>
> +-H::
> --hierarchy::
> - Enable hierarchical output.
> + Enable hierarchical output. In the hierarchy mode, each sort key groups
> + samples based on the criteria and then sub-divide it using the lower
> + level sort key.
> +
> + For example:
> + In normal output:
> +
> + perf report -s dso,sym
> + # Overhead Shared Object Symbol
> + 50.00% [kernel.kallsyms] [k] kfunc1
> + 20.00% perf [.] foo
> + 15.00% [kernel.kallsyms] [k] kfunc2
> + 10.00% perf [.] bar
> + 5.00% libc.so [.] libcall
> +
> + In hierarchy output:
> +
> + perf report -s dso,sym --hierarchy
> + # Overhead Shared Object / Symbol
> + 65.00% [kernel.kallsyms]
> + 50.00% [k] kfunc1
> + 15.00% [k] kfunc2
> + 30.00% perf
> + 20.00% [.] foo
> + 10.00% [.] bar
> + 5.00% libc.so
> + 5.00% [.] libcall
>
> --inline::
> If a callgraph address belongs to an inlined function, the inline stack
> diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
> index 3c202ec080ba..a754875fa5bb 100644
> --- a/tools/perf/Documentation/perf-top.txt
> +++ b/tools/perf/Documentation/perf-top.txt
> @@ -261,8 +261,38 @@ Default is to monitor all CPUS.
> --raw-trace::
> When displaying traceevent output, do not use print fmt or plugins.
>
> +-H::
> --hierarchy::
> - Enable hierarchy output.
> + Enable hierarchical output. In the hierarchy mode, each sort key groups
> + samples based on the criteria and then sub-divide it using the lower
> + level sort key.
> +
> + For example, in normal output:
> +
> + perf report -s dso,sym
> + #
> + # Overhead Shared Object Symbol
> + # ........ ................. ...........
> + 50.00% [kernel.kallsyms] [k] kfunc1
> + 20.00% perf [.] foo
> + 15.00% [kernel.kallsyms] [k] kfunc2
> + 10.00% perf [.] bar
> + 5.00% libc.so [.] libcall
> +
> + In hierarchy output:
> +
> + perf report -s dso,sym --hierarchy
> + #
> + # Overhead Shared Object / Symbol
> + # .......... ......................
> + 65.00% [kernel.kallsyms]
> + 50.00% [k] kfunc1
> + 15.00% [k] kfunc2
> + 30.00% perf
> + 20.00% [.] foo
> + 10.00% [.] bar
> + 5.00% libc.so
> + 5.00% [.] libcall
>
> --overwrite::
> Enable this to use just the most recent records, which helps in high core count
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index f2ed2b7e80a3..ccb91fe6b876 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -1410,7 +1410,7 @@ int cmd_report(int argc, const char **argv)
> "only show processor socket that match with this filter"),
> OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
> "Show raw trace event output (do not use print fmt or plugins)"),
> - OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
> + OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
> "Show entries in a hierarchy"),
> OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
> "'always' (default), 'never' or 'auto' only applicable to --stdio mode",
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index baf1ab083436..03cf45088fd8 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1573,7 +1573,7 @@ int cmd_top(int argc, const char **argv)
> "add last branch records to call history"),
> OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
> "Show raw trace event output (do not use print fmt or plugins)"),
> - OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
> + OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
> "Show entries in a hierarchy"),
> OPT_BOOLEAN(0, "overwrite", &top.record_opts.overwrite,
> "Use a backward ring buffer, default: no"),
> --
> 2.43.0.429.g432eaa2c6b-goog
>
--
- Arnaldo
Hi Arnaldo,
On Thu, Jan 25, 2024 at 6:45 AM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Wed, Jan 24, 2024 at 09:51:24PM -0800, Namhyung Kim escreveu:
> > I found the hierarchy mode useful, but it's easy to make a typo when
> > using it. Let's add a short option for that.
>
> Fair enough, but:
>
> [root@quaco ~]# perf report --hi + head -15
> Error: Ambiguous option: hi (could be --hide-unresolved or --hierarchy)
>
> Usage: perf report [<options>]
>
> -U, --hide-unresolved
> Only display entries resolved to a symbol
> --hierarchy Show entries in a hierarchy
> [root@quaco ~]# perf report --hie | head -15
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 56 of event 'cycles:P'
> # Event count (approx.): 13456952
> #
> # Overhead Command / Shared Object / Symbol
> # .............. ........................................
> #
> 72.56% swapper
> 72.56% [kernel.kallsyms]
> 72.56% [k] intel_idle_ibrs
> 18.53% perf
> [root@quaco ~]#
>
> > Also update the documentation. :)
>
> Thanks, as a suggestion maybe we should have a:
>
> $ perf config ui.hierarchy
>
> as we have:
>
> [root@quaco ~]# perf config ui.show-headers=true
> [root@quaco ~]# perf config ui.show-headers
> ui.show-headers=true
> [root@quaco ~]#
Yep, I'll think about the config option later. Right now it
cannot work with children mode which can be enabled
by a config as well.
>
>
> Acked-by: Arnaldo Carvalho de Melo <[email protected]>
Thanks,
Namhyung
>
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/Documentation/perf-report.txt | 29 ++++++++++++++++++++-
> > tools/perf/Documentation/perf-top.txt | 32 +++++++++++++++++++++++-
> > tools/perf/builtin-report.c | 2 +-
> > tools/perf/builtin-top.c | 2 +-
> > 4 files changed, 61 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> > index 38f59ac064f7..d8b863e01fe0 100644
> > --- a/tools/perf/Documentation/perf-report.txt
> > +++ b/tools/perf/Documentation/perf-report.txt
> > @@ -531,8 +531,35 @@ include::itrace.txt[]
> > --raw-trace::
> > When displaying traceevent output, do not use print fmt or plugins.
> >
> > +-H::
> > --hierarchy::
> > - Enable hierarchical output.
> > + Enable hierarchical output. In the hierarchy mode, each sort key groups
> > + samples based on the criteria and then sub-divide it using the lower
> > + level sort key.
> > +
> > + For example:
> > + In normal output:
> > +
> > + perf report -s dso,sym
> > + # Overhead Shared Object Symbol
> > + 50.00% [kernel.kallsyms] [k] kfunc1
> > + 20.00% perf [.] foo
> > + 15.00% [kernel.kallsyms] [k] kfunc2
> > + 10.00% perf [.] bar
> > + 5.00% libc.so [.] libcall
> > +
> > + In hierarchy output:
> > +
> > + perf report -s dso,sym --hierarchy
> > + # Overhead Shared Object / Symbol
> > + 65.00% [kernel.kallsyms]
> > + 50.00% [k] kfunc1
> > + 15.00% [k] kfunc2
> > + 30.00% perf
> > + 20.00% [.] foo
> > + 10.00% [.] bar
> > + 5.00% libc.so
> > + 5.00% [.] libcall
> >
> > --inline::
> > If a callgraph address belongs to an inlined function, the inline stack
> > diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
> > index 3c202ec080ba..a754875fa5bb 100644
> > --- a/tools/perf/Documentation/perf-top.txt
> > +++ b/tools/perf/Documentation/perf-top.txt
> > @@ -261,8 +261,38 @@ Default is to monitor all CPUS.
> > --raw-trace::
> > When displaying traceevent output, do not use print fmt or plugins.
> >
> > +-H::
> > --hierarchy::
> > - Enable hierarchy output.
> > + Enable hierarchical output. In the hierarchy mode, each sort key groups
> > + samples based on the criteria and then sub-divide it using the lower
> > + level sort key.
> > +
> > + For example, in normal output:
> > +
> > + perf report -s dso,sym
> > + #
> > + # Overhead Shared Object Symbol
> > + # ........ ................. ...........
> > + 50.00% [kernel.kallsyms] [k] kfunc1
> > + 20.00% perf [.] foo
> > + 15.00% [kernel.kallsyms] [k] kfunc2
> > + 10.00% perf [.] bar
> > + 5.00% libc.so [.] libcall
> > +
> > + In hierarchy output:
> > +
> > + perf report -s dso,sym --hierarchy
> > + #
> > + # Overhead Shared Object / Symbol
> > + # .......... ......................
> > + 65.00% [kernel.kallsyms]
> > + 50.00% [k] kfunc1
> > + 15.00% [k] kfunc2
> > + 30.00% perf
> > + 20.00% [.] foo
> > + 10.00% [.] bar
> > + 5.00% libc.so
> > + 5.00% [.] libcall
> >
> > --overwrite::
> > Enable this to use just the most recent records, which helps in high core count
> > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> > index f2ed2b7e80a3..ccb91fe6b876 100644
> > --- a/tools/perf/builtin-report.c
> > +++ b/tools/perf/builtin-report.c
> > @@ -1410,7 +1410,7 @@ int cmd_report(int argc, const char **argv)
> > "only show processor socket that match with this filter"),
> > OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
> > "Show raw trace event output (do not use print fmt or plugins)"),
> > - OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
> > + OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
> > "Show entries in a hierarchy"),
> > OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
> > "'always' (default), 'never' or 'auto' only applicable to --stdio mode",
> > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> > index baf1ab083436..03cf45088fd8 100644
> > --- a/tools/perf/builtin-top.c
> > +++ b/tools/perf/builtin-top.c
> > @@ -1573,7 +1573,7 @@ int cmd_top(int argc, const char **argv)
> > "add last branch records to call history"),
> > OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
> > "Show raw trace event output (do not use print fmt or plugins)"),
> > - OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
> > + OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
> > "Show entries in a hierarchy"),
> > OPT_BOOLEAN(0, "overwrite", &top.record_opts.overwrite,
> > "Use a backward ring buffer, default: no"),
> > --
> > 2.43.0.429.g432eaa2c6b-goog
> >
>
> --
>
> - Arnaldo
On Wed, 24 Jan 2024 21:51:24 -0800, Namhyung Kim wrote:
> I found the hierarchy mode useful, but it's easy to make a typo when
> using it. Let's add a short option for that.
>
> Also update the documentation. :)
>
>
Applied to perf-tools-next, thanks!
Best regards,
--
Namhyung Kim <[email protected]>