"acpi_idle_do_entry", "acpi_processor_ffh_cstate_enter", and "idle_cpu"
appear in 'perf top' output, at least on AMD systems.
Add them to perf's idle_symbols list, so they don't dominate 'perf top'
output.
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Kim Phillips <[email protected]>
---
tools/perf/util/symbol.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 3b379b1296f1..f3120c4f47ad 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -635,9 +635,12 @@ int modules__parse(const char *filename, void *arg,
static bool symbol__is_idle(const char *name)
{
const char * const idle_symbols[] = {
+ "acpi_idle_do_entry",
+ "acpi_processor_ffh_cstate_enter",
"arch_cpu_idle",
"cpu_idle",
"cpu_startup_entry",
+ "idle_cpu",
"intel_idle",
"default_idle",
"native_safe_halt",
--
2.24.1
For data collected on machines with front end stalled cycles supported,
such as found on modern AMD CPU families, commit 146540fb545b ("perf
stat: Always separate stalled cycles per insn") introduces a new line
in CSV output with a leading comma that upsets some automated scripts.
Scripts have to use "-e ex_ret_instr" to work around this issue, after
upgrading to a version of perf with that commit.
We could add "if (have_frontend_stalled && !config->csv_sep)"
to the not (total && avg) else clause, to emphasize that CSV users
are usually scripts, and are written to do only what is needed, i.e.,
they wouldn't typically invoke "perf stat" without specifying an
explicit event list.
But - let alone CSV output - why should users now tolerate a constant
0-reporting extra line in regular terminal output?:
BEFORE:
$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
Performance counter stats for 'system wide':
181,110,981 instructions # 0.58 insn per cycle
# 0.00 stalled cycles per insn
309,876,469 cycles
1.002202582 seconds time elapsed
The user would not like to see the now permanent
"0.00 stalled cycles per insn" line fixture, as it gives
no useful information.
So this patch removes the printing of the zeroed stalled cycles
line altogether, almost reverting the very original commit fb4605ba47e7
("perf stat: Check for frontend stalled for metrics"), which seems
like it was written to normalize --metric-only column output
of common Intel machines at the time: modern Intel machines
have ceased to support the genericised frontend stalled metrics AFAICT.
AFTER:
$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
Performance counter stats for 'system wide':
244,071,432 instructions # 0.69 insn per cycle
355,353,490 cycles
1.001862516 seconds time elapsed
Output behaviour when stalled cycles is indeed measured is not affected
(BEFORE == AFTER):
$ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1
Performance counter stats for 'system wide':
247,227,799 instructions # 0.63 insn per cycle
# 0.26 stalled cycles per insn
394,745,636 cycles
63,194,485 stalled-cycles-frontend # 16.01% frontend cycles idle
1.002079770 seconds time elapsed
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: 146540fb545b ("perf stat: Always separate stalled cycles per insn")
Signed-off-by: Kim Phillips <[email protected]>
---
tools/perf/util/stat-shadow.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 2c41d47f6f83..90d23cc3c8d4 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -18,7 +18,6 @@
* AGGR_NONE: Use matching CPU
* AGGR_THREAD: Not supported?
*/
-static bool have_frontend_stalled;
struct runtime_stat rt_stat;
struct stats walltime_nsecs_stats;
@@ -144,7 +143,6 @@ void runtime_stat__exit(struct runtime_stat *st)
void perf_stat__init_shadow_stats(void)
{
- have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
runtime_stat__init(&rt_stat);
}
@@ -853,10 +851,6 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
print_metric(config, ctxp, NULL, "%7.2f ",
"stalled cycles per insn",
ratio);
- } else if (have_frontend_stalled) {
- out->new_line(config, ctxp);
- print_metric(config, ctxp, NULL, "%7.2f ",
- "stalled cycles per insn", 0);
}
} else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
if (runtime_stat_n(st, STAT_BRANCHES, ctx, cpu) != 0)
--
2.24.1
On Wed, Jan 15, 2020 at 04:29:48PM -0600, Kim Phillips wrote:
> "acpi_idle_do_entry", "acpi_processor_ffh_cstate_enter", and "idle_cpu"
> appear in 'perf top' output, at least on AMD systems.
>
> Add them to perf's idle_symbols list, so they don't dominate 'perf top'
> output.
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Cong Wang <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Kim Phillips <[email protected]>
> Cc: Song Liu <[email protected]>
> Cc: Davidlohr Bueso <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Kim Phillips <[email protected]>
> ---
> tools/perf/util/symbol.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 3b379b1296f1..f3120c4f47ad 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -635,9 +635,12 @@ int modules__parse(const char *filename, void *arg,
> static bool symbol__is_idle(const char *name)
> {
> const char * const idle_symbols[] = {
> + "acpi_idle_do_entry",
> + "acpi_processor_ffh_cstate_enter",
> "arch_cpu_idle",
> "cpu_idle",
> "cpu_startup_entry",
> + "idle_cpu",
> "intel_idle",
> "default_idle",
> "native_safe_halt",
ok, at some point we should put this in strlist ;-)
Acked-by: Jiri Olsa <[email protected]
thanks,
jirka
On Wed, Jan 15, 2020 at 04:29:49PM -0600, Kim Phillips wrote:
> For data collected on machines with front end stalled cycles supported,
> such as found on modern AMD CPU families, commit 146540fb545b ("perf
> stat: Always separate stalled cycles per insn") introduces a new line
> in CSV output with a leading comma that upsets some automated scripts.
> Scripts have to use "-e ex_ret_instr" to work around this issue, after
> upgrading to a version of perf with that commit.
>
> We could add "if (have_frontend_stalled && !config->csv_sep)"
> to the not (total && avg) else clause, to emphasize that CSV users
> are usually scripts, and are written to do only what is needed, i.e.,
> they wouldn't typically invoke "perf stat" without specifying an
> explicit event list.
>
> But - let alone CSV output - why should users now tolerate a constant
> 0-reporting extra line in regular terminal output?:
>
> BEFORE:
>
> $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
>
> Performance counter stats for 'system wide':
>
> 181,110,981 instructions # 0.58 insn per cycle
> # 0.00 stalled cycles per insn
> 309,876,469 cycles
>
> 1.002202582 seconds time elapsed
>
> The user would not like to see the now permanent
> "0.00 stalled cycles per insn" line fixture, as it gives
> no useful information.
>
> So this patch removes the printing of the zeroed stalled cycles
> line altogether, almost reverting the very original commit fb4605ba47e7
> ("perf stat: Check for frontend stalled for metrics"), which seems
> like it was written to normalize --metric-only column output
> of common Intel machines at the time: modern Intel machines
> have ceased to support the genericised frontend stalled metrics AFAICT.
>
> AFTER:
>
> $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1
>
> Performance counter stats for 'system wide':
>
> 244,071,432 instructions # 0.69 insn per cycle
> 355,353,490 cycles
>
> 1.001862516 seconds time elapsed
>
> Output behaviour when stalled cycles is indeed measured is not affected
> (BEFORE == AFTER):
>
> $ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1
>
> Performance counter stats for 'system wide':
>
> 247,227,799 instructions # 0.63 insn per cycle
> # 0.26 stalled cycles per insn
> 394,745,636 cycles
> 63,194,485 stalled-cycles-frontend # 16.01% frontend cycles idle
>
> 1.002079770 seconds time elapsed
looks reasonable to me, Andi, are you ok with this?
Acked-by: Jiri Olsa <[email protected]
thanks,
jirka
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Cong Wang <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Kim Phillips <[email protected]>
> Cc: Song Liu <[email protected]>
> Cc: Davidlohr Bueso <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Fixes: 146540fb545b ("perf stat: Always separate stalled cycles per insn")
> Signed-off-by: Kim Phillips <[email protected]>
> ---
> tools/perf/util/stat-shadow.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 2c41d47f6f83..90d23cc3c8d4 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -18,7 +18,6 @@
> * AGGR_NONE: Use matching CPU
> * AGGR_THREAD: Not supported?
> */
> -static bool have_frontend_stalled;
>
> struct runtime_stat rt_stat;
> struct stats walltime_nsecs_stats;
> @@ -144,7 +143,6 @@ void runtime_stat__exit(struct runtime_stat *st)
>
> void perf_stat__init_shadow_stats(void)
> {
> - have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
> runtime_stat__init(&rt_stat);
> }
>
> @@ -853,10 +851,6 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
> print_metric(config, ctxp, NULL, "%7.2f ",
> "stalled cycles per insn",
> ratio);
> - } else if (have_frontend_stalled) {
> - out->new_line(config, ctxp);
> - print_metric(config, ctxp, NULL, "%7.2f ",
> - "stalled cycles per insn", 0);
> }
> } else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
> if (runtime_stat_n(st, STAT_BRANCHES, ctx, cpu) != 0)
> --
> 2.24.1
>
> looks reasonable to me, Andi, are you ok with this?
>
> Acked-by: Jiri Olsa <[email protected]
Yes.
Acked-by: Andi Kleen <[email protected]>
-Andi