2016-03-02 22:17:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [GIT PULL 00/11] perf/core improvements and fixes

Hi Ingo,

Please consider pulling, this is on top of the outstanding
perf-core-for-mingo-20160229 signed tag.

- Arnaldo

The following changes since commit 1d6c9407d45dd622b277ca9f725da3cc9e95b5de:

perf trace: Print content of bpf-output event (2016-02-26 19:57:07 -0300)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160302

for you to fetch changes up to 575197b405c45959ca2f71da8c65b6f8d9693140:

perf stat: Check for frontend stalled for metrics (2016-03-02 11:27:00 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Implement CSV metrics output in 'perf stat' (Andi Kleen)

- Support metrics in 'perf stat' --per-core/socket mode (Andi Kleen)

- Check for frontend stalled for metrics (Andi Kleen)

- Fix segfault in 'perf test' hists related entries (Arnaldo Carvalho de Melo)

- Fix output of %llu for 64 bit values read on 32 bit machines in libtraceevent (Steven Rostedt)

- Fix time stamp rounding issue in libtraceevent (Chaos.Chen)

Infrastructure:

- Fix double free on 'command_line' in a error path in 'perf script' (Colin Ian King)

- Initialize struct sigaction 'sa_flags' field in a 'perf test' entri (Colin Ian King)

- Fix various build warnings in turbostat, detected with gcc6 (Colin Ian King)

- Use .s extension for preprocessed assembler code (Masahiro Yamada)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------

The following changes since commit 575a02e00b11eecbbabcb1eb22eab4c68e91ae77:

perf record: Ensure return non-zero rc when mmap fail (2016-02-29 12:44:15 -0300)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160302

for you to fetch changes up to 575197b405c45959ca2f71da8c65b6f8d9693140:

perf stat: Check for frontend stalled for metrics (2016-03-02 11:27:00 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Implement CSV metrics output in 'perf stat' (Andi Kleen)

- Support metrics in 'perf stat' --per-core/socket mode (Andi Kleen)

- Check for frontend stalled for metrics (Andi Kleen)

- Fix segfault in 'perf test' hists related entries (Arnaldo Carvalho de Melo)

- Fix output of %llu for 64 bit values read on 32 bit machines in libtraceevent (Steven Rostedt)

- Fix time stamp rounding issue in libtraceevent (Chaos.Chen)

Infrastructure:

- Fix double free on 'command_line' in a error path in 'perf script' (Colin Ian King)

- Initialize struct sigaction 'sa_flags' field in a 'perf test' entri (Colin Ian King)

- Fix various build warnings in turbostat, detected with gcc6 (Colin Ian King)

- Use .s extension for preprocessed assembler code (Masahiro Yamada)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------
Andi Kleen (3):
perf stat: Implement CSV metrics output
perf stat: Support metrics in --per-core/socket mode
perf stat: Check for frontend stalled for metrics

Arnaldo Carvalho de Melo (1):
perf test: Fix hists related entries

Chaos.Chen (1):
tools lib traceevent: Fix time stamp rounding issue

Colin Ian King (3):
perf script: Fix double free on command_line
perf tests: Initialize sa.sa_flags
tools/power turbostat: fix various build warnings

Masahiro Yamada (1):
tools build: Use .s extension for preprocessed assembler code

Steven Rostedt (Red Hat) (2):
tools lib traceevent: Set int_array fields to NULL if freeing from error
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines

tools/build/Makefile.build | 2 +-
tools/lib/traceevent/event-parse.c | 10 +-
tools/perf/arch/x86/tests/rdpmc.c | 1 +
tools/perf/builtin-stat.c | 136 +++++++++++++++++++--
.../util/scripting-engines/trace-event-python.c | 4 +-
tools/perf/util/sort.c | 37 +++---
tools/perf/util/stat-shadow.c | 18 ++-
tools/perf/util/stat.h | 1 +
tools/power/x86/turbostat/turbostat.c | 8 +-
9 files changed, 181 insertions(+), 36 deletions(-)


2016-03-02 22:17:17

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 04/11] perf script: Fix double free on command_line

From: Colin Ian King <[email protected]>

The 'command_line' variable is free'd twice if db_export__branch_types()
fails. To avoid this, defer the free'ing of 'command_line' to after this
call so that the error return path will just free 'command_line' once.

Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Javi Merino <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/scripting-engines/trace-event-python.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 309d90fa7698..fbd05242b4e5 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1094,8 +1094,6 @@ static int python_start_script(const char *script, int argc, const char **argv)
goto error;
}

- free(command_line);
-
set_table_handlers(tables);

if (tables->db_export_mode) {
@@ -1104,6 +1102,8 @@ static int python_start_script(const char *script, int argc, const char **argv)
goto error;
}

+ free(command_line);
+
return err;
error:
Py_Finalize();
--
2.5.0

2016-03-02 22:17:13

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 08/11] perf test: Fix hists related entries

From: Arnaldo Carvalho de Melo <[email protected]>

That got broken by d3a72fd8187b ("perf report: Fix indentation of
dynamic entries in hierarchy"), by using the evlist in setup_sorting()
without checking if it is NULL, as done in some 'perf test' entries:

$ find tools/ -name "*.c" | xargs grep 'setup_sorting(NULL);'
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
$

Fix it.

Before:

[root@jouet ~]# perf test
<SNIP>
15: Test matching and linking multiple hists : FAILED!
16: Try 'import perf' in python, checking link problems : Ok
17: Test breakpoint overflow signal handler : Ok
18: Test breakpoint overflow sampling : Ok
19: Test number of exit event of a simple workload : Ok
20: Test software clock events have valid period values : Ok
21: Test object code reading : Ok
22: Test sample parsing : Ok
23: Test using a dummy software event to keep tracking : Ok
24: Test parsing with no sample_id_all bit set : Ok
25: Test filtering hist entries : FAILED!
26: Test mmap thread lookup : Ok
27: Test thread mg sharing : Ok
28: Test output sorting of hist entries : FAILED!
29: Test cumulation of child hist entries : FAILED!
<SNIP>

After the patch the above failed tests complete successfully.

Acked-by: Namhyung Kim <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: d3a72fd8187b ("perf report: Fix indentation of dynamic entries in hierarchy")
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/sort.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 5888bfe9a193..4380a2858802 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2635,25 +2635,14 @@ out:
return ret;
}

-int setup_sorting(struct perf_evlist *evlist)
+static void evlist__set_hists_nr_sort_keys(struct perf_evlist *evlist)
{
- int err;
- struct hists *hists;
struct perf_evsel *evsel;
- struct perf_hpp_fmt *fmt;
-
- err = __setup_sorting(evlist);
- if (err < 0)
- return err;
-
- if (parent_pattern != default_parent_pattern) {
- err = sort_dimension__add("parent", evlist);
- if (err < 0)
- return err;
- }

evlist__for_each(evlist, evsel) {
- hists = evsel__hists(evsel);
+ struct perf_hpp_fmt *fmt;
+ struct hists *hists = evsel__hists(evsel);
+
hists->nr_sort_keys = perf_hpp_list.nr_sort_keys;

/*
@@ -2667,6 +2656,24 @@ int setup_sorting(struct perf_evlist *evlist)
hists->nr_sort_keys--;
}
}
+}
+
+int setup_sorting(struct perf_evlist *evlist)
+{
+ int err;
+
+ err = __setup_sorting(evlist);
+ if (err < 0)
+ return err;
+
+ if (parent_pattern != default_parent_pattern) {
+ err = sort_dimension__add("parent", evlist);
+ if (err < 0)
+ return err;
+ }
+
+ if (evlist != NULL)
+ evlist__set_hists_nr_sort_keys(evlist);

reset_dimensions();

--
2.5.0

2016-03-02 22:17:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 11/11] perf stat: Check for frontend stalled for metrics

From: Andi Kleen <[email protected]>

Add an extra check for frontend stalled in the metrics. This avoids an
extra column for the --metric-only case when the CPU does not support
frontend stalled.

v2: Add separate init function

Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 1 +
tools/perf/util/stat-shadow.c | 9 ++++++++-
tools/perf/util/stat.h | 1 +
3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9b5089c5dffe..baa82078c148 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1966,6 +1966,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
(const char **) stat_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
+ perf_stat__init_shadow_stats();

if (csv_sep) {
csv_output = true;
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 5e2d2e34e1bc..b33ffb2af2cf 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -2,6 +2,7 @@
#include "evsel.h"
#include "stat.h"
#include "color.h"
+#include "pmu.h"

enum {
CTX_BIT_USER = 1 << 0,
@@ -35,9 +36,15 @@ static struct stats runtime_dtlb_cache_stats[NUM_CTX][MAX_NR_CPUS];
static struct stats runtime_cycles_in_tx_stats[NUM_CTX][MAX_NR_CPUS];
static struct stats runtime_transaction_stats[NUM_CTX][MAX_NR_CPUS];
static struct stats runtime_elision_stats[NUM_CTX][MAX_NR_CPUS];
+static bool have_frontend_stalled;

struct stats walltime_nsecs_stats;

+void perf_stat__init_shadow_stats(void)
+{
+ have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
+}
+
static int evsel_context(struct perf_evsel *evsel)
{
int ctx = 0;
@@ -323,7 +330,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
print_metric(ctxp, NULL, "%7.2f ",
"stalled cycles per insn",
ratio);
- } else {
+ } else if (have_frontend_stalled) {
print_metric(ctxp, NULL, NULL,
"stalled cycles per insn", 0);
}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index f02af68adc04..0150e786ccc7 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -72,6 +72,7 @@ typedef void (*print_metric_t)(void *ctx, const char *color, const char *unit,
const char *fmt, double val);
typedef void (*new_line_t )(void *ctx);

+void perf_stat__init_shadow_stats(void);
void perf_stat__reset_shadow_stats(void);
void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 *count,
int cpu);
--
2.5.0

2016-03-02 22:18:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 06/11] tools lib traceevent: Set int_array fields to NULL if freeing from error

From: "Steven Rostedt (Red Hat)" <[email protected]>

Had a bug where on error of parsing __print_array() where the fields are
freed after they were allocated, but since they were not set to NULL,
the freeing of the arg also tried to free the already freed fields
causing a double free.

Fix process_hex() while at it.

Signed-off-by: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/lib/traceevent/event-parse.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index ce59f4891fa2..fb790aa757eb 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -2635,6 +2635,7 @@ process_hex(struct event_format *event, struct print_arg *arg, char **tok)

free_field:
free_arg(arg->hex.field);
+ arg->hex.field = NULL;
out:
*tok = NULL;
return EVENT_ERROR;
@@ -2659,8 +2660,10 @@ process_int_array(struct event_format *event, struct print_arg *arg, char **tok)

free_size:
free_arg(arg->int_array.count);
+ arg->int_array.count = NULL;
free_field:
free_arg(arg->int_array.field);
+ arg->int_array.field = NULL;
out:
*tok = NULL;
return EVENT_ERROR;
--
2.5.0

2016-03-02 22:18:33

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 10/11] tools/power turbostat: fix various build warnings

From: Colin Ian King <[email protected]>

When building with gcc 6 we're getting various build warnings that just
require some trivial function declaration and call fixes:

turbostat.c: In function ‘dump_cstate_pstate_config_info’:
turbostat.c:1973:1: warning: type of ‘family’ defaults to ‘int’
dump_cstate_pstate_config_info(family, model)
turbostat.c:1973:1: warning: type of ‘model’ defaults to ‘int’
turbostat.c: In function ‘get_tdp’:
turbostat.c:2145:8: warning: type of ‘model’ defaults to ‘int’
double get_tdp(model)
turbostat.c: In function ‘perf_limit_reasons_probe’:
turbostat.c:2259:6: warning: type of ‘family’ defaults to ‘int’
void perf_limit_reasons_probe(family, model)
turbostat.c:2259:6: warning: type of ‘model’ defaults to ‘int’

Signed-off-by: Colin Ian King <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/power/x86/turbostat/turbostat.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 0dac7e05a6ac..3fa94e291d16 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -1970,7 +1970,7 @@ int has_config_tdp(unsigned int family, unsigned int model)
}

static void
-dump_cstate_pstate_config_info(family, model)
+dump_cstate_pstate_config_info(unsigned int family, unsigned int model)
{
if (!do_nhm_platform_info)
return;
@@ -2142,7 +2142,7 @@ int print_perf_limit(struct thread_data *t, struct core_data *c, struct pkg_data
#define RAPL_POWER_GRANULARITY 0x7FFF /* 15 bit power granularity */
#define RAPL_TIME_GRANULARITY 0x3F /* 6 bit time granularity */

-double get_tdp(model)
+double get_tdp(unsigned int model)
{
unsigned long long msr;

@@ -2256,7 +2256,7 @@ void rapl_probe(unsigned int family, unsigned int model)
return;
}

-void perf_limit_reasons_probe(family, model)
+void perf_limit_reasons_probe(unsigned int family, unsigned int model)
{
if (!genuine_intel)
return;
@@ -2792,7 +2792,7 @@ void process_cpuid()
perf_limit_reasons_probe(family, model);

if (debug)
- dump_cstate_pstate_config_info();
+ dump_cstate_pstate_config_info(family, model);

if (has_skl_msrs(family, model))
calculate_tsc_tweak();
--
2.5.0

2016-03-02 22:17:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 01/11] perf stat: Implement CSV metrics output

From: Andi Kleen <[email protected]>

Now support CSV output for metrics. With the new output callbacks this
is relatively straight forward by creating new callbacks.

This allows to easily plot metrics from CSV files.

The new line callback needs to know the number of fields to skip them
correctly

Example output before:

% perf stat -x, true
0.200687,,task-clock,200687,100.00
0,,context-switches,200687,100.00
0,,cpu-migrations,200687,100.00
40,,page-faults,200687,100.00
730871,,cycles,203601,100.00
551056,,stalled-cycles-frontend,203601,100.00
<not supported>,,stalled-cycles-backend,0,100.00
385523,,instructions,203601,100.00
78028,,branches,203601,100.00
3946,,branch-misses,203601,100.00

After:

% perf stat -x, true
.502457,,task-clock,502457,100.00,0.485,CPUs utilized
0,,context-switches,502457,100.00,0.000,K/sec
0,,cpu-migrations,502457,100.00,0.000,K/sec
45,,page-faults,502457,100.00,0.090,M/sec
644692,,cycles,509102,100.00,1.283,GHz
423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
<not supported>,,stalled-cycles-backend,0,100.00,,,,
492701,,instructions,509102,100.00,0.76,insn per cycle
,,,,,0.86,stalled cycles per insn
97767,,branches,509102,100.00,194.578,M/sec
4788,,branch-misses,509102,100.00,4.90,of all branches

or easier readable

$ perf stat -x, -o x.csv true
$ column -s, -t x.csv
0.490635 task-clock 490635 100.00 0.489 CPUs utilized
0 context-switches 490635 100.00 0.000 K/sec
0 cpu-migrations 490635 100.00 0.000 K/sec
45 page-faults 490635 100.00 0.092 M/sec
629080 cycles 497698 100.00 1.282 GHz
409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle
<not supported> stalled-cycles-backend 0 100.00
491424 instructions 497698 100.00 0.78 insn per cycle
0.83 stalled cycles per insn
97278 branches 497698 100.00 198.270 M/sec
4569 branch-misses 497698 100.00 4.70 of all branches

Two new fields are added: metric value and metric name.

v2: Split out function argument changes
v3: Reenable metrics for real.
v4: Fix wrong hunk from refactoring.
v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
Print empty metrics for not counted.
v6: Avoid outputting metric on empty format.
v7: Print metric at the end
v8: Remove extra run, ena fields
v9: Avoid extra new line for unsupported counters

Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 73 ++++++++++++++++++++++++++++++++++++++++---
tools/perf/util/stat-shadow.c | 2 +-
2 files changed, 70 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 24f222dd2a8a..2ffb8221917a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -739,6 +739,7 @@ struct outstate {
FILE *fh;
bool newline;
const char *prefix;
+ int nfields;
};

#define METRIC_LEN 35
@@ -789,6 +790,43 @@ static void print_metric_std(void *ctx, const char *color, const char *fmt,
fprintf(out, " %-*s", METRIC_LEN - n - 1, unit);
}

+static void new_line_csv(void *ctx)
+{
+ struct outstate *os = ctx;
+ int i;
+
+ fputc('\n', os->fh);
+ if (os->prefix)
+ fprintf(os->fh, "%s%s", os->prefix, csv_sep);
+ for (i = 0; i < os->nfields; i++)
+ fputs(csv_sep, os->fh);
+}
+
+static void print_metric_csv(void *ctx,
+ const char *color __maybe_unused,
+ const char *fmt, const char *unit, double val)
+{
+ struct outstate *os = ctx;
+ FILE *out = os->fh;
+ char buf[64], *vals, *ends;
+
+ if (unit == NULL || fmt == NULL) {
+ fprintf(out, "%s%s%s%s", csv_sep, csv_sep, csv_sep, csv_sep);
+ return;
+ }
+ snprintf(buf, sizeof(buf), fmt, val);
+ vals = buf;
+ while (isspace(*vals))
+ vals++;
+ ends = vals;
+ while (isdigit(*ends) || *ends == '.')
+ ends++;
+ *ends = 0;
+ while (isspace(*unit))
+ unit++;
+ fprintf(out, "%s%s%s%s", csv_sep, vals, csv_sep, unit);
+}
+
static void nsec_printout(int id, int nr, struct perf_evsel *evsel, double avg)
{
FILE *output = stat_config.output;
@@ -860,6 +898,22 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,

nl = new_line_std;

+ if (csv_output) {
+ static int aggr_fields[] = {
+ [AGGR_GLOBAL] = 0,
+ [AGGR_THREAD] = 1,
+ [AGGR_NONE] = 1,
+ [AGGR_SOCKET] = 2,
+ [AGGR_CORE] = 2,
+ };
+
+ pm = print_metric_csv;
+ nl = new_line_csv;
+ os.nfields = 3;
+ os.nfields += aggr_fields[stat_config.aggr_mode];
+ if (counter->cgrp)
+ os.nfields++;
+ }
if (run == 0 || ena == 0 || counter->counts->scaled == -1) {
aggr_printout(counter, id, nr);

@@ -880,7 +934,12 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
fprintf(stat_config.output, "%s%s",
csv_sep, counter->cgrp->name);

+ if (!csv_output)
+ pm(&os, NULL, NULL, "", 0);
+ print_noise(counter, noise);
print_running(run, ena);
+ if (csv_output)
+ pm(&os, NULL, NULL, "", 0);
return;
}

@@ -893,14 +952,20 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
out.new_line = nl;
out.ctx = &os;

- if (!csv_output)
- perf_stat__print_shadow_stats(counter, uval,
+ if (csv_output) {
+ print_noise(counter, noise);
+ print_running(run, ena);
+ }
+
+ perf_stat__print_shadow_stats(counter, uval,
stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
cpu_map__id_to_cpu(id),
&out);

- print_noise(counter, noise);
- print_running(run, ena);
+ if (!csv_output) {
+ print_noise(counter, noise);
+ print_running(run, ena);
+ }
}

static void print_aggr(char *prefix)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 4d8f18581b9b..367e220e93d5 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -310,8 +310,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
total = avg_stats(&runtime_stalled_cycles_front_stats[ctx][cpu]);
total = max(total, avg_stats(&runtime_stalled_cycles_back_stats[ctx][cpu]));

- out->new_line(ctxp);
if (total && avg) {
+ out->new_line(ctxp);
ratio = total / avg;
print_metric(ctxp, NULL, "%7.2f ",
"stalled cycles per insn",
--
2.5.0

2016-03-02 22:18:58

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 09/11] perf tests: Initialize sa.sa_flags

From: Colin Ian King <[email protected]>

The sa_flags field is not being initialized, so a garbage value is being
passed to sigaction. Initialize it to zero.

Signed-off-by: Colin Ian King <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/arch/x86/tests/rdpmc.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/arch/x86/tests/rdpmc.c b/tools/perf/arch/x86/tests/rdpmc.c
index 7bb0d13c235f..7945462851a4 100644
--- a/tools/perf/arch/x86/tests/rdpmc.c
+++ b/tools/perf/arch/x86/tests/rdpmc.c
@@ -103,6 +103,7 @@ static int __test__rdpmc(void)

sigfillset(&sa.sa_mask);
sa.sa_sigaction = segfault_handler;
+ sa.sa_flags = 0;
sigaction(SIGSEGV, &sa, NULL);

fd = sys_perf_event_open(&attr, 0, -1, -1,
--
2.5.0

2016-03-02 22:17:08

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 02/11] perf stat: Support metrics in --per-core/socket mode

From: Andi Kleen <[email protected]>

Enable metrics printing in --per-core / --per-socket mode. We need to
save the shadow metrics in a unique place. Always use the first CPU in
the aggregation. Then use the same CPU to retrieve the shadow value
later.

Example output:

% perf stat --per-core -a ./BC1s

Performance counter stats for 'system wide':

S0-C0 2 2966.020381 task-clock (msec) # 2.004 CPUs utilized (100.00%)
S0-C0 2 49 context-switches # 0.017 K/sec (100.00%)
S0-C0 2 4 cpu-migrations # 0.001 K/sec (100.00%)
S0-C0 2 467 page-faults # 0.157 K/sec
S0-C0 2 4,599,061,773 cycles # 1.551 GHz (100.00%)
S0-C0 2 9,755,886,883 instructions # 2.12 insn per cycle (100.00%)
S0-C0 2 1,906,272,125 branches # 642.704 M/sec (100.00%)
S0-C0 2 81,180,867 branch-misses # 4.26% of all branches
S0-C1 2 2965.995373 task-clock (msec) # 2.003 CPUs utilized (100.00%)
S0-C1 2 62 context-switches # 0.021 K/sec (100.00%)
S0-C1 2 8 cpu-migrations # 0.003 K/sec (100.00%)
S0-C1 2 281 page-faults # 0.095 K/sec
S0-C1 2 6,347,290 cycles # 0.002 GHz (100.00%)
S0-C1 2 4,654,156 instructions # 0.73 insn per cycle (100.00%)
S0-C1 2 947,121 branches # 0.319 M/sec (100.00%)
S0-C1 2 37,322 branch-misses # 3.94% of all branches

1.480409747 seconds time elapsed

v2: Rebase to older patches
v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri)

Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 64 +++++++++++++++++++++++++++++++++++++------
tools/perf/util/stat-shadow.c | 7 +++++
2 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2ffb8221917a..9b5089c5dffe 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -740,6 +740,8 @@ struct outstate {
bool newline;
const char *prefix;
int nfields;
+ int id, nr;
+ struct perf_evsel *evsel;
};

#define METRIC_LEN 35
@@ -755,12 +757,9 @@ static void do_new_line_std(struct outstate *os)
{
fputc('\n', os->fh);
fputs(os->prefix, os->fh);
+ aggr_printout(os->evsel, os->id, os->nr);
if (stat_config.aggr_mode == AGGR_NONE)
fprintf(os->fh, " ");
- if (stat_config.aggr_mode == AGGR_CORE)
- fprintf(os->fh, " ");
- if (stat_config.aggr_mode == AGGR_SOCKET)
- fprintf(os->fh, " ");
fprintf(os->fh, " ");
}

@@ -798,6 +797,7 @@ static void new_line_csv(void *ctx)
fputc('\n', os->fh);
if (os->prefix)
fprintf(os->fh, "%s%s", os->prefix, csv_sep);
+ aggr_printout(os->evsel, os->id, os->nr);
for (i = 0; i < os->nfields; i++)
fputs(csv_sep, os->fh);
}
@@ -855,6 +855,28 @@ static void nsec_printout(int id, int nr, struct perf_evsel *evsel, double avg)
fprintf(output, "%s%s", csv_sep, evsel->cgrp->name);
}

+static int first_shadow_cpu(struct perf_evsel *evsel, int id)
+{
+ int i;
+
+ if (!aggr_get_id)
+ return 0;
+
+ if (stat_config.aggr_mode == AGGR_NONE)
+ return id;
+
+ if (stat_config.aggr_mode == AGGR_GLOBAL)
+ return 0;
+
+ for (i = 0; i < perf_evsel__nr_cpus(evsel); i++) {
+ int cpu2 = perf_evsel__cpus(evsel)->map[i];
+
+ if (aggr_get_id(evsel_list->cpus, cpu2) == id)
+ return cpu2;
+ }
+ return 0;
+}
+
static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
{
FILE *output = stat_config.output;
@@ -891,7 +913,10 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
struct perf_stat_output_ctx out;
struct outstate os = {
.fh = stat_config.output,
- .prefix = prefix ? prefix : ""
+ .prefix = prefix ? prefix : "",
+ .id = id,
+ .nr = nr,
+ .evsel = counter,
};
print_metric_t pm = print_metric_std;
void (*nl)(void *);
@@ -958,16 +983,37 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
}

perf_stat__print_shadow_stats(counter, uval,
- stat_config.aggr_mode == AGGR_GLOBAL ? 0 :
- cpu_map__id_to_cpu(id),
+ first_shadow_cpu(counter, id),
&out);
-
if (!csv_output) {
print_noise(counter, noise);
print_running(run, ena);
}
}

+static void aggr_update_shadow(void)
+{
+ int cpu, s2, id, s;
+ u64 val;
+ struct perf_evsel *counter;
+
+ for (s = 0; s < aggr_map->nr; s++) {
+ id = aggr_map->map[s];
+ evlist__for_each(evsel_list, counter) {
+ val = 0;
+ for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+ s2 = aggr_get_id(evsel_list->cpus, cpu);
+ if (s2 != id)
+ continue;
+ val += perf_counts(counter->counts, cpu, 0)->val;
+ }
+ val = val * counter->scale;
+ perf_stat__update_shadow_stats(counter, &val,
+ first_shadow_cpu(counter, id));
+ }
+ }
+}
+
static void print_aggr(char *prefix)
{
FILE *output = stat_config.output;
@@ -979,6 +1025,8 @@ static void print_aggr(char *prefix)
if (!(aggr_map || aggr_get_id))
return;

+ aggr_update_shadow();
+
for (s = 0; s < aggr_map->nr; s++) {
id = aggr_map->map[s];
evlist__for_each(evsel_list, counter) {
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 367e220e93d5..5e2d2e34e1bc 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -14,6 +14,13 @@ enum {

#define NUM_CTX CTX_BIT_MAX

+/*
+ * AGGR_GLOBAL: Use CPU 0
+ * AGGR_SOCKET: Use first CPU of socket
+ * AGGR_CORE: Use first CPU of core
+ * AGGR_NONE: Use matching CPU
+ * AGGR_THREAD: Not supported?
+ */
static struct stats runtime_nsecs_stats[MAX_NR_CPUS];
static struct stats runtime_cycles_stats[NUM_CTX][MAX_NR_CPUS];
static struct stats runtime_stalled_cycles_front_stats[NUM_CTX][MAX_NR_CPUS];
--
2.5.0

2016-03-02 22:19:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 05/11] tools lib traceevent: Fix time stamp rounding issue

From: "Chaos.Chen" <[email protected]>

When rounding to microseconds, if the timestamp subsecond is between
.999999500 and .999999999, it is rounded to .1000000, when it should
instead increment the second counter due to the overflow.

For example, if the timestamp is 1234.999999501 instead of seeing:

1235.000000

we see:

1234.1000000

Signed-off-by: Chaos.Chen <[email protected]>
Cc: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ fixed incrementing "secs" instead of decrementing it ]
Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/lib/traceevent/event-parse.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 9a1e48afcf89..ce59f4891fa2 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -5429,6 +5429,11 @@ void pevent_print_event_time(struct pevent *pevent, struct trace_seq *s,
p = 9;
} else {
usecs = (nsecs + 500) / NSECS_PER_USEC;
+ /* To avoid usecs larger than 1 sec */
+ if (usecs >= 1000000) {
+ usecs -= 1000000;
+ secs++;
+ }
p = 6;
}

--
2.5.0

2016-03-02 22:19:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 07/11] tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines

From: "Steven Rostedt (Red Hat)" <[email protected]>

When a long value is read on 32 bit machines for 64 bit output, the
parsing needs to change "%lu" into "%llu", as the value is read
natively.

Unfortunately, if "%llu" is already there, the code will add another "l"
to it and fail to parse it properly.

Signed-off-by: Steven Rostedt <[email protected]>
Cc: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/lib/traceevent/event-parse.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index fb790aa757eb..865dea55454b 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4978,7 +4978,7 @@ static void pretty_print(struct trace_seq *s, void *data, int size, struct event
break;
}
}
- if (pevent->long_size == 8 && ls &&
+ if (pevent->long_size == 8 && ls == 1 &&
sizeof(long) != 8) {
char *p;

--
2.5.0

2016-03-02 22:19:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 03/11] tools build: Use .s extension for preprocessed assembler code

From: Masahiro Yamada <[email protected]>

The "man gcc" says .i extension represents the file is C source code
that should not be preprocessed. Here, .s should be used.

For clarification,
.c ---(preprocess)---> .i
.S ---(preprocess)---> .s

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Aaro Koskinen <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Lukas Wunner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/build/Makefile.build | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build
index 4a96473b180f..ee566e8bd1cf 100644
--- a/tools/build/Makefile.build
+++ b/tools/build/Makefile.build
@@ -85,7 +85,7 @@ $(OUTPUT)%.i: %.c FORCE
$(call rule_mkdir)
$(call if_changed_dep,cc_i_c)

-$(OUTPUT)%.i: %.S FORCE
+$(OUTPUT)%.s: %.S FORCE
$(call rule_mkdir)
$(call if_changed_dep,cc_i_c)

--
2.5.0

2016-03-03 08:21:37

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 00/11] perf/core improvements and fixes


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> The following changes since commit 675965b00d734c985e4285f5bec7e524d15fc4e1:
>
> perf: Export perf_event_sysfs_show() (2016-02-29 09:35:27 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160229
>
> for you to fetch changes up to 575a02e00b11eecbbabcb1eb22eab4c68e91ae77:
>
> perf record: Ensure return non-zero rc when mmap fail (2016-02-29 12:44:15 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> User visible:
>
> - Check existence of frontend/backed stalled cycles in 'perf stat' (Andi Kleen)
>
> - Avoid installing .o files from tools/lib/ into the python extension (Jiri Olsa)
>
> - Rename the tracepoint '/format' field that carries the syscall ID from 'nr',
> that is also the name of some syscalls arguments, to "__syscall_nr", to
> avoid having multiple fields with the same name, that was breaking the
> python script skeleton generator from perf.data files (Taeung Song)
>
> - Support converting data from bpf events in 'perf data' (Wang Nan)
>
> Infrastructure:
>
> - Split libtraceevent's pevent_print_event() (Steven Rostedt)
>
> - Librarize some 'perf record' bits to allow handling multiple perf.data
> files per session (Wang Nan)
>
> - Ensure return non-zero rc when mmap fail in 'perf record' (Wang Nan)
>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> ----------------------------------------------------------------
> Andi Kleen (1):
> perf stat: Check existence of frontend/backed stalled cycles

>
> Jiri Olsa (1):
> perf tools: Fix python extension build
>
> Steven Rostedt (1):
> tools lib traceevent: Split pevent_print_event() into specific functionality functions
>
> Taeung Song (2):
> perf trace: Check and discard not only 'nr' but also '__syscall_nr'
> tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:
>
> Wang Nan (6):
> perf data: Support converting data from bpf_perf_event_output()
> perf data: Explicitly set byte order for integer types
> perf record: Use WARN_ONCE to replace 'if' condition
> perf record: Extract synthesize code to record__synthesize()
> perf record: Introduce record__finish_output() to finish a perf.data
> perf record: Ensure return non-zero rc when mmap fail
>
> kernel/trace/trace_syscalls.c | 16 ++--
> tools/lib/traceevent/event-parse.c | 136 +++++++++++++++++++++++-------
> tools/lib/traceevent/event-parse.h | 13 +++
> tools/perf/builtin-record.c | 168 ++++++++++++++++++++++---------------
> tools/perf/builtin-stat.c | 22 ++++-
> tools/perf/builtin-trace.c | 8 +-
> tools/perf/util/data-convert-bt.c | 118 +++++++++++++++++++++++++-
> tools/perf/util/setup.py | 4 +
> 8 files changed, 372 insertions(+), 113 deletions(-)

Hm, there's a 'perf stat' regression that I can see:

Before:

triton:~/tip> perf stat -a sleep 1

Performance counter stats for 'system wide':

11990.023100 task-clock (msec) # 11.981 CPUs utilized
8,802 context-switches # 0.734 K/sec
543 cpu-migrations # 0.045 K/sec
97,375 page-faults # 0.008 M/sec
9,854,385,894 cycles # 0.822 GHz
15,274,841,152 stalled-cycles-frontend # 155.01% frontend cycles idle
<not supported> stalled-cycles-backend
9,634,486,137 instructions # 0.98 insn per cycle
# 1.59 stalled cycles per insn
1,818,488,088 branches # 151.667 M/sec
46,365,120 branch-misses # 2.55% of all branches

1.000741599 seconds time elapsed

After:

triton:~/tip> perf stat -a sleep 1

Performance counter stats for 'system wide':

11989.280397 task-clock (msec) # 11.981 CPUs utilized
1299 context-switches # 0.108 K/sec
6 cpu-migrations # 0.001 K/sec
70 page-faults # 0.006 K/sec
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycle
# 2.34 stalled cycles per insn
24166678 branches # 2.016 M/sec
505681 branch-misses # 2.09% of all branches

1.000684278 seconds time elapsed


... see how the numbers became human-unreadable, losing the big-number separator?

I suspect it's due to the following commit:

fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles

Thanks,

Ingo

2016-03-03 09:15:37

by Jiri Olsa

[permalink] [raw]
Subject: Re: [GIT PULL 00/11] perf/core improvements and fixes

On Thu, Mar 03, 2016 at 09:21:30AM +0100, Ingo Molnar wrote:
>
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > Hi Ingo,
> >
> > Please consider pulling,
> >
> > - Arnaldo
> >
> > The following changes since commit 675965b00d734c985e4285f5bec7e524d15fc4e1:
> >
> > perf: Export perf_event_sysfs_show() (2016-02-29 09:35:27 +0100)
> >
> > are available in the git repository at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160229
> >
> > for you to fetch changes up to 575a02e00b11eecbbabcb1eb22eab4c68e91ae77:
> >
> > perf record: Ensure return non-zero rc when mmap fail (2016-02-29 12:44:15 -0300)
> >
> > ----------------------------------------------------------------
> > perf/core improvements and fixes:
> >
> > User visible:
> >
> > - Check existence of frontend/backed stalled cycles in 'perf stat' (Andi Kleen)
> >
> > - Avoid installing .o files from tools/lib/ into the python extension (Jiri Olsa)
> >
> > - Rename the tracepoint '/format' field that carries the syscall ID from 'nr',
> > that is also the name of some syscalls arguments, to "__syscall_nr", to
> > avoid having multiple fields with the same name, that was breaking the
> > python script skeleton generator from perf.data files (Taeung Song)
> >
> > - Support converting data from bpf events in 'perf data' (Wang Nan)
> >
> > Infrastructure:
> >
> > - Split libtraceevent's pevent_print_event() (Steven Rostedt)
> >
> > - Librarize some 'perf record' bits to allow handling multiple perf.data
> > files per session (Wang Nan)
> >
> > - Ensure return non-zero rc when mmap fail in 'perf record' (Wang Nan)
> >
> > Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> >
> > ----------------------------------------------------------------
> > Andi Kleen (1):
> > perf stat: Check existence of frontend/backed stalled cycles
>
> >
> > Jiri Olsa (1):
> > perf tools: Fix python extension build
> >
> > Steven Rostedt (1):
> > tools lib traceevent: Split pevent_print_event() into specific functionality functions
> >
> > Taeung Song (2):
> > perf trace: Check and discard not only 'nr' but also '__syscall_nr'
> > tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:
> >
> > Wang Nan (6):
> > perf data: Support converting data from bpf_perf_event_output()
> > perf data: Explicitly set byte order for integer types
> > perf record: Use WARN_ONCE to replace 'if' condition
> > perf record: Extract synthesize code to record__synthesize()
> > perf record: Introduce record__finish_output() to finish a perf.data
> > perf record: Ensure return non-zero rc when mmap fail
> >
> > kernel/trace/trace_syscalls.c | 16 ++--
> > tools/lib/traceevent/event-parse.c | 136 +++++++++++++++++++++++-------
> > tools/lib/traceevent/event-parse.h | 13 +++
> > tools/perf/builtin-record.c | 168 ++++++++++++++++++++++---------------
> > tools/perf/builtin-stat.c | 22 ++++-
> > tools/perf/builtin-trace.c | 8 +-
> > tools/perf/util/data-convert-bt.c | 118 +++++++++++++++++++++++++-
> > tools/perf/util/setup.py | 4 +
> > 8 files changed, 372 insertions(+), 113 deletions(-)
>
> Hm, there's a 'perf stat' regression that I can see:
>
> Before:
>
> triton:~/tip> perf stat -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 11990.023100 task-clock (msec) # 11.981 CPUs utilized
> 8,802 context-switches # 0.734 K/sec
> 543 cpu-migrations # 0.045 K/sec
> 97,375 page-faults # 0.008 M/sec
> 9,854,385,894 cycles # 0.822 GHz
> 15,274,841,152 stalled-cycles-frontend # 155.01% frontend cycles idle
> <not supported> stalled-cycles-backend
> 9,634,486,137 instructions # 0.98 insn per cycle
> # 1.59 stalled cycles per insn
> 1,818,488,088 branches # 151.667 M/sec
> 46,365,120 branch-misses # 2.55% of all branches
>
> 1.000741599 seconds time elapsed
>
> After:
>
> triton:~/tip> perf stat -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 11989.280397 task-clock (msec) # 11.981 CPUs utilized
> 1299 context-switches # 0.108 K/sec
> 6 cpu-migrations # 0.001 K/sec
> 70 page-faults # 0.006 K/sec
> 127008602 cycles # 0.011 GHz
> 279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
> 119213269 instructions # 0.94 insn per cycle
> # 2.34 stalled cycles per insn
> 24166678 branches # 2.016 M/sec
> 505681 branch-misses # 2.09% of all branches
>
> 1.000684278 seconds time elapsed
>
>
> ... see how the numbers became human-unreadable, losing the big-number separator?
>
> I suspect it's due to the following commit:
>
> fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles

yea, it used the pmu parsing which screwes locales,
following patch fixed that for me..

jirka


---
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ce61f79dbaae..d8cd038baed2 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -124,6 +124,17 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
lc = setlocale(LC_NUMERIC, NULL);

/*
+ * The lc string may be allocated in static storage,
+ * so get a dynamic copy to make it survive setlocale
+ * call below.
+ */
+ lc = strdup(lc);
+ if (!lc) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ /*
* force to C locale to ensure kernel
* scale string is converted correctly.
* kernel uses default C locale.
@@ -135,6 +146,8 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
/* restore locale */
setlocale(LC_NUMERIC, lc);

+ free((char *) lc);
+
ret = 0;
error:
close(fd);

2016-03-03 09:54:04

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH] perf tools: Fix locale handling in pmu parsing

On Thu, Mar 03, 2016 at 10:15:22AM +0100, Jiri Olsa wrote:

SNIP

> >
> > 11989.280397 task-clock (msec) # 11.981 CPUs utilized
> > 1299 context-switches # 0.108 K/sec
> > 6 cpu-migrations # 0.001 K/sec
> > 70 page-faults # 0.006 K/sec
> > 127008602 cycles # 0.011 GHz
> > 279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
> > 119213269 instructions # 0.94 insn per cycle
> > # 2.34 stalled cycles per insn
> > 24166678 branches # 2.016 M/sec
> > 505681 branch-misses # 2.09% of all branches
> >
> > 1.000684278 seconds time elapsed
> >
> >
> > ... see how the numbers became human-unreadable, losing the big-number separator?
> >
> > I suspect it's due to the following commit:
> >
> > fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles
>
> yea, it used the pmu parsing which screwes locales,
> following patch fixed that for me..
>

resending with full changelog

jirka


---
Ingo reported regression on display format of big numbers,
which is missing separators (in default perf stat output).

triton:~/tip> perf stat -a sleep 1
...
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycle

This is caused by recent change:
perf stat: Check existence of frontend/backed stalled cycles

that added call to pmu_have_event, that subsequently calls
perf_pmu__parse_scale, which has a bug in locale handling.

The lc string returned from setlocale, that we use to store
old locale value, may be allocated in static storage. Getting
a dynamic copy to make it survive another setlocale call.

[jolsa@krava perf]$ perf stat ls
...
2,360,602 cycles # 3.080 GHz
2,703,090 instructions # 1.15 insn per cycle
546,031 branches # 712.511 M/sec

Reported-by: Ingo Molnar <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/pmu.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ce61f79dbaae..d8cd038baed2 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -124,6 +124,17 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
lc = setlocale(LC_NUMERIC, NULL);

/*
+ * The lc string may be allocated in static storage,
+ * so get a dynamic copy to make it survive setlocale
+ * call below.
+ */
+ lc = strdup(lc);
+ if (!lc) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ /*
* force to C locale to ensure kernel
* scale string is converted correctly.
* kernel uses default C locale.
@@ -135,6 +146,8 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
/* restore locale */
setlocale(LC_NUMERIC, lc);

+ free((char *) lc);
+
ret = 0;
error:
close(fd);
--
2.4.3

2016-03-03 14:38:17

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [GIT PULL 00/11] perf/core improvements and fixes

Em Thu, Mar 03, 2016 at 09:21:30AM +0100, Ingo Molnar escreveu:

> Hm, there's a 'perf stat' regression that I can see:
>
> Before:
> # 1.59 stalled cycles per insn
> 1,818,488,088 branches # 151.667 M/sec
>
> After:
>
> triton:~/tip> perf stat -a sleep 1
>
> 24166678 branches # 2.016 M/sec
>
> ... see how the numbers became human-unreadable, losing the big-number separator?
>
> I suspect it's due to the following commit:
>
> fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles

Ok, I inserted Jiri's patch fixing the problem just before the commit
(fa184776ac27) that triggers it, so that we don't break bisection for
human-readable numbers in 'perf stat'.

Its all in a new signed tag, that combines the two outstanding ones
(perf-core-for-mingo-20160229 + perf-core-for-mingo-20160302), please
consider pulling.

- Arnaldo

The following changes since commit 675965b00d734c985e4285f5bec7e524d15fc4e1:

perf: Export perf_event_sysfs_show() (2016-02-29 09:35:27 +0100)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160303

for you to fetch changes up to fb4605ba47e772ff9d62d1d54218a832ec8b3e1d:

perf stat: Check for frontend stalled for metrics (2016-03-03 11:10:40 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Check existence of frontend/backed stalled cycles in 'perf stat' (Andi Kleen)

- Implement CSV metrics output in 'perf stat' (Andi Kleen)

- Support metrics in 'perf stat' --per-core/socket mode (Andi Kleen)

- Avoid installing .o files from tools/lib/ into the python extension (Jiri Olsa)

- Rename the tracepoint '/format' field that carries the syscall ID from 'nr',
that is also the name of some syscalls arguments, to "__syscall_nr", to
avoid having multiple fields with the same name, that was breaking the
python script skeleton generator from perf.data files (Taeung Song)

- Support converting data from bpf events in 'perf data' (Wang Nan)

- Fix segfault in 'perf test' hists related entries (Arnaldo Carvalho de Melo)

- Fix output of %llu for 64 bit values read on 32 bit machines in libtraceevent (Steven Rostedt)

- Fix time stamp rounding issue in libtraceevent (Chaos.Chen)

Infrastructure:

- Fix setlocale() breakage in the pmu parsing code (Jiri Olsa)

- Split libtraceevent's pevent_print_event() (Steven Rostedt)

- Librarize some 'perf record' bits to allow handling multiple perf.data
files per session (Wang Nan)

- Ensure return non-zero rc when mmap fails in 'perf record' (Wang Nan)

- Fix double free on 'command_line' in a error path in 'perf script' (Colin Ian King)

- Initialize struct sigaction 'sa_flags' field in a 'perf test' entry (Colin Ian King)

- Fix various build warnings in turbostat, detected with gcc6 (Colin Ian King)

- Use .s extension for preprocessed assembler code (Masahiro Yamada)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------
Andi Kleen (4):
perf stat: Check existence of frontend/backed stalled cycles
perf stat: Implement CSV metrics output
perf stat: Support metrics in --per-core/socket mode
perf stat: Check for frontend stalled for metrics

Arnaldo Carvalho de Melo (1):
perf test: Fix hists related entries

Chaos.Chen (1):
tools lib traceevent: Fix time stamp rounding issue

Colin Ian King (3):
perf script: Fix double free on command_line
perf tests: Initialize sa.sa_flags
tools/power turbostat: fix various build warnings

Jiri Olsa (2):
perf tools: Fix python extension build
perf tools: Fix locale handling in pmu parsing

Masahiro Yamada (1):
tools build: Use .s extension for preprocessed assembler code

Steven Rostedt (1):
tools lib traceevent: Split pevent_print_event() into specific functionality functions

Steven Rostedt (Red Hat) (2):
tools lib traceevent: Set int_array fields to NULL if freeing from error
tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines

Taeung Song (2):
perf trace: Check and discard not only 'nr' but also '__syscall_nr'
tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:

Wang Nan (6):
perf data: Support converting data from bpf_perf_event_output()
perf data: Explicitly set byte order for integer types
perf record: Use WARN_ONCE to replace 'if' condition
perf record: Extract synthesize code to record__synthesize()
perf record: Introduce record__finish_output() to finish a perf.data
perf record: Ensure return non-zero rc when mmap fail

kernel/trace/trace_syscalls.c | 16 +-
tools/build/Makefile.build | 2 +-
tools/lib/traceevent/event-parse.c | 146 ++++++++++++++----
tools/lib/traceevent/event-parse.h | 13 ++
tools/perf/arch/x86/tests/rdpmc.c | 1 +
tools/perf/builtin-record.c | 168 ++++++++++++---------
tools/perf/builtin-stat.c | 158 +++++++++++++++++--
tools/perf/builtin-trace.c | 8 +-
tools/perf/util/data-convert-bt.c | 118 ++++++++++++++-
tools/perf/util/pmu.c | 13 ++
.../util/scripting-engines/trace-event-python.c | 4 +-
tools/perf/util/setup.py | 4 +
tools/perf/util/sort.c | 37 +++--
tools/perf/util/stat-shadow.c | 18 ++-
tools/perf/util/stat.h | 1 +
tools/power/x86/turbostat/turbostat.c | 8 +-
16 files changed, 566 insertions(+), 149 deletions(-)

2016-03-03 16:20:45

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix locale handling in pmu parsing

> resending with full changelog

Thanks. Looks good.

-Andi

2016-03-05 08:08:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 00/11] perf/core improvements and fixes


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Thu, Mar 03, 2016 at 09:21:30AM +0100, Ingo Molnar escreveu:
>
> > Hm, there's a 'perf stat' regression that I can see:
> >
> > Before:
> > # 1.59 stalled cycles per insn
> > 1,818,488,088 branches # 151.667 M/sec
> >
> > After:
> >
> > triton:~/tip> perf stat -a sleep 1
> >
> > 24166678 branches # 2.016 M/sec
> >
> > ... see how the numbers became human-unreadable, losing the big-number separator?
> >
> > I suspect it's due to the following commit:
> >
> > fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles
>
> Ok, I inserted Jiri's patch fixing the problem just before the commit
> (fa184776ac27) that triggers it, so that we don't break bisection for
> human-readable numbers in 'perf stat'.
>
> Its all in a new signed tag, that combines the two outstanding ones
> (perf-core-for-mingo-20160229 + perf-core-for-mingo-20160302), please
> consider pulling.
>
> - Arnaldo
>
> The following changes since commit 675965b00d734c985e4285f5bec7e524d15fc4e1:
>
> perf: Export perf_event_sysfs_show() (2016-02-29 09:35:27 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160303
>
> for you to fetch changes up to fb4605ba47e772ff9d62d1d54218a832ec8b3e1d:
>
> perf stat: Check for frontend stalled for metrics (2016-03-03 11:10:40 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> User visible:
>
> - Check existence of frontend/backed stalled cycles in 'perf stat' (Andi Kleen)
>
> - Implement CSV metrics output in 'perf stat' (Andi Kleen)
>
> - Support metrics in 'perf stat' --per-core/socket mode (Andi Kleen)
>
> - Avoid installing .o files from tools/lib/ into the python extension (Jiri Olsa)
>
> - Rename the tracepoint '/format' field that carries the syscall ID from 'nr',
> that is also the name of some syscalls arguments, to "__syscall_nr", to
> avoid having multiple fields with the same name, that was breaking the
> python script skeleton generator from perf.data files (Taeung Song)
>
> - Support converting data from bpf events in 'perf data' (Wang Nan)
>
> - Fix segfault in 'perf test' hists related entries (Arnaldo Carvalho de Melo)
>
> - Fix output of %llu for 64 bit values read on 32 bit machines in libtraceevent (Steven Rostedt)
>
> - Fix time stamp rounding issue in libtraceevent (Chaos.Chen)
>
> Infrastructure:
>
> - Fix setlocale() breakage in the pmu parsing code (Jiri Olsa)
>
> - Split libtraceevent's pevent_print_event() (Steven Rostedt)
>
> - Librarize some 'perf record' bits to allow handling multiple perf.data
> files per session (Wang Nan)
>
> - Ensure return non-zero rc when mmap fails in 'perf record' (Wang Nan)
>
> - Fix double free on 'command_line' in a error path in 'perf script' (Colin Ian King)
>
> - Initialize struct sigaction 'sa_flags' field in a 'perf test' entry (Colin Ian King)
>
> - Fix various build warnings in turbostat, detected with gcc6 (Colin Ian King)
>
> - Use .s extension for preprocessed assembler code (Masahiro Yamada)
>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> ----------------------------------------------------------------
> Andi Kleen (4):
> perf stat: Check existence of frontend/backed stalled cycles
> perf stat: Implement CSV metrics output
> perf stat: Support metrics in --per-core/socket mode
> perf stat: Check for frontend stalled for metrics
>
> Arnaldo Carvalho de Melo (1):
> perf test: Fix hists related entries
>
> Chaos.Chen (1):
> tools lib traceevent: Fix time stamp rounding issue
>
> Colin Ian King (3):
> perf script: Fix double free on command_line
> perf tests: Initialize sa.sa_flags
> tools/power turbostat: fix various build warnings
>
> Jiri Olsa (2):
> perf tools: Fix python extension build
> perf tools: Fix locale handling in pmu parsing
>
> Masahiro Yamada (1):
> tools build: Use .s extension for preprocessed assembler code
>
> Steven Rostedt (1):
> tools lib traceevent: Split pevent_print_event() into specific functionality functions
>
> Steven Rostedt (Red Hat) (2):
> tools lib traceevent: Set int_array fields to NULL if freeing from error
> tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines
>
> Taeung Song (2):
> perf trace: Check and discard not only 'nr' but also '__syscall_nr'
> tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:
>
> Wang Nan (6):
> perf data: Support converting data from bpf_perf_event_output()
> perf data: Explicitly set byte order for integer types
> perf record: Use WARN_ONCE to replace 'if' condition
> perf record: Extract synthesize code to record__synthesize()
> perf record: Introduce record__finish_output() to finish a perf.data
> perf record: Ensure return non-zero rc when mmap fail
>
> kernel/trace/trace_syscalls.c | 16 +-
> tools/build/Makefile.build | 2 +-
> tools/lib/traceevent/event-parse.c | 146 ++++++++++++++----
> tools/lib/traceevent/event-parse.h | 13 ++
> tools/perf/arch/x86/tests/rdpmc.c | 1 +
> tools/perf/builtin-record.c | 168 ++++++++++++---------
> tools/perf/builtin-stat.c | 158 +++++++++++++++++--
> tools/perf/builtin-trace.c | 8 +-
> tools/perf/util/data-convert-bt.c | 118 ++++++++++++++-
> tools/perf/util/pmu.c | 13 ++
> .../util/scripting-engines/trace-event-python.c | 4 +-
> tools/perf/util/setup.py | 4 +
> tools/perf/util/sort.c | 37 +++--
> tools/perf/util/stat-shadow.c | 18 ++-
> tools/perf/util/stat.h | 1 +
> tools/power/x86/turbostat/turbostat.c | 8 +-
> 16 files changed, 566 insertions(+), 149 deletions(-)

Pulled, thanks a lot Arnaldo!

Ingo

Subject: [tip:perf/core] perf tools: Fix locale handling in pmu parsing

Commit-ID: f9a5978ac4ede901fa73d7c28ae1c5d89bc2a46a
Gitweb: http://git.kernel.org/tip/f9a5978ac4ede901fa73d7c28ae1c5d89bc2a46a
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 3 Mar 2016 10:53:48 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 3 Mar 2016 11:04:54 -0300

perf tools: Fix locale handling in pmu parsing

Ingo reported regression on display format of big numbers, which is
missing separators (in default perf stat output).

triton:~/tip> perf stat -a sleep 1
...
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycle

This is caused by recent change:

perf stat: Check existence of frontend/backed stalled cycles

that added call to pmu_have_event, that subsequently calls
perf_pmu__parse_scale, which has a bug in locale handling.

The lc string returned from setlocale, that we use to store old locale
value, may be allocated in static storage. Getting a dynamic copy to
make it survive another setlocale call.

$ perf stat ls
...
2,360,602 cycles # 3.080 GHz
2,703,090 instructions # 1.15 insn per cycle
546,031 branches # 712.511 M/sec

Committer note:

Since the patch introducing the regression didn't made to perf/core,
move it to just before where the regression was introduced, so that we
don't break bisection for this feature.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/pmu.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ce61f79..d8cd038 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -124,6 +124,17 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
lc = setlocale(LC_NUMERIC, NULL);

/*
+ * The lc string may be allocated in static storage,
+ * so get a dynamic copy to make it survive setlocale
+ * call below.
+ */
+ lc = strdup(lc);
+ if (!lc) {
+ ret = -ENOMEM;
+ goto error;
+ }
+
+ /*
* force to C locale to ensure kernel
* scale string is converted correctly.
* kernel uses default C locale.
@@ -135,6 +146,8 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
/* restore locale */
setlocale(LC_NUMERIC, lc);

+ free((char *) lc);
+
ret = 0;
error:
close(fd);

2016-03-08 13:23:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:perf/core] perf tools: Fix locale handling in pmu parsing


* tip-bot for Jiri Olsa <[email protected]> wrote:

> @@ -135,6 +146,8 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
> /* restore locale */
> setlocale(LC_NUMERIC, lc);
>
> + free((char *) lc);
> +

Btw., minor side note: why does 'lc' have to be case to 'char *'?

In the kernel kfree() takes 'const void *':

include/linux/slab.h:void kfree(const void *);

which will accept all pointer types. That avoids unnecessary and fragile type
casts.

Thanks,

Ingo

2016-03-08 18:42:49

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH][perf/core] perf tools: Omit unnecessary cast in perf_pmu__parse_scale

On Tue, Mar 08, 2016 at 02:23:40PM +0100, Ingo Molnar wrote:
>
> * tip-bot for Jiri Olsa <[email protected]> wrote:
>
> > @@ -135,6 +146,8 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
> > /* restore locale */
> > setlocale(LC_NUMERIC, lc);
> >
> > + free((char *) lc);
> > +
>
> Btw., minor side note: why does 'lc' have to be case to 'char *'?
>
> In the kernel kfree() takes 'const void *':
>
> include/linux/slab.h:void kfree(const void *);
>
> which will accept all pointer types. That avoids unnecessary and fragile type
> casts.

libc free takes only non const pointers:

util/pmu.c: In function ‘perf_pmu__parse_scale’:
util/pmu.c:149:7: error: passing argument 1 of ‘free’ discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers]
free(lc);
^
In file included from /home/jolsa/kernel/linux-perf/tools/include/linux/kernel.h:6:0,
from /home/jolsa/kernel/linux-perf/tools/include/linux/list.h:6,
from util/pmu.c:1:
/usr/include/stdlib.h:483:13: note: expected ‘void *’ but argument is of type ‘const char *’
extern void free (void *__ptr) __THROW;


but we could actually make it char* from the beginning..
for some reason I thought setlocale returns const ptr,
and I did not check.. fix attached

jirka


---
There's no need to used const char pointer,
we can used char pointer from the beginning
and omit unnecessary cast.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/pmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index d8cd038baed2..adef23b1352e 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -98,7 +98,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
char scale[128];
int fd, ret = -1;
char path[PATH_MAX];
- const char *lc;
+ char *lc;

snprintf(path, PATH_MAX, "%s/%s.scale", dir, name);

@@ -146,7 +146,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
/* restore locale */
setlocale(LC_NUMERIC, lc);

- free((char *) lc);
+ free(lc);

ret = 0;
error:
--
2.4.3

2016-03-09 13:43:50

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH][perf/core] perf tools: Omit unnecessary cast in perf_pmu__parse_scale

Em Tue, Mar 08, 2016 at 07:42:30PM +0100, Jiri Olsa escreveu:
> but we could actually make it char* from the beginning..
> for some reason I thought setlocale returns const ptr,
> and I did not check.. fix attached

Thanks, applied.

Subject: [tip:perf/core] perf tools: Omit unnecessary cast in perf_pmu__parse_scale

Commit-ID: ea8f75f981918c5946fc4029acdc86707fa901c1
Gitweb: http://git.kernel.org/tip/ea8f75f981918c5946fc4029acdc86707fa901c1
Author: Jiri Olsa <[email protected]>
AuthorDate: Tue, 8 Mar 2016 19:42:30 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Mar 2016 10:42:22 -0300

perf tools: Omit unnecessary cast in perf_pmu__parse_scale

There's no need to use a const char pointer, we can used char pointer
from the beginning and omit the unnecessary cast.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/pmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index d8cd038..adef23b 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -98,7 +98,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
char scale[128];
int fd, ret = -1;
char path[PATH_MAX];
- const char *lc;
+ char *lc;

snprintf(path, PATH_MAX, "%s/%s.scale", dir, name);

@@ -146,7 +146,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias *alias, char *dir, char *
/* restore locale */
setlocale(LC_NUMERIC, lc);

- free((char *) lc);
+ free(lc);

ret = 0;
error: