Hi Ingo,
Please consider pulling,
- Arnaldo
The following changes since commit eaec12d7f526694f24d581a4ad23de6ce0315cd2:
tools lib traceevent: Fix signature of create_arg_item() (2012-05-24 11:36:05 -0300)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-urgent-for-mingo
for you to fetch changes up to a44fa3fa69d2f5023b85d4d86faa9ab8945df260:
perf top: Fix counter name fixup when fallbacking to cpu-clock (2012-05-25 14:49:51 -0300)
----------------------------------------------------------------
Fixes for perf/urgent.
. Fix build on newer distros where _FORTIFY_SOURCE needs optimization enable.
. Fix event name caching when fallbacking to cpu-clock
. Event name reconstruction from perf_event_attr now include modifiers.
. Elliminate leak on error path when parsing pid/tid list, From Franck Bui-Huu.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
----------------------------------------------------------------
Arnaldo Carvalho de Melo (3):
perf tools: Do not use _FORTIFY_SOURCE when DEBUG=1 is specified
perf evsel: Adopt the hardware event names
perf top: Fix counter name fixup when fallbacking to cpu-clock
Franck Bui-Huu (1):
perf tools: fix thread_map__new_by_pid_str() memory leak in error path
tools/perf/Makefile | 4 ++--
tools/perf/builtin-top.c | 2 +-
tools/perf/util/evsel.c | 21 +++++++++++++++++++++
tools/perf/util/evsel.h | 2 ++
tools/perf/util/parse-events.c | 17 +----------------
tools/perf/util/thread_map.c | 21 ++++++++++-----------
6 files changed, 37 insertions(+), 30 deletions(-)
From: Franck Bui-Huu <[email protected]>
The namelist array (including its content) was not freed if we fail to
realloc a new 'threads' structure.
Signed-off-by: Franck Bui-Huu <[email protected]>
Cc: David Ahern <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/thread_map.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 84d9bd78..9b5f856 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -188,28 +188,27 @@ static struct thread_map *thread_map__new_by_pid_str(const char *pid_str)
nt = realloc(threads, (sizeof(*threads) +
sizeof(pid_t) * total_tasks));
if (nt == NULL)
- goto out_free_threads;
+ goto out_free_namelist;
threads = nt;
- if (threads) {
- for (i = 0; i < items; i++)
- threads->map[j++] = atoi(namelist[i]->d_name);
- threads->nr = total_tasks;
- }
-
- for (i = 0; i < items; i++)
+ for (i = 0; i < items; i++) {
+ threads->map[j++] = atoi(namelist[i]->d_name);
free(namelist[i]);
+ }
+ threads->nr = total_tasks;
free(namelist);
-
- if (!threads)
- break;
}
out:
strlist__delete(slist);
return threads;
+out_free_namelist:
+ for (i = 0; i < items; i++)
+ free(namelist[i]);
+ free(namelist);
+
out_free_threads:
free(threads);
threads = NULL;
--
1.7.9.2.358.g22243
From: Arnaldo Carvalho de Melo <[email protected]>
In 40491eaa "perf top: Update event name when falling back to cpu-clock"
we freed counter->name but didn't reset it to NULL, then when setting it
to the result of event_name(), event_name() would use the cached value,
which by now was overwritten and thus we got garbage or a zero lenght
string.
Fix it by just freeing and setting counter->name to NULL, this way
event_name() when called afterwards, will find the right counter name
and cache it again.
Found while trying 'cycles:pp' on a machine were :pp couldn't be
honoured. Probably the best fallback here is to tell the user that that
level of precision is not available on the PMU and then go removing 'p',
levels of precision till we get to play 'cycles' and if even that fails,
_then_ get to 'cpu-clock'.
But that is the matter for another patch, this one just needs to fix the
caching issue, which in the end will show 'cpu-clock' when tools ask for
the event name being used, which clarifies things for the user, that
will see that 'cycles:pp' or whatever not support event is not being
used, some sort of fallback happened.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-top.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6031dce..d4a5f9b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -953,7 +953,7 @@ try_again:
attr->config = PERF_COUNT_SW_CPU_CLOCK;
if (counter->name) {
free(counter->name);
- counter->name = strdup(event_name(counter));
+ counter->name = NULL;
}
goto try_again;
}
--
1.7.9.2.358.g22243
From: Arnaldo Carvalho de Melo <[email protected]>
The modifiers:
k kernel space
u user space
h hypervisor
G guest
H host
p, pp, ppp precision level (PEBS)
that can be suffixed to an event were lost when tools used event_name()
to reconstruct them from the perf_event_attr entries in a perf.data
file.
Fix it by following the defaults used for these modifiers in the current
codebase, so:
$ perf record -e instructions:u usleep 1 2> /dev/null
$ perf evlist
instructions:u
$ perf record -e cycles:k usleep 1 2> /dev/null
$ perf evlist
cycles:k
$ perf record -e cycles:kh usleep 1 2> /dev/null
$ perf evlist
cycles:kh
$ perf record -e cache-misses:G usleep 1 2> /dev/null
$ perf evlist
cache-misses:G
$ perf record -e cycles:ppk usleep 1 2> /dev/null
$ perf evlist
cycles:kpp
$
Also works with 'top', 'report', etc.
More work needed to cover tracepoints and software events while not
dragging lots of baggage to the python binding, this is a minimal fix
for v3.5.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evsel.c | 90 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/evsel.h | 3 ++
tools/perf/util/parse-events.c | 27 +++++-------
3 files changed, 104 insertions(+), 16 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 57e4ce5..91d1913 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -15,6 +15,7 @@
#include "cpumap.h"
#include "thread_map.h"
#include "target.h"
+#include "../../include/linux/perf_event.h"
#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
#define GROUP_FD(group_fd, cpu) (*(int *)xyarray__entry(group_fd, cpu, 0))
@@ -64,6 +65,95 @@ struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr, int idx)
return evsel;
}
+static const char *perf_evsel__hw_names[PERF_COUNT_HW_MAX] = {
+ "cycles",
+ "instructions",
+ "cache-references",
+ "cache-misses",
+ "branches",
+ "branch-misses",
+ "bus-cycles",
+ "stalled-cycles-frontend",
+ "stalled-cycles-backend",
+ "ref-cycles",
+};
+
+const char *__perf_evsel__hw_name(u64 config)
+{
+ if (config < PERF_COUNT_HW_MAX && perf_evsel__hw_names[config])
+ return perf_evsel__hw_names[config];
+
+ return "unknown-hardware";
+}
+
+static int perf_evsel__hw_name(struct perf_evsel *evsel, char *bf, size_t size)
+{
+ int colon = 0;
+ struct perf_event_attr *attr = &evsel->attr;
+ int r = scnprintf(bf, size, "%s", __perf_evsel__hw_name(attr->config));
+ bool exclude_guest_default = false;
+
+#define MOD_PRINT(context, mod) do { \
+ if (!attr->exclude_##context) { \
+ if (!colon) colon = r++; \
+ r += scnprintf(bf + r, size - r, "%c", mod); \
+ } } while(0)
+
+ if (attr->exclude_kernel || attr->exclude_user || attr->exclude_hv) {
+ MOD_PRINT(kernel, 'k');
+ MOD_PRINT(user, 'u');
+ MOD_PRINT(hv, 'h');
+ exclude_guest_default = true;
+ }
+
+ if (attr->precise_ip) {
+ if (!colon)
+ colon = r++;
+ r += scnprintf(bf + r, size - r, "%.*s", attr->precise_ip, "ppp");
+ exclude_guest_default = true;
+ }
+
+ if (attr->exclude_host || attr->exclude_guest == exclude_guest_default) {
+ MOD_PRINT(host, 'H');
+ MOD_PRINT(guest, 'G');
+ }
+#undef MOD_PRINT
+ if (colon)
+ bf[colon] = ':';
+ return r;
+}
+
+int perf_evsel__name(struct perf_evsel *evsel, char *bf, size_t size)
+{
+ int ret;
+
+ switch (evsel->attr.type) {
+ case PERF_TYPE_RAW:
+ ret = scnprintf(bf, size, "raw 0x%" PRIx64, evsel->attr.config);
+ break;
+
+ case PERF_TYPE_HARDWARE:
+ ret = perf_evsel__hw_name(evsel, bf, size);
+ break;
+ default:
+ /*
+ * FIXME
+ *
+ * This is the minimal perf_evsel__name so that we can
+ * reconstruct event names taking into account event modifiers.
+ *
+ * The old event_name uses it now for raw anr hw events, so that
+ * we don't drag all the parsing stuff into the python binding.
+ *
+ * On the next devel cycle the rest of the event naming will be
+ * brought here.
+ */
+ return 0;
+ }
+
+ return ret;
+}
+
void perf_evsel__config(struct perf_evsel *evsel, struct perf_record_opts *opts,
struct perf_evsel *first)
{
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3d6b3e4..4ba8b56 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -83,6 +83,9 @@ void perf_evsel__config(struct perf_evsel *evsel,
struct perf_record_opts *opts,
struct perf_evsel *first);
+const char* __perf_evsel__hw_name(u64 config);
+int perf_evsel__name(struct perf_evsel *evsel, char *bf, size_t size);
+
int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads);
int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads);
int perf_evsel__alloc_counts(struct perf_evsel *evsel, int ncpus);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index fac7d59..05dbc8b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -62,19 +62,6 @@ static struct event_symbol event_symbols[] = {
#define PERF_EVENT_TYPE(config) __PERF_EVENT_FIELD(config, TYPE)
#define PERF_EVENT_ID(config) __PERF_EVENT_FIELD(config, EVENT)
-static const char *hw_event_names[PERF_COUNT_HW_MAX] = {
- "cycles",
- "instructions",
- "cache-references",
- "cache-misses",
- "branches",
- "branch-misses",
- "bus-cycles",
- "stalled-cycles-frontend",
- "stalled-cycles-backend",
- "ref-cycles",
-};
-
static const char *sw_event_names[PERF_COUNT_SW_MAX] = {
"cpu-clock",
"task-clock",
@@ -300,6 +287,16 @@ const char *event_name(struct perf_evsel *evsel)
u64 config = evsel->attr.config;
int type = evsel->attr.type;
+ if (type == PERF_TYPE_RAW || type == PERF_TYPE_HARDWARE) {
+ /*
+ * XXX minimal fix, see comment on perf_evsen__name, this static buffer
+ * will go away together with event_name in the next devel cycle.
+ */
+ static char bf[128];
+ perf_evsel__name(evsel, bf, sizeof(bf));
+ return bf;
+ }
+
if (evsel->name)
return evsel->name;
@@ -317,9 +314,7 @@ const char *__event_name(int type, u64 config)
switch (type) {
case PERF_TYPE_HARDWARE:
- if (config < PERF_COUNT_HW_MAX && hw_event_names[config])
- return hw_event_names[config];
- return "unknown-hardware";
+ return __perf_evsel__hw_name(config);
case PERF_TYPE_HW_CACHE: {
u8 cache_type, cache_op, cache_result;
--
1.7.9.2.358.g22243
From: Arnaldo Carvalho de Melo <[email protected]>
As:
make DEBUG=1 -C tools/perf
disables optimizations and _FORTIFY_SOURCE in recent distros requires
optimizations to be enabled, seen on a Fedora 17 system:
[acme@Fedora17 linux]$ make DEBUG=1 O=/home/acme/git/build/perf/ -C
tools/perf install
In file included from /usr/include/sys/types.h:26:0,
from /usr/include/libelf.h:53,
from /usr/include/gelf.h:53,
from /usr/include/elfutils/libdw.h:53,
from <stdin>:2:
/usr/include/features.h:314:4: error: #warning _FORTIFY_SOURCE requires
compiling with optimization (-O) [-Werror=cpp
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 1d3d513..0eee64c 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -80,7 +80,7 @@ ifeq ("$(origin DEBUG)", "command line")
PERF_DEBUG = $(DEBUG)
endif
ifndef PERF_DEBUG
- CFLAGS_OPTIMIZE = -O6
+ CFLAGS_OPTIMIZE = -O6 -D_FORTIFY_SOURCE=2
endif
ifdef PARSER_DEBUG
@@ -89,7 +89,7 @@ ifdef PARSER_DEBUG
PARSER_DEBUG_CFLAGS := -DPARSER_DEBUG
endif
-CFLAGS = -fno-omit-frame-pointer -ggdb3 -Wall -Wextra -std=gnu99 $(CFLAGS_WERROR) $(CFLAGS_OPTIMIZE) -D_FORTIFY_SOURCE=2 $(EXTRA_WARNINGS) $(EXTRA_CFLAGS) $(PARSER_DEBUG_CFLAGS)
+CFLAGS = -fno-omit-frame-pointer -ggdb3 -Wall -Wextra -std=gnu99 $(CFLAGS_WERROR) $(CFLAGS_OPTIMIZE) $(EXTRA_WARNINGS) $(EXTRA_CFLAGS) $(PARSER_DEBUG_CFLAGS)
EXTLIBS = -lpthread -lrt -lelf -lm
ALL_CFLAGS = $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
ALL_LDFLAGS = $(LDFLAGS)
--
1.7.9.2.358.g22243
On 5/25/12 1:59 PM, Arnaldo Carvalho de Melo wrote:
> From: Arnaldo Carvalho de Melo<[email protected]>
>
> In 40491eaa "perf top: Update event name when falling back to cpu-clock"
> we freed counter->name but didn't reset it to NULL, then when setting it
> to the result of event_name(), event_name() would use the cached value,
> which by now was overwritten and thus we got garbage or a zero lenght
> string.
>
> Fix it by just freeing and setting counter->name to NULL, this way
> event_name() when called afterwards, will find the right counter name
> and cache it again.
>
> Found while trying 'cycles:pp' on a machine were :pp couldn't be
> honoured. Probably the best fallback here is to tell the user that that
> level of precision is not available on the PMU and then go removing 'p',
> levels of precision till we get to play 'cycles' and if even that fails,
> _then_ get to 'cpu-clock'.
>
> But that is the matter for another patch, this one just needs to fix the
> caching issue, which in the end will show 'cpu-clock' when tools ask for
> the event name being used, which clarifies things for the user, that
> will see that 'cycles:pp' or whatever not support event is not being
> used, some sort of fallback happened.
>
> Cc: David Ahern<[email protected]>
> Cc: Frederic Weisbecker<[email protected]>
> Cc: Mike Galbraith<[email protected]>
> Cc: Namhyung Kim<[email protected]>
> Cc: Paul Mackerras<[email protected]>
> Cc: Peter Zijlstra<[email protected]>
> Cc: Stephane Eranian<[email protected]>
> Link: http://lkml.kernel.org/n/[email protected]
> Signed-off-by: Arnaldo Carvalho de Melo<[email protected]>
> ---
> tools/perf/builtin-top.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 6031dce..d4a5f9b 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -953,7 +953,7 @@ try_again:
> attr->config = PERF_COUNT_SW_CPU_CLOCK;
> if (counter->name) {
> free(counter->name);
> - counter->name = strdup(event_name(counter));
> + counter->name = NULL;
> }
> goto try_again;
> }
The patch I sent had:
+ if (counter->name) {
+ free(counter->name);
+ counter->name = NULL;
+ counter->name = strdup(event_name(counter));
+ }
See http://lkml.indiana.edu/hypermail/linux/kernel/1205.1/00390.html
Em Fri, May 25, 2012 at 02:26:34PM -0600, David Ahern escreveu:
> The patch I sent had:
>
> + if (counter->name) {
> + free(counter->name);
> + counter->name = NULL;
> + counter->name = strdup(event_name(counter));
> + }
>
> See http://lkml.indiana.edu/hypermail/linux/kernel/1205.1/00390.html
I probably messed up this one, should've put a committer note for that
stupid simplification, anyway, just setting it to NULL is enough and
less confusing.
- Arnaldo
Em Fri, May 25, 2012 at 04:59:14PM -0300, Arnaldo Carvalho de Melo escreveu:
> Hi Ingo,
>
> Please consider pulling,
Please disregard this one, a new one is coming marked v2,
- Arnaldo