Subject: [PATCH 0/6] perf tools: fix annotate and top->annotate + more

Ingo, All,

While studying perf, I've noticed `perf top` can't annotate userspace. Then I
also discovered that `perf annotate` wrongly annotates non-prelinked libraries.
Please find fixes below.

Kirill

Btw, thanks for nice tools.


Kirill Smelkov (6):
perf top: teach it to autolocate vmlinux
perf top: align help text on keys
perf top: fix code typo in prompt_symbol()
perf annotate: fix it for non-prelinked *.so
perf top: fix annotate for userspace
perf: fix few typos + cosmetics

tools/perf/Documentation/perf-top.txt | 2 +-
tools/perf/Documentation/perf.txt | 2 +-
tools/perf/builtin-annotate.c | 4 +-
tools/perf/builtin-top.c | 67 +++++++++++++++++++--------------
tools/perf/design.txt | 8 ++--
tools/perf/util/map.c | 22 +++++++++++
tools/perf/util/map.h | 10 +++++
7 files changed, 79 insertions(+), 36 deletions(-)


Subject: [PATCH 6/6] perf: fix few typos + cosmetics

Signed-off-by: Kirill Smelkov <[email protected]>
---
tools/perf/Documentation/perf.txt | 2 +-
tools/perf/design.txt | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
index 69c8325..0eeb247 100644
--- a/tools/perf/Documentation/perf.txt
+++ b/tools/perf/Documentation/perf.txt
@@ -12,7 +12,7 @@ SYNOPSIS

DESCRIPTION
-----------
-Performance counters for Linux are are a new kernel-based subsystem
+Performance counters for Linux are a new kernel-based subsystem
that provide a framework for all things performance analysis. It
covers hardware level (CPU/PMU, Performance Monitoring Unit) features
and software features (software counters, tracepoints) as well.
diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index 8d0de51..bd0bb1b 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -101,10 +101,10 @@ enum hw_event_ids {
*/
PERF_COUNT_HW_CPU_CYCLES = 0,
PERF_COUNT_HW_INSTRUCTIONS = 1,
- PERF_COUNT_HW_CACHE_REFERENCES = 2,
+ PERF_COUNT_HW_CACHE_REFERENCES = 2,
PERF_COUNT_HW_CACHE_MISSES = 3,
PERF_COUNT_HW_BRANCH_INSTRUCTIONS = 4,
- PERF_COUNT_HW_BRANCH_MISSES = 5,
+ PERF_COUNT_HW_BRANCH_MISSES = 5,
PERF_COUNT_HW_BUS_CYCLES = 6,
};

@@ -131,8 +131,8 @@ software events, selected by 'event_id':
*/
enum sw_event_ids {
PERF_COUNT_SW_CPU_CLOCK = 0,
- PERF_COUNT_SW_TASK_CLOCK = 1,
- PERF_COUNT_SW_PAGE_FAULTS = 2,
+ PERF_COUNT_SW_TASK_CLOCK = 1,
+ PERF_COUNT_SW_PAGE_FAULTS = 2,
PERF_COUNT_SW_CONTEXT_SWITCHES = 3,
PERF_COUNT_SW_CPU_MIGRATIONS = 4,
PERF_COUNT_SW_PAGE_FAULTS_MIN = 5,
--
1.6.6.79.gd514e.dirty

Subject: [PATCH 2/6] perf top: align help text on keys

Mapped keys:
[d] display refresh delay. (2)
[e] display entries (lines). (46)
[f] profile display filter (count). (5)
[F] annotate display filter (percent). (5%)
[s] annotate symbol. (NULL)
[S] stop annotation.
[K] hide kernel_symbols symbols. (no)
[U] hide user symbols. (no)
[z] toggle sample zeroing. (0)
[qQ] quit.

instead of

Mapped keys:
[d] display refresh delay. (2)
[e] display entries (lines). (46)
[f] profile display filter (count). (5)
[F] annotate display filter (percent). (5%)
[s] annotate symbol. (NULL)
[S] stop annotation.
[K] hide kernel_symbols symbols. (no)
[U] hide user symbols. (no)
[z] toggle sample zeroing. (0)
[qQ] quit.

Signed-off-by: Kirill Smelkov <[email protected]>
---
tools/perf/builtin-top.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9c7de93..9dc8070 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -704,7 +704,7 @@ static void print_mapped_keys(void)
fprintf(stdout, "\t[w] toggle display weighted/count[E]r. \t(%d)\n", display_weighted ? 1 : 0);

fprintf(stdout,
- "\t[K] hide kernel_symbols symbols. \t(%s)\n",
+ "\t[K] hide kernel_symbols symbols. \t(%s)\n",
hide_kernel_symbols ? "yes" : "no");
fprintf(stdout,
"\t[U] hide user symbols. \t(%s)\n",
--
1.6.6.79.gd514e.dirty

Subject: [PATCH 1/6] perf top: teach it to autolocate vmlinux

By relying on logic in dso__load_kernel_sym(), we can automatically load
vmlinux.

The only thing which needs to be adjusted, is how --sym-annotate option
is handled - now we can't rely on vmlinux been loaded until full
successful pass of dso__load_vmlinux(), but that's not the case if we'll
do sym_filter_entry setup in symbol_filter().

So move this step right after event__process_sample() where we know the
whole dso__load_kernel_sym() pass is done.

By the way, though conceptually similar `perf top` still can't annotate
userspace - see next patches with fixes.

Signed-off-by: Kirill Smelkov <[email protected]>
Cc: Mike Galbraith <[email protected]>
---
tools/perf/Documentation/perf-top.txt | 2 +-
tools/perf/builtin-top.c | 39 +++++++++++++++++++-------------
2 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 4a7d558..785b9fc 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -74,7 +74,7 @@ OPTIONS

-s <symbol>::
--sym-annotate=<symbol>::
- Annotate this symbol. Requires -k option.
+ Annotate this symbol.

-v::
--verbose::
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ddc584b..9c7de93 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -94,6 +94,7 @@ struct source_line {

static char *sym_filter = NULL;
struct sym_entry *sym_filter_entry = NULL;
+struct sym_entry *sym_filter_entry_sched = NULL;
static int sym_pcnt_filter = 5;
static int sym_counter = 0;
static int display_weighted = -1;
@@ -695,11 +696,9 @@ static void print_mapped_keys(void)

fprintf(stdout, "\t[f] profile display filter (count). \t(%d)\n", count_filter);

- if (symbol_conf.vmlinux_name) {
- fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", sym_pcnt_filter);
- fprintf(stdout, "\t[s] annotate symbol. \t(%s)\n", name?: "NULL");
- fprintf(stdout, "\t[S] stop annotation.\n");
- }
+ fprintf(stdout, "\t[F] annotate display filter (percent). \t(%d%%)\n", sym_pcnt_filter);
+ fprintf(stdout, "\t[s] annotate symbol. \t(%s)\n", name?: "NULL");
+ fprintf(stdout, "\t[S] stop annotation.\n");

if (nr_counters > 1)
fprintf(stdout, "\t[w] toggle display weighted/count[E]r. \t(%d)\n", display_weighted ? 1 : 0);
@@ -725,14 +724,13 @@ static int key_mapped(int c)
case 'Q':
case 'K':
case 'U':
+ case 'F':
+ case 's':
+ case 'S':
return 1;
case 'E':
case 'w':
return nr_counters > 1 ? 1 : 0;
- case 'F':
- case 's':
- case 'S':
- return symbol_conf.vmlinux_name ? 1 : 0;
default:
break;
}
@@ -910,8 +908,12 @@ static int symbol_filter(struct map *map, struct symbol *sym)
syme = symbol__priv(sym);
syme->map = map;
syme->src = NULL;
- if (!sym_filter_entry && sym_filter && !strcmp(name, sym_filter))
- sym_filter_entry = syme;
+
+ if (!sym_filter_entry && sym_filter && !strcmp(name, sym_filter)) {
+ /* schedule initial sym_filter_entry setup */
+ sym_filter_entry_sched = syme;
+ sym_filter = NULL;
+ }

for (i = 0; skip_symbols[i]; i++) {
if (!strcmp(skip_symbols[i], name)) {
@@ -951,6 +953,13 @@ static void event__process_sample(const event_t *self,
al.sym == NULL || al.filtered)
return;

+ /* let's see, whether we need to install initial sym_filter_entry */
+ if (sym_filter_entry_sched) {
+ sym_filter_entry = sym_filter_entry_sched;
+ sym_filter_entry_sched = NULL;
+ parse_source(sym_filter_entry);
+ }
+
syme = symbol__priv(al.sym);
if (!syme->skip) {
syme->count[counter]++;
@@ -1244,7 +1253,7 @@ static const struct option options[] = {
OPT_BOOLEAN('i', "inherit", &inherit,
"child tasks inherit counters"),
OPT_STRING('s', "sym-annotate", &sym_filter, "symbol name",
- "symbol to annotate - requires -k option"),
+ "symbol to annotate"),
OPT_BOOLEAN('z', "zero", &zero,
"zero history across updates"),
OPT_INTEGER('F', "freq", &freq,
@@ -1280,16 +1289,14 @@ int cmd_top(int argc, const char **argv, const char *prefix __used)

symbol_conf.priv_size = (sizeof(struct sym_entry) +
(nr_counters + 1) * sizeof(unsigned long));
- if (symbol_conf.vmlinux_name == NULL)
- symbol_conf.try_vmlinux_path = true;
+
+ symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
if (symbol__init() < 0)
return -1;

if (delay_secs < 1)
delay_secs = 1;

- parse_source(sym_filter_entry);
-
/*
* User specified count overrides default frequency.
*/
--
1.6.6.79.gd514e.dirty

Subject: [PATCH 4/6] perf annotate: fix it for non-prelinked *.so

The problem was we were incorrectly calculating objdump addresses for
sym->start and sym->end, look:

For simple ET_DYN type DSO (*.so) with one function, objdump -dS output
is something like this:

000004ac <my_strlen>:
int my_strlen(const char *s)
4ac: 55 push %ebp
4ad: 89 e5 mov %esp,%ebp
4af: 83 ec 10 sub $0x10,%esp
{

i.e. we have relative-to-dso-mapping IPs (=RIP) there.

For ET_EXEC type and probably for prelinked libs as well (sorry can't
test - I don't use prelink) objdump outputs absolute IPs, e.g.

08048604 <zz_strlen>:
extern "C"
int zz_strlen(const char *s)
8048604: 55 push %ebp
8048605: 89 e5 mov %esp,%ebp
8048607: 83 ec 10 sub $0x10,%esp
{

So, if sym->start is always relative to dso mapping(*), we'll have to
unmap it for ET_EXEC like cases, and leave as is for ET_DYN cases.

(*) and it is - we've explicitely made it relative. Look for
adjust_symbols handling in dso__load_sym()

Previously we were always unmapping sym->start and for ET_DYN dsos
resulting addresses were wrong, and so objdump output was empty.

The end result was that perf annotate output for symbols from
non-prelinked *.so had always 0.00% percents only, which is wrong.

----

To fix it, let's introduce a helper for converting rip to objdump
address, and also let's document what map_ip() and unmap_ip() do -- I
had to study sources for several hours to understand it.

Signed-off-by: Kirill Smelkov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mike Galbraith <[email protected]>
---
tools/perf/builtin-annotate.c | 4 ++--
tools/perf/util/map.c | 14 ++++++++++++++
tools/perf/util/map.h | 9 +++++++++
3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 117bbae..117301a 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -195,7 +195,7 @@ static int parse_line(FILE *file, struct hist_entry *he, u64 len)
line_ip = -1;
}

- start = he->map->unmap_ip(he->map, sym->start);
+ start = map__rip_2objdump(he->map, sym->start);

if (line_ip != -1) {
const char *path = NULL;
@@ -405,7 +405,7 @@ static void annotate_sym(struct hist_entry *he)
dso, dso->long_name, sym, sym->name);

sprintf(command, "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS %s|grep -v %s",
- map->unmap_ip(map, sym->start), map->unmap_ip(map, sym->end),
+ map__rip_2objdump(map, sym->start), map__rip_2objdump(map, sym->end),
filename, filename);

if (verbose >= 3)
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index c4d55a0..d6da969 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -201,3 +201,17 @@ size_t map__fprintf(struct map *self, FILE *fp)
return fprintf(fp, " %Lx-%Lx %Lx %s\n",
self->start, self->end, self->pgoff, self->dso->name);
}
+
+
+
+/*
+ * objdump wants/reports absolute IPs for ET_EXEC, and RIPs for ET_DYN.
+ * map->dso->adjust_symbols==1 for ET_EXEC-like cases.
+ */
+u64 map__rip_2objdump(struct map *map, u64 rip)
+{
+ u64 addr = map->dso->adjust_symbols ?
+ map->unmap_ip(map, rip) : /* RIP -> IP */
+ rip;
+ return addr;
+}
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 72f0b6a..39fa478 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -24,8 +24,12 @@ struct map {
u64 end;
enum map_type type;
u64 pgoff;
+
+ /* ip -> dso rip */
u64 (*map_ip)(struct map *, u64);
+ /* dso rip -> ip */
u64 (*unmap_ip)(struct map *, u64);
+
struct dso *dso;
};

@@ -44,6 +48,11 @@ static inline u64 identity__map_ip(struct map *map __used, u64 ip)
return ip;
}

+
+/* rip -> addr suitable for passing to `objdump --start-address=` */
+u64 map__rip_2objdump(struct map *map, u64 rip);
+
+
struct symbol;
struct mmap_event;

--
1.6.6.79.gd514e.dirty

Subject: [PATCH 3/6] perf top: fix code typo in prompt_symbol()

sym_filter is what was (if ever) passed with -s option. What was typed by
user, and what we were looking for, is in buf.

Signed-off-by: Kirill Smelkov <[email protected]>
---
tools/perf/builtin-top.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9dc8070..4067e4d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -668,7 +668,7 @@ static void prompt_symbol(struct sym_entry **target, const char *msg)
}

if (!found) {
- fprintf(stderr, "Sorry, %s is not active.\n", sym_filter);
+ fprintf(stderr, "Sorry, %s is not active.\n", buf);
sleep(1);
return;
} else
--
1.6.6.79.gd514e.dirty

Subject: [PATCH 5/6] perf top: fix annotate for userspace

First, for programs and prelinked libraries, annotate code was fooled by
objdump output IPs (src->eip in the code) being wrongly converted to
absolute IPs. In such case there were no conversion needed, but in

src->eip = strtoull(src->line, NULL, 16);
src->eip = map->unmap_ip(map, src->eip); // = eip + map->start - map->pgoff

we were reading absolute address from objdump (e.g. 8048604) and then
almost doubling it, because eip & map->start are approximately close for
small programs.

Needless to say, that later, in record_precise_ip() there was no
matching with real runtime IPs.

And second, like with `perf annotate` the problem with non-prelinked
*.so was that we were doing rip -> objdump address conversion wrong.

Also, because unlike `perf annotate`, `perf top` code does annotation based on
absolute IPs for performance reasons(*), new helper for mapping objdump
addresse to IP is introduced.

(*) we get samples info in absolute IPs, and since we do lots of
hit-testing on absolute IPs at runtime in record_precise_ip(), it's
better to convert objdump addresses to IPs once and do no conversion
at runtime.

I also had to fix how objdump output is parsed (with hardcoded 8/16
characters format, which was inappropriate for ET_DYN dsos with small
addresses like '4ac')

Also note, that not all objdump output lines has associtated IPs, e.g.
look at source lines here:

000004ac <my_strlen>:
extern "C"
int my_strlen(const char *s)
4ac: 55 push %ebp
4ad: 89 e5 mov %esp,%ebp
4af: 83 ec 10 sub $0x10,%esp
{
int len = 0;
4b2: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
4b9: eb 08 jmp 4c3 <my_strlen+0x17>

while (*s) {
++len;
4bb: 83 45 fc 01 addl $0x1,-0x4(%ebp)
++s;
4bf: 83 45 08 01 addl $0x1,0x8(%ebp)

So we mark them with eip=0, and ignore such lines in annotate lookup
code.

Signed-off-by: Kirill Smelkov <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mike Galbraith <[email protected]>
---
tools/perf/builtin-top.c | 24 ++++++++++++++----------
tools/perf/util/map.c | 8 ++++++++
tools/perf/util/map.h | 3 ++-
3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 4067e4d..6aa9b02 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -176,6 +176,7 @@ static void parse_source(struct sym_entry *syme)
FILE *file;
char command[PATH_MAX*2];
const char *path;
+ char *tmp;
u64 len;

if (!syme)
@@ -204,8 +205,8 @@ static void parse_source(struct sym_entry *syme)
sprintf(command,
"objdump --start-address=0x%016Lx "
"--stop-address=0x%016Lx -dS %s",
- map->unmap_ip(map, sym->start),
- map->unmap_ip(map, sym->end), path);
+ map__rip_2objdump(map, sym->start),
+ map__rip_2objdump(map, sym->end), path);

file = popen(command, "r");
if (!file)
@@ -235,14 +236,13 @@ static void parse_source(struct sym_entry *syme)
*source->lines_tail = src;
source->lines_tail = &src->next;

- if (strlen(src->line)>8 && src->line[8] == ':') {
- src->eip = strtoull(src->line, NULL, 16);
- src->eip = map->unmap_ip(map, src->eip);
- }
- if (strlen(src->line)>8 && src->line[16] == ':') {
- src->eip = strtoull(src->line, NULL, 16);
- src->eip = map->unmap_ip(map, src->eip);
- }
+
+ src->eip = strtoull(src->line, &tmp, 16);
+ if (*tmp == ':')
+ src->eip = map__objdump_2ip(map, src->eip);
+ else
+ /* this line has no ip info (e.g. source line) */
+ src->eip = 0;
}
pclose(file);
out_assign:
@@ -277,6 +277,10 @@ static void record_precise_ip(struct sym_entry *syme, int counter, u64 ip)
goto out_unlock;

for (line = syme->src->lines; line; line = line->next) {
+ /* skip lines without IP info */
+ if (line->eip == 0)
+ continue;
+
if (line->eip == ip) {
line->count[counter]++;
break;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index d6da969..37bfcc5 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -215,3 +215,11 @@ u64 map__rip_2objdump(struct map *map, u64 rip)
rip;
return addr;
}
+
+u64 map__objdump_2ip(struct map *map, u64 addr)
+{
+ u64 ip = map->dso->adjust_symbols ?
+ addr :
+ map->unmap_ip(map, addr); /* RIP -> IP */
+ return ip;
+}
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 39fa478..2bcd9d4 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -49,8 +49,9 @@ static inline u64 identity__map_ip(struct map *map __used, u64 ip)
}


-/* rip -> addr suitable for passing to `objdump --start-address=` */
+/* rip/ip <-> addr suitable for passing to `objdump --start-address=` */
u64 map__rip_2objdump(struct map *map, u64 rip);
+u64 map__objdump_2ip(struct map *map, u64 addr);


struct symbol;
--
1.6.6.79.gd514e.dirty

2010-01-13 13:40:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 1/6] perf top: teach it to autolocate vmlinux

Em Fri, Jan 08, 2010 at 03:23:04PM +0300, Kirill Smelkov escreveu:
> By relying on logic in dso__load_kernel_sym(), we can automatically load
> vmlinux.
>
> The only thing which needs to be adjusted, is how --sym-annotate option
> is handled - now we can't rely on vmlinux been loaded until full
> successful pass of dso__load_vmlinux(), but that's not the case if we'll
> do sym_filter_entry setup in symbol_filter().
>
> So move this step right after event__process_sample() where we know the
> whole dso__load_kernel_sym() pass is done.
>
> By the way, though conceptually similar `perf top` still can't annotate
> userspace - see next patches with fixes.
>
> Signed-off-by: Kirill Smelkov <[email protected]>
> Cc: Mike Galbraith <[email protected]>
> ---

<SNIP>

> @@ -951,6 +953,13 @@ static void event__process_sample(const event_t *self,
> al.sym == NULL || al.filtered)
> return;
>
> + /* let's see, whether we need to install initial sym_filter_entry */
> + if (sym_filter_entry_sched) {
> + sym_filter_entry = sym_filter_entry_sched;
> + sym_filter_entry_sched = NULL;
> + parse_source(sym_filter_entry);
> + }
> +

You're assuming that the first sample is for the kernel, right? It may
be not and then the vmlinux won't be loaded at this point.

I think that the right way is to force it to be loaded by calling:

map__load(session->vmlinux_maps[MAP__FUNCTION], session, filter);

after perf_session__create_kernel_maps and before parse_source(), ok?

You can even create a helper:

int perf_session__load_vmlinux(struct perf_session *self,
symbol_filter_t filter)
{
return map__load(session->vmlinux_maps[MAP__FUNCTION],
session, filter);
}

As this probably will be of interest for tools such as 'perf
probe', etc.

- Arnaldo

2010-01-13 13:41:49

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 6/6] perf: fix few typos + cosmetics

Em Fri, Jan 08, 2010 at 03:23:09PM +0300, Kirill Smelkov escreveu:
> Signed-off-by: Kirill Smelkov <[email protected]>
> ---
> tools/perf/Documentation/perf.txt | 2 +-
> tools/perf/design.txt | 8 ++++----
> 2 files changed, 5 insertions(+), 5 deletions(-)
>

Thanks, applied.

- Arnaldo

2010-01-13 13:47:31

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 3/6] perf top: fix code typo in prompt_symbol()

Em Fri, Jan 08, 2010 at 03:23:06PM +0300, Kirill Smelkov escreveu:
> sym_filter is what was (if ever) passed with -s option. What was typed by
> user, and what we were looking for, is in buf.
>
> Signed-off-by: Kirill Smelkov <[email protected]>
> ---
> tools/perf/builtin-top.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)

Thanks, applied.

- Arnaldo

Subject: Re: [PATCH 1/6] perf top: teach it to autolocate vmlinux

Arnaldo, All,

Thanks for replying, and I'm sorry for the delay in sending my reply
back. Please find my not so thoughtful reply below:

On Wed, Jan 13, 2010 at 11:39:52AM -0200, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jan 08, 2010 at 03:23:04PM +0300, Kirill Smelkov escreveu:
> > By relying on logic in dso__load_kernel_sym(), we can automatically load
> > vmlinux.
> >
> > The only thing which needs to be adjusted, is how --sym-annotate option
> > is handled - now we can't rely on vmlinux been loaded until full
> > successful pass of dso__load_vmlinux(), but that's not the case if we'll
> > do sym_filter_entry setup in symbol_filter().
> >
> > So move this step right after event__process_sample() where we know the
> > whole dso__load_kernel_sym() pass is done.
> >
> > By the way, though conceptually similar `perf top` still can't annotate
> > userspace - see next patches with fixes.
> >
> > Signed-off-by: Kirill Smelkov <[email protected]>
> > Cc: Mike Galbraith <[email protected]>
> > ---
>
> <SNIP>
>
> > @@ -951,6 +953,13 @@ static void event__process_sample(const event_t *self,
> > al.sym == NULL || al.filtered)
> > return;
> >
> > + /* let's see, whether we need to install initial sym_filter_entry */
> > + if (sym_filter_entry_sched) {
> > + sym_filter_entry = sym_filter_entry_sched;
> > + sym_filter_entry_sched = NULL;
> > + parse_source(sym_filter_entry);
> > + }
> > +
>
> You're assuming that the first sample is for the kernel, right? It may

Not quite so.

> be not and then the vmlinux won't be loaded at this point.

I agree, that there is an ambiguity, that e.g. for 'strstr' symbol there
are variants of which strstr to annotate - the kernel one, or the glibc
one (or even some other debug version of strstr preloaded by user
through LD_PRELOAD).


We'll get here on the first sample which hits function with name equal
to sym_filter. Sometimes this will be from vmlinux, sometimes not (but
if a symbol is only from vmlinux and produces sample hits, we'll get
here eventually for sure).

So yes, there is an ambiguity from which DSO we want sym_filter.


>
> I think that the right way is to force it to be loaded by calling:
>
> map__load(session->vmlinux_maps[MAP__FUNCTION], session, filter);
>
> after perf_session__create_kernel_maps and before parse_source(), ok?
>
> You can even create a helper:
>
> int perf_session__load_vmlinux(struct perf_session *self,
> symbol_filter_t filter)
> {
> return map__load(session->vmlinux_maps[MAP__FUNCTION],
> session, filter);
> }
>
> As this probably will be of interest for tools such as 'perf
> probe', etc.

I see your point.

Yes, you kernel people are almost always interested in kernel profile in
the first place :), but won't this be an ad-hock solution? I mean why
kernel (and only) kernel first?

In case of ambiguity, I'd better let users specify something like
vmlinux:strstr or libc.so.6:strstr to precisely define info for which
symbols they are going to see.

Anyway, as I see it, this days perf is used for kernel development
mostly, so I'd agree with ad-hoc kernel rule for now.


The problem is my spare time is very limited this month - I have only
few hours through weekends and this weekend I've already spent them all
:(. Sorry, maybe next week ...


Kirill


P.S. how about patches 4/6 and 5/6? They fix `perf annotate`
(independently for this vmlinux loading thing) and `perf top->annotate`
fix is somewhat orthogonal to the patch we are discussing...

2010-01-17 17:53:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 1/6] perf top: teach it to autolocate vmlinux

Em Sun, Jan 17, 2010 at 07:59:36PM +0300, Kirill Smelkov escreveu:
> Arnaldo, All,
>
> Thanks for replying, and I'm sorry for the delay in sending my reply
> back. Please find my not so thoughtful reply below:
|
That is okaish, lets get down (literally) to the bottom v

> On Wed, Jan 13, 2010 at 11:39:52AM -0200, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Jan 08, 2010 at 03:23:04PM +0300, Kirill Smelkov escreveu:
> > > By relying on logic in dso__load_kernel_sym(), we can automatically load
> > > vmlinux.
> > >
> > > The only thing which needs to be adjusted, is how --sym-annotate option
> > > is handled - now we can't rely on vmlinux been loaded until full
> > > successful pass of dso__load_vmlinux(), but that's not the case if we'll
> > > do sym_filter_entry setup in symbol_filter().
> > >
> > > So move this step right after event__process_sample() where we know the
> > > whole dso__load_kernel_sym() pass is done.
> > >
> > > By the way, though conceptually similar `perf top` still can't annotate
> > > userspace - see next patches with fixes.
> > >
> > > Signed-off-by: Kirill Smelkov <[email protected]>
> > > Cc: Mike Galbraith <[email protected]>
> > > ---
> >
> > <SNIP>
> >
> > > @@ -951,6 +953,13 @@ static void event__process_sample(const event_t *self,
> > > al.sym == NULL || al.filtered)
> > > return;
> > >
> > > + /* let's see, whether we need to install initial sym_filter_entry */
> > > + if (sym_filter_entry_sched) {
> > > + sym_filter_entry = sym_filter_entry_sched;
> > > + sym_filter_entry_sched = NULL;
> > > + parse_source(sym_filter_entry);
> > > + }
> > > +
> >
> > You're assuming that the first sample is for the kernel, right? It may
>
> Not quite so.
>
> > be not and then the vmlinux won't be loaded at this point.
>
> I agree, that there is an ambiguity, that e.g. for 'strstr' symbol there
> are variants of which strstr to annotate - the kernel one, or the glibc
> one (or even some other debug version of strstr preloaded by user
> through LD_PRELOAD).

yup

> We'll get here on the first sample which hits function with name equal
> to sym_filter. Sometimes this will be from vmlinux, sometimes not (but
> if a symbol is only from vmlinux and produces sample hits, we'll get
> here eventually for sure).

Definitely, its not the perfect filter, but a good one even then :-)

> So yes, there is an ambiguity from which DSO we want sym_filter.

violent agreement.

> >
> > I think that the right way is to force it to be loaded by calling:
> >
> > map__load(session->vmlinux_maps[MAP__FUNCTION], session, filter);
> >
> > after perf_session__create_kernel_maps and before parse_source(), ok?
> >
> > You can even create a helper:
> >
> > int perf_session__load_vmlinux(struct perf_session *self,
> > symbol_filter_t filter)
> > {
> > return map__load(session->vmlinux_maps[MAP__FUNCTION],
> > session, filter);
> > }
> >
> > As this probably will be of interest for tools such as 'perf
> > probe', etc.
>
> I see your point.
>
> Yes, you kernel people are almost always interested in kernel profile in
> the first place :), but won't this be an ad-hock solution? I mean why
> kernel (and only) kernel first?

Even now I don't think I qualify, ad hoc? Sure the current situation is
at best that.

> In case of ambiguity, I'd better let users specify something like
> vmlinux:strstr or libc.so.6:strstr to precisely define info for which
> symbols they are going to see.

In times of 'perf probe' being something that can and will help
integration with other 'people', yeah, we need to precisely define that
we can as well specify something in the Morton'ish test case domain[1]!

> Anyway, as I see it, this days perf is used for kernel development
> mostly, so I'd agree with ad-hoc kernel rule for now.

Yes, definetely agreed, perf must not sound, feel or be a kernel only
medicine, albeit it started from there (that is a good start,
neverthless, I'd say :-)).

>
> The problem is my spare time is very limited this month - I have only
> few hours through weekends and this weekend I've already spent them all
> :(. Sorry, maybe next week ...

Don't worry, the amount of feedback I got in this message seems enough
for me to come up with some patches tomorrow, thanks a lot and
dasvidanya!

- Arnaldo

[1] userspace