2010-01-26 00:34:44

by john smith

[permalink] [raw]
Subject: perf report for .ko files

How can "perf report" be used to get info on (only) the specific manually loadable modules?
Driver entry points are reported but without full breakdown (on all the calls).
If "--symbols" option is used, most of the symbols are not found.
("--dsos" doesn't seem to help)

John



2010-01-26 19:22:05

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf report for .ko files

On Mon, 2010-01-25 at 16:34 -0800, john smith wrote:
> How can "perf report" be used to get info on (only) the specific manually loadable modules?
> Driver entry points are reported but without full breakdown (on all the calls).

Since its sampling it could be a function is just not hit, another
possibility is inlining of functions, we simply cannot see inlined
functions.

> If "--symbols" option is used, most of the symbols are not found.
> ("--dsos" doesn't seem to help)

I'm not a big module user, but I think --dsos should work, if not feel
free to send a patch to rectify this situation.

--dsos does appear to work on my laptop (where I have iwlagn as a module
since it sometimes needs a reload to start working, which is hard when
build in).

perf report --dsos "[iwlagn]"

does indeed give me all iwlagn hits, albeit without symbol information,
not sure why that is, /proc/kallsyms does seem to include some iwlagn
symbols.

Arnaldo might have a clue.

2010-01-26 21:06:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: perf report for .ko files

Em Tue, Jan 26, 2010 at 08:21:55PM +0100, Peter Zijlstra escreveu:
> On Mon, 2010-01-25 at 16:34 -0800, john smith wrote:
> > How can "perf report" be used to get info on (only) the specific manually loadable modules?
> > Driver entry points are reported but without full breakdown (on all the calls).
>
> Since its sampling it could be a function is just not hit, another
> possibility is inlining of functions, we simply cannot see inlined
> functions.

Well, you could reduce the number of samples collected by asking perf record to
include only kernel samples by fiddling with these perf_event_attr fields:

exclude_user : 1, /* don't count user */
exclude_kernel : 1, /* ditto kernel */
exclude_hv : 1, /* ditto hypervisor */
exclude_idle : 1, /* don't count when idle */

I.e. setting exclude_user, exclude_hv and exclude_idle to 1, but this requires
a patch for tools/perf/builtin-record.c as this is not exposed yet.

Something similar to what we have in 'perf top'

-K, --hide_kernel_symbols
hide kernel symbols
-U, --hide_user_symbols
hide user symbols

And then increase the frequency.

> > If "--symbols" option is used, most of the symbols are not found.
> > ("--dsos" doesn't seem to help)
>
> I'm not a big module user, but I think --dsos should work, if not feel
> free to send a patch to rectify this situation.

--dsos does work for kernel modules, yes:

[root@doppio linux-2.6-tip]# perf report --dsos '[e1000e]'
# dso: [e1000e]
# Samples: 110518812
#
# Overhead Command Symbol
# ........ ............... ......
#
22.98% init [k] e1000_clean
21.52% mutt [k] e1000_clean
17.95% mutt [k] e1000_intr_msi
16.60% init [k] e1000_intr_msi
11.82% events/0 [k] e1000e_update_stats
1.80% mutt [k] e1000_put_txbuf
1.79% mutt [k] e1000_clean_tx_irq
1.28% init [k] e1000_clean_tx_irq
1.14% init [k] e1000_clean_rx_irq
1.02% sshd [k] e1000_xmit_frame
0.52% init [k] pci_unmap_single
0.43% openvpn [k] e1000_xmit_frame
0.29% sshd [k] e1000_intr_msi
0.18% init [k] e1000_update_itr
0.15% sshd [k] pci_dma_mapping_error
0.15% init [k] pci_dma_mapping_error
0.14% init [k] e1000_alloc_rx_buffers
0.13% sshd [k] e1000_clean
0.11% init [k] e1000_rx_checksum
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[root@doppio linux-2.6-tip]#

> --dsos does appear to work on my laptop (where I have iwlagn as a module
> since it sometimes needs a reload to start working, which is hard when
> build in).
>
> perf report --dsos "[iwlagn]"

yup, exactly like my example above.

> does indeed give me all iwlagn hits, albeit without symbol information,
> not sure why that is, /proc/kallsyms does seem to include some iwlagn
> symbols.

but not all, in verbose mode we can see from where it gets the module
symbol information:

[root@doppio linux-2.6-tip]# perf report --verbose --dsos '[e1000e]' |
head -10
Looking at the vmlinux_path (5 entries long)
No build_id in vmlinux, ignoring it
No build_id in /boot/vmlinux, ignoring it
No build_id in /boot/vmlinux-2.6.33-rc5-tip+, ignoring it
Using /lib/modules/2.6.33-rc5-tip+/build/vmlinux for symbols
# dso: [e1000e]
# Samples: 110518812
#
# Overhead Command Symbol
# ........ ............... ......
#
22.98% init 0x000000000001088c B [k] e1000_clean
21.52% mutt 0x000000000001088c B [k] e1000_clean
17.95% mutt 0x000000000000db5c B [k] e1000_intr_msi
16.60% init 0x000000000000db5c B [k] e1000_intr_msi
[root@doppio linux-2.6-tip]#

In my case, all came from the buildid-cache, the ' B ' above:

from tools/perf/util/symbol.c (should be in the man page, consider
submitting a patch for that :)):

char dso__symtab_origin(const struct dso *self)
{
static const char origin[] = {
[DSO__ORIG_KERNEL] = 'k',
[DSO__ORIG_JAVA_JIT] = 'j',
[DSO__ORIG_BUILD_ID_CACHE] = 'B',
[DSO__ORIG_FEDORA] = 'f',
[DSO__ORIG_UBUNTU] = 'u',
[DSO__ORIG_BUILDID] = 'b',
[DSO__ORIG_DSO] = 'd',
[DSO__ORIG_KMODULE] = 'K',
};

if (self == NULL || self->origin == DSO__ORIG_NOT_FOUND)
return '!';
return origin[self->origin];
}

For further info:

Use 'perf buildid-list' to get the build-id for the kernel module used at 'perf
record' time.

[root@doppio linux-2.6-tip]# perf buildid-list | grep e1000e
05c45e33bc92bf4bbc71ddde4128a9f5f1463fcb /lib/modules/2.6.33-rc5-tip+/kernel/drivers/net/e1000e/e1000e.ko

Then look it up in the build-id cache:

[root@doppio linux-2.6-tip]# l ~/.debug/.build-id/05/c45e33bc92bf4bbc71ddde4128a9f5f1463fcb
lrwxrwxrwx 1 root root 110 2010-01-22 12:12 /root/.debug/.build-id/05/c45e33bc92bf4bbc71ddde4128a9f5f1463fcb -> ../../lib/modules/2.6.33-rc5-tip+/kernel/drivers/net/e1000e/e1000e.ko/05c45e33bc92bf4bbc71ddde4128a9f5f1463fcb
[root@doppio linux-2.6-tip]#

So indeed, at 'perf record' time we found that the file that matched the loaded
e1000e.ko file was:

/lib/modules/2.6.33-rc5-tip+/kernel/drivers/net/e1000e/e1000e.ko

by looking at its build-id ELF note and at:

/sys/module/e1000e/sections/.note.gnu.build-id

for the build-id in the e1000e.ko loaded at that time.

> Arnaldo might have a clue.

See above :-)

- Arnaldo

2010-01-29 18:55:21

by john smith

[permalink] [raw]
Subject: Re: perf report for .ko files

Problem1:

> Well, you could reduce the number of samples collected by
> asking perf record to
> include only kernel samples by fiddling with these
> perf_event_attr fields:
>
>
> exclude_user : 1, /* don't count user */
> exclude_kernel : 1, /* ditto kernel */
> exclude_hv : 1, /* ditto hypervisor */
> exclude_idle : 1, /* don't count when idle */
>
> I.e. setting exclude_user, exclude_hv and exclude_idle to
> 1, but this requires
> a patch for tools/perf/builtin-record.c as this is not
> exposed yet.

I did setup attr->exclude_hv, exclude_idle, exclude_user = 1 in create_counter() just before "try_again:" label, no difference in results,
on a 2.6.31 (too old? "perf buildid-list" is not available) kernel version:

# perf record -f -e cycles fio fio_script
(for more than a minute, I increased the sample frequency with different -c values, some locked the machine - expectedly)
and
# perf report --verbose --dso=/my_module_path/my_module.ko
(--verbose doesn't provide any more info)

return the same "# Samples: 0":

"
# dso: /my_module_path/my_module.ko
# Samples: 0
#...
"

> [root@doppio linux-2.6-tip]# perf report --dsos '[e1000e]'
> # dso: [e1000e]
> # Samples: 110518812

'e1000e' module may be be sampled more often than the scsi modules but ...

> > not sure why that is, /proc/kallsyms does seem to
> > include some iwlagn symbols.

/proc/kallsyms includes all syms I need, so is a problem not finding the symbols, does this (Samples: 0) make sens?

"perf report --sort comm,dso,symbol" report has my module symbols.

Problem2:

If I try "perf annotate -l -k /my_module_path/my_module.ko my_symbol" has all the percentages "0.00" where I would expect a % breakdown (considering the total function counter 100%), even if the function represents a small overall counter % compared to the others, any ideas?

John



2010-01-29 19:22:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: perf report for .ko files

Em Fri, Jan 29, 2010 at 10:55:13AM -0800, john smith escreveu:
> Problem1:
>
> > Well, you could reduce the number of samples collected by
> > asking perf record to
> > include only kernel samples by fiddling with these
> > perf_event_attr fields:
> >
> > exclude_user : 1, /* don't count user */
> > exclude_kernel : 1, /* ditto kernel */
> > exclude_hv : 1, /* ditto hypervisor */
> > exclude_idle : 1, /* don't count when idle */
> >
> > I.e. setting exclude_user, exclude_hv and exclude_idle to
> > 1, but this requires
> > a patch for tools/perf/builtin-record.c as this is not
> > exposed yet.

> I did setup attr->exclude_hv, exclude_idle, exclude_user = 1 in
> create_counter() just before "try_again:" label, no difference in
> results, on a 2.6.31 (too old? "perf buildid-list" is not available)
> kernel version:

> # perf record -f -e cycles fio fio_script

> (for more than a minute, I increased the sample frequency with
> different -c values, some locked the machine - expectedly)
> and
> # perf report --verbose --dso=/my_module_path/my_module.ko
> (--verbose doesn't provide any more info)

This isn't going to work, you need to do:

perf report --verbose --dso='[my_module]'

I.e. as it appear on /proc/modules + brackets. Arguably we need to get
it to work also with both 'my_module.ko' and the full path.

> return the same "# Samples: 0":
>
> "
> # dso: /my_module_path/my_module.ko
> # Samples: 0
> #...
> "
>
> > [root@doppio linux-2.6-tip]# perf report --dsos '[e1000e]'
> > # dso: [e1000e]
> > # Samples: 110518812
>
> 'e1000e' module may be be sampled more often than the scsi modules but ...
>
> > > not sure why that is, /proc/kallsyms does seem to
> > > include some iwlagn symbols.

> /proc/kallsyms includes all syms I need, so is a problem not finding
> the symbols, does this (Samples: 0) make sens?

> "perf report --sort comm,dso,symbol" report has my module symbols.

See above on how to specify the module name, its just that it is finding
no samples for the module name as specified, it seems.

> Problem2:

> If I try "perf annotate -l -k /my_module_path/my_module.ko my_symbol"
> has all the percentages "0.00" where I would expect a % breakdown
> (considering the total function counter 100%), even if the function
> represents a small overall counter % compared to the others, any
> ideas?

I actually never tried doing annotation with a module, will try.

- Arnaldo

2010-01-29 21:59:37

by john smith

[permalink] [raw]
Subject: Re: perf report for .ko files

> > I did setup attr->exclude_hv, exclude_idle,
> exclude_user = 1 in
> > create_counter() just before "try_again:" label, no
> difference in
> > results, on a 2.6.31 (too old? "perf buildid-list" is
> not available)
> > kernel version:
>
> > # perf record -f -e cycles fio fio_script
>
> > (for more than a minute, I increased the sample
> frequency with
> > different -c values, some locked the machine -
> expectedly)
> > and
> > # perf report --verbose
> --dso=/my_module_path/my_module.ko
> > (--verbose doesn't provide any more info)
>
> This isn't going to work, you need to do:
>
> perf report --verbose --dso='[my_module]'
>
> I.e. as it appear on /proc/modules + brackets. Arguably we
> need to get
> it to work also with both 'my_module.ko' and the full
> path.

I've tried:
"perf report --verbose --dso='[my_module]'"
and also:
"perf report --verbose --dso='[sg]'"
same result as before.

New question:

Is this the minimal set of api that I can use from within the kernel (for specific module sections of interest) to read the counters?

{
int perf_event_release_kernel(struct perf_event *event);
struct perf_event *perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu);
u64 perf_event_read_value(struct perf_event *event);
}

Any good example of useage of these api's (struct initialization)?

John




2010-01-30 06:46:06

by Mike Galbraith

[permalink] [raw]
Subject: Re: perf report for .ko files

On Fri, 2010-01-29 at 17:15 -0200, Arnaldo Carvalho de Melo wrote:

> I actually never tried doing annotation with a module, will try.

Seems it went south again in both .git and .tip. 32 works.

ed52ce2e3c33dc7626a40fa2da766d1a6460e543 is the first bad commit
commit ed52ce2e3c33dc7626a40fa2da766d1a6460e543
Author: Arnaldo Carvalho de Melo <[email protected]>
Date: Mon Oct 19 17:17:57 2009 -0200

perf tools: Add ->unmap_ip operation to struct map

We need this because we get section relative addresses when
reading the symtabs, but when a tool like 'perf annotate' needs
to match these address to what 'objdump -dS' produces we need
the address + section back again.

So in annotate now we look at the 'struct hist_entry' instances
(that weren't really being used) so that we iterate only over
the symbols that had some hit and get the map where that
particular hit happened so that we can get the right address to
match with annotate.

Verified that at least:

perf annotate mmap_read_counter # Uses the ~/bin/perf binary
perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all

on a 'perf record perf top' session seems to work.

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Mike Galbraith <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

2010-02-03 17:28:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: perf report for .ko files

Em Sat, Jan 30, 2010 at 07:45:54AM +0100, Mike Galbraith escreveu:
> On Fri, 2010-01-29 at 17:15 -0200, Arnaldo Carvalho de Melo wrote:
>
> > I actually never tried doing annotation with a module, will try.
>
> Seems it went south again in both .git and .tip. 32 works.

We need to read the ELF header, notice its a relocatable file and not
apply the unmap_ip functions, doing a patch now.

- Arnaldo