2009-10-28 15:58:34

by K.Prasad

[permalink] [raw]
Subject: [RFC Patch 0/4] Enhance perf-events to profile memory accesses using hw-breakpoints - ver II

Hi All,
Please find version II of the patchset that enables perf-events to
place hw-breakpoints over kernel symbols (along with requisite enhancements to
the hw-breakpoint layer).

Changelog version II
---------------------
Version I: http://lkml.org/lkml/2009/10/26/461

- Fixed parsing issues that disallowed other perf events to be invoked
- Fixed user-space breakpoint usage which was broken due to patch 2/4
- Introduced an instance of perf_sample_data for use by do_perf_sw_event()

An edited log of 'perf stat' and 'perf record' output is shown below for your
reference.

Kindly let me know your suggestions/feedback about the same.

Thanks,
K.Prasad

Screen logs
------------
# perf stat -v -i -e breakpoint-readwrite:pid_max -e breakpoint-write:jiffies make kernel/futex.o
CHK include/linux/version.h
CHK include/linux/utsrelease.h
SYMLINK include/asm -> include/asm-x86
CALL scripts/checksyscalls.sh
CC kernel/futex.o
breakpoint-readwrite: 68 298512531 298512531
breakpoint-write: 235 298512531 298512531

Performance counter stats for 'make kernel/futex.o':

68 breakpoint-readwrite # 0.000 M/sec
235 breakpoint-write # 0.000 M/sec

14.571235288 seconds time elapsed

#
#
# perf record -v -i -e breakpoint-readwrite:jiffies top

[Ran 'top' for about 10 seconds]

# perf report -i perf.data
# Samples: 2022950155
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ......
#
99.99% top [kernel] [k] scheduler_tick
0.01% perf [kernel] [k] scheduler_tick
0.00% top [kernel] [k] set_track
0.00% top [kernel] [k] run_timer_softirq
0.00% perf [kernel] [k] set_track
0.00% top [kernel] [k] __call_rcu
0.00% top [kernel] [k] calc_global_load
0.00% top [kernel] [k] do_timer
0.00% top [kernel] [k] __rcu_process_callbacks
#
# (For a higher level overview, try: perf report --sort comm,dso)
#


2009-10-29 08:19:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC Patch 0/4] Enhance perf-events to profile memory accesses using hw-breakpoints - ver II


* K.Prasad <[email protected]> wrote:

> Hi All,
> Please find version II of the patchset that enables perf-events to
> place hw-breakpoints over kernel symbols (along with requisite enhancements to
> the hw-breakpoint layer).
>
> Changelog version II
> ---------------------
> Version I: http://lkml.org/lkml/2009/10/26/461
>
> - Fixed parsing issues that disallowed other perf events to be invoked
> - Fixed user-space breakpoint usage which was broken due to patch 2/4
> - Introduced an instance of perf_sample_data for use by do_perf_sw_event()
>
> An edited log of 'perf stat' and 'perf record' output is shown below for your
> reference.
>
> Kindly let me know your suggestions/feedback about the same.
>
> Thanks,
> K.Prasad
>
> Screen logs
> ------------
> # perf stat -v -i -e breakpoint-readwrite:pid_max -e breakpoint-write:jiffies make kernel/futex.o
> CHK include/linux/version.h
> CHK include/linux/utsrelease.h
> SYMLINK include/asm -> include/asm-x86
> CALL scripts/checksyscalls.sh
> CC kernel/futex.o
> breakpoint-readwrite: 68 298512531 298512531
> breakpoint-write: 235 298512531 298512531
>
> Performance counter stats for 'make kernel/futex.o':
>
> 68 breakpoint-readwrite # 0.000 M/sec
> 235 breakpoint-write # 0.000 M/sec
>
> 14.571235288 seconds time elapsed
>
> #
> #
> # perf record -v -i -e breakpoint-readwrite:jiffies top
>
> [Ran 'top' for about 10 seconds]

btw., you probably want to add the -a/--all option as well when you test
via top, to do system-wide profiling. With this command you profile top
itself (and its child tasks).

>
> # perf report -i perf.data
> # Samples: 2022950155
> #
> # Overhead Command Shared Object Symbol
> # ........ ....... ............. ......
> #
> 99.99% top [kernel] [k] scheduler_tick
> 0.01% perf [kernel] [k] scheduler_tick
> 0.00% top [kernel] [k] set_track
> 0.00% top [kernel] [k] run_timer_softirq
> 0.00% perf [kernel] [k] set_track
> 0.00% top [kernel] [k] __call_rcu
> 0.00% top [kernel] [k] calc_global_load
> 0.00% top [kernel] [k] do_timer
> 0.00% top [kernel] [k] __rcu_process_callbacks
> #
> # (For a higher level overview, try: perf report --sort comm,dso)
> #

That output looks pretty awesome! This way we can map out how frequently
global variables are used in the kernel - in stock distro kernels too.
Previously we could only measure it indirectly (by looking at
high-overhead functions and assembly level annotations), or by running
very costly instrumentation like Valgrind.

I like it how you extended --event with the breakpoint-readwrite:jiffies
method as well.

A few additional shortcuts/aliases would be nice, such as:

perf record -v -i -e readwrite:jiffies top

as breakpoint-readwrite is pretty log users arent really interested in
the mechanism (hardware-breakpoints), they are more interested that it's
memory read-write profiling done at a given address.

Maybe even 'rw' would be a useful alias as well. There are alias tables
for events which you can use for this. You can define them via:

{ CHBP(WRITE), "memory-write", "write", "w" },
{ CHBP(RW), "memory-readwrite", "readwrite", "rw" },

Anyway, this looks very good already - Frederic, if you like these
patches too feel free to send it to me in your next hw-breakpoints pull
request.

Thanks,

Ingo

2009-10-29 22:24:05

by K.Prasad

[permalink] [raw]
Subject: Re: [RFC Patch 0/4] Enhance perf-events to profile memory accesses using hw-breakpoints - ver II

On Thu, Oct 29, 2009 at 09:19:17AM +0100, Ingo Molnar wrote:
>
> * K.Prasad <[email protected]> wrote:
>
[snipped]
> >
> > #
> > #
> > # perf record -v -i -e breakpoint-readwrite:jiffies top
> >
> > [Ran 'top' for about 10 seconds]
>
> btw., you probably want to add the -a/--all option as well when you test
> via top, to do system-wide profiling. With this command you profile top
> itself (and its child tasks).
>

Okay. Attached output to that effect in ver III of my patchset sent
here: http://lkml.org/lkml/2009/10/29/300

> >
> > # perf report -i perf.data
> > # Samples: 2022950155
> > #
> > # Overhead Command Shared Object Symbol
> > # ........ ....... ............. ......
> > #
> > 99.99% top [kernel] [k] scheduler_tick
> > 0.01% perf [kernel] [k] scheduler_tick
> > 0.00% top [kernel] [k] set_track
> > 0.00% top [kernel] [k] run_timer_softirq
> > 0.00% perf [kernel] [k] set_track
> > 0.00% top [kernel] [k] __call_rcu
> > 0.00% top [kernel] [k] calc_global_load
> > 0.00% top [kernel] [k] do_timer
> > 0.00% top [kernel] [k] __rcu_process_callbacks
> > #
> > # (For a higher level overview, try: perf report --sort comm,dso)
> > #
>
> That output looks pretty awesome! This way we can map out how frequently
> global variables are used in the kernel - in stock distro kernels too.
> Previously we could only measure it indirectly (by looking at
> high-overhead functions and assembly level annotations), or by running
> very costly instrumentation like Valgrind.
>
> I like it how you extended --event with the breakpoint-readwrite:jiffies
> method as well.
>
> A few additional shortcuts/aliases would be nice, such as:
>
> perf record -v -i -e readwrite:jiffies top
>
> as breakpoint-readwrite is pretty log users arent really interested in
> the mechanism (hardware-breakpoints), they are more interested that it's
> memory read-write profiling done at a given address.
>
> Maybe even 'rw' would be a useful alias as well. There are alias tables
> for events which you can use for this. You can define them via:
>
> { CHBP(WRITE), "memory-write", "write", "w" },
> { CHBP(RW), "memory-readwrite", "readwrite", "rw" },
>

I've added "memory-write" and "w" as the aliases (similarly
"memory-readwrite" and "rw" as shown under). "read" and "write" are
used as hw_cache_op[] aliases; moreover defining more than one alias
would require a separate structure (as done by hw_cache[] and
hw_cache_op[] and further changes in print_events()), and hence the
single alias. I'm open to any further suggestions on the renaming front.


+ { CHBP(WRITE), "memory-write", "w" },
+ { CHBP(RW), "memory-readwrite", "rw" },


> Anyway, this looks very good already - Frederic, if you like these
> patches too feel free to send it to me in your next hw-breakpoints pull
> request.
>
> Thanks,
>
> Ingo

I'm glad that you found value in the patchset and hope that this would
entail the feature's journey further into the mainline. Frederic's
previous mail suggests that I owe him more reasoning about the patches'
approach, before being sent out for a git pull!

Thanks,
K.Prasad

2009-10-31 16:19:23

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [RFC Patch 0/4] Enhance perf-events to profile memory accesses using hw-breakpoints - ver II

2009/10/29 Ingo Molnar <[email protected]>:
>
> * K.Prasad <[email protected]> wrote:
>
>> Hi All,
>> ? ? ? Please find version II of the patchset that enables perf-events to
>> place hw-breakpoints over kernel symbols (along with requisite enhancements to
>> the hw-breakpoint layer).
>>
>> Changelog version II
>> ---------------------
>> Version I: http://lkml.org/lkml/2009/10/26/461
>>
>> - Fixed parsing issues that disallowed other perf events to be invoked
>> - Fixed user-space breakpoint usage which was broken due to patch 2/4
>> - Introduced an instance of perf_sample_data for use by do_perf_sw_event()
>>
>> An edited log of 'perf stat' and 'perf record' output is shown below for your
>> reference.
>>
>> Kindly let me know your suggestions/feedback about the same.
>>
>> Thanks,
>> K.Prasad
>>
>> Screen logs
>> ------------
>> # perf stat -v -i -e breakpoint-readwrite:pid_max -e breakpoint-write:jiffies make kernel/futex.o
>> ? CHK ? ? include/linux/version.h
>> ? CHK ? ? include/linux/utsrelease.h
>> ? SYMLINK include/asm -> include/asm-x86
>> ? CALL ? ?scripts/checksyscalls.sh
>> ? CC ? ? ?kernel/futex.o
>> breakpoint-readwrite: 68 298512531 298512531
>> breakpoint-write: 235 298512531 298512531
>>
>> ?Performance counter stats for 'make kernel/futex.o':
>>
>> ? ? ? ? ? ? ?68 ?breakpoint-readwrite ? ? # ? ? ?0.000 M/sec
>> ? ? ? ? ? ? 235 ?breakpoint-write ? ? ? ? # ? ? ?0.000 M/sec
>>
>> ? ?14.571235288 ?seconds time elapsed
>>
>> #
>> #
>> # perf record -v -i -e breakpoint-readwrite:jiffies top
>>
>> [Ran 'top' for about 10 seconds]
>
> btw., you probably want to add the -a/--all option as well when you test
> via top, to do system-wide profiling. With this command you profile top
> itself (and its child tasks).
>
>>
>> # perf report -i perf.data
>> # Samples: 2022950155
>> #
>> # Overhead ?Command ?Shared Object ?Symbol
>> # ........ ?....... ?............. ?......
>> #
>> ? ? 99.99% ? ? ?top ?[kernel] ? ? ? [k] scheduler_tick
>> ? ? ?0.01% ? ? perf ?[kernel] ? ? ? [k] scheduler_tick
>> ? ? ?0.00% ? ? ?top ?[kernel] ? ? ? [k] set_track
>> ? ? ?0.00% ? ? ?top ?[kernel] ? ? ? [k] run_timer_softirq
>> ? ? ?0.00% ? ? perf ?[kernel] ? ? ? [k] set_track
>> ? ? ?0.00% ? ? ?top ?[kernel] ? ? ? [k] __call_rcu
>> ? ? ?0.00% ? ? ?top ?[kernel] ? ? ? [k] calc_global_load
>> ? ? ?0.00% ? ? ?top ?[kernel] ? ? ? [k] do_timer
>> ? ? ?0.00% ? ? ?top ?[kernel] ? ? ? [k] __rcu_process_callbacks
>> #
>> # (For a higher level overview, try: perf report --sort comm,dso)
>> #
>
> That output looks pretty awesome! This way we can map out how frequently
> global variables are used in the kernel - in stock distro kernels too.
> Previously we could only measure it indirectly (by looking at
> high-overhead functions and assembly level annotations), or by running
> very costly instrumentation like Valgrind.
>
> I like it how you extended --event with the breakpoint-readwrite:jiffies
> method as well.
>
> A few additional shortcuts/aliases would be nice, such as:
>
> ? perf record -v -i -e readwrite:jiffies top
>
> as breakpoint-readwrite is pretty log users arent really interested in
> the mechanism (hardware-breakpoints), they are more interested that it's
> memory read-write profiling done at a given address.
>
> Maybe even 'rw' would be a useful alias as well. There are alias tables
> for events which you can use for this. You can define them via:
>
> ?{ CHBP(WRITE), ? ? ? ? ? ? ? "memory-write", ? ? "write", ? ? "w" ?},
> ?{ CHBP(RW), ? ? ? ? ? ? ? ? ?"memory-readwrite", "readwrite", "rw" },
>
> Anyway, this looks very good already - Frederic, if you like these
> patches too feel free to send it to me in your next hw-breakpoints pull
> request.


I can't add these patches to my tree as this is a patchset that implements
another direction.
Prasad's patchset is an evolution of the current state of
tip:/tracing/hw-breakpoint
that keeps the hardware breakpoints standalone wrt perf events:

perf ftrace ptrace kgdb
| / / /
pmu / / /
| / / /
| / / /

----------------------------
Hw breakpoints api

Whereas my patchset does:

perf ftrace ptrace kgdb
| | | |
| | | |
| --------------------
| hw breakpoint api
| |
|----------------
|
|
Lower level perf / pmu

Well this ascii art should be a bit more complicated actually.
But anyway. Prasad's patchset is another branch of evolution of
tracing/hw-breakpoints.
I've expressed my opinion about that in a mail yesterday. I basically
think it limits the perf events
possibilities and rewrites the context binding / register allocation
that perf already handles.

That said I won't mind if the general opinion is in favour of that
direction and I can zap
my patches and send a pull request with Prasad's patches instead.

2009-11-02 15:05:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC Patch 0/4] Enhance perf-events to profile memory accesses using hw-breakpoints - ver II


* Frederic Weisbecker <[email protected]> wrote:

> Well this ascii art should be a bit more complicated actually. But
> anyway. Prasad's patchset is another branch of evolution of
> tracing/hw-breakpoints.
>
> I've expressed my opinion about that in a mail yesterday. I basically
> think it limits the perf events possibilities and rewrites the context
> binding / register allocation that perf already handles.

Ok, in hindsight i agree with your point of view - we really dont want a
duplicate layer but a single handler/arbitrer of hw-breakpoint state.

Ingo