2009-11-23 14:42:37

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 1/4] hw-breakpoints: Include only linux/perf_event.h from kernel part of bp headers

As userspace only needs the breakpoints enum types from the breakpoints
headers.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
---
include/linux/hw_breakpoint.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index 4659e0c..76a48ab 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -1,8 +1,6 @@
#ifndef _LINUX_HW_BREAKPOINT_H
#define _LINUX_HW_BREAKPOINT_H

-#include <linux/perf_event.h>
-
enum {
HW_BREAKPOINT_LEN_1 = 1,
HW_BREAKPOINT_LEN_2 = 2,
@@ -19,6 +17,8 @@ enum {
#ifdef __KERNEL__
#ifdef CONFIG_HAVE_HW_BREAKPOINT

+#include <linux/perf_event.h>
+
static inline unsigned long hw_breakpoint_addr(struct perf_event *bp)
{
return bp->attr.bp_addr;
--
1.6.2.3


2009-11-23 14:42:37

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 2/4] hw-breakpoints: Check the breakpoint params from perf tools

Perf tools create perf events as disabled in the beginning. Breakpoints
are then considered like ptrace temporary breakpoints, only meant to
reserve a breakpoint slot until we get all the necessary informations
from the user.

In this case, we don't check the address that is breakpointed as it is
NULL in the ptrace case.

But perf tools don't have the same purpose, events are created disabled
to wait for all events to be created before enabling all of them.
We want to check the breakpoint parameters in this case.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
---
kernel/hw_breakpoint.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index b6d6fa2..1d8911a 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -263,7 +263,16 @@ int __register_perf_hw_breakpoint(struct perf_event *bp)
if (ret)
return ret;

- if (!bp->attr.disabled)
+ /*
+ * Ptrace breakpoints can be temporary perf events only
+ * meant to reserve a slot. In this case, it is created disabled and
+ * we don't want to check the params right now (as we put a null addr)
+ * But perf tools create events as disabled and we want to check
+ * the params for them.
+ * This is a quick hack that will be removed soon, once we remove
+ * the tmp breakpoints from ptrace
+ */
+ if (!bp->attr.disabled || bp->callback == perf_bp_event)
ret = arch_validate_hwbkpt_settings(bp, bp->ctx->task);

return ret;
--
1.6.2.3

2009-11-23 14:43:04

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 3/4] perf: Add kernel side syscall events support for breakpoints

Add the remaining necessary bits to support breakpoints created through
perf syscall.

We don't use the software counter interface as:

- We don't need to check against recursion, this is already done in
hardware breakpoints arch level.

- We already know the perf event we are dealing with when the event
is to be committed.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
---
kernel/perf_event.c | 34 +++++++++++++++++++++++++---------
1 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0b0d5f7..5fbca43 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3826,6 +3826,20 @@ static int perf_swevent_is_counting(struct perf_event *event)
static int perf_tp_event_match(struct perf_event *event,
struct perf_sample_data *data);

+static int perf_exclude_event(struct perf_event *event,
+ struct pt_regs *regs)
+{
+ if (regs) {
+ if (event->attr.exclude_user && user_mode(regs))
+ return 1;
+
+ if (event->attr.exclude_kernel && !user_mode(regs))
+ return 1;
+ }
+
+ return 0;
+}
+
static int perf_swevent_match(struct perf_event *event,
enum perf_type_id type,
u32 event_id,
@@ -3837,16 +3851,12 @@ static int perf_swevent_match(struct perf_event *event,

if (event->attr.type != type)
return 0;
+
if (event->attr.config != event_id)
return 0;

- if (regs) {
- if (event->attr.exclude_user && user_mode(regs))
- return 0;
-
- if (event->attr.exclude_kernel && !user_mode(regs))
- return 0;
- }
+ if (perf_exclude_event(event, regs))
+ return 0;

if (event->attr.type == PERF_TYPE_TRACEPOINT &&
!perf_tp_event_match(event, data))
@@ -4277,9 +4287,15 @@ static const struct pmu *bp_perf_event_init(struct perf_event *bp)
return &perf_ops_bp;
}

-void perf_bp_event(struct perf_event *bp, void *regs)
+void perf_bp_event(struct perf_event *bp, void *data)
{
- /* TODO */
+ struct perf_sample_data sample;
+ struct pt_regs *regs = data;
+
+ sample.addr = bp->attr.bp_addr;
+
+ if (!perf_exclude_event(bp, regs))
+ perf_swevent_add(bp, 1, 1, &sample, regs);
}
#else
static void bp_perf_event_destroy(struct perf_event *event)
--
1.6.2.3

2009-11-23 14:42:43

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools

Add the breakpoint events support with this new sysnopsis:

mem:addr[:access]

Where addr is a raw addr value in the kernel and access can be
either [r][w][x]

Example to profile tasklist_lock:

$ grep tasklist_lock /proc/kallsyms
ffffffff8189c000 D tasklist_lock

$ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
$ perf report

# Samples: 62
#
# Overhead Command Shared Object Symbol
# ........ ............... ............. ......
#
29.03% swapper [kernel] [k] _raw_read_trylock
29.03% swapper [kernel] [k] _raw_read_unlock
19.35% init [kernel] [k] _raw_read_trylock
19.35% init [kernel] [k] _raw_read_unlock
1.61% events/0 [kernel] [k] _raw_read_trylock
1.61% events/0 [kernel] [k] _raw_read_unlock

Coming soon:

- support for symbols in the event definition.
- default period to 1 for breakpoint events because these are not
high frequency events. The same thing is needed for trace events.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 16 ++++--
tools/perf/util/parse-events.c | 84 +++++++++++++++++++++++++++++-
2 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 0ff23de..fc46c0b 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -26,11 +26,19 @@ OPTIONS

-e::
--event=::
- Select the PMU event. Selection can be a symbolic event name
- (use 'perf list' to list all events) or a raw PMU
- event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
+ Select the PMU event. Selection can be:

+ - a symbolic event name (use 'perf list' to list all events)
+
+ - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
+ hexadecimal event descriptor.
+
+ - a hardware breakpoint event in the form of '\mem:addr[:access]'
+ where addr is the address in memory you want to break in.
+ Access is the memory access type (read, write, execute) it can
+ be passed as follows: '\mem:addr[:[r][w][x]]'.
+ If you want to profile read-write accesses in 0x1000, just set
+ 'mem:0x1000:rw'.
-a::
System-wide collection.

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 0faf4f2..0700274 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1,4 +1,4 @@
-
+#include "../../../include/linux/hw_breakpoint.h"
#include "util.h"
#include "../perf.h"
#include "parse-options.h"
@@ -540,6 +540,81 @@ static enum event_result parse_tracepoint_event(const char **strp,
attr, strp);
}

+static enum event_result
+parse_breakpoint_type(const char *type, const char **strp,
+ struct perf_event_attr *attr)
+{
+ int i;
+
+ for (i = 0; i < 3; i++) {
+ if (!type[i])
+ break;
+
+ switch (type[i]) {
+ case 'r':
+ attr->bp_type |= HW_BREAKPOINT_R;
+ break;
+ case 'w':
+ attr->bp_type |= HW_BREAKPOINT_W;
+ break;
+ case 'x':
+ attr->bp_type |= HW_BREAKPOINT_X;
+ break;
+ default:
+ return EVT_FAILED;
+ }
+ }
+ if (!attr->bp_type) /* Default */
+ attr->bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+
+ *strp = type + i;
+
+ return EVT_HANDLED;
+}
+
+static enum event_result
+parse_breakpoint_event(const char **strp, struct perf_event_attr *attr)
+{
+ const char *target;
+ const char *type;
+ char *endaddr;
+ u64 addr;
+ enum event_result err;
+
+ target = strchr(*strp, ':');
+ if (!target)
+ return EVT_FAILED;
+
+ if (strncmp(*strp, "mem", target - *strp) != 0)
+ return EVT_FAILED;
+
+ target++;
+
+ addr = strtoull(target, &endaddr, 0);
+ if (target == endaddr)
+ return EVT_FAILED;
+
+ attr->bp_addr = addr;
+ *strp = endaddr;
+
+ type = strchr(target, ':');
+
+ /* If no type is defined, just rw as default */
+ if (!type) {
+ attr->bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+ } else {
+ err = parse_breakpoint_type(++type, strp, attr);
+ if (err == EVT_FAILED)
+ return EVT_FAILED;
+ }
+
+ /* We should find a nice way to override the access type */
+ attr->bp_len = HW_BREAKPOINT_LEN_4;
+ attr->type = PERF_TYPE_BREAKPOINT;
+
+ return EVT_HANDLED;
+}
+
static int check_events(const char *str, unsigned int i)
{
int n;
@@ -673,6 +748,10 @@ parse_event_symbols(const char **str, struct perf_event_attr *attr)
if (ret != EVT_FAILED)
goto modifier;

+ ret = parse_breakpoint_event(str, attr);
+ if (ret != EVT_FAILED)
+ goto modifier;
+
fprintf(stderr, "invalid or unsupported event: '%s'\n", *str);
fprintf(stderr, "Run 'perf list' for a list of valid events\n");
return EVT_FAILED;
@@ -859,6 +938,9 @@ void print_events(void)
"rNNN");
printf("\n");

+ printf(" %-42s [hardware breakpoint]\n", "mem:<addr>[:access]");
+ printf("\n");
+
print_tracepoint_events();

exit(129);
--
1.6.2.3

2009-11-23 17:36:06

by K.Prasad

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools

On Mon, Nov 23, 2009 at 03:42:35PM +0100, Frederic Weisbecker wrote:
> Add the breakpoint events support with this new sysnopsis:
>
> mem:addr[:access]
>
> Where addr is a raw addr value in the kernel and access can be
> either [r][w][x]
>
> Example to profile tasklist_lock:
>
> $ grep tasklist_lock /proc/kallsyms
> ffffffff8189c000 D tasklist_lock
>
> $ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
> $ perf report

The problem in obtaining just the breakpoint address is that there can
be a variety of breakpoint lengths that can be associated with them -
assigning the smallest possible length (1-Byte) may cause loss of
exceptions and a higher length would lead to stray exceptions (such
dilemmas led to the support of symbol-only breakpoint in ksym_tracer and
perf-tools in my patchset...the default 1-Byte breakpoint length being a
temporary fix).

With kernel symbols as input it would be possible to derive the bkpt
length based on the symbol-size, say using
kallsyms_lookup_size_offset() (although the corresponding length may not
be available on the host processor such requests can be failed or
over-ridden by the user using a smaller length), but for addresses I think
it is vital to know what breakpoint length is desired by the user.

This comes at the cost of exposing the user to variations in
breakpoint implementation across architectures and demand processor-specific
knowledge, but specifying a kernel-space address would anyway require the
user to penetrate beyond the normal abstraction provided by the
interface/tool...so I presume it must be acceptable.

On a related note, the supported breakpoint length for PPC64 is a fixed
8-Byte length (which means all extraneous exceptions must be filtered by
the breakpoint architecture) and Book-E Power processors matching
addresses against a bitmask; for S390 it can be practically anything
(bound by a set of start and end addresses)...and you see what a
quandary it leads to!

Thanks,
K.Prasad

2009-11-23 17:38:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools


* Frederic Weisbecker <[email protected]> wrote:

> Add the breakpoint events support with this new sysnopsis:
>
> mem:addr[:access]
>
> Where addr is a raw addr value in the kernel and access can be
> either [r][w][x]
>
> Example to profile tasklist_lock:
>
> $ grep tasklist_lock /proc/kallsyms
> ffffffff8189c000 D tasklist_lock
>
> $ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1

would be nice to also accept a plain cut & paste of an address in hexa
without the 0x prefix, i.e.:
$ perf record -e mem:ffffffff8189c000:rw -a -f -c 1

plus perf stat gets surprised by this new event and prints 'unknown':

$ grep tasklist_lock /proc/kallsyms
ffffffff81802000 D tasklist_lock
$ perf stat -a -e mem:0xffffffff81802000 sleep 1

Performance counter stats for 'sleep 1':

64 unknown

1.000833173 seconds time elapsed

Ingo

2009-11-23 17:43:56

by Frederic Weisbecker

[permalink] [raw]
Subject: [tip:perf/core] hw-breakpoints: Include only linux/perf_event.h from kernel part of bp headers

Commit-ID: e6db4876575f3fdd5b1df2cbff826df95ab9af6a
Gitweb: http://git.kernel.org/tip/e6db4876575f3fdd5b1df2cbff826df95ab9af6a
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Mon, 23 Nov 2009 15:42:32 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 Nov 2009 18:18:30 +0100

hw-breakpoints: Include only linux/perf_event.h from kernel part of bp headers

As userspace only needs the breakpoints enum types from the
breakpoints headers.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/hw_breakpoint.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index 4659e0c..76a48ab 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -1,8 +1,6 @@
#ifndef _LINUX_HW_BREAKPOINT_H
#define _LINUX_HW_BREAKPOINT_H

-#include <linux/perf_event.h>
-
enum {
HW_BREAKPOINT_LEN_1 = 1,
HW_BREAKPOINT_LEN_2 = 2,
@@ -19,6 +17,8 @@ enum {
#ifdef __KERNEL__
#ifdef CONFIG_HAVE_HW_BREAKPOINT

+#include <linux/perf_event.h>
+
static inline unsigned long hw_breakpoint_addr(struct perf_event *bp)
{
return bp->attr.bp_addr;

2009-11-23 17:44:30

by Frederic Weisbecker

[permalink] [raw]
Subject: [tip:perf/core] hw-breakpoints: Check the breakpoint params from perf tools

Commit-ID: fdf6bc95229821e3d9405eba28925b76e92b74d0
Gitweb: http://git.kernel.org/tip/fdf6bc95229821e3d9405eba28925b76e92b74d0
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Mon, 23 Nov 2009 15:42:33 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 Nov 2009 18:18:30 +0100

hw-breakpoints: Check the breakpoint params from perf tools

Perf tools create perf events as disabled in the beginning.
Breakpoints are then considered like ptrace temporary
breakpoints, only meant to reserve a breakpoint slot until we
get all the necessary informations from the user.

In this case, we don't check the address that is breakpointed as
it is NULL in the ptrace case.

But perf tools don't have the same purpose, events are created
disabled to wait for all events to be created before enabling
all of them. We want to check the breakpoint parameters in this
case.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/hw_breakpoint.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index c166622..06d372f 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -267,7 +267,16 @@ int __register_perf_hw_breakpoint(struct perf_event *bp)
if (ret)
return ret;

- if (!bp->attr.disabled)
+ /*
+ * Ptrace breakpoints can be temporary perf events only
+ * meant to reserve a slot. In this case, it is created disabled and
+ * we don't want to check the params right now (as we put a null addr)
+ * But perf tools create events as disabled and we want to check
+ * the params for them.
+ * This is a quick hack that will be removed soon, once we remove
+ * the tmp breakpoints from ptrace
+ */
+ if (!bp->attr.disabled || bp->callback == perf_bp_event)
ret = arch_validate_hwbkpt_settings(bp, bp->ctx->task);

return ret;

2009-11-23 17:44:24

by Frederic Weisbecker

[permalink] [raw]
Subject: [tip:perf/core] perf: Add kernel side syscall events support for breakpoints

Commit-ID: f5ffe02e5046003ae7e2ce70d3d1c2a73331268b
Gitweb: http://git.kernel.org/tip/f5ffe02e5046003ae7e2ce70d3d1c2a73331268b
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Mon, 23 Nov 2009 15:42:34 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 Nov 2009 18:18:31 +0100

perf: Add kernel side syscall events support for breakpoints

Add the remaining necessary bits to support breakpoints created
through perf syscall.

We don't use the software counter interface as:

- We don't need to check against recursion, this is already done
in hardware breakpoints arch level.

- We already know the perf event we are dealing with when the
event is to be committed.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/perf_event.c | 34 +++++++++++++++++++++++++---------
1 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0aafe85..9425c96 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3831,6 +3831,20 @@ static int perf_swevent_is_counting(struct perf_event *event)
static int perf_tp_event_match(struct perf_event *event,
struct perf_sample_data *data);

+static int perf_exclude_event(struct perf_event *event,
+ struct pt_regs *regs)
+{
+ if (regs) {
+ if (event->attr.exclude_user && user_mode(regs))
+ return 1;
+
+ if (event->attr.exclude_kernel && !user_mode(regs))
+ return 1;
+ }
+
+ return 0;
+}
+
static int perf_swevent_match(struct perf_event *event,
enum perf_type_id type,
u32 event_id,
@@ -3842,16 +3856,12 @@ static int perf_swevent_match(struct perf_event *event,

if (event->attr.type != type)
return 0;
+
if (event->attr.config != event_id)
return 0;

- if (regs) {
- if (event->attr.exclude_user && user_mode(regs))
- return 0;
-
- if (event->attr.exclude_kernel && !user_mode(regs))
- return 0;
- }
+ if (perf_exclude_event(event, regs))
+ return 0;

if (event->attr.type == PERF_TYPE_TRACEPOINT &&
!perf_tp_event_match(event, data))
@@ -4282,9 +4292,15 @@ static const struct pmu *bp_perf_event_init(struct perf_event *bp)
return &perf_ops_bp;
}

-void perf_bp_event(struct perf_event *bp, void *regs)
+void perf_bp_event(struct perf_event *bp, void *data)
{
- /* TODO */
+ struct perf_sample_data sample;
+ struct pt_regs *regs = data;
+
+ sample.addr = bp->attr.bp_addr;
+
+ if (!perf_exclude_event(bp, regs))
+ perf_swevent_add(bp, 1, 1, &sample, regs);
}
#else
static void bp_perf_event_destroy(struct perf_event *event)

2009-11-23 17:44:40

by Frederic Weisbecker

[permalink] [raw]
Subject: [tip:perf/core] perf tools: Add support for breakpoint events in perf tools

Commit-ID: 1b290d670ffa883b7e062177463a8efd00eaa2c1
Gitweb: http://git.kernel.org/tip/1b290d670ffa883b7e062177463a8efd00eaa2c1
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Mon, 23 Nov 2009 15:42:35 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 23 Nov 2009 18:18:31 +0100

perf tools: Add support for breakpoint events in perf tools

Add the breakpoint events support with this new sysnopsis:

mem:addr[:access]

Where addr is a raw addr value in the kernel and access can be
either [r][w][x]

Example to profile tasklist_lock:

$ grep tasklist_lock /proc/kallsyms
ffffffff8189c000 D tasklist_lock

$ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
$ perf report

# Samples: 62
#
# Overhead Command Shared Object Symbol
# ........ ............... ............. ......
#
29.03% swapper [kernel] [k] _raw_read_trylock
29.03% swapper [kernel] [k] _raw_read_unlock
19.35% init [kernel] [k] _raw_read_trylock
19.35% init [kernel] [k] _raw_read_unlock
1.61% events/0 [kernel] [k] _raw_read_trylock
1.61% events/0 [kernel] [k] _raw_read_unlock

Coming soon:

- Support for symbols in the event definition.

- Default period to 1 for breakpoint events because these are
not high frequency events. The same thing is needed for trace
events.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Prasad <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 16 ++++--
tools/perf/util/parse-events.c | 84 +++++++++++++++++++++++++++++-
2 files changed, 95 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 0ff23de..fc46c0b 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -26,11 +26,19 @@ OPTIONS

-e::
--event=::
- Select the PMU event. Selection can be a symbolic event name
- (use 'perf list' to list all events) or a raw PMU
- event (eventsel+umask) in the form of rNNN where NNN is a
- hexadecimal event descriptor.
+ Select the PMU event. Selection can be:

+ - a symbolic event name (use 'perf list' to list all events)
+
+ - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
+ hexadecimal event descriptor.
+
+ - a hardware breakpoint event in the form of '\mem:addr[:access]'
+ where addr is the address in memory you want to break in.
+ Access is the memory access type (read, write, execute) it can
+ be passed as follows: '\mem:addr[:[r][w][x]]'.
+ If you want to profile read-write accesses in 0x1000, just set
+ 'mem:0x1000:rw'.
-a::
System-wide collection.

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 0faf4f2..0700274 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1,4 +1,4 @@
-
+#include "../../../include/linux/hw_breakpoint.h"
#include "util.h"
#include "../perf.h"
#include "parse-options.h"
@@ -540,6 +540,81 @@ static enum event_result parse_tracepoint_event(const char **strp,
attr, strp);
}

+static enum event_result
+parse_breakpoint_type(const char *type, const char **strp,
+ struct perf_event_attr *attr)
+{
+ int i;
+
+ for (i = 0; i < 3; i++) {
+ if (!type[i])
+ break;
+
+ switch (type[i]) {
+ case 'r':
+ attr->bp_type |= HW_BREAKPOINT_R;
+ break;
+ case 'w':
+ attr->bp_type |= HW_BREAKPOINT_W;
+ break;
+ case 'x':
+ attr->bp_type |= HW_BREAKPOINT_X;
+ break;
+ default:
+ return EVT_FAILED;
+ }
+ }
+ if (!attr->bp_type) /* Default */
+ attr->bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+
+ *strp = type + i;
+
+ return EVT_HANDLED;
+}
+
+static enum event_result
+parse_breakpoint_event(const char **strp, struct perf_event_attr *attr)
+{
+ const char *target;
+ const char *type;
+ char *endaddr;
+ u64 addr;
+ enum event_result err;
+
+ target = strchr(*strp, ':');
+ if (!target)
+ return EVT_FAILED;
+
+ if (strncmp(*strp, "mem", target - *strp) != 0)
+ return EVT_FAILED;
+
+ target++;
+
+ addr = strtoull(target, &endaddr, 0);
+ if (target == endaddr)
+ return EVT_FAILED;
+
+ attr->bp_addr = addr;
+ *strp = endaddr;
+
+ type = strchr(target, ':');
+
+ /* If no type is defined, just rw as default */
+ if (!type) {
+ attr->bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+ } else {
+ err = parse_breakpoint_type(++type, strp, attr);
+ if (err == EVT_FAILED)
+ return EVT_FAILED;
+ }
+
+ /* We should find a nice way to override the access type */
+ attr->bp_len = HW_BREAKPOINT_LEN_4;
+ attr->type = PERF_TYPE_BREAKPOINT;
+
+ return EVT_HANDLED;
+}
+
static int check_events(const char *str, unsigned int i)
{
int n;
@@ -673,6 +748,10 @@ parse_event_symbols(const char **str, struct perf_event_attr *attr)
if (ret != EVT_FAILED)
goto modifier;

+ ret = parse_breakpoint_event(str, attr);
+ if (ret != EVT_FAILED)
+ goto modifier;
+
fprintf(stderr, "invalid or unsupported event: '%s'\n", *str);
fprintf(stderr, "Run 'perf list' for a list of valid events\n");
return EVT_FAILED;
@@ -859,6 +938,9 @@ void print_events(void)
"rNNN");
printf("\n");

+ printf(" %-42s [hardware breakpoint]\n", "mem:<addr>[:access]");
+ printf("\n");
+
print_tracepoint_events();

exit(129);

2009-11-23 20:25:26

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools

On Mon, Nov 23, 2009 at 11:06:01PM +0530, K.Prasad wrote:
> On Mon, Nov 23, 2009 at 03:42:35PM +0100, Frederic Weisbecker wrote:
> > Add the breakpoint events support with this new sysnopsis:
> >
> > mem:addr[:access]
> >
> > Where addr is a raw addr value in the kernel and access can be
> > either [r][w][x]
> >
> > Example to profile tasklist_lock:
> >
> > $ grep tasklist_lock /proc/kallsyms
> > ffffffff8189c000 D tasklist_lock
> >
> > $ perf record -e mem:0xffffffff8189c000:rw -a -f -c 1
> > $ perf report
>
> The problem in obtaining just the breakpoint address is that there can
> be a variety of breakpoint lengths that can be associated with them -
> assigning the smallest possible length (1-Byte) may cause loss of
> exceptions and a higher length would lead to stray exceptions (such
> dilemmas led to the support of symbol-only breakpoint in ksym_tracer and
> perf-tools in my patchset...the default 1-Byte breakpoint length being a
> temporary fix).


Right.


> With kernel symbols as input it would be possible to derive the bkpt
> length based on the symbol-size, say using
> kallsyms_lookup_size_offset() (although the corresponding length may not
> be available on the host processor such requests can be failed or
> over-ridden by the user using a smaller length), but for addresses I think
> it is vital to know what breakpoint length is desired by the user.



Yeah. I guess we first need a way to manually add this length, may
be:

mem:addr/len:access

And as you said, finding it automatically for symbols. But still, passing
symbols to perf attr leads to confusion and complexity if we want to profile
in userspace.

I think we should find this symbol length from userspace.
I'm not sure how yet, probably using Dwarf. Arnaldo, do you have an
idea about that?

> This comes at the cost of exposing the user to variations in
> breakpoint implementation across architectures and demand processor-specific
> knowledge, but specifying a kernel-space address would anyway require the
> user to penetrate beyond the normal abstraction provided by the
> interface/tool...so I presume it must be acceptable.



Yeah. We'll probably need to write a quick sum-up about
such variations to facilitate perf uses.


> On a related note, the supported breakpoint length for PPC64 is a fixed
> 8-Byte length (which means all extraneous exceptions must be filtered by
> the breakpoint architecture) and Book-E Power processors matching
> addresses against a bitmask; for S390 it can be practically anything
> (bound by a set of start and end addresses)...and you see what a
> quandary it leads to!
>
> Thanks,
> K.Prasad


Yep :)

Thanks.

2009-11-23 21:10:32

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools

Em Mon, Nov 23, 2009 at 09:25:28PM +0100, Frederic Weisbecker escreveu:
> On Mon, Nov 23, 2009 at 11:06:01PM +0530, K.Prasad wrote:
> > With kernel symbols as input it would be possible to derive the bkpt
> > length based on the symbol-size, say using kallsyms_lookup_size_offset()
> > (although the corresponding length may not be available on the host
> > processor such requests can be failed or over-ridden by the user using
> > a smaller length), but for addresses I think it is vital to know what
> > breakpoint length is desired by the user.

> Yeah. I guess we first need a way to manually add this length, may be:

> mem:addr/len:access

> And as you said, finding it automatically for symbols. But still,
> passing symbols to perf attr leads to confusion and complexity if we
> want to profile in userspace.

> I think we should find this symbol length from userspace. I'm not sure
> how yet, probably using Dwarf. Arnaldo, do you have an idea about that?

DWARF has the type for each variable or struct member, getting its size
is straightforward.

Using just /proc/kallsyms all we can do is find the size of a variable
by looking at its address and the address of the next one.

- Arnaldo

2009-11-23 21:19:47

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools

On Mon, Nov 23, 2009 at 07:09:39PM -0200, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 23, 2009 at 09:25:28PM +0100, Frederic Weisbecker escreveu:
> > On Mon, Nov 23, 2009 at 11:06:01PM +0530, K.Prasad wrote:
> > > With kernel symbols as input it would be possible to derive the bkpt
> > > length based on the symbol-size, say using kallsyms_lookup_size_offset()
> > > (although the corresponding length may not be available on the host
> > > processor such requests can be failed or over-ridden by the user using
> > > a smaller length), but for addresses I think it is vital to know what
> > > breakpoint length is desired by the user.
>
> > Yeah. I guess we first need a way to manually add this length, may be:
>
> > mem:addr/len:access
>
> > And as you said, finding it automatically for symbols. But still,
> > passing symbols to perf attr leads to confusion and complexity if we
> > want to profile in userspace.
>
> > I think we should find this symbol length from userspace. I'm not sure
> > how yet, probably using Dwarf. Arnaldo, do you have an idea about that?
>
> DWARF has the type for each variable or struct member, getting its size
> is straightforward.


Ok.


> Using just /proc/kallsyms all we can do is find the size of a variable
> by looking at its address and the address of the next one.
>
> - Arnaldo


Hmm, but I worry a bit about alignment which would return us
the wrong size.

May be can we first try to get the address from /proc/kallsyms,
and if we have dwarf, get the size from it, otherwise try some
magic with /proc/kallsysms...

2009-11-23 21:25:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools


* Frederic Weisbecker <[email protected]> wrote:

> > Using just /proc/kallsyms all we can do is find the size of a
> > variable by looking at its address and the address of the next one.
> >
> > - Arnaldo
>
> Hmm, but I worry a bit about alignment which would return us the wrong
> size.
>
> May be can we first try to get the address from /proc/kallsyms, and if
> we have dwarf, get the size from it, otherwise try some magic with
> /proc/kallsysms...

Can we extend /proc/kallsyms (or add /proc/kallsyms) to include a size
field?

Perhaps can we generate some sort of DSO-alike thing in /proc/vmlinux
(via a default-off debug option in .config), that perf could just
interpret the usual ELF way - which happens to be the symbol table of
the kernel? It would use up some RAM, but it would also be quite useful
for debugging purposes.

Ingo

2009-11-23 21:25:33

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: Add support for breakpoint events in perf tools

Em Mon, Nov 23, 2009 at 10:19:47PM +0100, Frederic Weisbecker escreveu:
> On Mon, Nov 23, 2009 at 07:09:39PM -0200, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Nov 23, 2009 at 09:25:28PM +0100, Frederic Weisbecker escreveu:
> > > I think we should find this symbol length from userspace. I'm not sure
> > > how yet, probably using Dwarf. Arnaldo, do you have an idea about that?
> >
> > DWARF has the type for each variable or struct member, getting its size
> > is straightforward.
>
> Ok.
>
> > Using just /proc/kallsyms all we can do is find the size of a variable
> > by looking at its address and the address of the next one.
>
> Hmm, but I worry a bit about alignment which would return us
> the wrong size.
>
> May be can we first try to get the address from /proc/kallsyms,
> and if we have dwarf, get the size from it, otherwise try some
> magic with /proc/kallsysms...

Yeah, I'll get the variables-from-kallsyms bit just after I get a new
mechanism for expressing configuration for the perf symbol resolving
machinery.

Should have this done ASAP.

- Arnaldo