Hi guys,
Does perf support user space hw_breakpoint based on per-task?
perf already support kenerl space hw_breakpoint, but there don't have
any example for
user-space hw_breakpoint in code base(and never metion it).
>From perf api point of view, it should support per-task hw_breakpoint easily.
but I still want to make sure that?(badly I don't have any linux
machine to test it now:))
Thanks.
.jovi
Hi, Jovi
On Mon, 25 Jun 2012 13:22:01 +0800, Jovi Zhang wrote:
> Hi guys,
>
> Does perf support user space hw_breakpoint based on per-task?
>
> perf already support kenerl space hw_breakpoint, but there don't have
> any example for
> user-space hw_breakpoint in code base(and never metion it).
> From perf api point of view, it should support per-task hw_breakpoint easily.
> but I still want to make sure that?(badly I don't have any linux
> machine to test it now:))
>
Here is my simple test:
namhyung@sejong:perf$ nm -nD /usr/bin/ls | grep D
0000000000619ce0 D quoting_style_args
000000000061a530 D ls_mode
000000000061a538 D Version
000000000061a540 D argmatch_die
000000000061a548 D exit_failure
namhyung@sejong:perf$ ./perf stat -e mem:0x61a530 -e mem:0x61a538 -- /usr/bin/ls > /dev/null
Performance counter stats for '/usr/bin/ls':
1 mem:0x61a530:rw
0 mem:0x61a538:rw
0.002213595 seconds time elapsed
So, it should work on user-space hw_breakpoints.
BTW, when I perf record on a hwbp, it failed with ENOSPC.
I guess it's because each per-task-per-cpu event tried to
create an event so it'd get more than supported by h/w.
The strace told me that the fifth call to perf_event_open
failed on my 6-core machine.
Thanks,
Namhyung
Hi
On Mon, Jun 25, 2012 at 4:26 PM, Namhyung Kim <[email protected]> wrote:
> Hi, Jovi
>
> On Mon, 25 Jun 2012 13:22:01 +0800, Jovi Zhang wrote:
>> Hi guys,
>>
>> Does perf support user space hw_breakpoint based on per-task?
>>
>> perf already support kenerl space hw_breakpoint, but there don't have
>> any example for
>> user-space hw_breakpoint in code base(and never metion it).
>> From perf api point of view, it should support per-task hw_breakpoint easily.
>> but I still want to make sure that?(badly I don't have any linux
>> machine to test it now:))
>>
>
> Here is my simple test:
>
> namhyung@sejong:perf$ nm -nD /usr/bin/ls | grep D
> 0000000000619ce0 D quoting_style_args
> 000000000061a530 D ls_mode
> 000000000061a538 D Version
> 000000000061a540 D argmatch_die
> 000000000061a548 D exit_failure
>
> namhyung@sejong:perf$ ./perf stat -e mem:0x61a530 -e mem:0x61a538 -- /usr/bin/ls > /dev/null
>
> Performance counter stats for '/usr/bin/ls':
>
> 1 mem:0x61a530:rw
> 0 mem:0x61a538:rw
>
> 0.002213595 seconds time elapsed
>
>
> So, it should work on user-space hw_breakpoints.
Thanks very much, it works.
>
> BTW, when I perf record on a hwbp, it failed with ENOSPC.
> I guess it's because each per-task-per-cpu event tried to
> create an event so it'd get more than supported by h/w.
> The strace told me that the fifth call to perf_event_open
> failed on my 6-core machine.
>
> Thanks,
> Namhyung
I have same result as you in my linux box.
This should be a bug cause by commit d1cb9f(perf target: Add uses_mmap field)
Namhyung, How about below patch?
>From 4b77b99df9ca3b99be4ccf8c4256e622aae9203f Mon Sep 17 00:00:00 2001
From: Jovi Zhang <[email protected]>
Date: Thu, 28 Jun 2012 07:49:41 +0800
Subject: [PATCH] perf: revert commit d1cb9f(perf target: Add uses_mmap field)
In my x86 4 cores cpu linux machine, using hw_breakpoint output as follows:
Before add uses_mmap field:
[root@jovi perf]# ./perf record -g -e mem:0x080652c8 -e mem:0x1098 --
/usr/bin/ls >/dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB perf.data (~640 samples) ]
After add uses_mmap field:
[root@jovi perf]# ./perf record -e mem:0x080652c8 -e mem:0x1098 --
/usr/bin/ls >/dev/null
Error: sys_perf_event_open() syscall returned with 28 (No space
left on device). /bin/dmesg may provide additional information.
Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
Adding uses_mmap field in target structure will cause perf-record
creat per-task-per-cpu
event for each evsel, this will break hw_breakpoint(have limit debug
registers in cpu),
in above example, we should create dummy cpumap for hw_breakpoint
event, not per-task-per-cpu,
fix it.
Noticed-by: Namhyung Kim <[email protected]>
Signed-off-by: Jovi Zhang <[email protected]>
---
tools/perf/builtin-record.c | 3 ---
tools/perf/builtin-test.c | 1 -
tools/perf/builtin-top.c | 3 ---
tools/perf/util/evlist.c | 4 +---
tools/perf/util/target.h | 1 -
5 files changed, 1 insertion(+), 11 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f95840d..8128213 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -754,9 +754,6 @@ static struct perf_record record = {
.user_freq = UINT_MAX,
.user_interval = ULLONG_MAX,
.freq = 4000,
- .target = {
- .uses_mmap = true,
- },
},
.write_mode = WRITE_FORCE,
.file_new = true,
diff --git a/tools/perf/builtin-test.c b/tools/perf/builtin-test.c
index 5a8727c..338a0cc 100644
--- a/tools/perf/builtin-test.c
+++ b/tools/perf/builtin-test.c
@@ -647,7 +647,6 @@ static int test__PERF_RECORD(void)
struct perf_record_opts opts = {
.target = {
.uid = UINT_MAX,
- .uses_mmap = true,
},
.no_delay = true,
.freq = 10,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6bb0277..cc78e06 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1165,9 +1165,6 @@ int cmd_top(int argc, const char **argv, const
char *prefix __used)
.freq = 4000, /* 4 KHz */
.mmap_pages = 128,
.sym_pcnt_filter = 5,
- .target = {
- .uses_mmap = true,
- },
};
char callchain_default_opt[] = "fractal,0.5,callee";
const struct option options[] = {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7400fb3..e791029 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -622,9 +622,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist,
if (evlist->threads == NULL)
return -1;
- if (perf_target__has_task(target))
- evlist->cpus = cpu_map__dummy_new();
- else if (!perf_target__has_cpu(target) && !target->uses_mmap)
+ if (perf_target__has_task(target) || !perf_target__has_cpu(target))
evlist->cpus = cpu_map__dummy_new();
else
evlist->cpus = cpu_map__new(target->cpu_list);
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index a4be857..c43f632 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -11,7 +11,6 @@ struct perf_target {
const char *uid_str;
uid_t uid;
bool system_wide;
- bool uses_mmap;
};
enum perf_target_errno {
--
1.7.9.7
On Thu, 28 Jun 2012 09:02:02 +0800, Jovi Zhang wrote:
> On Mon, Jun 25, 2012 at 4:26 PM, Namhyung Kim <[email protected]> wrote:
>> BTW, when I perf record on a hwbp, it failed with ENOSPC.
>> I guess it's because each per-task-per-cpu event tried to
>> create an event so it'd get more than supported by h/w.
>> The strace told me that the fifth call to perf_event_open
>> failed on my 6-core machine.
>>
> I have same result as you in my linux box.
> This should be a bug cause by commit d1cb9f(perf target: Add uses_mmap field)
>
> Namhyung, How about below patch?
>
NAK. This uses_mmap field is needed to setup per-task-per-cpu events for
perf record (mostly). Without it, perf suffered from severe scalability
issues. Maybe we can change it not to create per-task-per-cpu events iff
for hwbp events only, but I'm not sure it's the right thing.
Thanks,
Namhyung
>
> From 4b77b99df9ca3b99be4ccf8c4256e622aae9203f Mon Sep 17 00:00:00 2001
> From: Jovi Zhang <[email protected]>
> Date: Thu, 28 Jun 2012 07:49:41 +0800
> Subject: [PATCH] perf: revert commit d1cb9f(perf target: Add uses_mmap field)
>
> In my x86 4 cores cpu linux machine, using hw_breakpoint output as follows:
>
> Before add uses_mmap field:
> [root@jovi perf]# ./perf record -g -e mem:0x080652c8 -e mem:0x1098 --
> /usr/bin/ls >/dev/null
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.015 MB perf.data (~640 samples) ]
>
> After add uses_mmap field:
> [root@jovi perf]# ./perf record -e mem:0x080652c8 -e mem:0x1098 --
> /usr/bin/ls >/dev/null
> Error: sys_perf_event_open() syscall returned with 28 (No space
> left on device). /bin/dmesg may provide additional information.
>
> Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
>
> Adding uses_mmap field in target structure will cause perf-record
> creat per-task-per-cpu
> event for each evsel, this will break hw_breakpoint(have limit debug
> registers in cpu),
> in above example, we should create dummy cpumap for hw_breakpoint
> event, not per-task-per-cpu,
> fix it.
>
> Noticed-by: Namhyung Kim <[email protected]>
> Signed-off-by: Jovi Zhang <[email protected]>
> ---
> tools/perf/builtin-record.c | 3 ---
> tools/perf/builtin-test.c | 1 -
> tools/perf/builtin-top.c | 3 ---
> tools/perf/util/evlist.c | 4 +---
> tools/perf/util/target.h | 1 -
> 5 files changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index f95840d..8128213 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -754,9 +754,6 @@ static struct perf_record record = {
> .user_freq = UINT_MAX,
> .user_interval = ULLONG_MAX,
> .freq = 4000,
> - .target = {
> - .uses_mmap = true,
> - },
> },
> .write_mode = WRITE_FORCE,
> .file_new = true,
> diff --git a/tools/perf/builtin-test.c b/tools/perf/builtin-test.c
> index 5a8727c..338a0cc 100644
> --- a/tools/perf/builtin-test.c
> +++ b/tools/perf/builtin-test.c
> @@ -647,7 +647,6 @@ static int test__PERF_RECORD(void)
> struct perf_record_opts opts = {
> .target = {
> .uid = UINT_MAX,
> - .uses_mmap = true,
> },
> .no_delay = true,
> .freq = 10,
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 6bb0277..cc78e06 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1165,9 +1165,6 @@ int cmd_top(int argc, const char **argv, const
> char *prefix __used)
> .freq = 4000, /* 4 KHz */
> .mmap_pages = 128,
> .sym_pcnt_filter = 5,
> - .target = {
> - .uses_mmap = true,
> - },
> };
> char callchain_default_opt[] = "fractal,0.5,callee";
> const struct option options[] = {
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 7400fb3..e791029 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -622,9 +622,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist,
> if (evlist->threads == NULL)
> return -1;
>
> - if (perf_target__has_task(target))
> - evlist->cpus = cpu_map__dummy_new();
> - else if (!perf_target__has_cpu(target) && !target->uses_mmap)
> + if (perf_target__has_task(target) || !perf_target__has_cpu(target))
> evlist->cpus = cpu_map__dummy_new();
> else
> evlist->cpus = cpu_map__new(target->cpu_list);
> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
> index a4be857..c43f632 100644
> --- a/tools/perf/util/target.h
> +++ b/tools/perf/util/target.h
> @@ -11,7 +11,6 @@ struct perf_target {
> const char *uid_str;
> uid_t uid;
> bool system_wide;
> - bool uses_mmap;
> };
>
> enum perf_target_errno {
> --
> 1.7.9.7
On 6/27/12 8:20 PM, Namhyung Kim wrote:
> NAK. This uses_mmap field is needed to setup per-task-per-cpu events for
> perf record (mostly). Without it, perf suffered from severe scalability
> issues. Maybe we can change it not to create per-task-per-cpu events iff
> for hwbp events only, but I'm not sure it's the right thing.
>
> Thanks,
> Namhyung
Add -a to the record command. And if you want to dump the samples try
the attached.
David
Hi,
On Thu, Jun 28, 2012 at 10:20 AM, Namhyung Kim <[email protected]> wrote:
> On Thu, 28 Jun 2012 09:02:02 +0800, Jovi Zhang wrote:
>> On Mon, Jun 25, 2012 at 4:26 PM, Namhyung Kim <[email protected]> wrote:
>>> BTW, when I perf record on a hwbp, it failed with ENOSPC.
>>> I guess it's because each per-task-per-cpu event tried to
>>> create an event so it'd get more than supported by h/w.
>>> The strace told me that the fifth call to perf_event_open
>>> failed on my 6-core machine.
>>>
>> I have same result as you in my linux box.
>> This should be a bug cause by commit d1cb9f(perf target: Add uses_mmap field)
>>
>> Namhyung, How about below patch?
>>
>
> NAK. This uses_mmap field is needed to setup per-task-per-cpu events for
> perf record (mostly). Without it, perf suffered from severe scalability
> issues. Maybe we can change it not to create per-task-per-cpu events iff
> for hwbp events only, but I'm not sure it's the right thing.
>
> Thanks,
> Namhyung
>
perf tool create cpumaps and threadmaps for evlist, not for evsel, this means
we cannot easily create a dummy cpumaps for hwbp event even when
evlist include hwbp type evsel.
Any ideas?
.jovi