2012-06-25 05:22:05

by Jovi Zhang

[permalink] [raw]
Subject: perf support user-space hw_breakpoint?

Hi guys,

Does perf support user space hw_breakpoint based on per-task?

perf already support kenerl space hw_breakpoint, but there don't have
any example for
user-space hw_breakpoint in code base(and never metion it).
>From perf api point of view, it should support per-task hw_breakpoint easily.
but I still want to make sure that?(badly I don't have any linux
machine to test it now:))

Thanks.

.jovi


2012-06-25 08:30:39

by Namhyung Kim

[permalink] [raw]
Subject: Re: perf support user-space hw_breakpoint?

Hi, Jovi

On Mon, 25 Jun 2012 13:22:01 +0800, Jovi Zhang wrote:
> Hi guys,
>
> Does perf support user space hw_breakpoint based on per-task?
>
> perf already support kenerl space hw_breakpoint, but there don't have
> any example for
> user-space hw_breakpoint in code base(and never metion it).
> From perf api point of view, it should support per-task hw_breakpoint easily.
> but I still want to make sure that?(badly I don't have any linux
> machine to test it now:))
>

Here is my simple test:

namhyung@sejong:perf$ nm -nD /usr/bin/ls | grep D
0000000000619ce0 D quoting_style_args
000000000061a530 D ls_mode
000000000061a538 D Version
000000000061a540 D argmatch_die
000000000061a548 D exit_failure

namhyung@sejong:perf$ ./perf stat -e mem:0x61a530 -e mem:0x61a538 -- /usr/bin/ls > /dev/null

Performance counter stats for '/usr/bin/ls':

1 mem:0x61a530:rw
0 mem:0x61a538:rw

0.002213595 seconds time elapsed


So, it should work on user-space hw_breakpoints.

BTW, when I perf record on a hwbp, it failed with ENOSPC.
I guess it's because each per-task-per-cpu event tried to
create an event so it'd get more than supported by h/w.
The strace told me that the fifth call to perf_event_open
failed on my 6-core machine.

Thanks,
Namhyung

2012-06-28 01:02:06

by Jovi Zhang

[permalink] [raw]
Subject: Re: perf support user-space hw_breakpoint?

Hi

On Mon, Jun 25, 2012 at 4:26 PM, Namhyung Kim <[email protected]> wrote:
> Hi, Jovi
>
> On Mon, 25 Jun 2012 13:22:01 +0800, Jovi Zhang wrote:
>> Hi guys,
>>
>> Does perf support user space hw_breakpoint based on per-task?
>>
>> perf already support kenerl space hw_breakpoint, but there don't have
>> any example for
>> user-space hw_breakpoint in code base(and never metion it).
>> From perf api point of view, it should support per-task hw_breakpoint easily.
>> but I still want to make sure that?(badly I don't have any linux
>> machine to test it now:))
>>
>
> Here is my simple test:
>
> namhyung@sejong:perf$ nm -nD /usr/bin/ls | grep D
> 0000000000619ce0 D quoting_style_args
> 000000000061a530 D ls_mode
> 000000000061a538 D Version
> 000000000061a540 D argmatch_die
> 000000000061a548 D exit_failure
>
> namhyung@sejong:perf$ ./perf stat -e mem:0x61a530 -e mem:0x61a538 -- /usr/bin/ls > /dev/null
>
>  Performance counter stats for '/usr/bin/ls':
>
>                 1 mem:0x61a530:rw
>                 0 mem:0x61a538:rw
>
>       0.002213595 seconds time elapsed
>
>
> So, it should work on user-space hw_breakpoints.

Thanks very much, it works.

>
> BTW, when I perf record on a hwbp, it failed with ENOSPC.
> I guess it's because each per-task-per-cpu event tried to
> create an event so it'd get more than supported by h/w.
> The strace told me that the fifth call to perf_event_open
> failed on my 6-core machine.
>
> Thanks,
> Namhyung

I have same result as you in my linux box.
This should be a bug cause by commit d1cb9f(perf target: Add uses_mmap field)

Namhyung, How about below patch?


>From 4b77b99df9ca3b99be4ccf8c4256e622aae9203f Mon Sep 17 00:00:00 2001
From: Jovi Zhang <[email protected]>
Date: Thu, 28 Jun 2012 07:49:41 +0800
Subject: [PATCH] perf: revert commit d1cb9f(perf target: Add uses_mmap field)

In my x86 4 cores cpu linux machine, using hw_breakpoint output as follows:

Before add uses_mmap field:
 [root@jovi perf]# ./perf record -g -e mem:0x080652c8 -e mem:0x1098 --
/usr/bin/ls >/dev/null
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.015 MB perf.data (~640 samples) ]

After add uses_mmap field:
 [root@jovi perf]# ./perf record -e mem:0x080652c8 -e mem:0x1098 --
/usr/bin/ls >/dev/null
   Error: sys_perf_event_open() syscall returned with 28 (No space
left on device).  /bin/dmesg may provide additional information.

   Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?

Adding uses_mmap field in target structure will cause perf-record
creat per-task-per-cpu
event for each evsel, this will break hw_breakpoint(have limit debug
registers in cpu),
in above example, we should create dummy cpumap for hw_breakpoint
event, not per-task-per-cpu,
fix it.

Noticed-by: Namhyung Kim <[email protected]>
Signed-off-by: Jovi Zhang <[email protected]>
---
tools/perf/builtin-record.c |    3 ---
tools/perf/builtin-test.c   |    1 -
tools/perf/builtin-top.c    |    3 ---
tools/perf/util/evlist.c    |    4 +---
tools/perf/util/target.h    |    1 -
5 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f95840d..8128213 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -754,9 +754,6 @@ static struct perf_record record = {
               .user_freq           = UINT_MAX,
               .user_interval       = ULLONG_MAX,
               .freq                = 4000,
-               .target              = {
-                       .uses_mmap   = true,
-               },
       },
       .write_mode = WRITE_FORCE,
       .file_new   = true,
diff --git a/tools/perf/builtin-test.c b/tools/perf/builtin-test.c
index 5a8727c..338a0cc 100644
--- a/tools/perf/builtin-test.c
+++ b/tools/perf/builtin-test.c
@@ -647,7 +647,6 @@ static int test__PERF_RECORD(void)
       struct perf_record_opts opts = {
               .target = {
                       .uid = UINT_MAX,
-                       .uses_mmap = true,
               },
               .no_delay   = true,
               .freq       = 10,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6bb0277..cc78e06 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1165,9 +1165,6 @@ int cmd_top(int argc, const char **argv, const
char *prefix __used)
               .freq                = 4000, /* 4 KHz */
               .mmap_pages          = 128,
               .sym_pcnt_filter     = 5,
-               .target              = {
-                       .uses_mmap   = true,
-               },
       };
       char callchain_default_opt[] = "fractal,0.5,callee";
       const struct option options[] = {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7400fb3..e791029 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -622,9 +622,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist,
       if (evlist->threads == NULL)
               return -1;

-       if (perf_target__has_task(target))
-               evlist->cpus = cpu_map__dummy_new();
-       else if (!perf_target__has_cpu(target) && !target->uses_mmap)
+       if (perf_target__has_task(target) || !perf_target__has_cpu(target))
               evlist->cpus = cpu_map__dummy_new();
       else
               evlist->cpus = cpu_map__new(target->cpu_list);
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index a4be857..c43f632 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -11,7 +11,6 @@ struct perf_target {
       const char   *uid_str;
       uid_t        uid;
       bool         system_wide;
-       bool         uses_mmap;
};

enum perf_target_errno {
--
1.7.9.7


Attachments:
0001-perf-revert-commit-d1cb9f-perf-target-Add-uses_mmap-.patch (3.60 kB)

2012-06-28 02:24:04

by Namhyung Kim

[permalink] [raw]
Subject: Re: perf support user-space hw_breakpoint?

On Thu, 28 Jun 2012 09:02:02 +0800, Jovi Zhang wrote:
> On Mon, Jun 25, 2012 at 4:26 PM, Namhyung Kim <[email protected]> wrote:
>> BTW, when I perf record on a hwbp, it failed with ENOSPC.
>> I guess it's because each per-task-per-cpu event tried to
>> create an event so it'd get more than supported by h/w.
>> The strace told me that the fifth call to perf_event_open
>> failed on my 6-core machine.
>>
> I have same result as you in my linux box.
> This should be a bug cause by commit d1cb9f(perf target: Add uses_mmap field)
>
> Namhyung, How about below patch?
>

NAK. This uses_mmap field is needed to setup per-task-per-cpu events for
perf record (mostly). Without it, perf suffered from severe scalability
issues. Maybe we can change it not to create per-task-per-cpu events iff
for hwbp events only, but I'm not sure it's the right thing.

Thanks,
Namhyung


>
> From 4b77b99df9ca3b99be4ccf8c4256e622aae9203f Mon Sep 17 00:00:00 2001
> From: Jovi Zhang <[email protected]>
> Date: Thu, 28 Jun 2012 07:49:41 +0800
> Subject: [PATCH] perf: revert commit d1cb9f(perf target: Add uses_mmap field)
>
> In my x86 4 cores cpu linux machine, using hw_breakpoint output as follows:
>
> Before add uses_mmap field:
>  [root@jovi perf]# ./perf record -g -e mem:0x080652c8 -e mem:0x1098 --
> /usr/bin/ls >/dev/null
>  [ perf record: Woken up 1 times to write data ]
>  [ perf record: Captured and wrote 0.015 MB perf.data (~640 samples) ]
>
> After add uses_mmap field:
>  [root@jovi perf]# ./perf record -e mem:0x080652c8 -e mem:0x1098 --
> /usr/bin/ls >/dev/null
>    Error: sys_perf_event_open() syscall returned with 28 (No space
> left on device).  /bin/dmesg may provide additional information.
>
>    Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
>
> Adding uses_mmap field in target structure will cause perf-record
> creat per-task-per-cpu
> event for each evsel, this will break hw_breakpoint(have limit debug
> registers in cpu),
> in above example, we should create dummy cpumap for hw_breakpoint
> event, not per-task-per-cpu,
> fix it.
>
> Noticed-by: Namhyung Kim <[email protected]>
> Signed-off-by: Jovi Zhang <[email protected]>
> ---
> tools/perf/builtin-record.c |    3 ---
> tools/perf/builtin-test.c   |    1 -
> tools/perf/builtin-top.c    |    3 ---
> tools/perf/util/evlist.c    |    4 +---
> tools/perf/util/target.h    |    1 -
> 5 files changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index f95840d..8128213 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -754,9 +754,6 @@ static struct perf_record record = {
>                .user_freq           = UINT_MAX,
>                .user_interval       = ULLONG_MAX,
>                .freq                = 4000,
> -               .target              = {
> -                       .uses_mmap   = true,
> -               },
>        },
>        .write_mode = WRITE_FORCE,
>        .file_new   = true,
> diff --git a/tools/perf/builtin-test.c b/tools/perf/builtin-test.c
> index 5a8727c..338a0cc 100644
> --- a/tools/perf/builtin-test.c
> +++ b/tools/perf/builtin-test.c
> @@ -647,7 +647,6 @@ static int test__PERF_RECORD(void)
>        struct perf_record_opts opts = {
>                .target = {
>                        .uid = UINT_MAX,
> -                       .uses_mmap = true,
>                },
>                .no_delay   = true,
>                .freq       = 10,
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 6bb0277..cc78e06 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1165,9 +1165,6 @@ int cmd_top(int argc, const char **argv, const
> char *prefix __used)
>                .freq                = 4000, /* 4 KHz */
>                .mmap_pages          = 128,
>                .sym_pcnt_filter     = 5,
> -               .target              = {
> -                       .uses_mmap   = true,
> -               },
>        };
>        char callchain_default_opt[] = "fractal,0.5,callee";
>        const struct option options[] = {
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 7400fb3..e791029 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -622,9 +622,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist,
>        if (evlist->threads == NULL)
>                return -1;
>
> -       if (perf_target__has_task(target))
> -               evlist->cpus = cpu_map__dummy_new();
> -       else if (!perf_target__has_cpu(target) && !target->uses_mmap)
> +       if (perf_target__has_task(target) || !perf_target__has_cpu(target))
>                evlist->cpus = cpu_map__dummy_new();
>        else
>                evlist->cpus = cpu_map__new(target->cpu_list);
> diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
> index a4be857..c43f632 100644
> --- a/tools/perf/util/target.h
> +++ b/tools/perf/util/target.h
> @@ -11,7 +11,6 @@ struct perf_target {
>        const char   *uid_str;
>        uid_t        uid;
>        bool         system_wide;
> -       bool         uses_mmap;
> };
>
> enum perf_target_errno {
> --
> 1.7.9.7

2012-06-28 04:10:55

by David Ahern

[permalink] [raw]
Subject: Re: perf support user-space hw_breakpoint?

On 6/27/12 8:20 PM, Namhyung Kim wrote:
> NAK. This uses_mmap field is needed to setup per-task-per-cpu events for
> perf record (mostly). Without it, perf suffered from severe scalability
> issues. Maybe we can change it not to create per-task-per-cpu events iff
> for hwbp events only, but I'm not sure it's the right thing.
>
> Thanks,
> Namhyung

Add -a to the record command. And if you want to dump the samples try
the attached.

David


Attachments:
perf-script-bp.patch (1.32 kB)

2012-06-28 04:12:56

by Jovi Zhang

[permalink] [raw]
Subject: Re: perf support user-space hw_breakpoint?

Hi,

On Thu, Jun 28, 2012 at 10:20 AM, Namhyung Kim <[email protected]> wrote:
> On Thu, 28 Jun 2012 09:02:02 +0800, Jovi Zhang wrote:
>> On Mon, Jun 25, 2012 at 4:26 PM, Namhyung Kim <[email protected]> wrote:
>>> BTW, when I perf record on a hwbp, it failed with ENOSPC.
>>> I guess it's because each per-task-per-cpu event tried to
>>> create an event so it'd get more than supported by h/w.
>>> The strace told me that the fifth call to perf_event_open
>>> failed on my 6-core machine.
>>>
>> I have same result as you in my linux box.
>> This should be a bug cause by commit d1cb9f(perf target: Add uses_mmap field)
>>
>> Namhyung, How about below patch?
>>
>
> NAK. This uses_mmap field is needed to setup per-task-per-cpu events for
> perf record (mostly). Without it, perf suffered from severe scalability
> issues. Maybe we can change it not to create per-task-per-cpu events iff
> for hwbp events only, but I'm not sure it's the right thing.
>
> Thanks,
> Namhyung
>
perf tool create cpumaps and threadmaps for evlist, not for evsel, this means
we cannot easily create a dummy cpumaps for hwbp event even when
evlist include hwbp type evsel.

Any ideas?

.jovi