2012-05-05 15:23:17

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH] perf top: Fix a race in callchain handling

When perf top collects callchain info, main thread and display
thread share the callchain cursor so that bugs can happen. For
example, if two threads race on callchain_cursor_advance() for
the last cursor node, one of them can dereference a NULL pointer.

Because the display thread only uses the cursor when collapsing
a hist entry, use its own cursor for that.

Reported-by: Sunjin Yang <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/hist.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 9f6d630d5316..795b49a803f9 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -378,7 +378,7 @@ void hist_entry__free(struct hist_entry *he)
* collapse the histogram
*/

-static bool hists__collapse_insert_entry(struct hists *hists,
+static bool hists__collapse_insert_entry(struct hists *hists __used,
struct rb_root *root,
struct hist_entry *he)
{
@@ -386,6 +386,7 @@ static bool hists__collapse_insert_entry(struct hists *hists,
struct rb_node *parent = NULL;
struct hist_entry *iter;
int64_t cmp;
+ static struct callchain_cursor cursor;

while (*p != NULL) {
parent = *p;
@@ -397,8 +398,8 @@ static bool hists__collapse_insert_entry(struct hists *hists,
iter->period += he->period;
iter->nr_events += he->nr_events;
if (symbol_conf.use_callchain) {
- callchain_cursor_reset(&hists->callchain_cursor);
- callchain_merge(&hists->callchain_cursor, iter->callchain,
+ callchain_cursor_reset(&cursor);
+ callchain_merge(&cursor, iter->callchain,
he->callchain);
}
hist_entry__free(he);
--
1.7.9.2


2012-05-05 18:23:02

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

On Sun, 2012-05-06 at 00:23 +0900, Namhyung Kim wrote:
> + static struct callchain_cursor cursor;

This just begs to become another concurrency problem. If anybody manages
to call multiple hists__collapse_insert_entry() concurrently you're
again up some creek without no paddle.

Adding global state is never a good option when dealing with
concurrency.

2012-05-05 23:53:25

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Em Sat, May 05, 2012 at 08:22:47PM +0200, Peter Zijlstra escreveu:
> On Sun, 2012-05-06 at 00:23 +0900, Namhyung Kim wrote:
> > + static struct callchain_cursor cursor;
>
> This just begs to become another concurrency problem. If anybody manages
> to call multiple hists__collapse_insert_entry() concurrently you're
> again up some creek without no paddle.
>
> Adding global state is never a good option when dealing with
> concurrency.

But it seems to fix the current issue, so thanks to Namhyung for
following up on the report and David Ahern for reporting that it was a
cross thread corruption (Namhyung, was your work based on that report?).

I'm looking how to get that fixed with Peter concerns addressed.

First testing Namhyung patch with -F 100000 tho :-)

- Arnaldo

2012-05-06 02:25:16

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Hi,

2012-05-05 (토), 20:53 -0300, Arnaldo Carvalho de Melo:
> Em Sat, May 05, 2012 at 08:22:47PM +0200, Peter Zijlstra escreveu:
> > On Sun, 2012-05-06 at 00:23 +0900, Namhyung Kim wrote:
> > > + static struct callchain_cursor cursor;
> >
> > This just begs to become another concurrency problem. If anybody manages
> > to call multiple hists__collapse_insert_entry() concurrently you're
> > again up some creek without no paddle.
> >
> > Adding global state is never a good option when dealing with
> > concurrency.
>
> But it seems to fix the current issue, so thanks to Namhyung for
> following up on the report and David Ahern for reporting that it was a
> cross thread corruption (Namhyung, was your work based on that report?).
>

No, I didn't see the David's report since I posted it using my company
email - I don't have an access to the mail outside of the company now.
It seems I have to subscribe the perf-users mailing list though :).


> I'm looking how to get that fixed with Peter concerns addressed.
>

I guess it's gonna be a non-trivial job. As far as I can see, the hists
code can handle up to two concurrent threads regardless of the callchain
cursor problem. And also guess that other areas of libperf also doesn't
support the true concurrency, right?


> First testing Namhyung patch with -F 100000 tho :-)
>

Thanks,
Namhyung

2012-05-06 18:07:15

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

> 2012-05-05 (토), 20:53 -0300, Arnaldo Carvalho de Melo:
> > First testing Namhyung patch with -F 100000 tho :-)

Program received signal SIGSEGV, Segmentation fault.
0x000000000046a51a in machine__resolve_callchain (self=0x982b20, evsel=0x982440, thread=0x993210, chain=0x7ffff18462d8, parent=0x7fffffffd6f8) at util/session.c:307
307 ip = chain->ips[i];
(gdb) bt
#0 0x000000000046a51a in machine__resolve_callchain (self=0x982b20, evsel=0x982440, thread=0x993210, chain=0x7ffff18462d8, parent=0x7fffffffd6f8) at util/session.c:307
#1 0x000000000042b9d2 in perf_event__process_sample (tool=0x7fffffffd8a0, event=0x7ffff18462b8, evsel=0x982440, sample=0x7fffffffd750, machine=0x982b20) at builtin-top.c:777
#2 0x000000000042bd17 in perf_top__mmap_read_idx (top=0x7fffffffd8a0, idx=0) at builtin-top.c:858
#3 0x000000000042bdfc in perf_top__mmap_read (top=0x7fffffffd8a0) at builtin-top.c:873
#4 0x000000000042c45f in __cmd_top (top=0x7fffffffd8a0) at builtin-top.c:1033
#5 0x000000000042d514 in cmd_top (argc=0, argv=0x7fffffffe630, prefix=0x0) at builtin-top.c:1323
#6 0x00000000004142c9 in run_builtin (p=0x78a7c8, argc=4, argv=0x7fffffffe630) at perf.c:273
#7 0x00000000004144cc in handle_internal_command (argc=4, argv=0x7fffffffe630) at perf.c:345
#8 0x0000000000414618 in run_argv (argcp=0x7fffffffe51c, argv=0x7fffffffe510) at perf.c:389
#9 0x000000000041489e in main (argc=4, argv=0x7fffffffe630) at perf.c:487
(gdb) p i
$1 = 12477348
(gdb) p *chain
$2 = {nr = 9007199254740994, ips = 0x7ffff18462d8}
(gdb) frame 2
#2 0x000000000042bd17 in perf_top__mmap_read_idx (top=0x7fffffffd8a0, idx=0) at builtin-top.c:858
858 perf_event__process_sample(&top->tool, event, evsel,
(gdb) print event->header
$3 = {type = 4175878766, misc = 16327, size = 0}
(gdb) print event->header
$4 = {type = 4175878766, misc = 16327, size = 0}
(gdb)

Probably an additional bug :-\

- Arnaldo

2012-05-06 18:32:19

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Em Sun, May 06, 2012 at 11:25:07AM +0900, Namhyung Kim escreveu:
> 2012-05-05 (토), 20:53 -0300, Arnaldo Carvalho de Melo:
> > I'm looking how to get that fixed with Peter concerns addressed.
> >
>
> I guess it's gonna be a non-trivial job. As far as I can see, the hists
> code can handle up to two concurrent threads regardless of the callchain
> cursor problem. And also guess that other areas of libperf also doesn't
> support the true concurrency, right?

Right, but making it even less concurrent is something we should avoid
8-)

How about this one instead? At least we would be able to, concurrently,
process multiple, unrelated hists:


diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 9f6d630..c5cde92 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -397,8 +397,8 @@ static bool hists__collapse_insert_entry(struct hists *hists,
iter->period += he->period;
iter->nr_events += he->nr_events;
if (symbol_conf.use_callchain) {
- callchain_cursor_reset(&hists->callchain_cursor);
- callchain_merge(&hists->callchain_cursor, iter->callchain,
+ callchain_cursor_reset(&hists->callchain_collapse_cursor);
+ callchain_merge(&hists->callchain_collapse_cursor, iter->callchain,
he->callchain);
}
hist_entry__free(he);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index cfc64e2..e04c80b 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -69,6 +69,7 @@ struct hists {
u16 col_len[HISTC_NR_COLS];
/* Best would be to reuse the session callchain cursor */
struct callchain_cursor callchain_cursor;
+ struct callchain_cursor callchain_collapse_cursor;
};

struct hist_entry *__hists__add_entry(struct hists *self,

2012-05-07 01:24:09

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Hi,

On Sun, 6 May 2012 15:07:07 -0300, Arnaldo Carvalho de Melo wrote:
>> 2012-05-05 (토), 20:53 -0300, Arnaldo Carvalho de Melo:
>> > First testing Namhyung patch with -F 100000 tho :-)
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000046a51a in machine__resolve_callchain (self=0x982b20, evsel=0x982440, thread=0x993210, chain=0x7ffff18462d8, parent=0x7fffffffd6f8) at util/session.c:307
> 307 ip = chain->ips[i];
> (gdb) bt
> #0 0x000000000046a51a in machine__resolve_callchain (self=0x982b20, evsel=0x982440, thread=0x993210, chain=0x7ffff18462d8, parent=0x7fffffffd6f8) at util/session.c:307
> #1 0x000000000042b9d2 in perf_event__process_sample (tool=0x7fffffffd8a0, event=0x7ffff18462b8, evsel=0x982440, sample=0x7fffffffd750, machine=0x982b20) at builtin-top.c:777
> #2 0x000000000042bd17 in perf_top__mmap_read_idx (top=0x7fffffffd8a0, idx=0) at builtin-top.c:858
> #3 0x000000000042bdfc in perf_top__mmap_read (top=0x7fffffffd8a0) at builtin-top.c:873
> #4 0x000000000042c45f in __cmd_top (top=0x7fffffffd8a0) at builtin-top.c:1033
> #5 0x000000000042d514 in cmd_top (argc=0, argv=0x7fffffffe630, prefix=0x0) at builtin-top.c:1323
> #6 0x00000000004142c9 in run_builtin (p=0x78a7c8, argc=4, argv=0x7fffffffe630) at perf.c:273
> #7 0x00000000004144cc in handle_internal_command (argc=4, argv=0x7fffffffe630) at perf.c:345
> #8 0x0000000000414618 in run_argv (argcp=0x7fffffffe51c, argv=0x7fffffffe510) at perf.c:389
> #9 0x000000000041489e in main (argc=4, argv=0x7fffffffe630) at perf.c:487
> (gdb) p i
> $1 = 12477348
> (gdb) p *chain
> $2 = {nr = 9007199254740994, ips = 0x7ffff18462d8}
> (gdb) frame 2
> #2 0x000000000042bd17 in perf_top__mmap_read_idx (top=0x7fffffffd8a0, idx=0) at builtin-top.c:858
> 858 perf_event__process_sample(&top->tool, event, evsel,
> (gdb) print event->header
> $3 = {type = 4175878766, misc = 16327, size = 0}
> (gdb) print event->header
> $4 = {type = 4175878766, misc = 16327, size = 0}
> (gdb)
>
> Probably an additional bug :-\
>
> - Arnaldo

Yeah, I think so :(.

The chain->nr has a strang value - it cannot be larger than 255
(PERF_MAX_STACK_DEPTH). Simple fix would be checking ->nr in the
function but it's obviously not fix a root cause.

Anyway, it seems not related to this race problem IMHO.

Thanks,
Namhyung

2012-05-07 01:33:48

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Hi,

On Sun, 6 May 2012 15:32:12 -0300, Arnaldo Carvalho de Melo wrote:
> Em Sun, May 06, 2012 at 11:25:07AM +0900, Namhyung Kim escreveu:
>> 2012-05-05 (토), 20:53 -0300, Arnaldo Carvalho de Melo:
>> > I'm looking how to get that fixed with Peter concerns addressed.
>> >
>>
>> I guess it's gonna be a non-trivial job. As far as I can see, the hists
>> code can handle up to two concurrent threads regardless of the callchain
>> cursor problem. And also guess that other areas of libperf also doesn't
>> support the true concurrency, right?
>
> Right, but making it even less concurrent is something we should avoid
> 8-)
>
> How about this one instead? At least we would be able to, concurrently,
> process multiple, unrelated hists:
>

I thought about it before, but it still cannot protect it from accessing
a hists by multiple concurrent threads. IOW if two threads call the
function to a same hists at the same time, ->callchain_collapse_cursor
would still get the race problem - so crashed.

I guess callchain_cursor should be thread-local, eventually. No need to
make it hist-local IMHO.

Thanks,
Namhyung


>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 9f6d630..c5cde92 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -397,8 +397,8 @@ static bool hists__collapse_insert_entry(struct hists *hists,
> iter->period += he->period;
> iter->nr_events += he->nr_events;
> if (symbol_conf.use_callchain) {
> - callchain_cursor_reset(&hists->callchain_cursor);
> - callchain_merge(&hists->callchain_cursor, iter->callchain,
> + callchain_cursor_reset(&hists->callchain_collapse_cursor);
> + callchain_merge(&hists->callchain_collapse_cursor, iter->callchain,
> he->callchain);
> }
> hist_entry__free(he);
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index cfc64e2..e04c80b 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -69,6 +69,7 @@ struct hists {
> u16 col_len[HISTC_NR_COLS];
> /* Best would be to reuse the session callchain cursor */
> struct callchain_cursor callchain_cursor;
> + struct callchain_cursor callchain_collapse_cursor;
> };
>
> struct hist_entry *__hists__add_entry(struct hists *self,

2012-05-11 02:12:30

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Hi,

On Mon, 07 May 2012 10:32:22 +0900, Namhyung Kim wrote:
> On Sun, 6 May 2012 15:32:12 -0300, Arnaldo Carvalho de Melo wrote:
>> Em Sun, May 06, 2012 at 11:25:07AM +0900, Namhyung Kim escreveu:
>>> 2012-05-05 (토), 20:53 -0300, Arnaldo Carvalho de Melo:
>>> > I'm looking how to get that fixed with Peter concerns addressed.
>>>
>>> I guess it's gonna be a non-trivial job. As far as I can see, the hists
>>> code can handle up to two concurrent threads regardless of the callchain
>>> cursor problem. And also guess that other areas of libperf also doesn't
>>> support the true concurrency, right?
>>
>> Right, but making it even less concurrent is something we should avoid
>> 8-)
>>
>> How about this one instead? At least we would be able to, concurrently,
>> process multiple, unrelated hists:
>
> I thought about it before, but it still cannot protect it from accessing
> a hists by multiple concurrent threads. IOW if two threads call the
> function to a same hists at the same time, ->callchain_collapse_cursor
> would still get the race problem - so crashed.
>
> I guess callchain_cursor should be thread-local, eventually. No need to
> make it hist-local IMHO.
>

So, any thoughts?

Thanks,
Namhyung

2012-05-11 15:01:18

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Em Fri, May 11, 2012 at 11:10:59AM +0900, Namhyung Kim escreveu:
> On Mon, 07 May 2012 10:32:22 +0900, Namhyung Kim wrote:
> > I thought about it before, but it still cannot protect it from accessing
> > a hists by multiple concurrent threads. IOW if two threads call the
> > function to a same hists at the same time, ->callchain_collapse_cursor
> > would still get the race problem - so crashed.
> >
> > I guess callchain_cursor should be thread-local, eventually. No need to
> > make it hist-local IMHO.
>
> So, any thoughts?

Can you try writing a patch using TLS?

- Arnaldo

2012-05-11 16:26:00

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf top: Fix a race in callchain handling

Hi,

On Fri, 11 May 2012 12:01:11 -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, May 11, 2012 at 11:10:59AM +0900, Namhyung Kim escreveu:
>> On Mon, 07 May 2012 10:32:22 +0900, Namhyung Kim wrote:
>> > I thought about it before, but it still cannot protect it from accessing
>> > a hists by multiple concurrent threads. IOW if two threads call the
>> > function to a same hists at the same time, ->callchain_collapse_cursor
>> > would still get the race problem - so crashed.
>> >
>> > I guess callchain_cursor should be thread-local, eventually. No need to
>> > make it hist-local IMHO.
>>
>> So, any thoughts?
>
> Can you try writing a patch using TLS?
>

Well, I can do it, but I still think the current patch is just enough
for now though. My concern is the TLS might make perf less portable
especially for embedded systems which are not using glibc.

Anyway I'll give it a spin as you ask.

Thanks,
Namhyung