2014-04-01 07:28:29

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v9)

Hi Arun,

On Mon, Mar 31, 2014 at 2:26 PM, Arun Sharma <[email protected]> wrote:
> On 3/20/14, 11:06 AM, Namhyung Kim wrote:
>>
>> Hello,
>>
>> This is a new attempt to implement cumulative hist period report.
>> This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
>> rewrote it from scratch.
>
>
> While testing this patch series, we found error messages which look like
> this:
>
> Out of bounds address found:
>
> Addr: 10370
> DSO: /usr/local/lib/libgcc_s.so.1 d
> Map: 7f1b0c953000-7f1b0c968000
> Symbol: 102d0-102e9 g _Unwind_DeleteException
> Arch: x86_64
> Kernel: 3.10.23+
> Tools: 3.13.rc1.g374a4d
>
> Not all samples will be on the annotation output.
>
> Please report to [email protected]

Hmm.. interesting. is it perf top right?

>
> I first suspected it to be caused by this patch series, but I'm able to
> reproduce without these patches as of this commit:
>
> a51e87c perf tools: Remove unused simple_strtoul() function

Ah, it's good to know :)

>
> gdb attributes 0x10370 to a different/known symbol.
>
> (gdb) x /i 0x10370
> 0x10370 <get_cie_encoding+160>: cmp $0x4c,%dl
>
> Is this known? Could this possibly be caused by stale histogram entries from
> unmapped/remapped shared libs?

Possibly.

Anyway the addr which perf reported is a mapped address so that it's
pointless to use the addr directly - it's 7f1b0c963370 in fact.

What was the exact command line though - did you use any filter
(--comms, --dsos, --symbols) or event modifiers? Those are another
possible culprits since map searching code touched by recent changes.

I'm not able to reproduce the problem on my machine. It'd be great if
you could bisect or let me know how to reproduce it easily.

Thanks,
Namhyung


2014-04-01 07:36:13

by Arun Sharma

[permalink] [raw]
Subject: Re: [PATCHSET 00/21] perf tools: Add support to accumulate hist periods (v9)

On 4/1/14, 12:58 PM, Namhyung Kim wrote:

>>
>> gdb attributes 0x10370 to a different/known symbol.
>>
>> (gdb) x /i 0x10370
>> 0x10370 <get_cie_encoding+160>: cmp $0x4c,%dl
>>
>> Is this known? Could this possibly be caused by stale histogram entries from
>> unmapped/remapped shared libs?
>
> Possibly.
>
> Anyway the addr which perf reported is a mapped address so that it's
> pointless to use the addr directly - it's 7f1b0c963370 in fact.
>

Right - that's the address I'd use if the process in question is still
running. But gdb <dso name> followed by relative addresses could still
tell us what the right symbol was?

> What was the exact command line though - did you use any filter
> (--comms, --dsos, --symbols) or event modifiers? Those are another
> possible culprits since map searching code touched by recent changes.

There were no other filters. The command used was just "perf top".

>
> I'm not able to reproduce the problem on my machine. It'd be great if
> you could bisect or let me know how to reproduce it easily.
>

I don't have a solid repro either. Involves building a binary, running
"perf top" and waiting for a few mins until that warning popup appears.

Will try to git bisect and figure out potential culprits.

-Arun