2017-06-03 13:51:08

by Milian Wolff

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix

On Montag, 22. Mai 2017 14:48:18 CEST Namhyung Kim wrote:
> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
> > The original patch that introduced inline frame output in the
> > various browsers used this suffix already. The new centralized
> > approach that uses fake symbols for inlined frames was missing
> > this approach so far.
> >
> > Instead of changing the symbol name itself, we only print the
> > suffix where needed. This allows us to efficiently lookup
> > the symbol for a given name without first having to append the
> > suffix before the lookup.
>
> You also need to do same thing for hist_entry__sym_snprintf().

Hey Namhyung,

I'm working on the next iteration of this patch series, the WIP can be found
here: https://github.com/milianw/linux/tree/wip/distinguish-inliners.

I just stumbled upon another issue:

~~~~~
perf report --inline -s srcline -g srcline --stdio -g none
61.22% 0.00% main+378
61.22% 0.00% std::_Norm_helper<true>::_S_do_it<double>+378
61.22% 0.00% std::__complex_abs+378
61.22% 0.00% std::abs<double>+378
61.22% 0.00% std::norm<double>+378
~~~~~

The problem here is that the srcline in the hist is detached from what we
actually produce in the callchain nodes. We will have to somehow hand through
the srcline from the callchain node to the hist entry, but this seems to be a
super large invasive change - which is why I need your input on how to resolve
this.

The current API seems to pass the data around mostly using the addr_location
struct, which is usually constructed on the stack and not always memset to
zero. As such, my initial plan of adding a srcline member there would require
me to go through all the code to ensure that we memset the struct to zero...

Alternatively, I'd have to change the API of hist_iter_ops, to let the
callback take another `const char **srcline` out parameter. This is also going
to be quite a large invasive change.

Do you have any suggestions on how to make this work?

Thanks

--
Milian Wolff | [email protected] | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts


2017-06-06 01:34:13

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix

On Sat, Jun 3, 2017 at 10:51 PM, Milian Wolff <[email protected]> wrote:
> On Montag, 22. Mai 2017 14:48:18 CEST Namhyung Kim wrote:
>> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
>> > The original patch that introduced inline frame output in the
>> > various browsers used this suffix already. The new centralized
>> > approach that uses fake symbols for inlined frames was missing
>> > this approach so far.
>> >
>> > Instead of changing the symbol name itself, we only print the
>> > suffix where needed. This allows us to efficiently lookup
>> > the symbol for a given name without first having to append the
>> > suffix before the lookup.
>>
>> You also need to do same thing for hist_entry__sym_snprintf().
>
> Hey Namhyung,
>
> I'm working on the next iteration of this patch series, the WIP can be found
> here: https://github.com/milianw/linux/tree/wip/distinguish-inliners.
>
> I just stumbled upon another issue:
>
> ~~~~~
> perf report --inline -s srcline -g srcline --stdio -g none
> 61.22% 0.00% main+378
> 61.22% 0.00% std::_Norm_helper<true>::_S_do_it<double>+378
> 61.22% 0.00% std::__complex_abs+378
> 61.22% 0.00% std::abs<double>+378
> 61.22% 0.00% std::norm<double>+378
> ~~~~~
>
> The problem here is that the srcline in the hist is detached from what we
> actually produce in the callchain nodes. We will have to somehow hand through
> the srcline from the callchain node to the hist entry, but this seems to be a
> super large invasive change - which is why I need your input on how to resolve
> this.
>
> The current API seems to pass the data around mostly using the addr_location
> struct, which is usually constructed on the stack and not always memset to
> zero. As such, my initial plan of adding a srcline member there would require
> me to go through all the code to ensure that we memset the struct to zero...
>
> Alternatively, I'd have to change the API of hist_iter_ops, to let the
> callback take another `const char **srcline` out parameter. This is also going
> to be quite a large invasive change.
>
> Do you have any suggestions on how to make this work?

I think passing srcline via addr_location might not be very invasive
since it calls machine__resolve() in most cases. Missing places that
set al->sym should set al->srcline as well IMHO.

Thanks,
Namhyung

2017-06-06 07:26:58

by Milian Wolff

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix

On Tuesday, June 6, 2017 3:33:49 AM CEST Namhyung Kim wrote:
> On Sat, Jun 3, 2017 at 10:51 PM, Milian Wolff <[email protected]> wrote:
> > On Montag, 22. Mai 2017 14:48:18 CEST Namhyung Kim wrote:
> >> On Thu, May 18, 2017 at 09:34:10PM +0200, Milian Wolff wrote:
> >> > The original patch that introduced inline frame output in the
> >> > various browsers used this suffix already. The new centralized
> >> > approach that uses fake symbols for inlined frames was missing
> >> > this approach so far.
> >> >
> >> > Instead of changing the symbol name itself, we only print the
> >> > suffix where needed. This allows us to efficiently lookup
> >> > the symbol for a given name without first having to append the
> >> > suffix before the lookup.
> >>
> >> You also need to do same thing for hist_entry__sym_snprintf().
> >
> > Hey Namhyung,
> >
> > I'm working on the next iteration of this patch series, the WIP can be
> > found here:
> > https://github.com/milianw/linux/tree/wip/distinguish-inliners.
> >
> > I just stumbled upon another issue:
> >
> > ~~~~~
> > perf report --inline -s srcline -g srcline --stdio -g none
> >
> > 61.22% 0.00% main+378
> > 61.22% 0.00% std::_Norm_helper<true>::_S_do_it<double>+378
> > 61.22% 0.00% std::__complex_abs+378
> > 61.22% 0.00% std::abs<double>+378
> > 61.22% 0.00% std::norm<double>+378
> >
> > ~~~~~
> >
> > The problem here is that the srcline in the hist is detached from what we
> > actually produce in the callchain nodes. We will have to somehow hand
> > through the srcline from the callchain node to the hist entry, but this
> > seems to be a super large invasive change - which is why I need your
> > input on how to resolve this.
> >
> > The current API seems to pass the data around mostly using the
> > addr_location struct, which is usually constructed on the stack and not
> > always memset to zero. As such, my initial plan of adding a srcline
> > member there would require me to go through all the code to ensure that
> > we memset the struct to zero...
> >
> > Alternatively, I'd have to change the API of hist_iter_ops, to let the
> > callback take another `const char **srcline` out parameter. This is also
> > going to be quite a large invasive change.
> >
> > Do you have any suggestions on how to make this work?
>
> I think passing srcline via addr_location might not be very invasive
> since it calls machine__resolve() in most cases. Missing places that
> set al->sym should set al->srcline as well IMHO.

OK, perfect - I'll implement that then.

Cheers

--
Milian Wolff | [email protected] | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts


Attachments:
smime.p7s (3.74 kB)

2017-06-06 19:53:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf report: mark inlined frames in output by " (inlined)" suffix

Em Tue, Jun 06, 2017 at 09:26:47AM +0200, Milian Wolff escreveu:
> On Tuesday, June 6, 2017 3:33:49 AM CEST Namhyung Kim wrote:
> > On Sat, Jun 3, 2017 at 10:51 PM, Milian Wolff <[email protected]> wrote:
> > > The current API seems to pass the data around mostly using the
> > > addr_location struct, which is usually constructed on the stack and not
> > > always memset to zero. As such, my initial plan of adding a srcline
> > > member there would require me to go through all the code to ensure that
> > > we memset the struct to zero...

> > > Alternatively, I'd have to change the API of hist_iter_ops, to let the
> > > callback take another `const char **srcline` out parameter. This is also
> > > going to be quite a large invasive change.

> > > Do you have any suggestions on how to make this work?

> > I think passing srcline via addr_location might not be very invasive
> > since it calls machine__resolve() in most cases. Missing places that
> > set al->sym should set al->srcline as well IMHO.

I haven't looked if it would be invasive or not, good that it isn't, but
then I think addr_location is the right place for this to be stored.

> OK, perfect - I'll implement that then.

Thanks,

- Arnaldo