Message-ID: <564C283E.6070306@huawei.com>
Date: Wed, 18 Nov 2015 15:26:54 +0800
From: "Wangnan (F)" <wangnan0@huawei.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: Namhyung Kim <namhyung@kernel.org>
CC: Jiri Olsa <jolsa@kernel.org>, Arnaldo Carvalho de Melo <acme@kernel.org>,
        lkml <linux-kernel@vger.kernel.org>, David Ahern <dsahern@gmail.com>,
        "Peter Zijlstra" <peterz@infradead.org>,
        Ingo Molnar <mingo@kernel.org>, Milian Wolff <milian.wolff@kdab.com>
Subject: Re: [PATCH 2/3] perf tools: Add callchain order support for libunwind
 DWARF unwinder
References: <1447772739-18471-1-git-send-email-jolsa@kernel.org> <1447772739-18471-3-git-send-email-jolsa@kernel.org> <564BFAD4.8060909@huawei.com> <20151118054114.GQ7062@sejong>
In-Reply-To: <20151118054114.GQ7062@sejong>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5288
Lines: 159


On 2015/11/18 13:41, Namhyung Kim wrote:
> On Wed, Nov 18, 2015 at 12:13:08PM +0800, Wangnan (F) wrote:
>>
>> On 2015/11/17 23:05, Jiri Olsa wrote:
>>> From: Jiri Olsa <jolsa@redhat.com>
>>>
>>> As reported by Milian, currently for DWARF unwind (both libdw
>>> and libunwind) we display callchain in callee order only.
>>>
>>> Adding the support to follow callchain order setup to libunwind
>>> DWARF unwinder, so we could get following output for report:
>>>
>>>    $ perf record --call-graph dwarf ls
>>>    ...
>>>    $ perf report --no-children --stdio
>>>
>>>      39.26%  ls       libc-2.21.so      [.] __strcoll_l
>>>                   |
>>>                   ---__strcoll_l
>>>                      mpsort_with_tmp
>>>                      mpsort_with_tmp
>>>                      sort_files
>>>                      main
>>>                      __libc_start_main
>>>                      _start
>>>                      0
>>>
>>>    $ perf report -g caller --no-children --stdio
>>>      ...
>>>      39.26%  ls       libc-2.21.so      [.] __strcoll_l
>>>                   |
>>>                   ---0
>>>                      _start
>>>                      __libc_start_main
>>>                      main
>>>                      sort_files
>>>                      mpsort_with_tmp
>>>                      mpsort_with_tmp
>>>                      __strcoll_l
>>>
>>> Reported-by: Milian Wolff <milian.wolff@kdab.com>
>>> Based-on-patch-by: Milian Wolff <milian.wolff@kdab.com>
>>> Link: http://lkml.kernel.org/n/tip-lmtbeqm403f3luw4jkjevsi5@git.kernel.org
>>> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
>>> ---
>>>   tools/perf/util/unwind-libunwind.c | 47 ++++++++++++++++++++++++--------------
>>>   1 file changed, 30 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
>>> index 0ae8844fe7a6..705e1c19f1ea 100644
>>> --- a/tools/perf/util/unwind-libunwind.c
>>> +++ b/tools/perf/util/unwind-libunwind.c
>> [SNIP]
>>
>>> -		unw_get_reg(&c, UNW_REG_IP, &ip);
>>> -		ret = ip ? entry(ip, ui->thread, cb, arg) : 0;
>> In original code if ip == 0 entry() won't be called.
>>
>>> +		if (callchain_param.order == ORDER_CALLER)
>>> +			j = max_stack - i - 1;
>>> +		ret = entry(ips[j], ui->thread, cb, arg);
>> But in new code event if ips[j] == 0 an entry will be built, which causes
>> a behavior changes user noticable:
>>
>> Before this patch:
>>
>>
>> # perf report --no-children --stdio --call-graph=callee
>> ...
>>       3.38%  a.out    a.out             [.] funcc
>>                |
>>                ---funcc
>>                   |
>>                    --2.70%-- funcb
>>                              funca
>>                              main
>>                              __libc_start_main
>>                              _start
>>
>> After this patch:
>>
>> # perf report --no-children --stdio --call-graph=callee
>> ...
>>       3.38%  a.out    a.out             [.] funcc
>>                |
>>                ---funcc
>>                   |
>>                   |--2.70%-- funcb
>>                   |          funca
>>                   |          main
>>                   |          __libc_start_main
>>                   |          _start
>>                   |
>>                    --0.68%-- 0
>>
>>
>> I'm not sure whether we can regard this behavior changing as a bugfix? I
>> think
>> there may be some reason the original code explicitly avoid creating an '0'
>> entry.
> I think callchain value being 0 is an error or marker for the end of
> callchain.  So it'd be better avoiding 0 entry.
>
> But unfortunately, we have many 0 entries (and broken callchain after
> them) with fp recording on optimized binaries.  I think we should omit
> those callchains.
>
> Maybe something like this?
>
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 5ef90be2a249..22642c5719ab 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1850,6 +1850,15 @@ static int thread__resolve_callchain_sample(struct thread *thread,
>   #endif
>   		ip = chain->ips[j];
>   
> +		/* callchain value inside zero page means it's broken, stop */
> +		if (ip < 4096) {
> +			if (callchain_param.order == ORDER_CALLER) {
> +				callchain_cursor_reset(&callchain_cursor);
> +				continue;
> +			} else
> +				break;
> +		}
> +
>   		err = add_callchain_ip(thread, parent, root_al, &cpumode, ip);
>   
>   		if (err)

Then we totally get rid of 0 entries, but how can we explain
the sum of overhead of different branches?

Is it possible to explicitly tell user the place where perf
failed to unwind call stack? For example:

      3.38%  a.out    a.out             [.] funcc
               |
               ---funcc
                  |
                  |--2.70%-- funcb
                  |          funca
                  |          main
                  |          __libc_start_main
                  |          _start
                  |
                   --0.68%-- (unwind failure)


Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/