Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752389AbbKVP1q (ORCPT ); Sun, 22 Nov 2015 10:27:46 -0500 Received: from mail-pa0-f52.google.com ([209.85.220.52]:36412 "EHLO mail-pa0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752146AbbKVP1o (ORCPT ); Sun, 22 Nov 2015 10:27:44 -0500 Date: Mon, 23 Nov 2015 00:27:16 +0900 From: Namhyung Kim To: "Wangnan (F)" Cc: Jiri Olsa , Arnaldo Carvalho de Melo , lkml , David Ahern , Peter Zijlstra , Ingo Molnar , Milian Wolff Subject: Re: [PATCH 2/3] perf tools: Add callchain order support for libunwind DWARF unwinder Message-ID: <20151122152716.GA19229@danjae.kornet> References: <1447772739-18471-1-git-send-email-jolsa@kernel.org> <1447772739-18471-3-git-send-email-jolsa@kernel.org> <564BFAD4.8060909@huawei.com> <20151118054114.GQ7062@sejong> <564C283E.6070306@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <564C283E.6070306@huawei.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5868 Lines: 166 Hello, Sorry for late reply, I missed this in the mailbox.. On Wed, Nov 18, 2015 at 03:26:54PM +0800, Wangnan (F) wrote: > On 2015/11/18 13:41, Namhyung Kim wrote: > >On Wed, Nov 18, 2015 at 12:13:08PM +0800, Wangnan (F) wrote: > >> > >>On 2015/11/17 23:05, Jiri Olsa wrote: > >>>From: Jiri Olsa > >>> > >>>As reported by Milian, currently for DWARF unwind (both libdw > >>>and libunwind) we display callchain in callee order only. > >>> > >>>Adding the support to follow callchain order setup to libunwind > >>>DWARF unwinder, so we could get following output for report: > >>> > >>> $ perf record --call-graph dwarf ls > >>> ... > >>> $ perf report --no-children --stdio > >>> > >>> 39.26% ls libc-2.21.so [.] __strcoll_l > >>> | > >>> ---__strcoll_l > >>> mpsort_with_tmp > >>> mpsort_with_tmp > >>> sort_files > >>> main > >>> __libc_start_main > >>> _start > >>> 0 > >>> > >>> $ perf report -g caller --no-children --stdio > >>> ... > >>> 39.26% ls libc-2.21.so [.] __strcoll_l > >>> | > >>> ---0 > >>> _start > >>> __libc_start_main > >>> main > >>> sort_files > >>> mpsort_with_tmp > >>> mpsort_with_tmp > >>> __strcoll_l > >>> > >>>Reported-by: Milian Wolff > >>>Based-on-patch-by: Milian Wolff > >>>Link: http://lkml.kernel.org/n/tip-lmtbeqm403f3luw4jkjevsi5@git.kernel.org > >>>Signed-off-by: Jiri Olsa > >>>--- > >>> tools/perf/util/unwind-libunwind.c | 47 ++++++++++++++++++++++++-------------- > >>> 1 file changed, 30 insertions(+), 17 deletions(-) > >>> > >>>diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c > >>>index 0ae8844fe7a6..705e1c19f1ea 100644 > >>>--- a/tools/perf/util/unwind-libunwind.c > >>>+++ b/tools/perf/util/unwind-libunwind.c > >>[SNIP] > >> > >>>- unw_get_reg(&c, UNW_REG_IP, &ip); > >>>- ret = ip ? entry(ip, ui->thread, cb, arg) : 0; > >>In original code if ip == 0 entry() won't be called. > >> > >>>+ if (callchain_param.order == ORDER_CALLER) > >>>+ j = max_stack - i - 1; > >>>+ ret = entry(ips[j], ui->thread, cb, arg); > >>But in new code event if ips[j] == 0 an entry will be built, which causes > >>a behavior changes user noticable: > >> > >>Before this patch: > >> > >> > >># perf report --no-children --stdio --call-graph=callee > >>... > >> 3.38% a.out a.out [.] funcc > >> | > >> ---funcc > >> | > >> --2.70%-- funcb > >> funca > >> main > >> __libc_start_main > >> _start > >> > >>After this patch: > >> > >># perf report --no-children --stdio --call-graph=callee > >>... > >> 3.38% a.out a.out [.] funcc > >> | > >> ---funcc > >> | > >> |--2.70%-- funcb > >> | funca > >> | main > >> | __libc_start_main > >> | _start > >> | > >> --0.68%-- 0 > >> > >> > >>I'm not sure whether we can regard this behavior changing as a bugfix? I > >>think > >>there may be some reason the original code explicitly avoid creating an '0' > >>entry. > >I think callchain value being 0 is an error or marker for the end of > >callchain. So it'd be better avoiding 0 entry. > > > >But unfortunately, we have many 0 entries (and broken callchain after > >them) with fp recording on optimized binaries. I think we should omit > >those callchains. > > > >Maybe something like this? > > > > > >diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > >index 5ef90be2a249..22642c5719ab 100644 > >--- a/tools/perf/util/machine.c > >+++ b/tools/perf/util/machine.c > >@@ -1850,6 +1850,15 @@ static int thread__resolve_callchain_sample(struct thread *thread, > > #endif > > ip = chain->ips[j]; > >+ /* callchain value inside zero page means it's broken, stop */ > >+ if (ip < 4096) { > >+ if (callchain_param.order == ORDER_CALLER) { > >+ callchain_cursor_reset(&callchain_cursor); > >+ continue; > >+ } else > >+ break; > >+ } > >+ > > err = add_callchain_ip(thread, parent, root_al, &cpumode, ip); > > if (err) > > Then we totally get rid of 0 entries, but how can we explain > the sum of overhead of different branches? > > Is it possible to explicitly tell user the place where perf > failed to unwind call stack? For example: > > 3.38% a.out a.out [.] funcc > | > ---funcc > | > |--2.70%-- funcb > | funca > | main > | __libc_start_main > | _start > | > --0.68%-- (unwind failure) Hmm.. we have something similar in the fractal callchain mode, but it doesn't say about the failure. I think it's hard for perf to know whether a given callchain is broken or not. It seems that correct callchains end with 0 entry, but broken chains all can have 0 entries in the middle. Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/