Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754339AbbKREWt (ORCPT ); Tue, 17 Nov 2015 23:22:49 -0500 Received: from szxga03-in.huawei.com ([119.145.14.66]:58224 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752362AbbKREWs (ORCPT ); Tue, 17 Nov 2015 23:22:48 -0500 Message-ID: <564BFD06.7020901@huawei.com> Date: Wed, 18 Nov 2015 12:22:30 +0800 From: "Wangnan (F)" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Jiri Olsa , Arnaldo Carvalho de Melo CC: Jan Kratochvil , lkml , David Ahern , Peter Zijlstra , "Ingo Molnar" , Namhyung Kim , Milian Wolff Subject: Re: [PATCH 0/3] perf tools DWARF libunwind: Add callchain order support References: <1447772739-18471-1-git-send-email-jolsa@kernel.org> In-Reply-To: <1447772739-18471-1-git-send-email-jolsa@kernel.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090202.564BFD0D.00DB,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 710782715cac2f4dcd6207cf89431f06 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6797 Lines: 222 Hi Jiri, On 2015/11/17 23:05, Jiri Olsa wrote: > hi, > as reported by Milian, currently for DWARF unwind (both libdw > and libunwind) we display callchain in callee order only. > > Adding the support to follow callchain order setup to libunwind > DWARF unwinder, so we could get following output for report: > > $ perf record --call-graph dwarf ls > ... > $ perf report --no-children --stdio > > 39.26% ls libc-2.21.so [.] __strcoll_l > | > ---__strcoll_l > mpsort_with_tmp > mpsort_with_tmp > sort_files > main > __libc_start_main > _start > 0 > > $ perf report -g caller --no-children --stdio > ... > 39.26% ls libc-2.21.so [.] __strcoll_l > | > ---0 > _start > __libc_start_main > main > sort_files > mpsort_with_tmp > mpsort_with_tmp > __strcoll_l > > Tested on x86_64. The change is in generic code only, > so it should not affect other archs. Still it would be > nice to have some confirmation.. Wang Nan? ;-) > > It'd be nice to have this for libdw unwind as well, > but it looks like it's out of reach for perf code.. Jan? > > Also available in: > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git > perf/callchain_1 Thanks for notifying me about this. I have tested it in my environment. It works well for me except a small behavior changing. Please see below. Before applying these patch set: # perf report --no-children --stdio --call-graph=callee # Overhead Command Shared Object Symbol # ........ ....... ................ ......................... # 96.61% a.out [vdso] [.] __vdso_gettimeofday | ---__vdso_gettimeofday funcc funcb funca main __libc_start_main _start 3.38% a.out a.out [.] funcc | ---funcc | --2.70%-- funcb funca main __libc_start_main _start 0.02% pref_re [kernel.vmlinux] [k] sched_clock | ---sched_clock perf_event_nmi_handler nmi_handle ... And caller: # ./perf report --no-children --stdio --call-graph=caller # Overhead Command Shared Object Symbol # ........ ....... ................ ......................... # 96.61% a.out [vdso] [.] __vdso_gettimeofday | ---__vdso_gettimeofday funcc funcb funca main __libc_start_main _start 3.38% a.out a.out [.] funcc | ---funcc | --2.70%-- funcb funca main __libc_start_main _start 0.02% pref_re [kernel.vmlinux] [k] sched_clock | ---return_from_execve sys_execve do_execveat_common.isra.27 The user code part of output are identical so I confirm the bug. After applying this patchset: # ./perf report --no-children --stdio --call-graph=callee # Overhead Command Shared Object Symbol # ........ ....... ................ ......................... # 96.61% a.out [vdso] [.] __vdso_gettimeofday | ---__vdso_gettimeofday funcc funcb funca main __libc_start_main _start 3.38% a.out a.out [.] funcc | ---funcc | |--2.70%-- funcb | funca | main | __libc_start_main | _start | --0.68%-- 0 0.02% pref_re [kernel.vmlinux] [k] sched_clock | ---sched_clock perf_event_nmi_handler ... And caller: # ./perf report --no-children --stdio --call-graph=caller # Overhead Command Shared Object Symbol # ........ ....... ................ ......................... # 96.61% a.out [vdso] [.] __vdso_gettimeofday | ---_start __libc_start_main main funca funcb funcc __vdso_gettimeofday 3.38% a.out a.out [.] funcc | |--2.70%-- _start | __libc_start_main | main | funca | funcb | funcc | --0.68%-- 0 funcc 0.02% pref_re [kernel.vmlinux] [k] sched_clock | ---return_from_execve sys_execve ... It fixes the bug. However, do you see the extra "0.68%-- 0" in the tree? I give a message on patch 2/3, please have a look. I think this change would be okay for me if we treat the old behavior as a bug (for example: sum of all branches not equal to the overhead of itself). However, the original code explicitly avoid generating '0' entry so I think we should make it clear. Thank you. > thanks, > jirka > > > Cc: Jan Kratochvil > --- > Jiri Olsa (3): > perf tools: Move initial entry call into get_entries function > perf tools: Add callchain order support for libunwind DWARF unwinder > perf test: Add callchain order setup for DWARF unwinder test > > tools/perf/tests/dwarf-unwind.c | 22 +++++++++++++++++++--- > tools/perf/util/unwind-libunwind.c | 60 +++++++++++++++++++++++++++++++++++++++--------------------- > 2 files changed, 58 insertions(+), 24 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/