Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752055AbdF2HLP convert rfc822-to-8bit (ORCPT ); Thu, 29 Jun 2017 03:11:15 -0400 Received: from LGEAMRELO11.lge.com ([156.147.23.51]:53197 "EHLO lgeamrelo11.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751638AbdF2HLI (ORCPT ); Thu, 29 Jun 2017 03:11:08 -0400 X-Original-SENDERIP: 156.147.1.121 X-Original-MAILFROM: namhyung@kernel.org X-Original-SENDERIP: 10.177.227.17 X-Original-MAILFROM: namhyung@kernel.org Date: Thu, 29 Jun 2017 16:11:01 +0900 From: Namhyung Kim To: Milian Wolff Cc: Taeung Song , Arnaldo Carvalho de Melo , linux-kernel@vger.kernel.org, Adrian Hunter , Andi Kleen , David Ahern , Jin Yao , Jiri Olsa , Kim Phillips , Masami Hiramatsu , Wang Nan , kernel-team@lge.com Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view Message-ID: <20170629071101.GA10884@sejong> References: <1498619888-1305-1-git-send-email-treeze.taeung@gmail.com> <1571575.U9nG2MaxHg@milian-kdab2> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <1571575.U9nG2MaxHg@milian-kdab2> User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6964 Lines: 131 Hello, On Wed, Jun 28, 2017 at 11:53:22AM +0200, Milian Wolff wrote: > On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote: > > Hi, > > > > The --source-only option and new source code TUI view can show > > the result of performance analysis based on full source code per > > symbol(function). (Namhyung Kim told me this idea and it was also requested > > by others some time ago..) > > > > If someone wants to see the cause, he/she will need to dig into the asm. > > But before that, looking at the source level can give a hint or clue > > for the problem. > > > > For example, if target symbol is 'hex2u64' of util/util.c, > > the output is like below. > > > > $ perf annotate --source-only --stdio -s hex2u64 > > Percent | Source code of util.c for cycles:ppp (42 samples) > > ----------------------------------------------------------------- > > 0.00 : 354 * While we find nice hex chars, build a long_val. > > 0.00 : 355 * Return number of chars processed. > > 0.00 : 356 */ > > 0.00 : 357 int hex2u64(const char *ptr, u64 *long_val) > > 2.38 : 358 { > > 2.38 : 359 const char *p = ptr; > > 0.00 : 360 *long_val = 0; > > 0.00 : 361 > > 30.95 : 362 while (*p) { > > 23.81 : 363 const int hex_val = hex(*p); > > 0.00 : 364 > > 14.29 : 365 if (hex_val < 0) > > 0.00 : 366 break; > > 0.00 : 367 > > 26.19 : 368 *long_val = (*long_val << 4) | hex_val; > > 0.00 : 369 p++; > > 0.00 : 370 } > > 0.00 : 371 > > 0.00 : 372 return p - ptr; > > 0.00 : 373 } > > > > And I added many perf developers into Cc: because I want to listen to your > > opinions about this new feature, if you don't mind. > > > > If you give some feedback, I'd appreciate it! :) > > Thanks Taeung, > > I requested this feature some time ago and it's really cool to see someone > step up and implement it - much appreciated! > > I just tested it out on my pet-example that leverages C++ instead of C: > > ~~~~~ > #include > #include > #include > #include > > using namespace std; > > int main() > { > uniform_real_distribution uniform(-1E5, 1E5); > default_random_engine engine; > double s = 0; > for (int i = 0; i < 10000000; ++i) { > s += norm(complex(uniform(engine), uniform(engine))); > } > cout << s << '\n'; > return 0; > } > ~~~~~ > > Compile it with: > > g++ -O2 -g -std=c++11 test.cpp -o test > > Then record it with perf: > > perf record --call-graph dwarf ./test > > Then analyse it with `perf report`. You'll see one entry for main with > something like: > > + 100.00% 39.69% cpp-inlining cpp-inlining [.] main > > Select it and annotate it, then switch to your new source-only view: > > main test.cpp > │ 30 > │ 31 using namespace std; > │ 32 > │ 33 int main() > │+ 34 { > │ 35 uniform_real_distribution uniform(-1E5, 1E5); > │ 36 default_random_engine engine; > │+ 37 double s = 0; > │+ 38 for (int i = 0; i < 10000000; ++i) { > 4.88 │+ 39 s += norm(complex(uniform(engine), uniform(engine))); > │ 40 } > │ 41 cout << s << '\n'; > │ 42 return 0; > │+ 43 } > > Note: the line numbers are off b/c my file contains a file-header on-top. > Ignore that. > > Note2: There is no column header shown, so it's unclear what the first column > represents. > > Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate > shows 4.88... What is that? > > What this shows, is that it's extremely important to visualize inclusive cost > _and_ self cost in this view. Additionally, we need to account for inlining. > Right now, we only see the self cost that is directly within main, I suspect. Currently perf annotate doesn't use the sample period, it uses sample count instead and print the percentage within the function. So it's a different number to the perf report. I think we need to fix this first. Thanks, Namhyung > For C++ this is usually very misleading, and basically makes the annotate view > completely useless for application-level profiling. If a second column would > be added with the inclusive cost with the ability to drill down, then I could > easily see myself using this view. > > I would appreciate if you could take this into account. > > Thanks a lot > > > -- > Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer > KDAB (Deutschland) GmbH&Co KG, a KDAB Group company > Tel: +49-30-521325470 > KDAB - The Qt Experts