Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751643AbdF1Q1r (ORCPT ); Wed, 28 Jun 2017 12:27:47 -0400 Received: from mail-pf0-f172.google.com ([209.85.192.172]:36113 "EHLO mail-pf0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751505AbdF1Q1j (ORCPT ); Wed, 28 Jun 2017 12:27:39 -0400 Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view To: Milian Wolff Cc: Arnaldo Carvalho de Melo , linux-kernel@vger.kernel.org, Adrian Hunter , Andi Kleen , David Ahern , Jin Yao , Jiri Olsa , Kim Phillips , Masami Hiramatsu , Namhyung Kim , Wang Nan References: <1498619888-1305-1-git-send-email-treeze.taeung@gmail.com> <1571575.U9nG2MaxHg@milian-kdab2> From: Taeung Song Message-ID: Date: Thu, 29 Jun 2017 01:27:34 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <1571575.U9nG2MaxHg@milian-kdab2> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4584 Lines: 140 On 06/28/2017 06:53 PM, Milian Wolff wrote: > On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote: >> Hi, >> >> The --source-only option and new source code TUI view can show >> the result of performance analysis based on full source code per >> symbol(function). (Namhyung Kim told me this idea and it was also requested >> by others some time ago..) >> >> If someone wants to see the cause, he/she will need to dig into the asm. >> But before that, looking at the source level can give a hint or clue >> for the problem. >> >> For example, if target symbol is 'hex2u64' of util/util.c, >> the output is like below. >> >> $ perf annotate --source-only --stdio -s hex2u64 >> Percent | Source code of util.c for cycles:ppp (42 samples) >> ----------------------------------------------------------------- >> 0.00 : 354 * While we find nice hex chars, build a long_val. >> 0.00 : 355 * Return number of chars processed. >> 0.00 : 356 */ >> 0.00 : 357 int hex2u64(const char *ptr, u64 *long_val) >> 2.38 : 358 { >> 2.38 : 359 const char *p = ptr; >> 0.00 : 360 *long_val = 0; >> 0.00 : 361 >> 30.95 : 362 while (*p) { >> 23.81 : 363 const int hex_val = hex(*p); >> 0.00 : 364 >> 14.29 : 365 if (hex_val < 0) >> 0.00 : 366 break; >> 0.00 : 367 >> 26.19 : 368 *long_val = (*long_val << 4) | hex_val; >> 0.00 : 369 p++; >> 0.00 : 370 } >> 0.00 : 371 >> 0.00 : 372 return p - ptr; >> 0.00 : 373 } >> >> And I added many perf developers into Cc: because I want to listen to your >> opinions about this new feature, if you don't mind. >> >> If you give some feedback, I'd appreciate it! :) > > Thanks Taeung, > > I requested this feature some time ago and it's really cool to see someone > step up and implement it - much appreciated! Thank you so much, Milian !! :) > > I just tested it out on my pet-example that leverages C++ instead of C: > > ~~~~~ > #include > #include > #include > #include > > using namespace std; > > int main() > { > uniform_real_distribution uniform(-1E5, 1E5); > default_random_engine engine; > double s = 0; > for (int i = 0; i < 10000000; ++i) { > s += norm(complex(uniform(engine), uniform(engine))); > } > cout << s << '\n'; > return 0; > } > ~~~~~ > > Compile it with: > > g++ -O2 -g -std=c++11 test.cpp -o test > > Then record it with perf: > > perf record --call-graph dwarf ./test > > Then analyse it with `perf report`. You'll see one entry for main with > something like: > > + 100.00% 39.69% cpp-inlining cpp-inlining [.] main > > Select it and annotate it, then switch to your new source-only view: > > main test.cpp > │ 30 > │ 31 using namespace std; > │ 32 > │ 33 int main() > │+ 34 { > │ 35 uniform_real_distribution uniform(-1E5, 1E5); > │ 36 default_random_engine engine; > │+ 37 double s = 0; > │+ 38 for (int i = 0; i < 10000000; ++i) { > 4.88 │+ 39 s += norm(complex(uniform(engine), > uniform(engine))); > │ 40 } > │ 41 cout << s << '\n'; > │ 42 return 0; > │+ 43 } > > Note: the line numbers are off b/c my file contains a file-header on-top. > Ignore that. > > Note2: There is no column header shown, so it's unclear what the first column > represents. > > Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate > shows 4.88... What is that? > > What this shows, is that it's extremely important to visualize inclusive cost > _and_ self cost in this view. Additionally, we need to account for inlining. > Right now, we only see the self cost that is directly within main, I suspect. > For C++ this is usually very misleading, and basically makes the annotate view > completely useless for application-level profiling. If a second column would > be added with the inclusive cost with the ability to drill down, then I could > easily see myself using this view. > > I would appreciate if you could take this into account. > > Thanks a lot > > Sure, I got it. I'll investigate this weird case and recheck this patchset based on your comments, and then I'll reply again. :) Thanks, Taeung