Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752038AbdF3QWA (ORCPT ); Fri, 30 Jun 2017 12:22:00 -0400 Received: from mail-pg0-f65.google.com ([74.125.83.65]:36385 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751790AbdF3QV7 (ORCPT ); Fri, 30 Jun 2017 12:21:59 -0400 Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view To: Milian Wolff Cc: Arnaldo Carvalho de Melo , linux-kernel@vger.kernel.org, Adrian Hunter , Andi Kleen , David Ahern , Jin Yao , Jiri Olsa , Kim Phillips , Masami Hiramatsu , Namhyung Kim , Wang Nan References: <1498619888-1305-1-git-send-email-treeze.taeung@gmail.com> <1571575.U9nG2MaxHg@milian-kdab2> From: Taeung Song Message-ID: Date: Sat, 1 Jul 2017 01:21:47 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <1571575.U9nG2MaxHg@milian-kdab2> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5038 Lines: 156 I'm late.. On 06/28/2017 06:53 PM, Milian Wolff wrote: > On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote: >> Hi, >> >> The --source-only option and new source code TUI view can show >> the result of performance analysis based on full source code per >> symbol(function). (Namhyung Kim told me this idea and it was also requested >> by others some time ago..) >> >> If someone wants to see the cause, he/she will need to dig into the asm. >> But before that, looking at the source level can give a hint or clue >> for the problem. >> >> For example, if target symbol is 'hex2u64' of util/util.c, >> the output is like below. >> >> $ perf annotate --source-only --stdio -s hex2u64 >> Percent | Source code of util.c for cycles:ppp (42 samples) >> ----------------------------------------------------------------- >> 0.00 : 354 * While we find nice hex chars, build a long_val. >> 0.00 : 355 * Return number of chars processed. >> 0.00 : 356 */ >> 0.00 : 357 int hex2u64(const char *ptr, u64 *long_val) >> 2.38 : 358 { >> 2.38 : 359 const char *p = ptr; >> 0.00 : 360 *long_val = 0; >> 0.00 : 361 >> 30.95 : 362 while (*p) { >> 23.81 : 363 const int hex_val = hex(*p); >> 0.00 : 364 >> 14.29 : 365 if (hex_val < 0) >> 0.00 : 366 break; >> 0.00 : 367 >> 26.19 : 368 *long_val = (*long_val << 4) | hex_val; >> 0.00 : 369 p++; >> 0.00 : 370 } >> 0.00 : 371 >> 0.00 : 372 return p - ptr; >> 0.00 : 373 } >> >> And I added many perf developers into Cc: because I want to listen to your >> opinions about this new feature, if you don't mind. >> >> If you give some feedback, I'd appreciate it! :) > > Thanks Taeung, > > I requested this feature some time ago and it's really cool to see someone > step up and implement it - much appreciated! > > I just tested it out on my pet-example that leverages C++ instead of C: > > ~~~~~ > #include > #include > #include > #include > > using namespace std; > > int main() > { > uniform_real_distribution uniform(-1E5, 1E5); > default_random_engine engine; > double s = 0; > for (int i = 0; i < 10000000; ++i) { > s += norm(complex(uniform(engine), uniform(engine))); > } > cout << s << '\n'; > return 0; > } > ~~~~~ > > Compile it with: > > g++ -O2 -g -std=c++11 test.cpp -o test > > Then record it with perf: > > perf record --call-graph dwarf ./test > > Then analyse it with `perf report`. You'll see one entry for main with > something like: > > + 100.00% 39.69% cpp-inlining cpp-inlining [.] main > > Select it and annotate it, then switch to your new source-only view: > > main test.cpp > │ 30 > │ 31 using namespace std; > │ 32 > │ 33 int main() > │+ 34 { > │ 35 uniform_real_distribution uniform(-1E5, 1E5); > │ 36 default_random_engine engine; > │+ 37 double s = 0; > │+ 38 for (int i = 0; i < 10000000; ++i) { > 4.88 │+ 39 s += norm(complex(uniform(engine), > uniform(engine))); > │ 40 } > │ 41 cout << s << '\n'; > │ 42 return 0; > │+ 43 } > > Note: the line numbers are off b/c my file contains a file-header on-top. > Ignore that. > > Note2: There is no column header shown, so it's unclear what the first column > represents. Okey, I'll add the first column. > > Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate > shows 4.88... What is that? My case is different from your result..but I'll keep digging things related to the inclusive and self cost to understand the above case, remaking v2 patchset. > > What this shows, is that it's extremely important to visualize inclusive cost > _and_ self cost in this view. Additionally, we need to account for inlining. > Right now, we only see the self cost that is directly within main, I suspect. I handled only one source code file on this patchset, so I also think it seems to be needed to check other source code file to handle inlining.. > For C++ this is usually very misleading, and basically makes the annotate view > completely useless for application-level profiling. If a second column would > be added with the inclusive cost with the ability to drill down, then I could > easily see myself using this view. Okey, will try to make the feature for 'drill down' after this basic patchset. BTW, I'll resend v2 patchset later - considering several source files when getting srcline info - considering inline function - investigating the inclusive and self cost Thanks, Taeung > > I would appreciate if you could take this into account. > > Thanks a lot > >