Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753237AbdFMNGy (ORCPT ); Tue, 13 Jun 2017 09:06:54 -0400 Received: from mail.kernel.org ([198.145.29.99]:38906 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752456AbdFMNGw (ORCPT ); Tue, 13 Jun 2017 09:06:52 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A96AF239BE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=goodmis.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=rostedt@goodmis.org Date: Tue, 13 Jun 2017 09:06:49 -0400 From: Steven Rostedt To: Will Hawkins Cc: LKML Subject: Re: Ftrace vs perf user page fault statistics differences Message-ID: <20170613090649.2a5edb6a@gandalf.local.home> In-Reply-To: References: <20170612212005.1bf8c44d@gandalf.local.home> X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3649 Lines: 105 On Mon, 12 Jun 2017 22:05:05 -0400 Will Hawkins wrote: > > Can you supply this program. So I can see exactly what it does? > > I have attached the binary to this email. I also provided the source > so you can reproduce the situation directly. You can use the makefile > provided as long as you have nasm installed. > > Although this is a response to a later question, I won't bury the > lede: The program was written by hand in asm, compiled with nasm to > object code and then run through gcc to get the output binary. Here is > how that works: > > nasm -felf64 page.s && gcc -nostdlib page.o -o page > > The result, as far as ldd is concerned: > > hawkinsw@moormans:~/code/one_page_play$ ldd page > not a dynamic executable OK, that makes sense. > > > > > >> > >> I looked at the results: > >> > >> perf report > >> > >> and the results were as I expected. There were two page faults for > >> loading the code into memory and a page fault to > >> copy_user_enhanced_fast_string invoked by execve's implementation when > >> loading the binary. > > > > What does perf script show you? > > Here is what perf report --stdio shows: > > # Overhead Command Shared Object Symbol > # ........ ....... ................. .................................. > # > 33.33% page page [.] _start > | > --- _start > > 33.33% page page [.] target > | > --- target > > 33.33% page [kernel.kallsyms] [k] copy_user_enhanced_fast_string > | > --- copy_user_enhanced_fast_string > load_elf_binary > search_binary_handler > do_execve_common.isra.23 > sys_execve > stub_execve > 0x7f51698a31e7 > > > > Here is what perf script shows: > > page 28787 113626.587935: page-faults: > ffffffff81376609 copy_user_enhanced_fast_string ([kernel.kallsyms]) > ffffffff812197bf load_elf_binary ([kernel.kallsyms]) > ffffffff811c76df search_binary_handler ([kernel.kallsyms]) > ffffffff811c8c8e do_execve_common.isra.23 ([kernel.kallsyms]) > ffffffff811c9176 sys_execve ([kernel.kallsyms]) > ffffffff8173afc9 stub_execve ([kernel.kallsyms]) > 7f51698a31e7 [unknown] ([unknown]) > > page 28787 113626.587961: page-faults: > 4000e0 _start (/home/hawkinsw/code/one_page_play/page) > > page 28787 113626.587968: page-faults: > 401473 target (/home/hawkinsw/code/one_page_play/page) > > Again, this seems like exactly what I would expect. > > > > >> > >> I decided to run the application under ftrace just for fun. I wanted > >> an excuse to learn more about it and this seemed like the perfect > >> chance. I used the incredible trace-cmd suite for the actual > >> incantation of ftrace. I won't include the actual incantations here > >> because I used many of them while digging around. > > > > what events did you enable? > > Sorry for not including this in my first email. I enabled > exceptions:page_fault_user. I ran the trace-cmd record like this: > > sudo ./trace-cmd record -e exceptions:page_fault_user -T --profile -l > handle_mm_fault ../one_page_play/page See if things change if you add -F. Perf by default only traces the application you specify. trace-cmd by default traces everything. That is, "trace-cmd record -F" is similar to "perf record". Where as "trace-cmd record" is similar to "perf record -a". -- Steve