Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752665AbdFLUIv (ORCPT ); Mon, 12 Jun 2017 16:08:51 -0400 Received: from mail-qt0-f170.google.com ([209.85.216.170]:33378 "EHLO mail-qt0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752258AbdFLUIu (ORCPT ); Mon, 12 Jun 2017 16:08:50 -0400 MIME-Version: 1.0 From: Will Hawkins Date: Mon, 12 Jun 2017 16:08:48 -0400 Message-ID: Subject: Ftrace vs perf user page fault statistics differences To: srostedt@redhat.com Cc: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3052 Lines: 78 Dear Mr. Rostedt and Kernel community, I hope that this is an appropriate place to ask this question, Please forgive me for wasting your time if it is not. I've searched for answers to this question on LKML and the "net" before asking. I apologize if I already missed the question and answer somewhere else. I was doing a gut check of my understanding of how the Kernel pages in a binary program for execution. I created a very small program that is two pages. The entry point to the program is on page a and the code simply jumps to the second page (b) for its termination. I compiled the program and dropped the kernel's memory caches [1]. Then I ran the program under perf: perf record --call-graph fp -e page-faults ../one_page_play/page I looked at the results: perf report and the results were as I expected. There were two page faults for loading the code into memory and a page fault to copy_user_enhanced_fast_string invoked by execve's implementation when loading the binary. I decided to run the application under ftrace just for fun. I wanted an excuse to learn more about it and this seemed like the perfect chance. I used the incredible trace-cmd suite for the actual incantation of ftrace. I won't include the actual incantations here because I used many of them while digging around. The results are both expected and unexpected. I see output like this: Event: page_fault_user:0x4000e0 which indicates that there is a page fault at the program's entry point (and matches what I saw with the perf output). I have another similar entry that confirms the other expected page fault when loading the second page of the test application. However, I also see entries like this: Event: page_fault_user:0x7f4f590934c4 (1) The addresses of the faults I see that match that pattern are not loaded into the application binary. What I discovered as I investigated, is that those page faults seem to occur when the kernel is attempting to record the output of stack traces, etc. After thinking through this, I came up with the following hypothesis which is the crux of this email: Ftrace's act of recording the traces that I requested to its ring buffer generated page faults of their own. These page faults are generated on behalf of the traced program and get reported in the results. If that is correct/reasonable, it explains the differences between what perf is reporting and what ftrace is reporting and I am happy. If, however, that is a bogus conclusion, please help me understand what is going on. I know that everyone who is on this email is incredibly busy and has much to do. I hope that I've included enough information to make it possible for you experts to advise, but not included too much to waste your time. If you have the time or interest in answering, I would love to hear your responses. Please CC me directly on all responses. Thanks again for your time! Will [1] I used echo 3 > /proc/sys/vm/drop_caches to accomplish this and issued it between every run. It may have been overkill, but I did it anyway.