MIME-Version: 1.0
From: Will Hawkins <whh8b@virginia.edu>
Date: Mon, 12 Jun 2017 16:08:48 -0400
Message-ID: <CAE+MWFvLvx+GRLQ4ikSb94h1jVXwmFVb-3PCzFoqC7rnOb0rvw@mail.gmail.com>
Subject: Ftrace vs perf user page fault statistics differences
To: srostedt@redhat.com
Cc: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3052
Lines: 78

Dear Mr. Rostedt and Kernel community,

I hope that this is an appropriate place to ask this question, Please
forgive me for wasting your time if it is not. I've searched for
answers to this question on LKML and the "net" before asking. I
apologize if I already missed the question and answer somewhere else.

I was doing a gut check of my understanding of how the Kernel pages in
a binary program for execution. I created a very small program that is
two pages. The entry point to the program is on page a and the code
simply jumps to the second page (b) for its termination.

I compiled the program and dropped the kernel's memory caches [1].
Then I ran the program under perf:

perf record --call-graph fp -e page-faults ../one_page_play/page

I looked at the results:

perf report

and the results were as I expected. There were two page faults for
loading the code into memory and a page fault to
copy_user_enhanced_fast_string invoked by execve's implementation when
loading the binary.

I decided to run the application under ftrace just for fun. I wanted
an excuse to learn more about it and this seemed like the perfect
chance. I used the incredible trace-cmd suite for the actual
incantation of ftrace. I won't include the actual incantations here
because I used many of them while digging around.

The results are both expected and unexpected. I see output like this:

Event: page_fault_user:0x4000e0

which indicates that there is a page fault at the program's entry
point (and matches what I saw with the perf output). I have another
similar entry that confirms the other expected page fault when loading
the second page of the test application.

However, I also see entries like this:

Event: page_fault_user:0x7f4f590934c4 (1)

The addresses of the faults I see that match that pattern are not
loaded into the application binary. What I discovered as I
investigated, is that those page faults seem to occur when the kernel
is attempting to record the output of stack traces, etc.

After thinking through this, I came up with the following hypothesis
which is the crux of this email:

Ftrace's act of recording the traces that I requested to its ring
buffer generated page faults of their own. These page faults are
generated on behalf of the traced program and get reported in the
results.

If that is correct/reasonable, it explains the differences between
what perf is reporting and what ftrace is reporting and I am happy.
If, however, that is a bogus conclusion, please help me understand
what is going on.

I know that everyone who is on this email is incredibly busy and has
much to do. I hope that I've included enough information to make it
possible for you experts to advise, but not included too much to waste
your time.

If you have the time or interest in answering, I would love to hear
your responses. Please CC me directly on all responses.

Thanks again for your time!

Will

[1] I used echo 3 > /proc/sys/vm/drop_caches to accomplish this and
issued it between every run. It may have been overkill, but I did it
anyway.