Date: Tue, 6 Dec 2016 19:26:47 +0100
From: Peter Zijlstra <peterz@infradead.org>
To: "Liang, Kan" <kan.liang@intel.com>
Cc: "mingo@redhat.com" <mingo@redhat.com>,
        "acme@kernel.org" <acme@kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "alexander.shishkin@linux.intel.com" 
        <alexander.shishkin@linux.intel.com>,
        "tglx@linutronix.de" <tglx@linutronix.de>,
        "namhyung@kernel.org" <namhyung@kernel.org>,
        "jolsa@kernel.org" <jolsa@kernel.org>,
        "Hunter, Adrian" <adrian.hunter@intel.com>,
        "wangnan0@huawei.com" <wangnan0@huawei.com>,
        "mark.rutland@arm.com" <mark.rutland@arm.com>,
        "andi@firstfloor.org" <andi@firstfloor.org>
Subject: Re: [PATCH V2 03/13] perf/x86: output sampling overhead
Message-ID: <20161206182647.GC3107@twins.programming.kicks-ass.net>
References: <1480713561-6617-1-git-send-email-kan.liang@intel.com>
 <1480713561-6617-4-git-send-email-kan.liang@intel.com>
 <20161206112013.GJ3124@twins.programming.kicks-ass.net>
 <37D7C6CF3E00A74B8858931C1DB2F07750CA9EAE@SHSMSX103.ccr.corp.intel.com>
 <20161206153222.GB3061@worktop.programming.kicks-ass.net>
 <37D7C6CF3E00A74B8858931C1DB2F07750CA9EFB@SHSMSX103.ccr.corp.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07750CA9EFB@SHSMSX103.ccr.corp.intel.com>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1507
Lines: 38

On Tue, Dec 06, 2016 at 03:47:40PM +0000, Liang, Kan wrote:

> > It doesn't record anything, it generates the output. And it doesn't explain
> > why that needs to be in pmu::del(), in general that's a horrible thing to do.
> 
> Yes, it only generate/log the output. Sorry for the confused wording.
> 
> The NMI overhead is pmu specific overhead. So the NMI overhead output
> should be generated in pmu code.

True, but you're also accounting in a per-cpu bucket, which means it
includes all events. At which point the per-event overhead thing doesn't
really make sense.

It also means that previous sessions influence the numbers of our
current session; there's no explicit reset of the numbers.

> I assume that the pmu:del is the last called pmu function when perf finish.
> Is it a good place for logging?

No, its horrible. Sure, we'll call pmu::del on events, but yuck.

You really only want _one_ invocation when you stop using the event, and
we don't really have a good place for that. But instead of creating one,
you do horrible things.

Now, I realize there's a bit of a catch-22 in that the moment we know
the event is going away, its already gone from userspace. So we cannot
dump data from there in general..

Howver, if we have output redirection we can, but that would make things
depend on that and it cannot be used for the last event who's buffer
we're using.

Another option would be to introduce PERF_EVENT_IOC_STAT or something
like that, and have the tool call that when its 'done'.