Date: Tue, 6 Dec 2016 12:16:23 +0100
From: Jiri Olsa <jolsa@redhat.com>
To: kan.liang@intel.com
Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org,
        linux-kernel@vger.kernel.org, alexander.shishkin@linux.intel.com,
        tglx@linutronix.de, namhyung@kernel.org, jolsa@kernel.org,
        adrian.hunter@intel.com, wangnan0@huawei.com, mark.rutland@arm.com,
        andi@firstfloor.org
Subject: Re: [PATCH V2 08/13] perf tools: show kernel overhead
Message-ID: <20161206111623.GC7730@krava>
References: <1480713561-6617-1-git-send-email-kan.liang@intel.com>
 <1480713561-6617-9-git-send-email-kan.liang@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1480713561-6617-9-git-send-email-kan.liang@intel.com>
User-Agent: Mutt/1.7.1 (2016-10-04)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1460
Lines: 40

On Fri, Dec 02, 2016 at 04:19:16PM -0500, kan.liang@intel.com wrote:

SNIP

>  --show-profiling-cost:
>  	Show extra time cost during perf profiling
> +	Sort the extra time cost by CPU
> +	If CPU information is not available in perf_sample, using -1 instead.
> +	The time cost include:
> +	- SAM: sample overhead. For x86, it's the time cost in perf NMI handler.
> +	- MUX: multiplexing overhead. The time cost spends on rotate context.
> +	- SB: side-band events overhead. The time cost spends on iterating all
> +	      events that need to receive side-band events.
>  
>  include::callchain-overhead-calculation.txt[]
>  
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index d96e215..dd4ec5c 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -245,6 +245,12 @@ enum auxtrace_error_type {
>  	PERF_AUXTRACE_ERROR_MAX
>  };
>  
> +struct total_overhead {
> +	struct perf_overhead_entry	total_sample[MAX_NR_CPUS + 1];
> +	struct perf_overhead_entry	total_mux[MAX_NR_CPUS + 1];
> +	struct perf_overhead_entry	total_sb[MAX_NR_CPUS + 1];
> +};

I think this should be either:
   - dynamically allocated (there's cpu count available in the session)
   - or made static within perf report (as in shadow stats) and counted
     in report's overhead tool callback

I also don't like that you do the process-related counts
in the xxx[MAX_NR_CPUS] member, we should have separated
struct perf_overhead_entry for that

jirla