From: "Liang, Kan" <kan.liang@intel.com>
To: Stephane Eranian <eranian@google.com>
CC: Peter Zijlstra <peterz@infradead.org>,
        "mingo@redhat.com" <mingo@redhat.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        "ak@linux.intel.com" <ak@linux.intel.com>,
        LKML <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH V2 1/1] perf/x86: Add Intel power cstate PMUs support
Thread-Topic: [PATCH V2 1/1] perf/x86: Add Intel power cstate PMUs support
Thread-Index: AQHQyIV8gIooX1MBqECTD6kaMyULcJ3+pmIAgAAp4ICAAIlEYP//iUWAgACHTsD//4g1AIAAhsrg//9/a4AAEVTCMP//n7sA//90HWA=
Date: Thu, 6 Aug 2015 23:40:16 +0000
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F077018D34E1@SHSMSX103.ccr.corp.intel.com>
References: <1437986776-8438-1-git-send-email-kan.liang@intel.com>
	<20150806154424.GR19282@twins.programming.kicks-ass.net>
	<CABPqkBR91SjU6UMVbdZ0+CpDw5gFOyFREAOk-qPCvxOkuWGWdA@mail.gmail.com>
	<37D7C6CF3E00A74B8858931C1DB2F077018D334D@SHSMSX103.ccr.corp.intel.com>
	<CABPqkBTy+QzTtP8KgD8WK6UFq+wQSLW=69MVSp3SXbcqj3YO-g@mail.gmail.com>
	<37D7C6CF3E00A74B8858931C1DB2F077018D3373@SHSMSX103.ccr.corp.intel.com>
	<CABPqkBQ=muoC-THVkYZm4Q4B2mj8jX8KZKSyawjmk7fvFXcknA@mail.gmail.com>
	<37D7C6CF3E00A74B8858931C1DB2F077018D33A7@SHSMSX103.ccr.corp.intel.com>
	<CABPqkBSQS6mMkY-uC0-qE0vLVs0CUGP--iFvFci4z0_BzZWyOw@mail.gmail.com>
	<37D7C6CF3E00A74B8858931C1DB2F077018D342E@SHSMSX103.ccr.corp.intel.com>
 <CABPqkBQafcye9jhtKr4wO-gR9j5roDdM9W4_UGGx74VgWGJRHQ@mail.gmail.com>
In-Reply-To: <CABPqkBQafcye9jhtKr4wO-gR9j5roDdM9W4_UGGx74VgWGJRHQ@mail.gmail.com>
Accept-Language: zh-CN, en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Sender: linux-kernel-owner@vger.kernel.org
Content-Transfer-Encoding: 8bit
Content-Length: 2312
Lines: 58


> >> On Thu, Aug 6, 2015 at 1:25 PM, Liang, Kan <kan.liang@intel.com> wrote:
> >> >
> >> >> >> >> >> +static cpumask_t power_cstate_core_cpu_mask;
> >> >> >> >> >
> >> >> >> >> > That one typically does not need a cpumask.
> >> >> >> >> >
> >> >> >> >> You need to pick one CPU out of the multi-core. But it is
> >> >> >> >> for client parts thus there is only one socket. At least
> >> >> >> >> this is my
> >> >> understanding.
> >> >> >> >>
> >> >> >> >
> >> >> >> > CORE_C*_RESIDENCY are available for physical processor core.
> >> >> >> > So logical processor in same physical processor core share
> >> >> >> > the same counter.
> >> >> >> > I think we need the cpumask to identify the default logical
> >> >> >> > processor which do counting.
> >> >> >> >
> >> >> >> Did you restrict these events to system-wide mode only?
> >> >> >>
> >> >> Ok, so that means that your cpumask includes one HT per physical
> core.
> >> >> But then, the result is not the simple aggregation of all the N/2 CPUs.
> >> >
> >> > The counter counts per physical core. The result is the aggregation
> >> > of all HT cpus in same physical core.
> >>
> >> But then don't you need to divide by 2 to get a meaningful result?
> >
> > Rethink of it. I think I was unclear about the aggregation of all HT
> > cpus in same physical core.
> >
> > physical core Cstate should equal to min(logical core C-state).
> > So only all logical core enters C6-state, the physical core enters
> > C6-state, then CORE_C6_RESIDENCY counts.
> >
> > So if we only count on one logical core/HT for CORE_C6_RESIDENCY.
> > We don't need to divide by 2. The count result is the residency when
> > all logical core in C6 (some may deeper).
> >
> Ok and here you are assuming you are only measuring one logical CPU per
> physical core. If this is the case, then I think you are alright. But I wonder
> what you'd get when perf stat -a aggregates across all measured CPUs, i.e.,
> one CPU per core.

Just add them all together.
I think we do the same thing for other PMUs as well.
For uncore or rapl, we get meaningful result by applying --per-socket.
Here we can use --per-core. 

Thanks,
Kan


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?