Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752048AbbEJTXF (ORCPT ); Sun, 10 May 2015 15:23:05 -0400 Received: from mga01.intel.com ([192.55.52.88]:21574 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751705AbbEJTW4 (ORCPT ); Sun, 10 May 2015 15:22:56 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,402,1427785200"; d="scan'208";a="708067992" From: Andi Kleen To: peterz@infradead.org Cc: eranian@google.com, linux-kernel@vger.kernel.org, Andi Kleen Subject: [PATCH 2/9] x86, perf: Add support for PEBSv3 profiling Date: Sun, 10 May 2015 12:22:40 -0700 Message-Id: <1431285767-27027-3-git-send-email-andi@firstfloor.org> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1431285767-27027-1-git-send-email-andi@firstfloor.org> References: <1431285767-27027-1-git-send-email-andi@firstfloor.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3591 Lines: 109 From: Andi Kleen PEBSv3 is the same as the existing PEBSv2 used on Haswell, but it adds a new TSC field. Add support to the generic PEBS handler to handle the new format, and overwrite the perf time stamp using the new native_sched_clock_from_tsc() Right now the time stamp is just slightly more accurate, as it is nearer the actual event trigger point. With the PEBS threshold > 1 patchkit it will be much more accurate, avoid the problems with MMAP mismatches earlier. The accurate time stamping is only implemented for the default trace clock for now. v2: Use _skl prefix. Check for default clock_id. Signed-off-by: Andi Kleen --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 36 ++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 813f75d..2eed6fc 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -224,6 +224,19 @@ union hsw_tsx_tuning { #define PEBS_HSW_TSX_FLAGS 0xff00000000ULL +/* Same as HSW, plus TSC */ + +struct pebs_record_skl { + u64 flags, ip; + u64 ax, bx, cx, dx; + u64 si, di, bp, sp; + u64 r8, r9, r10, r11; + u64 r12, r13, r14, r15; + u64 status, dla, dse, lat; + u64 real_ip, tsx_tuning; + u64 tsc; +}; + void init_debug_store_on_cpu(int cpu) { struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds; @@ -827,7 +840,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs) return 0; } -static inline u64 intel_hsw_weight(struct pebs_record_hsw *pebs) +static inline u64 intel_hsw_weight(struct pebs_record_skl *pebs) { if (pebs->tsx_tuning) { union hsw_tsx_tuning tsx = { .value = pebs->tsx_tuning }; @@ -836,7 +849,7 @@ static inline u64 intel_hsw_weight(struct pebs_record_hsw *pebs) return 0; } -static inline u64 intel_hsw_transaction(struct pebs_record_hsw *pebs) +static inline u64 intel_hsw_transaction(struct pebs_record_skl *pebs) { u64 txn = (pebs->tsx_tuning & PEBS_HSW_TSX_FLAGS) >> 32; @@ -858,7 +871,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event, * unconditionally access the 'extra' entries. */ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); - struct pebs_record_hsw *pebs = __pebs; + struct pebs_record_skl *pebs = __pebs; struct perf_sample_data data; struct pt_regs regs; u64 sample_type; @@ -958,6 +971,16 @@ static void __intel_pmu_pebs_event(struct perf_event *event, data.txn = intel_hsw_transaction(pebs); } + /* + * v3 supplies an accurate time stamp, so we use that + * for the time stamp. + * + * We can only do this for the default trace clock. + */ + if (x86_pmu.intel_cap.pebs_format >= 3 && + event->attr.use_clockid == 0) + data.time = native_sched_clock_from_tsc(pebs->tsc); + if (has_branch_stack(event)) data.br_stack = &cpuc->lbr_stack; @@ -1098,6 +1121,13 @@ void __init intel_ds_init(void) x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm; break; + case 3: + pr_cont("PEBS fmt3%c, ", pebs_type); + x86_pmu.pebs_record_size = + sizeof(struct pebs_record_skl); + x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm; + break; + default: printk(KERN_CONT "no PEBS fmt%d%c, ", format, pebs_type); x86_pmu.pebs = 0; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/