Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752380AbdFON4s (ORCPT ); Thu, 15 Jun 2017 09:56:48 -0400 Received: from mail-wm0-f46.google.com ([74.125.82.46]:36469 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751985AbdFON4q (ORCPT ); Thu, 15 Jun 2017 09:56:46 -0400 From: Stephane Eranian To: linux-kernel@vger.kernel.org Cc: acme@redhat.com, peterz@infradead.org, mingo@elte.hu, ak@linux.intel.com, kan.liang@intel.com, jolsa@redhat.com Subject: [PATCH 1/5] perf/core: add PERF_SAMPLE_SKID_IP record type Date: Thu, 15 Jun 2017 06:56:25 -0700 Message-Id: <1497534989-29231-2-git-send-email-eranian@google.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1497534989-29231-1-git-send-email-eranian@google.com> References: <1497534989-29231-1-git-send-email-eranian@google.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3984 Lines: 113 This patchs adds a new sample record type called PERF_SAMPLE_SKID_IP. The goal is to record the interrupted instruction pointer (IP) as seen by the kernel and reflected in the machine state. On some architectures, it is possible to avoid the IP skid using hardware support. For instance, on Intel x86, the use of PEBS helps eliminate the skid on Haswell and later processors. Without this patch, on Haswell processors, if you set: - attr.precise = 0, then you get the skid IP - attr.precise = 1, then you get the skid PEBS ip (off-by-1) - attr.precise = 2, then you get the skidless PEBS The IP is captured when the event has PERF_RECORD_SAMPLE_IP set in sample_type. However, there are certain measurements where you need to have BOTH the skidless IP and the skid IP. For instance, when studying branches, the skid IP usually points to the target of the branch while the skidless IP points to the branch instruction itself. Today, it is not possible to retrieve both at the same time. This patch makes this possible by specifying PERF_SAMPLE_IP|PERF_SAMPLE_SKID_IP. Signed-off-by: Stephane Eranian --- include/linux/perf_event.h | 2 ++ include/uapi/linux/perf_event.h | 4 +++- kernel/events/core.c | 14 ++++++++++++++ 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 7d6aa29094b2..ac63f6af6f53 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -944,6 +944,7 @@ struct perf_sample_data { struct perf_regs regs_intr; u64 stack_user_size; + u64 skid_ip; } ____cacheline_aligned; /* default value for data source */ @@ -964,6 +965,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data, data->weight = 0; data->data_src.val = PERF_MEM_NA; data->txn = 0; + data->skid_ip = 0; /* mark as uinitialized */ } extern void perf_output_sample(struct perf_output_handle *handle, diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index b1c0b187acfe..48e736eb20bd 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -139,8 +139,9 @@ enum perf_event_sample_format { PERF_SAMPLE_IDENTIFIER = 1U << 16, PERF_SAMPLE_TRANSACTION = 1U << 17, PERF_SAMPLE_REGS_INTR = 1U << 18, + PERF_SAMPLE_SKID_IP = 1U << 19, - PERF_SAMPLE_MAX = 1U << 19, /* non-ABI */ + PERF_SAMPLE_MAX = 1U << 20, /* non-ABI */ }; /* @@ -791,6 +792,7 @@ enum perf_event_type { * { u64 transaction; } && PERF_SAMPLE_TRANSACTION * { u64 abi; # enum perf_sample_regs_abi * u64 regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR + * { u64 skid_ip; } && PERF_SAMPLE_SKID_IP * }; */ PERF_RECORD_SAMPLE = 9, diff --git a/kernel/events/core.c b/kernel/events/core.c index 4d2c32f98482..e5afc3cbf287 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1563,6 +1563,9 @@ static void __perf_event_header_size(struct perf_event *event, u64 sample_type) if (sample_type & PERF_SAMPLE_TRANSACTION) size += sizeof(data->txn); + if (sample_type & PERF_SAMPLE_SKID_IP) + size += sizeof(data->skid_ip); + event->header_size = size; } @@ -5947,6 +5950,9 @@ void perf_output_sample(struct perf_output_handle *handle, } } + if (sample_type & PERF_SAMPLE_SKID_IP) + perf_output_put(handle, data->skid_ip); + if (!event->attr.watermark) { int wakeup_events = event->attr.wakeup_events; @@ -5980,6 +5986,14 @@ void perf_prepare_sample(struct perf_event_header *header, if (sample_type & PERF_SAMPLE_IP) data->ip = perf_instruction_pointer(regs); + /* + * if skid_ip has not been set by arch specific code, then + * we initialize it to IP as interrupt-based sampling has + * skid + */ + if (!data->skid_ip && (sample_type & PERF_SAMPLE_SKID_IP)) + data->skid_ip = perf_instruction_pointer(regs); + if (sample_type & PERF_SAMPLE_CALLCHAIN) { int size = 1; -- 2.13.1.518.g3df882009-goog