Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp805051img; Mon, 18 Mar 2019 14:47:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqynQuHN2pkGu3stT4n3i/j33mxR967vA9iX0MvcuwZ57DqLQj6wflBuCh/3M3wYad8PF3sd X-Received: by 2002:a17:902:9688:: with SMTP id n8mr21907809plp.133.1552945644593; Mon, 18 Mar 2019 14:47:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552945644; cv=none; d=google.com; s=arc-20160816; b=hMjKrBbayqXy76Z9f1XChXMItJ/oPZsjl12Q5f/3tolnMvTJPhOIo/7sg9JBtK3BU7 vWyX6zyH8vzLrVHm4zsv/NIaOg7xwIfiAK/219u1FmvhFEIABUGEnmRXb22SvWg9LLDr ssfKPLkrhD2E5kcbRoZE6U+oaKc15eoHGxhQxRq3wXIhBZV621+FXhLPrNIc6/gywQxN Nu7vzB8teE2DyB5jvQcEBzdhs2jVEc7+WslPSzYIxj0Rd+yu1w45S4fgpZ33RDjeJKSb ZkTwU5TA/Qu01U82r8QemS3eSH41GN+EI6UCjYUMiomMIAiguGJuLJ3/HjYB1NuP67pf QsXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=RhMrMLoYa+3HTCOmo2EqiPOI2fnN2BcA5uMOQ6s9bTE=; b=MfXBFnvps2JTGQoDTKcQ5FTKQUfptq8Ja9mOCKFNh3Oxk4Cwf6xb7Lg7p67TdPabZb i3NX8wUxxYbXpFxMVszq3/6C7VH88Y7gY7wXdU+UXLotb2jES2hyvk3AeRkICDFHXvR+ ggAl0fnNuhLlgnE3tbLg4HR6EN5jQmOn0HpZtNf/mBY8mXyLn9b0Y41rL2jCPqlTvTRI wbvYQiQrDtTDKXpyT0YaIBTy6Qcwe3cz6qiN1oiklXDFGcY9xrHqlvzsNm0EwlHQ2EsC Wib3DkvJRTCz9UBrVacoM3DMqWXJnZcmWrhtby2xfoWFoCtO6JEYvg7G+P9xhv3FBgxk AS0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 2si5241559pfr.136.2019.03.18.14.47.09; Mon, 18 Mar 2019 14:47:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728053AbfCRVp7 (ORCPT + 99 others); Mon, 18 Mar 2019 17:45:59 -0400 Received: from mga04.intel.com ([192.55.52.120]:57586 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727238AbfCRVoL (ORCPT ); Mon, 18 Mar 2019 17:44:11 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Mar 2019 14:44:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,495,1544515200"; d="scan'208";a="308301771" Received: from otc-icl-cdi187.jf.intel.com ([10.54.55.103]) by orsmga005.jf.intel.com with ESMTP; 18 Mar 2019 14:44:10 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, jolsa@kernel.org, eranian@google.com, alexander.shishkin@linux.intel.com, ak@linux.intel.com, Kan Liang Subject: [PATCH 01/22] perf/core: Support outputting registers from a separate array Date: Mon, 18 Mar 2019 14:41:23 -0700 Message-Id: <20190318214144.4639-2-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190318214144.4639-1-kan.liang@linux.intel.com> References: <20190318214144.4639-1-kan.liang@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andi Kleen Add support to the perf core for outputting registers from a separate array and add support for outputting XMM registers for x86. This requires changing all the perf_reg_value functions for the different architectures to pass the additional argument. Except for x86, they just ignore it. XMM registers are 128 bit. To simplify the code, they are handled like two different registers, which means setting two bits in the register bitmap. This also allows only sampling the lower 64bit bits in XMM. Signed-off-by: Andi Kleen Signed-off-by: Kan Liang --- arch/arm/kernel/perf_regs.c | 2 +- arch/arm64/kernel/perf_regs.c | 2 +- arch/powerpc/perf/perf_regs.c | 2 +- arch/s390/kernel/perf_regs.c | 2 +- arch/x86/include/uapi/asm/perf_regs.h | 25 +++++++++++++++++++++++-- arch/x86/kernel/perf_regs.c | 17 ++++++++++++----- include/linux/perf_event.h | 2 ++ include/linux/perf_regs.h | 4 ++-- kernel/events/core.c | 7 +++++-- 9 files changed, 48 insertions(+), 15 deletions(-) diff --git a/arch/arm/kernel/perf_regs.c b/arch/arm/kernel/perf_regs.c index 05fe92aa7d98..1feedc151e87 100644 --- a/arch/arm/kernel/perf_regs.c +++ b/arch/arm/kernel/perf_regs.c @@ -8,7 +8,7 @@ #include #include -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx) { if (WARN_ON_ONCE((u32)idx >= PERF_REG_ARM_MAX)) return 0; diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c index 0bbac612146e..85d7db5e5428 100644 --- a/arch/arm64/kernel/perf_regs.c +++ b/arch/arm64/kernel/perf_regs.c @@ -9,7 +9,7 @@ #include #include -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx) { if (WARN_ON_ONCE((u32)idx >= PERF_REG_ARM64_MAX)) return 0; diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c index 3349f3f8fe84..29380a99473a 100644 --- a/arch/powerpc/perf/perf_regs.c +++ b/arch/powerpc/perf/perf_regs.c @@ -73,7 +73,7 @@ static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = { PT_REGS_OFFSET(PERF_REG_POWERPC_MMCRA, dsisr), }; -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx) { if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX)) return 0; diff --git a/arch/s390/kernel/perf_regs.c b/arch/s390/kernel/perf_regs.c index 4352a504f235..88974116dbc6 100644 --- a/arch/s390/kernel/perf_regs.c +++ b/arch/s390/kernel/perf_regs.c @@ -8,7 +8,7 @@ #include #include -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx) { freg_t fp; diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/asm/perf_regs.h index f3329cabce5c..1ff0df1c97ae 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -28,7 +28,28 @@ enum perf_event_x86_regs { PERF_REG_X86_R14, PERF_REG_X86_R15, - PERF_REG_X86_32_MAX = PERF_REG_X86_GS + 1, - PERF_REG_X86_64_MAX = PERF_REG_X86_R15 + 1, + /* These all need two bits set because they are 128bit */ + PERF_REG_X86_XMM0 = 32, + PERF_REG_X86_XMM1 = 34, + PERF_REG_X86_XMM2 = 36, + PERF_REG_X86_XMM3 = 38, + PERF_REG_X86_XMM4 = 40, + PERF_REG_X86_XMM5 = 42, + PERF_REG_X86_XMM6 = 44, + PERF_REG_X86_XMM7 = 46, + PERF_REG_X86_XMM8 = 48, + PERF_REG_X86_XMM9 = 50, + PERF_REG_X86_XMM10 = 52, + PERF_REG_X86_XMM11 = 54, + PERF_REG_X86_XMM12 = 56, + PERF_REG_X86_XMM13 = 58, + PERF_REG_X86_XMM14 = 60, + PERF_REG_X86_XMM15 = 62, + + /* This does not include the XMMX registers */ + PERF_REG_GPR_X86_32_MAX = PERF_REG_X86_GS + 1, + PERF_REG_GPR_X86_64_MAX = PERF_REG_X86_R15 + 1, + + PERF_REG_X86_MAX = PERF_REG_X86_XMM15 + 2, }; #endif /* _ASM_X86_PERF_REGS_H */ diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index c06c4c16c6b6..8b44a4c5a161 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -10,14 +10,14 @@ #include #ifdef CONFIG_X86_32 -#define PERF_REG_X86_MAX PERF_REG_X86_32_MAX +#define PERF_REG_GPR_X86_MAX PERF_REG_GPR_X86_32_MAX #else -#define PERF_REG_X86_MAX PERF_REG_X86_64_MAX +#define PERF_REG_GPR_X86_MAX PERF_REG_GPR_X86_64_MAX #endif #define PT_REGS_OFFSET(id, r) [id] = offsetof(struct pt_regs, r) -static unsigned int pt_regs_offset[PERF_REG_X86_MAX] = { +static unsigned int pt_regs_offset[PERF_REG_GPR_X86_MAX] = { PT_REGS_OFFSET(PERF_REG_X86_AX, ax), PT_REGS_OFFSET(PERF_REG_X86_BX, bx), PT_REGS_OFFSET(PERF_REG_X86_CX, cx), @@ -57,15 +57,22 @@ static unsigned int pt_regs_offset[PERF_REG_X86_MAX] = { #endif }; -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx) { + if (idx >= 32 && idx < 64) { + if (!extra_regs) + return 0; + return extra_regs[idx - 32]; + } + if (WARN_ON_ONCE(idx >= ARRAY_SIZE(pt_regs_offset))) return 0; return regs_get_register(regs, pt_regs_offset[idx]); } -#define REG_RESERVED (~((1ULL << PERF_REG_X86_MAX) - 1ULL)) +#define REG_RESERVED \ + (PERF_REG_X86_MAX == 64 ? 0 : ~((1ULL << PERF_REG_X86_MAX)) - 1ULL) #ifdef CONFIG_X86_32 int perf_reg_validate(u64 mask) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index e47ef764f613..bd3d6a89ccd4 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -948,6 +948,7 @@ struct perf_sample_data { u64 stack_user_size; u64 phys_addr; + u64 *extra_regs; } ____cacheline_aligned; /* default value for data source */ @@ -968,6 +969,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data, data->weight = 0; data->data_src.val = PERF_MEM_NA; data->txn = 0; + data->extra_regs = NULL; } extern void perf_output_sample(struct perf_output_handle *handle, diff --git a/include/linux/perf_regs.h b/include/linux/perf_regs.h index 476747456bca..9884c64d5598 100644 --- a/include/linux/perf_regs.h +++ b/include/linux/perf_regs.h @@ -11,14 +11,14 @@ struct perf_regs { #ifdef CONFIG_HAVE_PERF_REGS #include -u64 perf_reg_value(struct pt_regs *regs, int idx); +u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx); int perf_reg_validate(u64 mask); u64 perf_reg_abi(struct task_struct *task); void perf_get_regs_user(struct perf_regs *regs_user, struct pt_regs *regs, struct pt_regs *regs_user_copy); #else -static inline u64 perf_reg_value(struct pt_regs *regs, int idx) +static inline u64 perf_reg_value(struct pt_regs *regs, u64 *extra_regs, int idx) { return 0; } diff --git a/kernel/events/core.c b/kernel/events/core.c index 5f59d848171e..560ac237b8be 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5858,7 +5858,8 @@ EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks); static void perf_output_sample_regs(struct perf_output_handle *handle, - struct pt_regs *regs, u64 mask) + struct pt_regs *regs, + u64 *extra_regs, u64 mask) { int bit; DECLARE_BITMAP(_mask, 64); @@ -5867,7 +5868,7 @@ perf_output_sample_regs(struct perf_output_handle *handle, for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) { u64 val; - val = perf_reg_value(regs, bit); + val = perf_reg_value(regs, extra_regs, bit); perf_output_put(handle, val); } } @@ -6274,6 +6275,7 @@ void perf_output_sample(struct perf_output_handle *handle, u64 mask = event->attr.sample_regs_user; perf_output_sample_regs(handle, data->regs_user.regs, + NULL, mask); } } @@ -6306,6 +6308,7 @@ void perf_output_sample(struct perf_output_handle *handle, perf_output_sample_regs(handle, data->regs_intr.regs, + data->extra_regs, mask); } } -- 2.17.1