Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755383AbbGQDsl (ORCPT ); Thu, 16 Jul 2015 23:48:41 -0400 Received: from mga11.intel.com ([192.55.52.93]:62184 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753436AbbGQDsd (ORCPT ); Thu, 16 Jul 2015 23:48:33 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,492,1432623600"; d="scan'208";a="607841824" From: kan.liang@intel.com To: a.p.zijlstra@chello.nl Cc: mingo@redhat.com, acme@kernel.org, eranian@google.com, ak@linux.intel.com, mark.rutland@arm.com, adrian.hunter@intel.com, dsahern@gmail.com, jolsa@kernel.org, namhyung@kernel.org, linux-kernel@vger.kernel.org, Kan Liang Subject: [PATCH 4/9] perf/x86: special case per-cpu core misc PMU events Date: Thu, 16 Jul 2015 16:33:46 -0400 Message-Id: <1437078831-10152-5-git-send-email-kan.liang@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1437078831-10152-1-git-send-email-kan.liang@intel.com> References: <1437078831-10152-1-git-send-email-kan.liang@intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4917 Lines: 134 From: Kan Liang This patch special case per-cpu core_misc PMU events and allow them to be part of any hardware/software group for system-wide monitoring. An useful example would be to include the ASTATE/MSTATE event in a sampling group. This can be used to calculate the frequency during each sampling period, and track it over time. A new context type (perf_free_context) is introduced to indicate these per-cpu core misc PMU events. They are - Free running counter - Don't have any state to switch on context switch and never fails to schedule - No sampling support - Only support system-wide monitoring - per-cpu We also defined a new PERF event type PERF_TYPE_CORE_MISC_FREE for them. It's safe to mix cpu PMU events and CORE_MISC_FREE events in a group. Because when cpu PMU events disable/enable, we can disable/enable them at the same time without failure. Since there is no sampling support for these events. They are only available for group reading and system-wide monitoring. Signed-off-by: Kan Liang --- arch/x86/kernel/cpu/perf_event_intel_core_misc.c | 8 +++++--- include/linux/perf_event.h | 10 ++++++++++ include/linux/sched.h | 1 + include/uapi/linux/perf_event.h | 1 + kernel/events/core.c | 9 +++++++++ 5 files changed, 26 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_core_misc.c b/arch/x86/kernel/cpu/perf_event_intel_core_misc.c index 4efe842..dad4495 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_core_misc.c +++ b/arch/x86/kernel/cpu/perf_event_intel_core_misc.c @@ -889,7 +889,6 @@ static void __init core_misc_pmus_register(void) type->pmu = (struct pmu) { .attr_groups = type->pmu_group, - .task_ctx_nr = perf_invalid_context, .event_init = core_misc_pmu_event_init, .add = core_misc_pmu_event_add, /* must have */ .del = core_misc_pmu_event_del, /* must have */ @@ -902,9 +901,12 @@ static void __init core_misc_pmus_register(void) if (type->type == perf_intel_core_misc_thread) { type->pmu.pmu_disable = (void *) intel_core_misc_pmu_disable; type->pmu.pmu_enable = (void *) intel_core_misc_pmu_enable; + type->pmu.task_ctx_nr = perf_free_context; + err = perf_pmu_register(&type->pmu, type->name, PERF_TYPE_CORE_MISC_FREE); + } else { + type->pmu.task_ctx_nr = perf_invalid_context; + err = perf_pmu_register(&type->pmu, type->name, -1); } - - err = perf_pmu_register(&type->pmu, type->name, -1); if (WARN_ON(err)) pr_info("Failed to register PMU %s error %d\n", type->pmu.name, err); diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index fea0ddf..3538f1c 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -773,6 +773,16 @@ static inline int is_hardware_event(struct perf_event *event) return event->pmu->task_ctx_nr == perf_hw_context; } +static inline int is_free_event(struct perf_event *event) +{ + return event->pmu->task_ctx_nr == perf_free_context; +} + +static inline int has_context_event(struct perf_event *event) +{ + return event->pmu->task_ctx_nr > perf_invalid_context; +} + extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64); diff --git a/include/linux/sched.h b/include/linux/sched.h index ae21f15..717f492 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1335,6 +1335,7 @@ union rcu_special { struct rcu_node; enum perf_event_task_context { + perf_free_context = -2, perf_invalid_context = -1, perf_hw_context = 0, perf_sw_context, diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index d97f84c..232b674 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -32,6 +32,7 @@ enum perf_type_id { PERF_TYPE_HW_CACHE = 3, PERF_TYPE_RAW = 4, PERF_TYPE_BREAKPOINT = 5, + PERF_TYPE_CORE_MISC_FREE = 6, PERF_TYPE_MAX, /* non-ABI */ }; diff --git a/kernel/events/core.c b/kernel/events/core.c index 9077867..995b436 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8019,6 +8019,15 @@ SYSCALL_DEFINE5(perf_event_open, } /* + * Special case per-cpu free counter events and allow them to be part of + * any hardware/software group for system-wide monitoring. + */ + if (group_leader && !task && + is_free_event(event) && + has_context_event(group_leader)) + pmu = group_leader->pmu; + + /* * Get the target context (task or percpu): */ ctx = find_get_context(pmu, task, event); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/