Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751389AbbD2Suw (ORCPT ); Wed, 29 Apr 2015 14:50:52 -0400 Received: from mail-la0-f43.google.com ([209.85.215.43]:36366 "EHLO mail-la0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750808AbbD2Sut (ORCPT ); Wed, 29 Apr 2015 14:50:49 -0400 MIME-Version: 1.0 In-Reply-To: <20150429090941.GO5029@twins.programming.kicks-ass.net> References: <2c37309d20afadf88ad4a82cf0ce02b9152801e2.1430256154.git.luto@kernel.org> <20150429090941.GO5029@twins.programming.kicks-ass.net> From: Andy Lutomirski Date: Wed, 29 Apr 2015 11:50:27 -0700 Message-ID: Subject: Re: [RFC] x86, perf: Add an aperfmperf driver To: Peter Zijlstra Cc: "linux-kernel@vger.kernel.org" , Len Brown , Paul Mackerras , Arnaldo Carvalho de Melo , Ingo Molnar Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3509 Lines: 96 On Apr 29, 2015 2:09 AM, "Peter Zijlstra" wrote: > > On Tue, Apr 28, 2015 at 02:25:37PM -0700, Andy Lutomirski wrote: > > diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile > > index 80091ae54c2b..fadc822efc90 100644 > > --- a/arch/x86/kernel/cpu/Makefile > > +++ b/arch/x86/kernel/cpu/Makefile > > @@ -45,6 +45,8 @@ obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += perf_event_intel_uncore.o \ > > perf_event_intel_uncore_snb.o \ > > perf_event_intel_uncore_snbep.o \ > > perf_event_intel_uncore_nhmex.o > > +obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_aperf_mperf.o > > +obj-$(CONFIG_CPU_SUP_AMD) += perf_event_aperf_mperf.o > > Does this actually work? I would expect it to go complain about having > to build it twice if you have both set. No, but only because I spelled the filename wrong while regenerating the patch. Oops! > > > diff --git a/arch/x86/kernel/cpu/perf_event_aperfmperf.c b/arch/x86/kernel/cpu/perf_event_aperfmperf.c > > new file mode 100644 > > index 000000000000..6e6d113bd9ce > > --- /dev/null > > +++ b/arch/x86/kernel/cpu/perf_event_aperfmperf.c > > @@ -0,0 +1,119 @@ > > +#include > > + > > +#define APERFMPERF_EVENT_APERF 0 > > +#define APERFMPERF_EVENT_MPERF 1 > > + > > > +static int aperfmperf_event_init(struct perf_event *event) > > +{ > > + if (event->attr.type != event->pmu->type) > > + return -ENOENT; > > + > > + if (event->attr.config != APERFMPERF_EVENT_APERF && > > + event->attr.config != APERFMPERF_EVENT_MPERF) > > + return -ENOENT; > > Once we pass the type test we know its 'our' event, and we can go return > fatal errors. No other PMU will pick this up. > > This could therefore turn into an -EINVAL. > > > + > > + if (event->attr.config1 != 0) > > + return -ENOENT; > > Idem. > > > + /* no sampling */ > > + if (event->hw.sample_period) > > + return -EINVAL; > > You could have set pmu::capabilities = > PERF_PMU_CAP_NO_INTERRUPT which would also have killed that dead. That checks attr.sample_period. I'm a bit confused about the relationship between event->hw and event->attr. Do I not need to check hw.sample_period? > > > + /* unsupported modes and filters */ > > + if (event->attr.exclude_user || > > + event->attr.exclude_kernel || > > + event->attr.exclude_hv || > > + event->attr.exclude_idle || > > + event->attr.exclude_host || > > + event->attr.exclude_guest || > > + event->attr.freq || > > + event->attr.sample_period) /* no sampling */ > > + return -EINVAL; > > + > > + event->hw.idx = -1; > > + event->hw.event_base = (event->attr.config == APERFMPERF_EVENT_APERF ? > > + MSR_IA32_APERF : MSR_IA32_MPERF); > > + > > + return 0; > > +} > > The rest looks about right. Very simple thing indeed ;-) Before I submit v2, do you think this is actually worth doing? I can see it being useful for answering questions like "did this workload end up running at full speed". --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/