Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932771AbcDGVCy (ORCPT ); Thu, 7 Apr 2016 17:02:54 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:44993 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932448AbcDGU6w (ORCPT ); Thu, 7 Apr 2016 16:58:52 -0400 From: Arnaldo Carvalho de Melo To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, Andi Kleen , Arnaldo Carvalho de Melo Subject: [PATCH 05/19] perf list: Document event specifications better Date: Thu, 7 Apr 2016 17:58:26 -0300 Message-Id: <1460062720-21736-6-git-send-email-acme@kernel.org> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1460062720-21736-1-git-send-email-acme@kernel.org> References: <1460062720-21736-1-git-send-email-acme@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5740 Lines: 154 From: Andi Kleen Document some features for specifying events in the perf list manpage: - Event groups - Leader sampling - How to specify raw PMU events in the new syntax - Global versus per process PMUs. - Access restrictions - Fix Intel SDM URL v2: Lots of new content. address review feedback. Signed-off-by: Andi Kleen Acked-by: Jiri Olsa Link: http://lkml.kernel.org/r/1459810686-15913-1-git-send-email-andi@firstfloor.org [ Add quotes to some keywords, such as "any" ] Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Documentation/perf-list.txt | 107 ++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 1 deletion(-) diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index ec723d0a5bb3..a126e97a8114 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -93,6 +93,67 @@ raw encoding of 0x1A8 can be used: You should refer to the processor specific documentation for getting these details. Some of them are referenced in the SEE ALSO section below. +ARBITRARY PMUS +-------------- + +perf also supports an extended syntax for specifying raw parameters +to PMUs. Using this typically requires looking up the specific event +in the CPU vendor specific documentation. + +The available PMUs and their raw parameters can be listed with + + ls /sys/devices/*/format + +For example the raw event "LSD.UOPS" core pmu event above could +be specified as + + perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ... + +PER SOCKET PMUS +--------------- + +Some PMUs are not associated with a core, but with a whole CPU socket. +Events on these PMUs generally cannot be sampled, but only counted globally +with perf stat -a. They can be bound to one logical CPU, but will measure +all the CPUs in the same socket. + +This example measures memory bandwidth every second +on the first memory controller on socket 0 of a Intel Xeon system + + perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ... + +Each memory controller has its own PMU. Measuring the complete system +bandwidth would require specifying all imc PMUs (see perf list output), +and adding the values together. + +This example measures the combined core power every second + + perf stat -I 1000 -e power/energy-cores/ -a + +ACCESS RESTRICTIONS +------------------- + +For non root users generally only context switched PMU events are available. +This is normally only the events in the cpu PMU, the predefined events +like cycles and instructions and some software events. + +Other PMUs and global measurements are normally root only. +Some event qualifiers, such as "any", are also root only. + +This can be overriden by setting the kernel.perf_event_paranoid +sysctl to -1, which allows non root to use these events. + +For accessing trace point events perf needs to have read access to +/sys/kernel/debug/tracing, even when perf_event_paranoid is in a relaxed +setting. + +TRACING +------- + +Some PMUs control advanced hardware tracing capabilities, such as Intel PT, +that allows low overhead execution tracing. These are described in a separate +intel-pt.txt document. + PARAMETERIZED EVENTS -------------------- @@ -106,6 +167,50 @@ also be supplied. For example: perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ... +EVENT GROUPS +------------ + +Perf supports time based multiplexing of events, when the number of events +active exceeds the number of hardware performance counters. Multiplexing +can cause measurement errors when the workload changes its execution +profile. + +When metrics are computed using formulas from event counts, it is useful to +ensure some events are always measured together as a group to minimize multiplexing +errors. Event groups can be specified using { }. + + perf stat -e '{instructions,cycles}' ... + +The number of available performance counters depend on the CPU. A group +cannot contain more events than available counters. +For example Intel Core CPUs typically have four generic performance counters +for the core, plus three fixed counters for instructions, cycles and +ref-cycles. Some special events have restrictions on which counter they +can schedule, and may not support multiple instances in a single group. +When too many events are specified in the group none of them will not +be measured. + +Globally pinned events can limit the number of counters available for +other groups. On x86 systems, the NMI watchdog pins a counter by default. +The nmi watchdog can be disabled as root with + + echo 0 > /proc/sys/kernel/nmi_watchdog + +Events from multiple different PMUs cannot be mixed in a group, with +some exceptions for software events. + +LEADER SAMPLING +--------------- + +perf also supports group leader sampling using the :S specifier. + + perf record -e '{cycles,instructions}:S' ... + perf report --group + +Normally all events in a event group sample, but with :S only +the first event (the leader) samples, and it only reads the values of the +other events in the group. + OPTIONS ------- @@ -143,5 +248,5 @@ SEE ALSO -------- linkperf:perf-stat[1], linkperf:perf-top[1], linkperf:perf-record[1], -http://www.intel.com/Assets/PDF/manual/253669.pdf[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide], +http://www.intel.com/sdm/[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide], http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf[AMD64 Architecture Programmer’s Manual Volume 2: System Programming] -- 2.5.5