Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3380335pxf; Mon, 29 Mar 2021 00:05:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy9WMWQbkSrUZljn+RPNgzwaTEEedXYYSuyTyCF40Se/XC7yqHK6A+wTnukhpvqlcH0c+wh X-Received: by 2002:a17:906:ecb8:: with SMTP id qh24mr27979715ejb.162.1617001526461; Mon, 29 Mar 2021 00:05:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617001526; cv=none; d=google.com; s=arc-20160816; b=AWhF+q4aTxtFjT26PKcy/+gQUo2d7KyzitHB72Za4t2kMI3Vvu51LV06RRIlTIPvcS sBvf74GwKocd/Tnve/GBvS5diOR30DQ4BP0x8sbAV9Hc2lZW/uIViDCHq2Ae+wNBjeFq iaEJccYmf539crkxsJwSkppVu/XmQckDOBO+VGOLTIa7tG53PrBg2mvTPxMv5jTkCTbM Bok/FYaIl6CEDLCeqXfn+UsQsjT5gH89ow0IVG0Rmp8bvKihbxqrM4Z5xJ7jVQUQGcO3 RuI7pNsAvUiWRYyUR2S8DIYy6w69CzoS3wXqqXf2kOCyAJeXNnTq3pbQMN0R1K0aW+Ck DrIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:ironport-sdr:ironport-sdr; bh=XVZMapOjUewwkvAYHpLzRIIosDroxqJbdjNY8QppBwU=; b=FMTeBXYLt0FL5VEm8K2FgwPANanPnEAd0dHEct9mmp0AgXe207I6FFoRH4eSktlmck lHzKtwi89Sc/J6E6yfn+qd7AAwOzVToOTHYg2zQrqxDywufshH5414UbuetbTSJB3u1V aXyAJow/w6ukLRdpo1uMKWVDCjDK3IeS4CXvUKRougcbRNzcbWmF/vzWkqChI/uzVps5 KO/TZoKJzs7+CMq+3kd1+af4oEMTeNAgtCpFVJTWTM8+UEX3OvBUkQUrvq7U/XF6LLqG PRpIUhcr6eYUDYrvtugdcrsAqjrkx1F4nPNBBAQsbC9W6YzB1aYpqCS0w/7iKMz46z6G XQ2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id yd6si12768094ejb.302.2021.03.29.00.05.04; Mon, 29 Mar 2021 00:05:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231616AbhC2HDn (ORCPT + 99 others); Mon, 29 Mar 2021 03:03:43 -0400 Received: from mga14.intel.com ([192.55.52.115]:20049 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231446AbhC2HCz (ORCPT ); Mon, 29 Mar 2021 03:02:55 -0400 IronPort-SDR: 4NdfyCftKt2L86XaEIzE/xFjf64ee+uceYAIq7zRQH6cWxjOn6mNoACqXQNuttafn4dpiZVCKP hNwuc16xX6ig== X-IronPort-AV: E=McAfee;i="6000,8403,9937"; a="190956344" X-IronPort-AV: E=Sophos;i="5.81,287,1610438400"; d="scan'208";a="190956344" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2021 00:02:54 -0700 IronPort-SDR: TV60NTZSzEfGx4Gkckk2Li/4gGWAb4vMftJJXnwvIAEIXHa+FXkh9tYpsjRKRgYTq814zAUFZ0 jb3yE1KQMq9g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,287,1610438400"; d="scan'208";a="444677707" Received: from kbl-ppc.sh.intel.com ([10.239.159.163]) by FMSMGA003.fm.intel.com with ESMTP; 29 Mar 2021 00:02:51 -0700 From: Jin Yao To: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org, mingo@redhat.com, alexander.shishkin@linux.intel.com Cc: Linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com, Jin Yao Subject: [PATCH v3 27/27] perf Documentation: Document intel-hybrid support Date: Mon, 29 Mar 2021 15:00:46 +0800 Message-Id: <20210329070046.8815-28-yao.jin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210329070046.8815-1-yao.jin@linux.intel.com> References: <20210329070046.8815-1-yao.jin@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add some words and examples to help understanding of Intel hybrid perf support. Signed-off-by: Jin Yao --- v3: - No functional change. tools/perf/Documentation/intel-hybrid.txt | 228 ++++++++++++++++++++++ tools/perf/Documentation/perf-record.txt | 1 + tools/perf/Documentation/perf-stat.txt | 2 + 3 files changed, 231 insertions(+) create mode 100644 tools/perf/Documentation/intel-hybrid.txt diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt new file mode 100644 index 000000000000..784f598dd36f --- /dev/null +++ b/tools/perf/Documentation/intel-hybrid.txt @@ -0,0 +1,228 @@ +Intel hybrid support +-------------------- +Support for Intel hybrid events within perf tools. + +For some Intel platforms, such as AlderLake, which is hybrid platform and +it consists of atom cpu and core cpu. Each cpu has dedicated event list. +Part of events are available on core cpu, part of events are available +on atom cpu and even part of events are available on both. + +Kernel exports two new cpu pmus via sysfs: +/sys/devices/cpu_core +/sys/devices/cpu_atom + +The 'cpus' files are created under the directories. For example, + +cat /sys/devices/cpu_core/cpus +0-15 + +cat /sys/devices/cpu_atom/cpus +16-23 + +It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus. + +Quickstart + +List hybrid event +----------------- + +As before, use perf-list to list the symbolic event. + +perf list + +inst_retired.any + [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom] +inst_retired.any + [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core] + +The 'Unit: xxx' is added to brief description to indicate which pmu +the event is belong to. Same event name but with different pmu can +be supported. + +Enable hybrid event with a specific pmu +--------------------------------------- + +To enable a core only event or atom only event, following syntax is supported: + + cpu_core// +or + cpu_atom// + +For example, count the 'cycles' event on core cpus. + + perf stat -e cpu_core/cycles/ + +Create two events for one hardware event automatically +------------------------------------------------------ + +When creating one event and the event is available on both atom and core, +two events are created automatically. One is for atom, the other is for +core. Most of hardware events and cache events are available on both +cpu_core and cpu_atom. + +For hardware events, they have pre-defined configs (e.g. 0 for cycles). +But on hybrid platform, kernel needs to know where the event comes from +(from atom or from core). The original perf event type PERF_TYPE_HARDWARE +can't carry pmu information. So a new type PERF_TYPE_HARDWARE_PMU is +introduced. + +The new attr.config layout for PERF_TYPE_HARDWARE_PMU: + +0xDD000000AA +AA: original hardware event ID +DD: PMU type ID + +Cache event is similar. A new type PERF_TYPE_HW_CACHE_PMU is introduced. + +The new attr.config layout for PERF_TYPE_HW_CACHE_PMU: + +0xDD00CCBBAA +AA: original hardware cache ID +BB: original hardware cache op ID +CC: original hardware cache op result ID +DD: PMU type ID + +PMU type ID is retrieved from sysfs + +cat /sys/devices/cpu_atom/type +10 + +cat /sys/devices/cpu_core/type +4 + +When enabling a hardware event without specified pmu, such as, +perf stat -e cycles -a (use system-wide in this example), two events +are created automatically. + + ------------------------------------------------------------ + perf_event_attr: + type 6 + size 120 + config 0x400000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + exclude_guest 1 + ------------------------------------------------------------ + +and + + ------------------------------------------------------------ + perf_event_attr: + type 6 + size 120 + config 0xa00000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + exclude_guest 1 + ------------------------------------------------------------ + +type 6 is PERF_TYPE_HARDWARE_PMU. +0x4 in 0x400000000 indicates it's cpu_core pmu. +0xa in 0xa00000000 indicates it's cpu_atom pmu (atom pmu type id is random). + +The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus), +and create 'cycles' (0xa00000000) on cpu16-cpu23 (atom cpus). + +For perf-stat result, it displays two events: + + Performance counter stats for 'system wide': + + 12,869,720,529 cpu_core/cycles/ + 6,405,459,328 cpu_atom/cycles/ + +The first 'cycles' is core event, the second 'cycles' is atom event. + +Thread mode example: +-------------------- + +perf-stat reports the scaled counts for hybrid event and with a percentage +displayed. The percentage is the event's running time/enabling time. + +One example, 'triad_loop' runs on cpu16 (atom core), while we can see the +scaled value for core cycles is 160,444,092 and the percentage is 0.47%. + +perf stat -e cycles -- taskset -c 16 ./triad_loop + +As previous, two events are created. + +------------------------------------------------------------ +perf_event_attr: + type 6 + size 120 + config 0x400000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + enable_on_exec 1 + exclude_guest 1 +------------------------------------------------------------ + +and + +------------------------------------------------------------ +perf_event_attr: + type 6 + size 120 + config 0xa00000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + enable_on_exec 1 + exclude_guest 1 +------------------------------------------------------------ + +cycles: 0: 784136 339210144 1013444 +cycles: 0: 601988862 339210144 338196700 +cycles: 262458394 339210144 1013444 +cycles: 603792788 339210144 338196700 + + Performance counter stats for 'taskset -c 16 ./triad_loop': + + 262,458,394 cpu_core/cycles/ (0.30%) + 603,792,788 cpu_atom/cycles/ (99.70%) + + 0.340467913 seconds time elapsed + +perf-record: +------------ + +If there is no '-e' specified in perf record, on hybrid platform, +it creates two default 'cycles' and adds them to event list. One +is for core, the other is for atom. + +perf-stat: +---------- + +If there is no '-e' specified in perf stat, on hybrid platform, +besides of software events, following events are created and +added to event list in order. + +cpu_core/cycles/, +cpu_atom/cycles/, +cpu_core/instructions/, +cpu_atom/instructions/, +cpu_core/branches/, +cpu_atom/branches/, +cpu_core/branch-misses/, +cpu_atom/branch-misses/ + +Of course, both perf-stat and perf-record support to enable +hybrid event with a specific pmu. + +e.g. +perf stat -e cpu_core/cycles/ +perf stat -e cpu_atom/cycles/ +perf stat -e cpu_core/r1a/ +perf stat -e cpu_atom/L1-icache-loads/ +perf stat -e cpu_core/cycles/,cpu_atom/instructions/ +perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' + +But '{cpu_core/cycles/,cpu_atom/instructions/}' will return +"" for 'instructions', because the pmus in +group are not matched (cpu_core vs. cpu_atom). diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index f3161c9673e9..d71bac847936 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -695,6 +695,7 @@ measurements: wait -n ${perf_pid} exit $? +include::intel-hybrid.txt[] SEE ALSO -------- diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 744211fa8c18..13ec4165004b 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -541,6 +541,8 @@ The fields are in this order: Additional metrics may be printed with all earlier fields being empty. +include::intel-hybrid.txt[] + SEE ALSO -------- linkperf:perf-top[1], linkperf:perf-list[1] -- 2.17.1