Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp1376196rwr; Wed, 26 Apr 2023 14:11:02 -0700 (PDT) X-Google-Smtp-Source: AKy350bwZXPoeo9F309MVf5kX86kGxjzCpaydA56ivRBUkq4Z5c5X7GSC1EG+Lg+DXB1yB55U3Ro X-Received: by 2002:a05:6a20:429e:b0:f2:64f8:b214 with SMTP id o30-20020a056a20429e00b000f264f8b214mr26433935pzj.13.1682543461675; Wed, 26 Apr 2023 14:11:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682543461; cv=none; d=google.com; s=arc-20160816; b=adxPpFuB8y41a9GjYCnR8WT/eIUd0FNX3ZKQmznhYduZFEjKrorYPgPGRmC1LRE3Q1 tVq1hMGxC7VAHME+VRD4uLrpmkW3WZKqzWeDykqb2YiQ3toVr8fD9ZtpTv608GevYd3s TGG4VhvWH4aI/17TqqINOfx6IlhGinx5snOxSOF4F4AuRKb1Wg++5JCBfT2v8NuZknWn FaayOFQalLE8ebq01eb3n7gw7Pi3kRxb6qWYcLSeq/Pjdcjw4DtzEo9SYbtw5XlR0uUD ndNxhh3Xso4K+cpHSEtoPju92CyEJlaZ1RplphtIp+CkK6Ab+1qayxt5XKKZ+8AqEPnN 06Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=hA0pjw60bOwz9ogmesWZE+Y9xCNILkqgt/C/oGx+2Nw=; b=Dayw2TmPj8RO+V3Fcblm10+oLelmtpY+dqZGNtsxm02DxnbAh3K/PfgC+FhabIhvDg N6kCbot9l+5tHoqIB3HM9I643bGLAUO1VwNLlm0ggY63kjmMLl0qS3vEoTrxN2gyXGxP nxwbhPnw4ZQ/JBUjVRqNkTUXpcWxbCOQPCNVlLGMsPE4Hv0VL8TuHwa3zO2916b3RFQ6 HewvcaLPgq0qgfyS+7yq/XU+Ag6mgHA6qO4S/auJznK6uvHaP7Lnmx/4SeL0rAhzXMbo 8oD3JSbeIvXahdJ3ExwNBKc+XU5wxalWpBCFbqFH20Jnjoy5TfzydwowpWGaC323gq5N yhCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZMx60Cpx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w2-20020a634902000000b0051b810a9d5fsi17847138pga.384.2023.04.26.14.10.33; Wed, 26 Apr 2023 14:11:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZMx60Cpx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235048AbjDZVJp (ORCPT + 99 others); Wed, 26 Apr 2023 17:09:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232360AbjDZVJo (ORCPT ); Wed, 26 Apr 2023 17:09:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCD4283; Wed, 26 Apr 2023 14:09:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3BAC663910; Wed, 26 Apr 2023 21:09:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18573C433D2; Wed, 26 Apr 2023 21:09:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682543379; bh=alftZZMmi80DcytoVzomSCunjahQB2k8F9ydTg01/e8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ZMx60CpxlH6boB7koeOpvS+05VKkJNtZxXcIEn88m1JTfSErOjhezu8Z0I1eBJLGw QBJ0MM1CouFUM1ug4uJ3Mq9/i6AgzBAbMzX3PUYpd4zEALb6bReVFZUxxyaAmoexB5 +Vbwka5PKqHq5EMVYilV/Rm4xKMtGJssCYJwBN+nJiO7FgCmPp9QrvlOTc12xvg1Zc UkPOVp9EbO3EfPel6s9BOryXlnB0l7wHFD4OZWb7P6oU/DiOoqMSg7buSPjo1fWL2H 8TooSzBpT4azeBOUmz9HVgLQou56WIfNSNMqFAqkiHLe8NGTA6yoQTSzCLnvZ4/e5e 9ydqHQYR3j5AA== Received: by quaco.ghostprotocols.net (Postfix, from userid 1000) id 7468C403B5; Wed, 26 Apr 2023 18:09:36 -0300 (-03) Date: Wed, 26 Apr 2023 18:09:36 -0300 From: Arnaldo Carvalho de Melo To: Kan Liang , Ian Rogers Cc: Ahmad Yasin , Peter Zijlstra , Ingo Molnar , Stephane Eranian , Andi Kleen , Perry Taylor , Samantha Alt , Caleb Biggers , Weilin Wang , Edward Baker , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Adrian Hunter , Florian Fischer , Rob Herring , Zhengjun Xing , John Garry , Kajol Jain , Sumanth Korikkar , Thomas Richter , Tiezhu Yang , Ravi Bangoria , Leo Yan , Yang Jihong , James Clark , Suzuki Poulouse , Kang Minchul , Athira Rajeev , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1 00/40] Fix perf on Intel hybrid CPUs Message-ID: References: <20230426070050.1315519-1-irogers@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230426070050.1315519-1-irogers@google.com> X-Url: http://acmel.wordpress.com X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Wed, Apr 26, 2023 at 12:00:10AM -0700, Ian Rogers escreveu: > TL;DR: hybrid doesn't crash, json metrics work on hybrid on both PMUs > or individually, event parsing doesn't always scan all PMUs, more and > new tests that also run without hybrid, less code. > > The first patches were previously posted to improve metrics here: > "perf stat: Introduce skippable evsels" > https://lore.kernel.org/all/20230414051922.3625666-1-irogers@google.com/ > "perf vendor events intel: Add xxx metric constraints" > https://lore.kernel.org/all/20230419005423.343862-1-irogers@google.com/ > > Next are some general test improvements. Kan, Have you looked at this? I'm doing a test build on it now. - Arnaldo > Next event parsing is rewritten to not scan all PMUs for the benefit > of raw and legacy cache parsing, instead these are handled by the > lexer and a new term type. This ultimately removes the need for the > event parser for hybrid to be recursive as legacy cache can be just a > term. Tests are re-enabled for events with hyphens, so AMD's > branch-brs event is now parsable. > > The cputype option is made a generic pmu filter flag and is tested > even on non-hybrid systems. > > The final patches address specific json metric issues on hybrid, in > both the json metrics and the metric code. They also bring in a new > json option to not group events when matching a metricgroup, this > helps reduce counter pressure for TopdownL1 and TopdownL2 metric > groups. The updates to the script that updates the json are posted in: > https://github.com/intel/perfmon/pull/73 > > The patches add slightly more code than they remove, in areas like > better json metric constraints and tests, but in the core util code, > the removal of hybrid is a net reduction: > 20 files changed, 631 insertions(+), 951 deletions(-) > > There's specific detail with each patch, but for now here is the 6.3 > output followed by that from perf-tools-next with the patch series > applied. The tool is running on an Alderlake CPU on an elderly 5.15 > kernel: > > Events on hybrid that parse and pass tests: > ''' > $ perf-6.3 version > perf version 6.3.rc7.gb7bc77e2f2c7 > $ perf-6.3 test > ... > 6.1: Test event parsing : FAILED! > ... > $ perf test > ... > 6: Parse event definition strings : > 6.1: Test event parsing : Ok > 6.2: Parsing of all PMU events from sysfs : Ok > 6.3: Parsing of given PMU events from sysfs : Ok > 6.4: Parsing of aliased events from sysfs : Skip (no aliases in sysfs) > 6.5: Parsing of aliased events : Ok > 6.6: Parsing of terms (event modifiers) : Ok > ... > ''' > > No event/metric running with json metrics and TopdownL1 on both PMUs: > ''' > $ perf-6.3 stat -a sleep 1 > > Performance counter stats for 'system wide': > > 24,073.58 msec cpu-clock # 23.975 CPUs utilized > 350 context-switches # 14.539 /sec > 25 cpu-migrations # 1.038 /sec > 66 page-faults # 2.742 /sec > 21,257,199 cpu_core/cycles/ # 883.009 K/sec > 2,162,192 cpu_atom/cycles/ # 89.816 K/sec > 6,679,379 cpu_core/instructions/ # 277.457 K/sec > 753,197 cpu_atom/instructions/ # 31.287 K/sec > 1,300,647 cpu_core/branches/ # 54.028 K/sec > 148,652 cpu_atom/branches/ # 6.175 K/sec > 117,429 cpu_core/branch-misses/ # 4.878 K/sec > 14,396 cpu_atom/branch-misses/ # 598.000 /sec > 123,097,644 cpu_core/slots/ # 5.113 M/sec > 9,241,207 cpu_core/topdown-retiring/ # 7.5% Retiring > 8,903,288 cpu_core/topdown-bad-spec/ # 7.2% Bad Speculation > 66,590,029 cpu_core/topdown-fe-bound/ # 54.1% Frontend Bound > 38,397,500 cpu_core/topdown-be-bound/ # 31.2% Backend Bound > 3,294,283 cpu_core/topdown-heavy-ops/ # 2.7% Heavy Operations # 4.8% Light Operations > 8,855,769 cpu_core/topdown-br-mispredict/ # 7.2% Branch Mispredict # 0.0% Machine Clears > 57,695,714 cpu_core/topdown-fetch-lat/ # 46.9% Fetch Latency # 7.2% Fetch Bandwidth > 12,823,926 cpu_core/topdown-mem-bound/ # 10.4% Memory Bound # 20.8% Core Bound > > 1.004093622 seconds time elapsed > > $ perf stat -a sleep 1 > > Performance counter stats for 'system wide': > > 24,064.65 msec cpu-clock # 23.973 CPUs utilized > 384 context-switches # 15.957 /sec > 24 cpu-migrations # 0.997 /sec > 71 page-faults # 2.950 /sec > 19,737,646 cpu_core/cycles/ # 820.192 K/sec > 122,018,505 cpu_atom/cycles/ # 5.070 M/sec (63.32%) > 7,636,653 cpu_core/instructions/ # 317.339 K/sec > 16,266,629 cpu_atom/instructions/ # 675.955 K/sec (72.50%) > 1,552,995 cpu_core/branches/ # 64.534 K/sec > 3,208,143 cpu_atom/branches/ # 133.314 K/sec (72.50%) > 132,151 cpu_core/branch-misses/ # 5.491 K/sec > 547,285 cpu_atom/branch-misses/ # 22.742 K/sec (72.49%) > 32,110,597 cpu_atom/TOPDOWN_RETIRING.ALL/ # 1.334 M/sec > # 18.4 % tma_bad_speculation (72.48%) > 228,006,765 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 9.475 M/sec > # 38.1 % tma_frontend_bound (72.47%) > 225,866,251 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 9.386 M/sec > # 37.7 % tma_backend_bound > # 37.7 % tma_backend_bound_aux (72.73%) > 119,748,254 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 4.976 M/sec > # 5.2 % tma_retiring (73.14%) > 31,363,579 cpu_atom/TOPDOWN_RETIRING.ALL/ # 1.303 M/sec (73.37%) > 227,907,321 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 9.471 M/sec (63.95%) > 228,803,268 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 9.508 M/sec (63.55%) > 113,357,334 cpu_core/TOPDOWN.SLOTS/ # 30.5 % tma_backend_bound > # 9.2 % tma_retiring > # 8.7 % tma_bad_speculation > # 51.6 % tma_frontend_bound > 10,451,044 cpu_core/topdown-retiring/ > 9,687,449 cpu_core/topdown-bad-spec/ > 58,703,214 cpu_core/topdown-fe-bound/ > 34,540,660 cpu_core/topdown-be-bound/ > 154,902 cpu_core/INT_MISC.UOP_DROPPING/ # 6.437 K/sec > > 1.003818397 seconds time elapsed > ''' > > Json metrics that don't crash: > ''' > $ perf-6.3 stat -M TopdownL1 -a sleep 1 > WARNING: events in group from different hybrid PMUs! > WARNING: grouped events cpus do not match, disabling group: > anon group { topdown-retiring, topdown-retiring, INT_MISC.UOP_DROPPING, topdown-fe-bound, topdown-fe-bound, CPU_CLK_UNHALTED.CORE, topdown-be-bound, topdown-be-bound, topdown-bad-spec, topdown-bad-spec } > Error: > The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-retiring). > /bin/dmesg | grep -i perf may provide additional information. > > $ perf stat -M TopdownL1 -a sleep 1 > > Performance counter stats for 'system wide': > > 811,810 cpu_atom/TOPDOWN_RETIRING.ALL/ # 26.6 % tma_bad_speculation > 3,239,281 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 38.8 % tma_frontend_bound > 2,037,667 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 24.4 % tma_backend_bound > # 24.4 % tma_backend_bound_aux > 1,670,438 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 9.7 % tma_retiring > 808,138 cpu_atom/TOPDOWN_RETIRING.ALL/ > 3,234,707 cpu_atom/TOPDOWN_FE_BOUND.ALL/ > 2,081,420 cpu_atom/TOPDOWN_BE_BOUND.ALL/ > 122,795,280 cpu_core/TOPDOWN.SLOTS/ # 31.7 % tma_backend_bound > # 7.0 % tma_bad_speculation > # 54.1 % tma_frontend_bound > # 7.2 % tma_retiring > 8,817,636 cpu_core/topdown-retiring/ > 8,480,817 cpu_core/topdown-bad-spec/ > 3,108,926 cpu_core/topdown-heavy-ops/ > 66,566,215 cpu_core/topdown-fe-bound/ > 38,958,811 cpu_core/topdown-be-bound/ > 134,194 cpu_core/INT_MISC.UOP_DROPPING/ > > 1.003607796 seconds time elapsed > > $ perf stat -M TopdownL2 -a sleep 1 > > Performance counter stats for 'system wide': > > 162,334,218 cpu_atom/TOPDOWN_FE_BOUND.FRONTEND_LATENCY/ # 27.7 % tma_fetch_latency (38.99%) > 16,191,486 cpu_atom/INST_RETIRED.ANY/ (45.76%) > 68,443,205 cpu_atom/TOPDOWN_BE_BOUND.MEM_SCHEDULER/ # 32.2 % tma_memory_bound > # 5.8 % tma_core_bound (45.77%) > 14,920,109 cpu_atom/UOPS_RETIRED.MS/ # 2.9 % tma_base (45.92%) > 14,829,879 cpu_atom/UOPS_RETIRED.MS/ # 2.5 % tma_ms_uops (46.31%) > 31,860,520 cpu_atom/TOPDOWN_RETIRING.ALL/ (46.71%) > 117,323,055 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 18.7 % tma_branch_mispredicts > # 11.5 % tma_fetch_bandwidth > # 0.3 % tma_machine_clears > # 37.9 % tma_resource_bound (53.49%) > 222,579,768 cpu_atom/TOPDOWN_BE_BOUND.ALL/ (53.90%) > 13,672,174 cpu_atom/MEM_SCHEDULER_BLOCK.ST_BUF/ (54.23%) > 24,264,262 cpu_atom/LD_HEAD.ANY_AT_RET/ (47.46%) > 13,872,813 cpu_atom/MEM_SCHEDULER_BLOCK.ALL/ (47.45%) > 223,722,007 cpu_atom/TOPDOWN_BE_BOUND.ALL/ (47.31%) > 2,005,972 cpu_atom/TOPDOWN_BAD_SPECULATION.MACHINE_CLEARS/ (46.91%) > 109,423,013 cpu_atom/TOPDOWN_BAD_SPECULATION.MISPREDICT/ (39.72%) > 67,420,790 cpu_atom/TOPDOWN_FE_BOUND.FRONTEND_BANDWIDTH/ (39.33%) > 92,790,312 cpu_core/TOPDOWN.SLOTS/ # 24.3 % tma_core_bound > # 3.0 % tma_heavy_operations > # 5.6 % tma_light_operations > # 10.8 % tma_memory_bound > # 7.8 % tma_branch_mispredicts > # 40.4 % tma_fetch_latency > # 0.2 % tma_machine_clears > # 7.8 % tma_fetch_bandwidth > 8,041,595 cpu_core/topdown-retiring/ > 10,060,500 cpu_core/topdown-mem-bound/ > 7,314,344 cpu_core/topdown-bad-spec/ > 2,824,600 cpu_core/topdown-heavy-ops/ > 37,630,164 cpu_core/topdown-fetch-lat/ > 7,278,843 cpu_core/topdown-br-mispredict/ > 44,863,148 cpu_core/topdown-fe-bound/ > 32,573,458 cpu_core/topdown-be-bound/ > 5,785,074 cpu_core/INST_RETIRED.ANY/ > 2,325,424 cpu_core/UOPS_RETIRED.MS/ > 15,972,774 cpu_core/CPU_CLK_UNHALTED.THREAD/ > 117,750 cpu_core/INT_MISC.UOP_DROPPING/ > > 1.003519749 seconds time elapsed > ''' > > Note, flags are added below to reduce the size of the output by > removing event groups and threshold printing support: > ''' > $ perf stat --metric-no-threshold --metric-no-group -M TopdownL3 -a sleep 1 > > Performance counter stats for 'system wide': > > 3,506,641 cpu_atom/TOPDOWN_BE_BOUND.ALLOC_RESTRICTIONS/ # 0.6 % tma_alloc_restriction (17.14%) > 133,962,390 cpu_atom/TOPDOWN_BE_BOUND.SERIALIZATION/ # 22.2 % tma_serialization (17.48%) > 11,201,207 cpu_atom/TOPDOWN_FE_BOUND.ITLB/ # 1.9 % tma_itlb_misses (17.88%) > 63,876,838 cpu_atom/TOPDOWN_BE_BOUND.MEM_SCHEDULER/ # 10.6 % tma_mem_scheduler > # 10.5 % tma_store_bound > # 2.4 % tma_other_load_store (18.28%) > 14,386,940 cpu_atom/UOPS_RETIRED.MS/ (18.68%) > 14,432,493 cpu_atom/UOPS_RETIRED.MS/ # 2.7 % tma_other_ret (19.09%) > 81,582,687 cpu_atom/TOPDOWN_FE_BOUND.ICACHE/ # 13.5 % tma_icache_misses (19.14%) > 30,467,546 cpu_atom/TOPDOWN_RETIRING.ALL/ (19.14%) > 16,788,753 cpu_atom/MEM_BOUND_STALLS.LOAD/ # 4.2 % tma_dram_bound > # 3.7 % tma_l2_bound > # 6.7 % tma_l3_bound (19.14%) > 14,514,040 cpu_atom/TOPDOWN_FE_BOUND.DECODE/ # 2.4 % tma_decode (19.14%) > 688,307 cpu_atom/TOPDOWN_BAD_SPECULATION.NUKE/ # 0.1 % tma_nuke (19.13%) > 0 cpu_atom/UOPS_RETIRED.FPDIV/ (19.12%) > 4,408,466 cpu_atom/MEM_BOUND_STALLS.LOAD_L2_HIT/ (19.12%) > 120,556,998 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 9.3 % tma_branch_detect > # 1.0 % tma_branch_resteer > # 5.8 % tma_cisc > # 0.3 % tma_fast_nuke > # 0.0 % tma_fpdiv_uops > # 4.3 % tma_l1_bound > # 3.2 % tma_non_mem_scheduler > # 1.9 % tma_other_fb > # 1.1 % tma_predecode > # 0.1 % tma_register > # 0.1 % tma_reorder_buffer (22.30%) > 34,773,106 cpu_atom/TOPDOWN_FE_BOUND.CISC/ (22.30%) > 591,112 cpu_atom/TOPDOWN_BE_BOUND.REGISTER/ (22.30%) > 11,286,706 cpu_atom/TOPDOWN_FE_BOUND.OTHER/ (22.30%) > 5,082,636 cpu_atom/MEM_BOUND_STALLS.LOAD_DRAM_HIT/ (22.30%) > 14,146,185 cpu_atom/MEM_SCHEDULER_BLOCK.ST_BUF/ (22.31%) > 55,833,686 cpu_atom/TOPDOWN_FE_BOUND.BRANCH_DETECT/ (22.30%) > 25,714,051 cpu_atom/LD_HEAD.ANY_AT_RET/ (19.12%) > 456,549 cpu_atom/TOPDOWN_BE_BOUND.REORDER_BUFFER/ (19.12%) > 1,616,862 cpu_atom/TOPDOWN_BAD_SPECULATION.FASTNUKE/ (19.12%) > 6,680,782 cpu_atom/TOPDOWN_FE_BOUND.PREDECODE/ (19.12%) > 14,229,195 cpu_atom/MEM_SCHEDULER_BLOCK.ALL/ (19.12%) > 8,128,921 cpu_atom/MEM_BOUND_STALLS.LOAD_LLC_HIT/ (19.12%) > 20,941,725 cpu_atom/LD_HEAD.L1_MISS_AT_RET/ (19.11%) > 6,177,125 cpu_atom/TOPDOWN_FE_BOUND.BRANCH_RESTEER/ (18.78%) > 228,066,346 cpu_atom/TOPDOWN_BE_BOUND.ALL/ (18.38%) > 5,204,897 cpu_atom/LD_HEAD.L1_BOUND_AT_RET/ (17.99%) > 19,060,104 cpu_atom/TOPDOWN_BE_BOUND.NON_MEM_SCHEDULER/ (17.58%) > 0 cpu_atom/UOPS_RETIRED.FPDIV/ (17.19%) > 864,565,692 cpu_core/TOPDOWN.SLOTS/ # 4.7 % tma_microcode_sequencer > # 0.4 % tma_few_uops_instructions > # 0.3 % tma_fused_instructions > # 1.8 % tma_memory_operations > # 0.1 % tma_nop_instructions > # 8.9 % tma_ms_switches > # 0.4 % tma_non_fused_branches > # 0.0 % tma_fp_arith > # 0.0 % tma_int_operations > # 35.7 % tma_ports_utilization > # 3.8 % tma_other_light_ops (18.03%) > 100,519,954 cpu_core/topdown-retiring/ (18.03%) > 68,964,454 cpu_core/topdown-bad-spec/ (18.03%) > 44,732,021 cpu_core/topdown-heavy-ops/ (18.03%) > 435,618,316 cpu_core/topdown-fe-bound/ (18.03%) > 262,842,804 cpu_core/topdown-be-bound/ (18.03%) > 10,368,608 cpu_core/BR_INST_RETIRED.ALL_BRANCHES/ (18.43%) > 55,947,727 cpu_core/RESOURCE_STALLS.SCOREBOARD/ (18.84%) > 125,718,255 cpu_core/UOPS_ISSUED.ANY/ (19.24%) > 23,178,652 cpu_core/EXE_ACTIVITY.1_PORTS_UTIL/ (19.65%) > 0 cpu_core/INT_VEC_RETIRED.ADD_256/ (20.05%) > 1,119,514 cpu_core/DSB2MITE_SWITCHES.PENALTY_CYCLES/ # 0.5 % tma_dsb_switches (20.46%) > 27,684,795 cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/ # 10.6 % tma_l1_bound > # 0.7 % tma_l2_bound (20.86%) > 108,813,079 cpu_core/UOPS_EXECUTED.THREAD/ (21.27%) > 16,563,036 cpu_core/IDQ.MITE_CYCLES_ANY/ # 5.2 % tma_mite (19.14%) > 53,037,471 cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/ (19.14%) > 41,005,510 cpu_core/UOPS_RETIRED.MS/ (19.14%) > 575,534 cpu_core/ARITH.DIV_ACTIVE/ # 0.2 % tma_divider (19.14%) > 0 cpu_core/FP_ARITH_INST_RETIRED.SCALAR_SINGLE,umask=0x03/ (19.14%) > 2,207,021 cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/ # 0.9 % tma_store_bound (19.13%) > 5,685,032 cpu_core/UOPS_RETIRED.MS,cmask=1,edge/ (19.13%) > 25,523 cpu_core/DECODE.LCP/ # 0.0 % tma_lcp (19.12%) > 26,095,298 cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/ # 10.8 % tma_l3_bound (19.13%) > 108,516 cpu_core/MEMORY_ACTIVITY.STALLS_L3_MISS/ # 0.0 % tma_dram_bound (19.13%) > 192,239,590 cpu_core/CYCLE_ACTIVITY.STALLS_TOTAL/ (19.12%) > 5,978 cpu_core/LSD.CYCLES_ACTIVE/ # -0.0 % tma_lsd (19.12%) > 0 cpu_core/INT_VEC_RETIRED.VNNI_128/ (19.13%) > 137,530,949 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/ # 0.1 % tma_dsb (19.12%) > 240,070,549 cpu_core/CPU_CLK_UNHALTED.THREAD/ # 17.5 % tma_icache_misses > # 6.1 % tma_itlb_misses > # 40.3 % tma_branch_resteers (21.52%) > 0 cpu_core/FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE,umask=0x3c/ (21.51%) > 595,051 cpu_core/ARITH.DIV_ACTIVE/ (21.52%) > 461,041 cpu_core/IDQ.DSB_CYCLES_ANY/ (21.51%) > 0 cpu_core/INT_VEC_RETIRED.MUL_256/ (21.52%) > 0 cpu_core/UOPS_EXECUTED.X87/ (21.52%) > 237,196 cpu_core/IDQ.DSB_CYCLES_OK/ (21.52%) > 125,009 cpu_core/LSD.CYCLES_OK/ (21.52%) > 0 cpu_core/INT_VEC_RETIRED.ADD_128/ (21.40%) > 28,388,778 cpu_core/MEM_UOP_RETIRED.ANY/ (18.61%) > 1,806,629 cpu_core/INST_RETIRED.NOP/ (18.21%) > 41,928,018 cpu_core/ICACHE_DATA.STALLS/ (17.81%) > 0 cpu_core/INT_VEC_RETIRED.VNNI_256/ (17.41%) > 18,230,137 cpu_core/EXE_ACTIVITY.2_PORTS_UTIL,umask=0xc/ (17.02%) > 28,052,001 cpu_core/EXE_ACTIVITY.3_PORTS_UTIL,umask=0x80/ (16.61%) > 4,073,568 cpu_core/INST_RETIRED.MACRO_FUSED/ (16.20%) > 66,509,871 cpu_core/INT_MISC.UNKNOWN_BRANCH_CYCLES/ (15.92%) > 2,307,447 cpu_core/IDQ.MITE_CYCLES_OK/ (15.91%) > 30,345,769 cpu_core/INT_MISC.CLEAR_RESTEER_CYCLES/ (15.91%) > 0 cpu_core/INT_VEC_RETIRED.SHUFFLES/ (15.91%) > 14,722,079 cpu_core/ICACHE_TAG.STALLS/ (15.90%) > > 1.004474469 seconds time elapsed > > $ perf stat --metric-no-threshold --metric-no-group -M TopdownL4 -a sleep 1 > > Performance counter stats for 'system wide': > > 1,004,834,399 ns duration_time # 0.3 % tma_false_sharing > # 40.2 % tma_l3_hit_latency > # 4.4 % tma_contested_accesses > # 1.6 % tma_data_sharing > 3,762,410 cpu_atom/LD_HEAD.PGWALK_AT_RET/ # 3.1 % tma_stlb_miss (33.58%) > 10 cpu_atom/MACHINE_CLEARS.SMC/ # 0.0 % tma_smc (33.98%) > 66,500,689 cpu_atom/TOPDOWN_BE_BOUND.MEM_SCHEDULER/ # 0.0 % tma_ld_buffer > # 0.0 % tma_rsv > # 11.0 % tma_st_buffer (29.60%) > 1,051,312 cpu_atom/LD_HEAD.OTHER_AT_RET/ # 0.9 % tma_other_l1 (30.00%) > 14,740,093 cpu_atom/UOPS_RETIRED.MS/ (30.39%) > 117,899 cpu_atom/LD_HEAD.DTLB_MISS_AT_RET/ # 0.1 % tma_stlb_hit (30.79%) > 701,548 cpu_atom/TOPDOWN_BAD_SPECULATION.NUKE/ # 0.0 % tma_disambiguation > # 0.0 % tma_fp_assist > # 0.1 % tma_memory_ordering > # 0.0 % tma_page_fault (31.08%) > 12,873 cpu_atom/MACHINE_CLEARS.MEMORY_ORDERING/ (31.07%) > 58,321 cpu_atom/MEM_SCHEDULER_BLOCK.LD_BUF/ (31.07%) > 43,458 cpu_atom/MEM_SCHEDULER_BLOCK.RSV/ (31.07%) > 14,256,005 cpu_atom/MEM_SCHEDULER_BLOCK.ALL/ (31.06%) > 122,156,534 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 0.0 % tma_store_fwd_blk (36.16%) > 0 cpu_atom/MACHINE_CLEARS.FP_ASSIST/ (35.76%) > 13,804 cpu_atom/MACHINE_CLEARS.SLOW/ (35.35%) > 14,388,300 cpu_atom/MEM_SCHEDULER_BLOCK.ST_BUF/ (34.95%) > 493,070,443 cpu_atom/CPU_CLK_UNHALTED.REF_TSC/ (39.73%) > 2 cpu_atom/MACHINE_CLEARS.PAGE_FAULT/ (39.33%) > 1,101 cpu_atom/LD_HEAD.ST_ADDR_AT_RET/ (38.93%) > 929 cpu_atom/MACHINE_CLEARS.DISAMBIGUATION/ (38.55%) > 14,241,213 cpu_atom/MEM_SCHEDULER_BLOCK.ALL/ (33.45%) > 1,010,981,054 cpu_core/TOPDOWN.SLOTS/ # 0.0 % tma_assists > # 4.3 % tma_cisc > # 0.0 % tma_fp_scalar > # 0.0 % tma_fp_vector > # 0.0 % tma_shuffles > # 0.0 % tma_int_vector_128b > # 0.0 % tma_x87_use > # 0.0 % tma_int_vector_256b > # 0.7 % tma_clears_resteers > # 12.4 % tma_mispredicts_resteers (8.14%) > 132,375,316 cpu_core/topdown-retiring/ (8.14%) > 88,303,327 cpu_core/topdown-bad-spec/ (8.14%) > 85,519,216 cpu_core/topdown-br-mispredict/ (8.14%) > 495,722,455 cpu_core/topdown-fe-bound/ (8.14%) > 298,147,134 cpu_core/topdown-be-bound/ (8.14%) > 21,418,803 cpu_core/UOPS_EXECUTED.CYCLES_GE_3/ # 8.8 % tma_ports_utilized_3m (10.12%) > 35,208,716 cpu_core/OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD,cmask=4/ # 14.5 % tma_mem_bandwidth > # 33.3 % tma_mem_latency (10.52%) > 17,358 cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM/ (10.91%) > 55,883,811 cpu_core/RESOURCE_STALLS.SCOREBOARD/ # 24.1 % tma_ports_utilized_0 (12.91%) > 0 cpu_core/INT_VEC_RETIRED.ADD_256/ (14.89%) > 139,890 cpu_core/DTLB_STORE_MISSES.STLB_HIT,cmask=1/ # 2.8 % tma_dtlb_store (15.30%) > 216,886 cpu_core/MEM_INST_RETIRED.LOCK_LOADS/ # 3.8 % tma_store_latency > # 0.1 % tma_lock_latency (15.71%) > 115,948,790 cpu_core/UOPS_EXECUTED.THREAD/ (17.69%) > 52,155,508 cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/ (15.93%) > 6 cpu_core/ASSISTS.ANY,umask=0x1B/ (15.93%) > 87,422,517 cpu_core/CYCLE_ACTIVITY.CYCLES_MEM_ANY/ # 5.2 % tma_dtlb_load (15.81%) > 37,420,652 cpu_core/MEMORY_ACTIVITY.CYCLES_L1D_MISS/ (15.44%) > 43,527,357 cpu_core/UOPS_RETIRED.MS/ (15.04%) > 31,787,227 cpu_core/INT_MISC.CLEAR_RESTEER_CYCLES/ (14.64%) > 0 cpu_core/FP_ARITH_INST_RETIRED.SCALAR_SINGLE,umask=0x03/ (14.24%) > 4,899,130 cpu_core/XQ.FULL_CYCLES/ # 2.0 % tma_sq_full (13.84%) > 1,365 cpu_core/OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM/ (13.44%) > 23,904,338 cpu_core/EXE_ACTIVITY.1_PORTS_UTIL/ # 9.9 % tma_ports_utilized_1 (13.05%) > 251,479 cpu_core/L2_RQSTS.ALL_RFO/ (12.76%) > 188,701,010 cpu_core/CYCLE_ACTIVITY.STALLS_TOTAL/ (12.74%) > 6,909 cpu_core/MEM_INST_RETIRED.SPLIT_STORES/ # 0.0 % tma_split_stores (12.74%) > 619,775 cpu_core/MEM_LOAD_RETIRED.L1_MISS/ (9.56%) > 136,716,345 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/ # 0.9 % tma_decoder0_alone (11.15%) > 0 cpu_core/INT_VEC_RETIRED.VNNI_128/ (12.74%) > 605,850 cpu_core/L1D_PEND_MISS.FB_FULL/ # 0.2 % tma_fb_full (12.73%) > 60,079 cpu_core/MEM_STORE_RETIRED.L2_HIT/ (11.14%) > 242,508,080 cpu_core/CPU_CLK_UNHALTED.THREAD/ # 4.2 % tma_ports_utilized_2 > # 0.2 % tma_store_fwd_blk > # 0.0 % tma_streaming_stores > # 27.5 % tma_unknown_branches > # 0.0 % tma_split_loads (12.74%) > 0 cpu_core/FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE,umask=0x3c/ (14.33%) > 32,573 cpu_core/LD_BLOCKS.STORE_FORWARD/ (12.74%) > 1,130 cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD/ (12.74%) > 4,029 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/ (9.56%) > 4,844,548 cpu_core/INST_DECODED.DECODERS,cmask=1/ (9.56%) > 5,266 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_NO_FWD/ (6.37%) > 0 cpu_core/UOPS_EXECUTED.X87/ (7.96%) > 0 cpu_core/INT_VEC_RETIRED.MUL_256/ (9.56%) > 2,786,473 cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/ (9.56%) > 961,614,001 cpu_core/CPU_CLK_UNHALTED.REF_TSC/ (11.15%) > 2,433,107 cpu_core/INST_DECODED.DECODERS,cmask=2/ (11.15%) > 0 cpu_core/INT_VEC_RETIRED.ADD_128/ (12.74%) > 9,058,046 cpu_core/OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO/ (12.74%) > 6,399,992 cpu_core/MEM_INST_RETIRED.ALL_STORES/ (12.74%) > 45,519,749 cpu_core/L1D_PEND_MISS.PENDING/ (9.56%) > 12,200,559 cpu_core/DTLB_LOAD_MISSES.WALK_ACTIVE/ (7.97%) > 115,944,190 cpu_core/OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD/ (6.37%) > 0 cpu_core/INT_VEC_RETIRED.VNNI_256/ (7.96%) > 1,885,278 cpu_core/INT_MISC.UOP_DROPPING/ (9.56%) > 524,819 cpu_core/MEM_LOAD_RETIRED.FB_HIT/ (9.56%) > 26,866,872 cpu_core/EXE_ACTIVITY.3_PORTS_UTIL,umask=0x80/ (11.15%) > 10,265,977 cpu_core/EXE_ACTIVITY.2_PORTS_UTIL/ (12.74%) > 66,662,934 cpu_core/INT_MISC.UNKNOWN_BRANCH_CYCLES/ (12.74%) > 0 cpu_core/OCR.STREAMING_WR.ANY_RESPONSE/ (12.74%) > 12,499 cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/ (12.74%) > 0 cpu_core/INT_VEC_RETIRED.SHUFFLES/ (12.74%) > 47,649 cpu_core/DTLB_LOAD_MISSES.STLB_HIT,cmask=1/ (12.74%) > 106,424 cpu_core/L2_RQSTS.RFO_HIT/ (12.74%) > 0 cpu_core/LD_BLOCKS.NO_SR/ (7.97%) > 1,343,692 cpu_core/MEM_LOAD_COMPLETED.L1_MISS_ANY/ (7.96%) > 28,517 cpu_core/L1D_PEND_MISS.L2_STALLS/ (6.37%) > 394,101 cpu_core/MEM_LOAD_RETIRED.L3_HIT/ (6.36%) > 76,860,165,929 TSC > > 1.004834399 seconds time elapsed > > $ perf stat --metric-no-threshold --metric-no-group -M TopdownL5 -a sleep 1 > > Performance counter stats for 'system wide': > > 839,538,302 cpu_core/TOPDOWN.SLOTS/ # 0.0 % tma_avx_assists > # 0.0 % tma_fp_assists > # 0.0 % tma_page_faults > # 0.0 % tma_fp_vector_128b > # 0.0 % tma_fp_vector_256b (32.40%) > 100,274,045 cpu_core/topdown-retiring/ (32.40%) > 77,425,642 cpu_core/topdown-bad-spec/ (32.40%) > 424,563,652 cpu_core/topdown-fe-bound/ (32.40%) > 245,420,564 cpu_core/topdown-be-bound/ (32.40%) > 0 cpu_core/FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE/ (32.79%) > 54,372,921 cpu_core/RESOURCE_STALLS.SCOREBOARD/ # 22.2 % tma_serializing_operation (33.20%) > 23,018,585 cpu_core/UOPS_DISPATCHED.PORT_6/ # 8.0 % tma_alu_op_utilization (33.61%) > 17,748,101 cpu_core/UOPS_DISPATCHED.PORT_2_3_10/ # 4.2 % tma_load_op_utilization (34.02%) > 0 cpu_core/FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE/ (34.43%) > 7,616,700 cpu_core/UOPS_DISPATCHED.PORT_0/ (34.83%) > 96,571 cpu_core/DTLB_STORE_MISSES.STLB_HIT,cmask=1/ # 0.6 % tma_store_stlb_hit (35.25%) > 84,909,672 cpu_core/CYCLE_ACTIVITY.CYCLES_MEM_ANY/ # 0.2 % tma_load_stlb_hit (35.66%) > 32,935,744 cpu_core/MEMORY_ACTIVITY.CYCLES_L1D_MISS/ (31.95%) > 16,597,385 cpu_core/UOPS_DISPATCHED.PORT_5_11/ (31.95%) > 9,452,844 cpu_core/UOPS_DISPATCHED.PORT_1/ (31.94%) > 2,620,695 cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/ # 1.8 % tma_store_stlb_miss (31.95%) > 15,699,364 cpu_core/UOPS_DISPATCHED.PORT_7_8/ # 5.7 % tma_store_op_utilization (31.95%) > 0 cpu_core/FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE/ (31.94%) > 142,096,670 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/ (31.95%) > 244,591,239 cpu_core/CPU_CLK_UNHALTED.THREAD/ # 5.2 % tma_load_stlb_miss > # 0.0 % tma_mixing_vectors (35.92%) > 2,728,385 cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/ (35.66%) > 0 cpu_core/ASSISTS.SSE_AVX_MIX/ (35.27%) > 0 cpu_core/FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE/ (34.86%) > 12,664,768 cpu_core/DTLB_LOAD_MISSES.WALK_ACTIVE/ (34.46%) > 12,629,733 cpu_core/DTLB_LOAD_MISSES.WALK_ACTIVE/ (34.04%) > 0 cpu_core/ASSISTS.FP/ (33.63%) > 12 cpu_core/ASSISTS.PAGE_FAULT/ (33.23%) > 16,704,699 cpu_core/UOPS_DISPATCHED.PORT_4_9/ (32.81%) > 48,386 cpu_core/DTLB_LOAD_MISSES.STLB_HIT,cmask=1/ (28.68%) > > 1.002806967 seconds time elapsed > > $ perf stat --metric-no-threshold --metric-no-group -M TopdownL6 -a sleep 1 > > Performance counter stats for 'system wide': > > 743,684 cpu_core/UOPS_DISPATCHED.PORT_0/ # 4.6 % tma_port_0 > 1,514 cpu_core/MISC2_RETIRED.LFENCE/ # 0.1 % tma_memory_fence > 22,120 cpu_core/CPU_CLK_UNHALTED.PAUSE/ # 0.1 % tma_slow_pause > 16,187,637 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/ # 4.5 % tma_port_1 > # 12.6 % tma_port_6 > 16,754,672 cpu_core/CPU_CLK_UNHALTED.THREAD/ > 728,805 cpu_core/UOPS_DISPATCHED.PORT_1/ > 2,040,181 cpu_core/UOPS_DISPATCHED.PORT_6/ > > 1.002727371 seconds time elapse > ''' > > Using --cputype: > ''' > $ perf stat --cputype=core -M TopdownL1 -a sleep 1 > > Performance counter stats for 'system wide': > > 90,542,172 cpu_core/TOPDOWN.SLOTS/ # 31.3 % tma_backend_bound > # 7.0 % tma_bad_speculation > # 54.0 % tma_frontend_bound > # 7.6 % tma_retiring > 6,917,885 cpu_core/topdown-retiring/ > 6,242,227 cpu_core/topdown-bad-spec/ > 2,353,956 cpu_core/topdown-heavy-ops/ > 49,034,945 cpu_core/topdown-fe-bound/ > 28,390,484 cpu_core/topdown-be-bound/ > 98,299 cpu_core/INT_MISC.UOP_DROPPING/ > > 1.002395582 seconds time elapsed > > $ perf stat --cputype=atom -M TopdownL1 -a sleep 1 > > Performance counter stats for 'system wide': > > 645,836 cpu_atom/TOPDOWN_RETIRING.ALL/ # 26.4 % tma_bad_speculation > 2,404,468 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 38.9 % tma_frontend_bound > 1,455,604 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 23.6 % tma_backend_bound > # 23.6 % tma_backend_bound_aux > 1,235,109 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 10.4 % tma_retiring > 642,124 cpu_atom/TOPDOWN_RETIRING.ALL/ > 2,398,892 cpu_atom/TOPDOWN_FE_BOUND.ALL/ > 1,503,157 cpu_atom/TOPDOWN_BE_BOUND.ALL/ > > 1.002061651 seconds time elapsed > ''' > > Ian Rogers (40): > perf stat: Introduce skippable evsels > perf vendor events intel: Add alderlake metric constraints > perf vendor events intel: Add icelake metric constraints > perf vendor events intel: Add icelakex metric constraints > perf vendor events intel: Add sapphirerapids metric constraints > perf vendor events intel: Add tigerlake metric constraints > perf stat: Avoid segv on counter->name > perf test: Test more sysfs events > perf test: Use valid for PMU tests > perf test: Mask config then test > perf test: Test more with config_cache > perf test: Roundtrip name, don't assume 1 event per name > perf parse-events: Set attr.type to PMU type early > perf print-events: Avoid unnecessary strlist > perf parse-events: Avoid scanning PMUs before parsing > perf test: Validate events with hyphens in > perf evsel: Modify group pmu name for software events > perf test: Move x86 hybrid tests to arch/x86 > perf test x86 hybrid: Don't assume evlist order > perf parse-events: Support PMUs for legacy cache events > perf parse-events: Wildcard legacy cache events > perf print-events: Print legacy cache events for each PMU > perf parse-events: Support wildcards on raw events > perf parse-events: Remove now unused hybrid logic > perf parse-events: Minor type safety cleanup > perf parse-events: Add pmu filter > perf stat: Make cputype filter generic > perf test: Add cputype testing to perf stat > perf test: Fix parse-events tests for >1 core PMU > perf parse-events: Support hardware events as terms > perf parse-events: Avoid error when assigning a term > perf parse-events: Avoid error when assigning a legacy cache term > perf parse-events: Don't auto merge hybrid wildcard events > perf parse-events: Don't reorder atom cpu events > perf metrics: Be PMU specific for referenced metrics. > perf metric: Json flag to not group events if gathering a metric group > perf stat: Command line PMU metric filtering > perf vendor events intel: Correct alderlake metrics > perf jevents: Don't rewrite metrics across PMUs > perf metrics: Be PMU specific in event match > > tools/perf/arch/x86/include/arch-tests.h | 1 + > tools/perf/arch/x86/tests/Build | 1 + > tools/perf/arch/x86/tests/arch-tests.c | 10 + > tools/perf/arch/x86/tests/hybrid.c | 275 ++++++ > tools/perf/arch/x86/util/evlist.c | 4 +- > tools/perf/builtin-list.c | 19 +- > tools/perf/builtin-record.c | 13 +- > tools/perf/builtin-stat.c | 73 +- > tools/perf/builtin-top.c | 5 +- > tools/perf/builtin-trace.c | 5 +- > .../arch/x86/alderlake/adl-metrics.json | 275 +++--- > .../arch/x86/alderlaken/adln-metrics.json | 20 +- > .../arch/x86/broadwell/bdw-metrics.json | 12 + > .../arch/x86/broadwellde/bdwde-metrics.json | 12 + > .../arch/x86/broadwellx/bdx-metrics.json | 12 + > .../arch/x86/cascadelakex/clx-metrics.json | 12 + > .../arch/x86/haswell/hsw-metrics.json | 12 + > .../arch/x86/haswellx/hsx-metrics.json | 12 + > .../arch/x86/icelake/icl-metrics.json | 23 + > .../arch/x86/icelakex/icx-metrics.json | 23 + > .../arch/x86/ivybridge/ivb-metrics.json | 12 + > .../arch/x86/ivytown/ivt-metrics.json | 12 + > .../arch/x86/jaketown/jkt-metrics.json | 12 + > .../arch/x86/sandybridge/snb-metrics.json | 12 + > .../arch/x86/sapphirerapids/spr-metrics.json | 23 + > .../arch/x86/skylake/skl-metrics.json | 12 + > .../arch/x86/skylakex/skx-metrics.json | 12 + > .../arch/x86/tigerlake/tgl-metrics.json | 23 + > tools/perf/pmu-events/jevents.py | 10 +- > tools/perf/pmu-events/metric.py | 28 +- > tools/perf/pmu-events/metric_test.py | 6 +- > tools/perf/pmu-events/pmu-events.h | 2 + > tools/perf/tests/evsel-roundtrip-name.c | 119 ++- > tools/perf/tests/parse-events.c | 826 +++++++++--------- > tools/perf/tests/pmu-events.c | 12 +- > tools/perf/tests/shell/stat.sh | 44 + > tools/perf/util/Build | 1 - > tools/perf/util/evlist.h | 1 - > tools/perf/util/evsel.c | 30 +- > tools/perf/util/evsel.h | 1 + > tools/perf/util/metricgroup.c | 111 ++- > tools/perf/util/metricgroup.h | 3 +- > tools/perf/util/parse-events-hybrid.c | 214 ----- > tools/perf/util/parse-events-hybrid.h | 25 - > tools/perf/util/parse-events.c | 646 ++++++-------- > tools/perf/util/parse-events.h | 61 +- > tools/perf/util/parse-events.l | 108 +-- > tools/perf/util/parse-events.y | 222 ++--- > tools/perf/util/pmu-hybrid.c | 20 - > tools/perf/util/pmu-hybrid.h | 1 - > tools/perf/util/pmu.c | 16 +- > tools/perf/util/pmu.h | 3 + > tools/perf/util/pmus.c | 25 +- > tools/perf/util/pmus.h | 3 + > tools/perf/util/print-events.c | 85 +- > tools/perf/util/stat-display.c | 6 +- > 56 files changed, 1939 insertions(+), 1627 deletions(-) > create mode 100644 tools/perf/arch/x86/tests/hybrid.c > delete mode 100644 tools/perf/util/parse-events-hybrid.c > delete mode 100644 tools/perf/util/parse-events-hybrid.h > > -- > 2.40.1.495.gc816e09b53d-goog > -- - Arnaldo