2024-03-21 06:00:39

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 00/12] perf vendor events intel update

Update events to the latest on:
https://github.com/intel/perfmon

Using the converter script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Ian Rogers (12):
perf vendor events intel: Update cascadelakex to 1.21
perf vendor events intel: Update emeraldrapids to 1.06
perf vendor events intel: Update grandridge to 1.02
perf vendor events intel: Update icelakex to 1.24
perf vendor events intel: Update lunarlake to 1.01
perf vendor events intel: Update meteorlake to 1.08
perf vendor events intel: Update sapphirerapids to 1.20
perf vendor events intel: Update sierraforest to 1.02
perf vendor events intel: Update skylakex to 1.33
perf vendor events intel: Update skylake to v58
perf vendor events intel: Update snowridgex to 1.22
perf vendor events intel: Remove info metrics erroneously in TopdownL1

.../arch/x86/broadwellx/bdx-metrics.json | 35 +++---
.../arch/x86/cascadelakex/clx-metrics.json | 85 +++++--------
.../arch/x86/cascadelakex/frontend.json | 10 +-
.../arch/x86/cascadelakex/memory.json | 2 +-
.../arch/x86/cascadelakex/other.json | 2 +-
.../arch/x86/cascadelakex/pipeline.json | 2 +-
.../x86/cascadelakex/uncore-interconnect.json | 14 +--
.../arch/x86/cascadelakex/virtual-memory.json | 2 +-
.../arch/x86/emeraldrapids/frontend.json | 2 +-
.../arch/x86/emeraldrapids/memory.json | 1 +
.../arch/x86/emeraldrapids/pipeline.json | 3 +
.../arch/x86/emeraldrapids/uncore-cache.json | 112 ++++++++++++++++-
.../emeraldrapids/uncore-interconnect.json | 26 ++--
.../arch/x86/grandridge/pipeline.json | 43 ++++++-
.../arch/x86/grandridge/uncore-cache.json | 28 ++++-
.../arch/x86/haswellx/hsx-metrics.json | 35 +++---
.../arch/x86/icelakex/frontend.json | 2 +-
.../arch/x86/icelakex/icx-metrics.json | 95 ++++++--------
.../pmu-events/arch/x86/icelakex/memory.json | 1 +
.../arch/x86/icelakex/uncore-cache.json | 22 +++-
.../x86/icelakex/uncore-interconnect.json | 64 +++++-----
.../arch/x86/icelakex/uncore-io.json | 11 --
.../pmu-events/arch/x86/lunarlake/cache.json | 24 ++--
.../arch/x86/lunarlake/frontend.json | 2 +-
.../pmu-events/arch/x86/lunarlake/memory.json | 4 +-
.../pmu-events/arch/x86/lunarlake/other.json | 4 +-
.../arch/x86/lunarlake/pipeline.json | 109 +++++++++++++---
tools/perf/pmu-events/arch/x86/mapfile.csv | 20 +--
.../pmu-events/arch/x86/meteorlake/cache.json | 30 +++++
.../arch/x86/meteorlake/frontend.json | 4 +-
.../arch/x86/meteorlake/memory.json | 20 +++
.../pmu-events/arch/x86/meteorlake/other.json | 42 ++++++-
.../arch/x86/meteorlake/pipeline.json | 44 +++++--
.../x86/meteorlake/uncore-interconnect.json | 22 +++-
.../arch/x86/sapphirerapids/cache.json | 1 +
.../arch/x86/sapphirerapids/frontend.json | 2 +-
.../arch/x86/sapphirerapids/memory.json | 1 +
.../arch/x86/sapphirerapids/pipeline.json | 19 +--
.../arch/x86/sapphirerapids/spr-metrics.json | 119 +++++++-----------
.../arch/x86/sapphirerapids/uncore-cache.json | 112 ++++++++++++++++-
.../sapphirerapids/uncore-interconnect.json | 26 ++--
.../arch/x86/sierraforest/pipeline.json | 36 +++++-
.../pmu-events/arch/x86/skylake/frontend.json | 10 +-
.../pmu-events/arch/x86/skylakex/cache.json | 9 ++
.../arch/x86/skylakex/frontend.json | 10 +-
.../pmu-events/arch/x86/skylakex/memory.json | 2 +-
.../pmu-events/arch/x86/skylakex/other.json | 2 +-
.../arch/x86/skylakex/pipeline.json | 2 +-
.../arch/x86/skylakex/skx-metrics.json | 85 +++++--------
.../x86/skylakex/uncore-interconnect.json | 14 +--
.../arch/x86/skylakex/uncore-io.json | 2 +-
.../arch/x86/skylakex/virtual-memory.json | 2 +-
.../arch/x86/snowridgex/uncore-cache.json | 4 +-
.../x86/snowridgex/uncore-interconnect.json | 6 +-
.../arch/x86/snowridgex/uncore-io.json | 11 --
55 files changed, 911 insertions(+), 486 deletions(-)

--
2.44.0.396.g6e790dbe36-goog



2024-03-21 06:00:53

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 01/12] perf vendor events intel: Update cascadelakex to 1.21

Update events from 1.20 to 1.21 as released in:
https://github.com/intel/perfmon/commit/fcfdba3be8f3be81ad6b509fdebf953ead92dc2c

Largely fixes spelling and descriptions.

Signed-off-by: Ian Rogers <[email protected]>
---
.../pmu-events/arch/x86/cascadelakex/frontend.json | 10 +++++-----
.../pmu-events/arch/x86/cascadelakex/memory.json | 2 +-
.../pmu-events/arch/x86/cascadelakex/other.json | 2 +-
.../pmu-events/arch/x86/cascadelakex/pipeline.json | 2 +-
.../arch/x86/cascadelakex/uncore-interconnect.json | 14 +++++++-------
.../arch/x86/cascadelakex/virtual-memory.json | 2 +-
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json b/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
index 095904c77001..d6f543471b24 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
@@ -19,7 +19,7 @@
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches",
"EventCode": "0xAB",
"EventName": "DSB2MITE_SWITCHES.COUNT",
- "PublicDescription": "This event counts the number of the Decode Stream Buffer (DSB)-to-MITE switches including all misses because of missing Decode Stream Buffer (DSB) cache and u-arch forced misses.\nNote: Invoking MITE requires two or three cycles delay.",
+ "PublicDescription": "This event counts the number of the Decode Stream Buffer (DSB)-to-MITE switches including all misses because of missing Decode Stream Buffer (DSB) cache and u-arch forced misses. Note: Invoking MITE requires two or three cycles delay.",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -267,11 +267,11 @@
"UMask": "0x4"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 or more Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
+ "PublicDescription": "Counts the number of cycles 4 or more uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
@@ -321,11 +321,11 @@
"UMask": "0x18"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 or more Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "PublicDescription": "Counts the number of cycles 4 or more uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/memory.json b/tools/perf/pmu-events/arch/x86/cascadelakex/memory.json
index a00ad0aaf1ba..c69b2c33334b 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/memory.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/memory.json
@@ -6866,7 +6866,7 @@
"BriefDescription": "Number of times an RTM execution aborted due to any reasons (multiple categories may count as one).",
"EventCode": "0xC9",
"EventName": "RTM_RETIRED.ABORTED",
- "PEBS": "1",
+ "PEBS": "2",
"PublicDescription": "Number of times RTM abort was triggered.",
"SampleAfterValue": "2000003",
"UMask": "0x4"
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/other.json b/tools/perf/pmu-events/arch/x86/cascadelakex/other.json
index 3ab5e91a4c1c..95d42ac36717 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/other.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/other.json
@@ -19,7 +19,7 @@
"BriefDescription": "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX512 turbo schedule.",
"EventCode": "0x28",
"EventName": "CORE_POWER.LVL2_TURBO_LICENSE",
- "PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server michroarchtecture). This includes high current AVX 512-bit instructions.",
+ "PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server microarchitecture). This includes high current AVX 512-bit instructions.",
"SampleAfterValue": "200003",
"UMask": "0x20"
},
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json b/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
index 66d686cc933e..c50ddf5b40dd 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
@@ -396,7 +396,7 @@
"Errata": "SKL091, SKL044",
"EventCode": "0xC0",
"EventName": "INST_RETIRED.NOP",
- "PEBS": "1",
+ "PEBS": "2",
"SampleAfterValue": "2000003",
"UMask": "0x2"
},
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json
index 1a342dff1503..3fe9ce483bbe 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json
@@ -38,7 +38,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.CLFLUSH",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x80",
"Unit": "IRP"
},
@@ -47,7 +47,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.CRD",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x2",
"Unit": "IRP"
},
@@ -56,7 +56,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.DRD",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x4",
"Unit": "IRP"
},
@@ -65,7 +65,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.PCIDCAHINT",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x20",
"Unit": "IRP"
},
@@ -74,7 +74,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.PCIRDCUR",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x1",
"Unit": "IRP"
},
@@ -101,7 +101,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.WBMTOI",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x40",
"Unit": "IRP"
},
@@ -500,7 +500,7 @@
"EventCode": "0x11",
"EventName": "UNC_I_TRANSACTIONS.WRITES",
"PerPkg": "1",
- "PublicDescription": "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID.; Trackes only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
+ "PublicDescription": "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID.; Tracks only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
"UMask": "0x2",
"Unit": "IRP"
},
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/virtual-memory.json b/tools/perf/pmu-events/arch/x86/cascadelakex/virtual-memory.json
index f59405877ae8..73feadaf7674 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/virtual-memory.json
@@ -205,7 +205,7 @@
"BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake.",
"EventCode": "0x85",
"EventName": "ITLB_MISSES.WALK_PENDING",
- "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake michroarchitecture.",
+ "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake microarchitecture.",
"SampleAfterValue": "100003",
"UMask": "0x10"
},
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 5297d25f4e03..281419d2edeb 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -5,7 +5,7 @@ GenuineIntel-6-(1C|26|27|35|36),v5,bonnell,core
GenuineIntel-6-(3D|47),v29,broadwell,core
GenuineIntel-6-56,v11,broadwellde,core
GenuineIntel-6-4F,v22,broadwellx,core
-GenuineIntel-6-55-[56789ABCDEF],v1.20,cascadelakex,core
+GenuineIntel-6-55-[56789ABCDEF],v1.21,cascadelakex,core
GenuineIntel-6-9[6C],v1.04,elkhartlake,core
GenuineIntel-6-CF,v1.03,emeraldrapids,core
GenuineIntel-6-5[CF],v13,goldmont,core
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:01:08

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 02/12] perf vendor events intel: Update emeraldrapids to 1.06

Update events from 1.03 to 1.96 as released in:
https://github.com/intel/perfmon/commit/21a8be3ea7918749141db4036fb65a2343cd865d

Fixes spelling and descriptions. Adds cache miss latency events
UNC_CHA_TOR_(INSERTS|OCCUPANCY).IO_(PCIRDCUR|ITOM|ITOMCACHENEAR)_(LOCAL|REMOTE).

Signed-off-by: Ian Rogers <[email protected]>
---
.../arch/x86/emeraldrapids/frontend.json | 2 +-
.../arch/x86/emeraldrapids/memory.json | 1 +
.../arch/x86/emeraldrapids/pipeline.json | 3 +
.../arch/x86/emeraldrapids/uncore-cache.json | 112 +++++++++++++++++-
.../emeraldrapids/uncore-interconnect.json | 26 ++--
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
6 files changed, 129 insertions(+), 17 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/emeraldrapids/frontend.json b/tools/perf/pmu-events/arch/x86/emeraldrapids/frontend.json
index 9e53da55d0c1..93d99318a623 100644
--- a/tools/perf/pmu-events/arch/x86/emeraldrapids/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/emeraldrapids/frontend.json
@@ -267,7 +267,7 @@
"CounterMask": "6",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).",
+ "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the DSB (Decode Stream Buffer) path. Count includes uops that may 'bypass' the IDQ.",
"SampleAfterValue": "2000003",
"UMask": "0x8"
},
diff --git a/tools/perf/pmu-events/arch/x86/emeraldrapids/memory.json b/tools/perf/pmu-events/arch/x86/emeraldrapids/memory.json
index e8bf7c9c44e1..5420f529f491 100644
--- a/tools/perf/pmu-events/arch/x86/emeraldrapids/memory.json
+++ b/tools/perf/pmu-events/arch/x86/emeraldrapids/memory.json
@@ -264,6 +264,7 @@
"BriefDescription": "Number of times an RTM execution aborted.",
"EventCode": "0xc9",
"EventName": "RTM_RETIRED.ABORTED",
+ "PEBS": "1",
"PublicDescription": "Counts the number of times RTM abort was triggered.",
"SampleAfterValue": "100003",
"UMask": "0x4"
diff --git a/tools/perf/pmu-events/arch/x86/emeraldrapids/pipeline.json b/tools/perf/pmu-events/arch/x86/emeraldrapids/pipeline.json
index 1f8200fb8964..e2086bedeca8 100644
--- a/tools/perf/pmu-events/arch/x86/emeraldrapids/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/emeraldrapids/pipeline.json
@@ -428,6 +428,7 @@
"BriefDescription": "INST_RETIRED.MACRO_FUSED",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.MACRO_FUSED",
+ "PEBS": "1",
"SampleAfterValue": "2000003",
"UMask": "0x10"
},
@@ -435,6 +436,7 @@
"BriefDescription": "Retired NOP instructions.",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.NOP",
+ "PEBS": "1",
"PublicDescription": "Counts all retired NOP or ENDBR32/64 instructions",
"SampleAfterValue": "2000003",
"UMask": "0x2"
@@ -451,6 +453,7 @@
"BriefDescription": "Iterations of Repeat string retired instructions.",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.REP_ITERATION",
+ "PEBS": "1",
"PublicDescription": "Number of iterations of Repeat (REP) string retired instructions such as MOVS, CMPS, and SCAS. Each has a byte, word, and doubleword version and string instructions can be repeated using a repetition prefix, REP, that allows their architectural execution to be repeated a number of times as specified by the RCX register. Note the number of iterations is implementation-dependent.",
"SampleAfterValue": "2000003",
"UMask": "0x8"
diff --git a/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-cache.json b/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-cache.json
index 86a8f3b7fe1d..141dab46682e 100644
--- a/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-cache.json
@@ -4197,6 +4197,42 @@
"UMask": "0xcd43ff04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "ItoMCacheNear (partial write) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Inserts : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd42ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "ItoMCacheNear (partial write) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Inserts : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd437f04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "ItoM (write) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOM_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc42ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "ItoM (write) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOM_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc437f04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; Misses from local IO",
"EventCode": "0x35",
@@ -4207,7 +4243,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "TOR Inserts; ItoM misses from local IO",
+ "BriefDescription": "TOR Inserts : ItoM, indicating a full cacheline write request, from IO Devices that missed the LLC",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_ITOM",
"PerPkg": "1",
@@ -4225,7 +4261,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "TOR Inserts; RdCur and FsRdCur misses from local IO",
+ "BriefDescription": "TOR Inserts; RdCur and FsRdCur requests from local IO that miss LLC",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_PCIRDCUR",
"PerPkg": "1",
@@ -4251,6 +4287,24 @@
"UMask": "0xc8f3ff04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "PCIRDCUR (read) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f2ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "PCIRDCUR (read) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f37f04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; RFO from local IO",
"EventCode": "0x35",
@@ -5713,6 +5767,42 @@
"UMask": "0xcd43fe04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC and targets local memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd42fe04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC and targets remote memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd437e04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy; ITOM misses from local IO and targets local memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc42fe04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy; ITOM misses from local IO and targets remote memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc437e04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; RdCur and FsRdCur misses from local IO",
"EventCode": "0x36",
@@ -5722,6 +5812,24 @@
"UMask": "0xc8f3fe04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy; RdCur and FsRdCur misses from local IO and targets local memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f2fe04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy; RdCur and FsRdCur misses from local IO and targets remote memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f37e04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; RFO misses from local IO",
"EventCode": "0x36",
diff --git a/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-interconnect.json
index 65d088556bae..22bb490e9666 100644
--- a/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/emeraldrapids/uncore-interconnect.json
@@ -4888,7 +4888,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (AD)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (AD)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.AD",
"PerPkg": "1",
@@ -4897,7 +4897,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (AK)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (AK)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.AK",
"PerPkg": "1",
@@ -4906,7 +4906,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (AKC)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (AKC)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.AKC",
"PerPkg": "1",
@@ -4915,7 +4915,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (BL)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (BL)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.BL",
"PerPkg": "1",
@@ -4924,7 +4924,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (IV)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (IV)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.IV",
"PerPkg": "1",
@@ -5291,7 +5291,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCB",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xe",
"Unit": "UPI"
},
@@ -5300,7 +5300,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10e",
"Unit": "UPI"
},
@@ -5309,7 +5309,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCS",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xf",
"Unit": "UPI"
},
@@ -5318,7 +5318,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCS_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10f",
"Unit": "UPI"
},
@@ -5763,7 +5763,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCB",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xe",
"Unit": "UPI"
},
@@ -5772,7 +5772,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10e",
"Unit": "UPI"
},
@@ -5781,7 +5781,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCS",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xf",
"Unit": "UPI"
},
@@ -5790,7 +5790,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCS_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10f",
"Unit": "UPI"
},
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 281419d2edeb..85b89b1098cf 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -7,7 +7,7 @@ GenuineIntel-6-56,v11,broadwellde,core
GenuineIntel-6-4F,v22,broadwellx,core
GenuineIntel-6-55-[56789ABCDEF],v1.21,cascadelakex,core
GenuineIntel-6-9[6C],v1.04,elkhartlake,core
-GenuineIntel-6-CF,v1.03,emeraldrapids,core
+GenuineIntel-6-CF,v1.06,emeraldrapids,core
GenuineIntel-6-5[CF],v13,goldmont,core
GenuineIntel-6-7A,v1.01,goldmontplus,core
GenuineIntel-6-B6,v1.01,grandridge,core
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:02:19

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 04/12] perf vendor events intel: Update icelakex to 1.24

Update events from 1.23 to 1.24 as released in:
https://github.com/intel/perfmon/commit/d883888ae60882028e387b6fe1ebf683beb693fa

Fixes spelling and descriptions. Adds the uncore events
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL and
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE, while removing
UNC_IIO_NUM_REQ_FROM_CPU.IRP.

Signed-off-by: Ian Rogers <[email protected]>
---
.../arch/x86/icelakex/frontend.json | 2 +-
.../pmu-events/arch/x86/icelakex/memory.json | 1 +
.../arch/x86/icelakex/uncore-cache.json | 22 ++++++-
.../x86/icelakex/uncore-interconnect.json | 64 +++++++++----------
.../arch/x86/icelakex/uncore-io.json | 11 ----
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
6 files changed, 55 insertions(+), 47 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/icelakex/frontend.json b/tools/perf/pmu-events/arch/x86/icelakex/frontend.json
index f6edc4222f42..66669d062e68 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/frontend.json
@@ -282,7 +282,7 @@
"CounterMask": "5",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).",
+ "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the DSB (Decode Stream Buffer) path. Count includes uops that may 'bypass' the IDQ.",
"SampleAfterValue": "2000003",
"UMask": "0x8"
},
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/memory.json b/tools/perf/pmu-events/arch/x86/icelakex/memory.json
index f36ac04f8d76..875b584b8443 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/memory.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/memory.json
@@ -319,6 +319,7 @@
"BriefDescription": "Number of times an RTM execution aborted.",
"EventCode": "0xc9",
"EventName": "RTM_RETIRED.ABORTED",
+ "PEBS": "1",
"PublicDescription": "Counts the number of times RTM abort was triggered.",
"SampleAfterValue": "100003",
"UMask": "0x4"
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/uncore-cache.json b/tools/perf/pmu-events/arch/x86/icelakex/uncore-cache.json
index b6ce14ebf844..a950ba3ddcb4 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/uncore-cache.json
@@ -1580,7 +1580,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "This event is deprecated. Refer to new event UNC_CHA_LLC_LOOKUP.CODE_READ",
+ "BriefDescription": "This event is deprecated.",
"Deprecated": "1",
"EventCode": "0x34",
"EventName": "UNC_CHA_LLC_LOOKUP.CODE",
@@ -1677,7 +1677,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "This event is deprecated. Refer to new event UNC_CHA_LLC_LOOKUP.DATA_READ",
+ "BriefDescription": "This event is deprecated.",
"Deprecated": "1",
"EventCode": "0x34",
"EventName": "UNC_CHA_LLC_LOOKUP.DATA_READ_ALL",
@@ -6782,6 +6782,24 @@
"UMask": "0xc8f3ff04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "PCIRDCUR (read) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Inserts : PCIRdCurs issued by IO Devices and targets local memory : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f2ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "PCIRDCUR (read) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Inserts : PCIRdCurs issued by IO Devices and targets remote memory : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f37f04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts : RFOs issued by IO Devices",
"EventCode": "0x35",
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json
index a066a009c511..6997e6f7d366 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json
@@ -13523,7 +13523,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCB",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xe",
"Unit": "UPI"
},
@@ -13532,7 +13532,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10e",
"Unit": "UPI"
},
@@ -13541,7 +13541,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCS",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xf",
"Unit": "UPI"
},
@@ -13550,7 +13550,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCS_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10f",
"Unit": "UPI"
},
@@ -13559,7 +13559,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.REQ",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Request : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Request : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x8",
"Unit": "UPI"
},
@@ -13568,7 +13568,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.REQ_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Request, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Request, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x108",
"Unit": "UPI"
},
@@ -13577,7 +13577,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.RSPCNFLT",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Response - Conflict : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Response - Conflict : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x1aa",
"Unit": "UPI"
},
@@ -13586,7 +13586,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.RSPI",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Response - Invalid : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Response - Invalid : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x12a",
"Unit": "UPI"
},
@@ -13595,7 +13595,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.RSP_DATA",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Response - Data : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Response - Data : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xc",
"Unit": "UPI"
},
@@ -13604,7 +13604,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.RSP_DATA_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Response - Data, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Response - Data, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10c",
"Unit": "UPI"
},
@@ -13613,7 +13613,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.RSP_NODATA",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Response - No Data : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Response - No Data : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xa",
"Unit": "UPI"
},
@@ -13622,7 +13622,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.RSP_NODATA_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Response - No Data, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Response - No Data, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10a",
"Unit": "UPI"
},
@@ -13631,7 +13631,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.SNP",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Snoop : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Snoop : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x9",
"Unit": "UPI"
},
@@ -13640,7 +13640,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.SNP_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Snoop, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Snoop, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x109",
"Unit": "UPI"
},
@@ -13649,7 +13649,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.WB",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Writeback : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Writeback : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xd",
"Unit": "UPI"
},
@@ -13658,7 +13658,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.WB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Writeback, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Writeback, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10d",
"Unit": "UPI"
},
@@ -14038,7 +14038,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCB",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xe",
"Unit": "UPI"
},
@@ -14047,7 +14047,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10e",
"Unit": "UPI"
},
@@ -14056,7 +14056,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCS",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xf",
"Unit": "UPI"
},
@@ -14065,7 +14065,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCS_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10f",
"Unit": "UPI"
},
@@ -14074,7 +14074,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.REQ",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Request : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Request : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x8",
"Unit": "UPI"
},
@@ -14083,7 +14083,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.REQ_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Request, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Request, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x108",
"Unit": "UPI"
},
@@ -14092,7 +14092,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.RSPCNFLT",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Conflict : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Conflict : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x1aa",
"Unit": "UPI"
},
@@ -14101,7 +14101,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.RSPI",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Invalid : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Invalid : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x12a",
"Unit": "UPI"
},
@@ -14110,7 +14110,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.RSP_DATA",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Data : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Data : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xc",
"Unit": "UPI"
},
@@ -14119,7 +14119,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.RSP_DATA_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Data, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Response - Data, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10c",
"Unit": "UPI"
},
@@ -14128,7 +14128,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.RSP_NODATA",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Response - No Data : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Response - No Data : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xa",
"Unit": "UPI"
},
@@ -14137,7 +14137,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.RSP_NODATA_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Response - No Data, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Response - No Data, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10a",
"Unit": "UPI"
},
@@ -14146,7 +14146,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.SNP",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Snoop : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Snoop : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x9",
"Unit": "UPI"
},
@@ -14155,7 +14155,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.SNP_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Snoop, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Snoop, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x109",
"Unit": "UPI"
},
@@ -14164,7 +14164,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.WB",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Writeback : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Writeback : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xd",
"Unit": "UPI"
},
@@ -14173,7 +14173,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.WB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Writeback, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Writeback, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10d",
"Unit": "UPI"
},
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/uncore-io.json b/tools/perf/pmu-events/arch/x86/icelakex/uncore-io.json
index 9cef8862c428..1b8a719b81a5 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/uncore-io.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/uncore-io.json
@@ -2476,17 +2476,6 @@
"UMask": "0x10",
"Unit": "IIO"
},
- {
- "BriefDescription": "Number requests sent to PCIe from main die : From IRP",
- "EventCode": "0xC2",
- "EventName": "UNC_IIO_NUM_REQ_FROM_CPU.IRP",
- "FCMask": "0x07",
- "PerPkg": "1",
- "PortMask": "0xFF",
- "PublicDescription": "Number requests sent to PCIe from main die : From IRP : Captures Posted/Non-posted allocations from IRP. i.e. either non-confined P2P traffic or from the CPU",
- "UMask": "0x1",
- "Unit": "IIO"
- },
{
"BriefDescription": "Number requests sent to PCIe from main die : From ITC",
"EventCode": "0xC2",
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 650c28574576..a219dc3bd1ef 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -15,7 +15,7 @@ GenuineIntel-6-A[DE],v1.01,graniterapids,core
GenuineIntel-6-(3C|45|46),v35,haswell,core
GenuineIntel-6-3F,v28,haswellx,core
GenuineIntel-6-7[DE],v1.21,icelake,core
-GenuineIntel-6-6[AC],v1.23,icelakex,core
+GenuineIntel-6-6[AC],v1.24,icelakex,core
GenuineIntel-6-3A,v24,ivybridge,core
GenuineIntel-6-3E,v24,ivytown,core
GenuineIntel-6-2D,v24,jaketown,core
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:02:51

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 09/12] perf vendor events intel: Update skylakex to 1.33

Update events from 1.32 to 1.33 as released in:
https://github.com/intel/perfmon/commit/3fe7390dd18496c35ec3a9cf17de0473fd5485cb

Various description updates. Adds the event
OFFCORE_RESPONSE.ALL_READS.L3_HIT.HIT_OTHER_CORE_FWD.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
tools/perf/pmu-events/arch/x86/skylakex/cache.json | 9 +++++++++
.../pmu-events/arch/x86/skylakex/frontend.json | 10 +++++-----
.../perf/pmu-events/arch/x86/skylakex/memory.json | 2 +-
tools/perf/pmu-events/arch/x86/skylakex/other.json | 2 +-
.../pmu-events/arch/x86/skylakex/pipeline.json | 2 +-
.../arch/x86/skylakex/uncore-interconnect.json | 14 +++++++-------
.../pmu-events/arch/x86/skylakex/uncore-io.json | 2 +-
.../arch/x86/skylakex/virtual-memory.json | 2 +-
9 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index ac32377ab01a..e67e7906b2cf 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -30,7 +30,7 @@ GenuineIntel-6-8F,v1.20,sapphirerapids,core
GenuineIntel-6-AF,v1.02,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v58,skylake,core
-GenuineIntel-6-55-[01234],v1.32,skylakex,core
+GenuineIntel-6-55-[01234],v1.33,skylakex,core
GenuineIntel-6-86,v1.21,snowridgex,core
GenuineIntel-6-8[CD],v1.15,tigerlake,core
GenuineIntel-6-2C,v5,westmereep-dp,core
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/cache.json b/tools/perf/pmu-events/arch/x86/skylakex/cache.json
index d28d8822a51a..14229f4b29d8 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/cache.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/cache.json
@@ -763,6 +763,15 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "OFFCORE_RESPONSE.ALL_READS.L3_HIT.HIT_OTHER_CORE_FWD hit in the L3 and the snoop to one of the sibling cores hits the line in E/S/F state and the line is forwarded.",
+ "EventCode": "0xB7, 0xBB",
+ "EventName": "OFFCORE_RESPONSE.ALL_READS.L3_HIT.HIT_OTHER_CORE_FWD",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x8003C07F7",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Counts all demand & prefetch RFOs that have any response type.",
"EventCode": "0xB7, 0xBB",
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
index 095904c77001..d6f543471b24 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
@@ -19,7 +19,7 @@
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches",
"EventCode": "0xAB",
"EventName": "DSB2MITE_SWITCHES.COUNT",
- "PublicDescription": "This event counts the number of the Decode Stream Buffer (DSB)-to-MITE switches including all misses because of missing Decode Stream Buffer (DSB) cache and u-arch forced misses.\nNote: Invoking MITE requires two or three cycles delay.",
+ "PublicDescription": "This event counts the number of the Decode Stream Buffer (DSB)-to-MITE switches including all misses because of missing Decode Stream Buffer (DSB) cache and u-arch forced misses. Note: Invoking MITE requires two or three cycles delay.",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -267,11 +267,11 @@
"UMask": "0x4"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 or more Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
+ "PublicDescription": "Counts the number of cycles 4 or more uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
@@ -321,11 +321,11 @@
"UMask": "0x18"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 or more Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "PublicDescription": "Counts the number of cycles 4 or more uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/memory.json b/tools/perf/pmu-events/arch/x86/skylakex/memory.json
index 2b797dbc75fe..dba3cd6b3690 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/memory.json
@@ -864,7 +864,7 @@
"BriefDescription": "Number of times an RTM execution aborted due to any reasons (multiple categories may count as one).",
"EventCode": "0xC9",
"EventName": "RTM_RETIRED.ABORTED",
- "PEBS": "1",
+ "PEBS": "2",
"PublicDescription": "Number of times RTM abort was triggered.",
"SampleAfterValue": "2000003",
"UMask": "0x4"
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/other.json b/tools/perf/pmu-events/arch/x86/skylakex/other.json
index cda8a7a45f0c..2511d722327a 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/other.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/other.json
@@ -19,7 +19,7 @@
"BriefDescription": "Core cycles where the core was running in a manner where Turbo may be clipped to the AVX512 turbo schedule.",
"EventCode": "0x28",
"EventName": "CORE_POWER.LVL2_TURBO_LICENSE",
- "PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server michroarchtecture). This includes high current AVX 512-bit instructions.",
+ "PublicDescription": "Core cycles where the core was running with power-delivery for license level 2 (introduced in Skylake Server microarchitecture). This includes high current AVX 512-bit instructions.",
"SampleAfterValue": "200003",
"UMask": "0x20"
},
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
index 66d686cc933e..c50ddf5b40dd 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
@@ -396,7 +396,7 @@
"Errata": "SKL091, SKL044",
"EventCode": "0xC0",
"EventName": "INST_RETIRED.NOP",
- "PEBS": "1",
+ "PEBS": "2",
"SampleAfterValue": "2000003",
"UMask": "0x2"
},
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json
index 3eece8a728b5..f32d4d9d283a 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json
@@ -38,7 +38,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.CLFLUSH",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x80",
"Unit": "IRP"
},
@@ -47,7 +47,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.CRD",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x2",
"Unit": "IRP"
},
@@ -56,7 +56,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.DRD",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x4",
"Unit": "IRP"
},
@@ -65,7 +65,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.PCIDCAHINT",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x20",
"Unit": "IRP"
},
@@ -74,7 +74,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.PCIRDCUR",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x1",
"Unit": "IRP"
},
@@ -101,7 +101,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.WBMTOI",
"PerPkg": "1",
- "PublicDescription": "Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x40",
"Unit": "IRP"
},
@@ -500,7 +500,7 @@
"EventCode": "0x11",
"EventName": "UNC_I_TRANSACTIONS.WRITES",
"PerPkg": "1",
- "PublicDescription": "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID.; Trackes only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
+ "PublicDescription": "Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID.; Tracks only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
"UMask": "0x2",
"Unit": "IRP"
},
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/uncore-io.json b/tools/perf/pmu-events/arch/x86/skylakex/uncore-io.json
index 2a3a709018bb..743c91f3d2f0 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-io.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-io.json
@@ -34,7 +34,7 @@
"EventCode": "0x1",
"EventName": "UNC_IIO_CLOCKTICKS",
"PerPkg": "1",
- "PublicDescription": "Counts clockticks of the 1GHz trafiic controller clock in the IIO unit.",
+ "PublicDescription": "Counts clockticks of the 1GHz traffic controller clock in the IIO unit.",
"Unit": "IIO"
},
{
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json b/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json
index f59405877ae8..73feadaf7674 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json
@@ -205,7 +205,7 @@
"BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake.",
"EventCode": "0x85",
"EventName": "ITLB_MISSES.WALK_PENDING",
- "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake michroarchitecture.",
+ "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake microarchitecture.",
"SampleAfterValue": "100003",
"UMask": "0x10"
},
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:02:59

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 07/12] perf vendor events intel: Update sapphirerapids to 1.20

Update events from 1.17 to 1.20 as released in:
https://github.com/intel/perfmon/commit/6f674057745acf0125395638ca6be36458a59bda

Various description updates. Adds uncore events
UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_LOCAL,
UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_REMOTE,
UNC_CHA_TOR_INSERTS.IO_ITOM_LOCAL, UNC_CHA_TOR_INSERTS.IO_ITOM_REMOTE,
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL,
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE,
UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_LOCAL,
UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_REMOTE,
UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_LOCAL,
UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_REMOTE,
UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_LOCAL,
UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_REMOTE and removes core events
AMX_OPS_RETIRED.BF16 and AMX_OPS_RETIRED.INT8.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../arch/x86/sapphirerapids/cache.json | 1 +
.../arch/x86/sapphirerapids/frontend.json | 2 +-
.../arch/x86/sapphirerapids/memory.json | 1 +
.../arch/x86/sapphirerapids/pipeline.json | 19 +--
.../arch/x86/sapphirerapids/uncore-cache.json | 112 +++++++++++++++++-
.../sapphirerapids/uncore-interconnect.json | 26 ++--
7 files changed, 130 insertions(+), 33 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index fedaacbe981a..e5de35c96358 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -26,7 +26,7 @@ GenuineIntel-6-1[AEF],v4,nehalemep,core
GenuineIntel-6-2E,v4,nehalemex,core
GenuineIntel-6-A7,v1.02,rocketlake,core
GenuineIntel-6-2A,v19,sandybridge,core
-GenuineIntel-6-8F,v1.17,sapphirerapids,core
+GenuineIntel-6-8F,v1.20,sapphirerapids,core
GenuineIntel-6-AF,v1.01,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v58,skylake,core
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/cache.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/cache.json
index 9606e76b98d6..b0447aad0dfc 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/cache.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/cache.json
@@ -432,6 +432,7 @@
"BriefDescription": "Retired load instructions with remote Intel(R) Optane(TM) DC persistent memory as the data source where the data request missed all caches.",
"EventCode": "0xd3",
"EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM",
+ "PEBS": "1",
"PublicDescription": "Counts retired load instructions with remote Intel(R) Optane(TM) DC persistent memory as the data source and the data request missed L3.",
"SampleAfterValue": "100007",
"UMask": "0x10"
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/frontend.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/frontend.json
index 9e53da55d0c1..93d99318a623 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/frontend.json
@@ -267,7 +267,7 @@
"CounterMask": "6",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).",
+ "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the DSB (Decode Stream Buffer) path. Count includes uops that may 'bypass' the IDQ.",
"SampleAfterValue": "2000003",
"UMask": "0x8"
},
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/memory.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/memory.json
index e8bf7c9c44e1..5420f529f491 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/memory.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/memory.json
@@ -264,6 +264,7 @@
"BriefDescription": "Number of times an RTM execution aborted.",
"EventCode": "0xc9",
"EventName": "RTM_RETIRED.ABORTED",
+ "PEBS": "1",
"PublicDescription": "Counts the number of times RTM abort was triggered.",
"SampleAfterValue": "100003",
"UMask": "0x4"
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json
index 2cfe814d2015..e2086bedeca8 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json
@@ -1,20 +1,4 @@
[
- {
- "BriefDescription": "AMX retired arithmetic BF16 operations.",
- "EventCode": "0xce",
- "EventName": "AMX_OPS_RETIRED.BF16",
- "PublicDescription": "Number of AMX-based retired arithmetic bfloat16 (BF16) floating-point operations. Counts TDPBF16PS FP instructions. SW to use operation multiplier of 4",
- "SampleAfterValue": "1000003",
- "UMask": "0x2"
- },
- {
- "BriefDescription": "AMX retired arithmetic integer 8-bit operations.",
- "EventCode": "0xce",
- "EventName": "AMX_OPS_RETIRED.INT8",
- "PublicDescription": "Number of AMX-based retired arithmetic integer operations of 8-bit width source operands. Counts TDPB[SS,UU,US,SU]D instructions. SW should use operation multiplier of 8.",
- "SampleAfterValue": "1000003",
- "UMask": "0x1"
- },
{
"BriefDescription": "This event is deprecated. Refer to new event ARITH.DIV_ACTIVE",
"CounterMask": "1",
@@ -444,6 +428,7 @@
"BriefDescription": "INST_RETIRED.MACRO_FUSED",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.MACRO_FUSED",
+ "PEBS": "1",
"SampleAfterValue": "2000003",
"UMask": "0x10"
},
@@ -451,6 +436,7 @@
"BriefDescription": "Retired NOP instructions.",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.NOP",
+ "PEBS": "1",
"PublicDescription": "Counts all retired NOP or ENDBR32/64 instructions",
"SampleAfterValue": "2000003",
"UMask": "0x2"
@@ -467,6 +453,7 @@
"BriefDescription": "Iterations of Repeat string retired instructions.",
"EventCode": "0xc0",
"EventName": "INST_RETIRED.REP_ITERATION",
+ "PEBS": "1",
"PublicDescription": "Number of iterations of Repeat (REP) string retired instructions such as MOVS, CMPS, and SCAS. Each has a byte, word, and doubleword version and string instructions can be repeated using a repetition prefix, REP, that allows their architectural execution to be repeated a number of times as specified by the RCX register. Note the number of iterations is implementation-dependent.",
"SampleAfterValue": "2000003",
"UMask": "0x8"
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json
index cf6fa70f37c1..25a2b9695135 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json
@@ -4143,6 +4143,42 @@
"UMask": "0xcd43ff04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "ItoMCacheNear (partial write) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Inserts : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd42ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "ItoMCacheNear (partial write) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Inserts : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC : Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd437f04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "ItoM (write) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOM_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc42ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "ItoM (write) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_ITOM_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc437f04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; Misses from local IO",
"EventCode": "0x35",
@@ -4153,7 +4189,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "TOR Inserts; ItoM misses from local IO",
+ "BriefDescription": "TOR Inserts : ItoM, indicating a full cacheline write request, from IO Devices that missed the LLC",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_ITOM",
"PerPkg": "1",
@@ -4171,7 +4207,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "TOR Inserts; RdCur and FsRdCur misses from local IO",
+ "BriefDescription": "TOR Inserts; RdCur and FsRdCur requests from local IO that miss LLC",
"EventCode": "0x35",
"EventName": "UNC_CHA_TOR_INSERTS.IO_MISS_PCIRDCUR",
"PerPkg": "1",
@@ -4197,6 +4233,24 @@
"UMask": "0xc8f3ff04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "PCIRDCUR (read) transactions from an IO device that addresses memory on a remote socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f2ff04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "PCIRDCUR (read) transactions from an IO device that addresses memory on the local socket",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "Counts the number of entries successfully inserted into the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f37f04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; RFO from local IO",
"EventCode": "0x35",
@@ -5565,6 +5619,42 @@
"UMask": "0xcd43fe04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC and targets local memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd42fe04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy : ItoMCacheNears, indicating a partial write request, from IO Devices that missed the LLC and targets remote memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOMCACHENEAR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcd437e04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy; ITOM misses from local IO and targets local memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc42fe04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy; ITOM misses from local IO and targets remote memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_ITOM_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xcc437e04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; RdCur and FsRdCur misses from local IO",
"EventCode": "0x36",
@@ -5574,6 +5664,24 @@
"UMask": "0xc8f3fe04",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy; RdCur and FsRdCur misses from local IO and targets local memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_LOCAL",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f2fe04",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "TOR Occupancy; RdCur and FsRdCur misses from local IO and targets remote memory",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IO_MISS_PCIRDCUR_REMOTE",
+ "PerPkg": "1",
+ "PublicDescription": "For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent. Does not include addressless requests such as locks and interrupts.",
+ "UMask": "0xc8f37e04",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; RFO misses from local IO",
"EventCode": "0x36",
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json
index 65d088556bae..22bb490e9666 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json
@@ -4888,7 +4888,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (AD)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (AD)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.AD",
"PerPkg": "1",
@@ -4897,7 +4897,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (AK)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (AK)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.AK",
"PerPkg": "1",
@@ -4906,7 +4906,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (AKC)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (AKC)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.AKC",
"PerPkg": "1",
@@ -4915,7 +4915,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (BL)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (BL)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.BL",
"PerPkg": "1",
@@ -4924,7 +4924,7 @@
"Unit": "MDF"
},
{
- "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO\r\nIngress (V-EMIB) (IV)",
+ "BriefDescription": "Number of cycles incoming messages from the vertical ring that are bounced at the SBO Ingress (V-EMIB) (IV)",
"EventCode": "0x4B",
"EventName": "UNC_MDF_CRS_TxR_V_BOUNCES.IV",
"PerPkg": "1",
@@ -5291,7 +5291,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCB",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xe",
"Unit": "UPI"
},
@@ -5300,7 +5300,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10e",
"Unit": "UPI"
},
@@ -5309,7 +5309,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCS",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xf",
"Unit": "UPI"
},
@@ -5318,7 +5318,7 @@
"EventCode": "0x05",
"EventName": "UNC_UPI_RxL_BASIC_HDR_MATCH.NCS_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Receive path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Receive path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Receive path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10f",
"Unit": "UPI"
},
@@ -5763,7 +5763,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCB",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xe",
"Unit": "UPI"
},
@@ -5772,7 +5772,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCB_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Bypass, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10e",
"Unit": "UPI"
},
@@ -5781,7 +5781,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCS",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0xf",
"Unit": "UPI"
},
@@ -5790,7 +5790,7 @@
"EventCode": "0x04",
"EventName": "UNC_UPI_TxL_BASIC_HDR_MATCH.NCS_OPC",
"PerPkg": "1",
- "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Transmit path of a UPI port.\r\nMatch based on UMask specific bits:\r\nZ: Message Class (3-bit)\r\nY: Message Class Enable\r\nW: Opcode (4-bit)\r\nV: Opcode Enable\r\nU: Local Enable\r\nT: Remote Enable\r\nS: Data Hdr Enable\r\nR: Non-Data Hdr Enable\r\nQ: Dual Slot Hdr Enable\r\nP: Single Slot Hdr Enable\r\nLink Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases.\r\nNote: If Message Class is disabled, we expect opcode to also be disabled.",
+ "PublicDescription": "Matches on Transmit path of a UPI Port : Non-Coherent Standard, Match Opcode : Matches on Transmit path of a UPI port. Match based on UMask specific bits: Z: Message Class (3-bit) Y: Message Class Enable W: Opcode (4-bit) V: Opcode Enable U: Local Enable T: Remote Enable S: Data Hdr Enable R: Non-Data Hdr Enable Q: Dual Slot Hdr Enable P: Single Slot Hdr Enable Link Layer control types are excluded (LL CTRL, slot NULL, LLCRD) even under specific opcode match_en cases. Note: If Message Class is disabled, we expect opcode to also be disabled.",
"UMask": "0x10f",
"Unit": "UPI"
},
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:03:37

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 11/12] perf vendor events intel: Update snowridgex to 1.22

Update events from 1.21 to 1.22 as released in:
https://github.com/intel/perfmon/commit/ba4f96039f96231b51e3eb69d5a21e2b00f6de5b

Updates various descriptions and removes the event
UNC_IIO_NUM_REQ_FROM_CPU.IRP.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../pmu-events/arch/x86/snowridgex/uncore-cache.json | 4 ++--
.../arch/x86/snowridgex/uncore-interconnect.json | 6 +++---
.../pmu-events/arch/x86/snowridgex/uncore-io.json | 11 -----------
4 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index e67e7906b2cf..c372e3594a69 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -31,7 +31,7 @@ GenuineIntel-6-AF,v1.02,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v58,skylake,core
GenuineIntel-6-55-[01234],v1.33,skylakex,core
-GenuineIntel-6-86,v1.21,snowridgex,core
+GenuineIntel-6-86,v1.22,snowridgex,core
GenuineIntel-6-8[CD],v1.15,tigerlake,core
GenuineIntel-6-2C,v5,westmereep-dp,core
GenuineIntel-6-25,v4,westmereep-sp,core
diff --git a/tools/perf/pmu-events/arch/x86/snowridgex/uncore-cache.json b/tools/perf/pmu-events/arch/x86/snowridgex/uncore-cache.json
index a68a5bb05c22..4090e4da1bd0 100644
--- a/tools/perf/pmu-events/arch/x86/snowridgex/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/snowridgex/uncore-cache.json
@@ -1444,7 +1444,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "This event is deprecated. Refer to new event UNC_CHA_LLC_LOOKUP.DATA_READ_LOCAL",
+ "BriefDescription": "This event is deprecated.",
"Deprecated": "1",
"EventCode": "0x34",
"EventName": "UNC_CHA_LLC_LOOKUP.DMND_READ_LOCAL",
@@ -1638,7 +1638,7 @@
"Unit": "CHA"
},
{
- "BriefDescription": "This event is deprecated. Refer to new event UNC_CHA_LLC_LOOKUP.RFO_LOCAL",
+ "BriefDescription": "This event is deprecated.",
"Deprecated": "1",
"EventCode": "0x34",
"EventName": "UNC_CHA_LLC_LOOKUP.RFO_PREF_LOCAL",
diff --git a/tools/perf/pmu-events/arch/x86/snowridgex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/snowridgex/uncore-interconnect.json
index 7e2895f7fe3d..7cc3635b118b 100644
--- a/tools/perf/pmu-events/arch/x86/snowridgex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/snowridgex/uncore-interconnect.json
@@ -38,7 +38,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.CLFLUSH",
"PerPkg": "1",
- "PublicDescription": "Coherent Ops : CLFlush : Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Coherent Ops : CLFlush : Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x80",
"Unit": "IRP"
},
@@ -65,7 +65,7 @@
"EventCode": "0x10",
"EventName": "UNC_I_COHERENT_OPS.WBMTOI",
"PerPkg": "1",
- "PublicDescription": "Coherent Ops : WbMtoI : Counts the number of coherency related operations servied by the IRP",
+ "PublicDescription": "Coherent Ops : WbMtoI : Counts the number of coherency related operations serviced by the IRP",
"UMask": "0x40",
"Unit": "IRP"
},
@@ -454,7 +454,7 @@
"EventCode": "0x11",
"EventName": "UNC_I_TRANSACTIONS.WRITES",
"PerPkg": "1",
- "PublicDescription": "Inbound Transaction Count : Writes : Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID. : Trackes only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
+ "PublicDescription": "Inbound Transaction Count : Writes : Counts the number of Inbound transactions from the IRP to the Uncore. This can be filtered based on request type in addition to the source queue. Note the special filtering equation. We do OR-reduction on the request type. If the SOURCE bit is set, then we also do AND qualification based on the source portID. : Tracks only write requests. Each write request should have a prefetch, so there is no need to explicitly track these requests. For writes that are tickled and have to retry, the counter will be incremented for each retry.",
"UMask": "0x2",
"Unit": "IRP"
},
diff --git a/tools/perf/pmu-events/arch/x86/snowridgex/uncore-io.json b/tools/perf/pmu-events/arch/x86/snowridgex/uncore-io.json
index ecdd6f0f8e8f..de156e499f56 100644
--- a/tools/perf/pmu-events/arch/x86/snowridgex/uncore-io.json
+++ b/tools/perf/pmu-events/arch/x86/snowridgex/uncore-io.json
@@ -2505,17 +2505,6 @@
"UMask": "0x10",
"Unit": "IIO"
},
- {
- "BriefDescription": "Number requests sent to PCIe from main die : From IRP",
- "EventCode": "0xC2",
- "EventName": "UNC_IIO_NUM_REQ_FROM_CPU.IRP",
- "FCMask": "0x07",
- "PerPkg": "1",
- "PortMask": "0xFF",
- "PublicDescription": "Number requests sent to PCIe from main die : From IRP : Captures Posted/Non-posted allocations from IRP. i.e. either non-confined P2P traffic or from the CPU",
- "UMask": "0x1",
- "Unit": "IIO"
- },
{
"BriefDescription": "Number requests sent to PCIe from main die : From ITC",
"EventCode": "0xC2",
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:03:44

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 10/12] perf vendor events intel: Update skylake to v58

Update events from:
https://github.com/intel/perfmon/commit/f2e5136e062a91ae554dc40530132e66f9271848
This change didn't increase the version number from v58.

Updates various descriptions.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/arch/x86/skylake/frontend.json | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/skylake/frontend.json b/tools/perf/pmu-events/arch/x86/skylake/frontend.json
index 095904c77001..d6f543471b24 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/frontend.json
@@ -19,7 +19,7 @@
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches",
"EventCode": "0xAB",
"EventName": "DSB2MITE_SWITCHES.COUNT",
- "PublicDescription": "This event counts the number of the Decode Stream Buffer (DSB)-to-MITE switches including all misses because of missing Decode Stream Buffer (DSB) cache and u-arch forced misses.\nNote: Invoking MITE requires two or three cycles delay.",
+ "PublicDescription": "This event counts the number of the Decode Stream Buffer (DSB)-to-MITE switches including all misses because of missing Decode Stream Buffer (DSB) cache and u-arch forced misses. Note: Invoking MITE requires two or three cycles delay.",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -267,11 +267,11 @@
"UMask": "0x4"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 or more Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
+ "PublicDescription": "Counts the number of cycles 4 or more uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
@@ -321,11 +321,11 @@
"UMask": "0x18"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 or more Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "PublicDescription": "Counts the number of cycles 4 or more uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 14:25:50

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v1 00/12] perf vendor events intel update

On Wed, Mar 20, 2024 at 11:00:04PM -0700, Ian Rogers wrote:
> Update events to the latest on:
> https://github.com/intel/perfmon
>
> Using the converter script:
> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Kan,

Can you please take a look and provide an Acked or Reviewed-by?

Thanks!

- Arnaldo

> Ian Rogers (12):
> perf vendor events intel: Update cascadelakex to 1.21
> perf vendor events intel: Update emeraldrapids to 1.06
> perf vendor events intel: Update grandridge to 1.02
> perf vendor events intel: Update icelakex to 1.24
> perf vendor events intel: Update lunarlake to 1.01
> perf vendor events intel: Update meteorlake to 1.08
> perf vendor events intel: Update sapphirerapids to 1.20
> perf vendor events intel: Update sierraforest to 1.02
> perf vendor events intel: Update skylakex to 1.33
> perf vendor events intel: Update skylake to v58
> perf vendor events intel: Update snowridgex to 1.22
> perf vendor events intel: Remove info metrics erroneously in TopdownL1
>
> .../arch/x86/broadwellx/bdx-metrics.json | 35 +++---
> .../arch/x86/cascadelakex/clx-metrics.json | 85 +++++--------
> .../arch/x86/cascadelakex/frontend.json | 10 +-
> .../arch/x86/cascadelakex/memory.json | 2 +-
> .../arch/x86/cascadelakex/other.json | 2 +-
> .../arch/x86/cascadelakex/pipeline.json | 2 +-
> .../x86/cascadelakex/uncore-interconnect.json | 14 +--
> .../arch/x86/cascadelakex/virtual-memory.json | 2 +-
> .../arch/x86/emeraldrapids/frontend.json | 2 +-
> .../arch/x86/emeraldrapids/memory.json | 1 +
> .../arch/x86/emeraldrapids/pipeline.json | 3 +
> .../arch/x86/emeraldrapids/uncore-cache.json | 112 ++++++++++++++++-
> .../emeraldrapids/uncore-interconnect.json | 26 ++--
> .../arch/x86/grandridge/pipeline.json | 43 ++++++-
> .../arch/x86/grandridge/uncore-cache.json | 28 ++++-
> .../arch/x86/haswellx/hsx-metrics.json | 35 +++---
> .../arch/x86/icelakex/frontend.json | 2 +-
> .../arch/x86/icelakex/icx-metrics.json | 95 ++++++--------
> .../pmu-events/arch/x86/icelakex/memory.json | 1 +
> .../arch/x86/icelakex/uncore-cache.json | 22 +++-
> .../x86/icelakex/uncore-interconnect.json | 64 +++++-----
> .../arch/x86/icelakex/uncore-io.json | 11 --
> .../pmu-events/arch/x86/lunarlake/cache.json | 24 ++--
> .../arch/x86/lunarlake/frontend.json | 2 +-
> .../pmu-events/arch/x86/lunarlake/memory.json | 4 +-
> .../pmu-events/arch/x86/lunarlake/other.json | 4 +-
> .../arch/x86/lunarlake/pipeline.json | 109 +++++++++++++---
> tools/perf/pmu-events/arch/x86/mapfile.csv | 20 +--
> .../pmu-events/arch/x86/meteorlake/cache.json | 30 +++++
> .../arch/x86/meteorlake/frontend.json | 4 +-
> .../arch/x86/meteorlake/memory.json | 20 +++
> .../pmu-events/arch/x86/meteorlake/other.json | 42 ++++++-
> .../arch/x86/meteorlake/pipeline.json | 44 +++++--
> .../x86/meteorlake/uncore-interconnect.json | 22 +++-
> .../arch/x86/sapphirerapids/cache.json | 1 +
> .../arch/x86/sapphirerapids/frontend.json | 2 +-
> .../arch/x86/sapphirerapids/memory.json | 1 +
> .../arch/x86/sapphirerapids/pipeline.json | 19 +--
> .../arch/x86/sapphirerapids/spr-metrics.json | 119 +++++++-----------
> .../arch/x86/sapphirerapids/uncore-cache.json | 112 ++++++++++++++++-
> .../sapphirerapids/uncore-interconnect.json | 26 ++--
> .../arch/x86/sierraforest/pipeline.json | 36 +++++-
> .../pmu-events/arch/x86/skylake/frontend.json | 10 +-
> .../pmu-events/arch/x86/skylakex/cache.json | 9 ++
> .../arch/x86/skylakex/frontend.json | 10 +-
> .../pmu-events/arch/x86/skylakex/memory.json | 2 +-
> .../pmu-events/arch/x86/skylakex/other.json | 2 +-
> .../arch/x86/skylakex/pipeline.json | 2 +-
> .../arch/x86/skylakex/skx-metrics.json | 85 +++++--------
> .../x86/skylakex/uncore-interconnect.json | 14 +--
> .../arch/x86/skylakex/uncore-io.json | 2 +-
> .../arch/x86/skylakex/virtual-memory.json | 2 +-
> .../arch/x86/snowridgex/uncore-cache.json | 4 +-
> .../x86/snowridgex/uncore-interconnect.json | 6 +-
> .../arch/x86/snowridgex/uncore-io.json | 11 --
> 55 files changed, 911 insertions(+), 486 deletions(-)
>
> --
> 2.44.0.396.g6e790dbe36-goog
>

2024-03-21 15:14:51

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 00/12] perf vendor events intel update



On 2024-03-21 10:25 a.m., Arnaldo Carvalho de Melo wrote:
> On Wed, Mar 20, 2024 at 11:00:04PM -0700, Ian Rogers wrote:
>> Update events to the latest on:
>> https://github.com/intel/perfmon
>>
>> Using the converter script:
>> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
>
> Kan,
>
> Can you please take a look and provide an Acked or Reviewed-by?
>


Sure. The patch series look good to me. Thanks Ian!

Reviewed-by: Kan Liang <[email protected]>

Thanks,
Kan

> Thanks!
>
> - Arnaldo
>
>> Ian Rogers (12):
>> perf vendor events intel: Update cascadelakex to 1.21
>> perf vendor events intel: Update emeraldrapids to 1.06
>> perf vendor events intel: Update grandridge to 1.02
>> perf vendor events intel: Update icelakex to 1.24
>> perf vendor events intel: Update lunarlake to 1.01
>> perf vendor events intel: Update meteorlake to 1.08
>> perf vendor events intel: Update sapphirerapids to 1.20
>> perf vendor events intel: Update sierraforest to 1.02
>> perf vendor events intel: Update skylakex to 1.33
>> perf vendor events intel: Update skylake to v58
>> perf vendor events intel: Update snowridgex to 1.22
>> perf vendor events intel: Remove info metrics erroneously in TopdownL1
>>
>> .../arch/x86/broadwellx/bdx-metrics.json | 35 +++---
>> .../arch/x86/cascadelakex/clx-metrics.json | 85 +++++--------
>> .../arch/x86/cascadelakex/frontend.json | 10 +-
>> .../arch/x86/cascadelakex/memory.json | 2 +-
>> .../arch/x86/cascadelakex/other.json | 2 +-
>> .../arch/x86/cascadelakex/pipeline.json | 2 +-
>> .../x86/cascadelakex/uncore-interconnect.json | 14 +--
>> .../arch/x86/cascadelakex/virtual-memory.json | 2 +-
>> .../arch/x86/emeraldrapids/frontend.json | 2 +-
>> .../arch/x86/emeraldrapids/memory.json | 1 +
>> .../arch/x86/emeraldrapids/pipeline.json | 3 +
>> .../arch/x86/emeraldrapids/uncore-cache.json | 112 ++++++++++++++++-
>> .../emeraldrapids/uncore-interconnect.json | 26 ++--
>> .../arch/x86/grandridge/pipeline.json | 43 ++++++-
>> .../arch/x86/grandridge/uncore-cache.json | 28 ++++-
>> .../arch/x86/haswellx/hsx-metrics.json | 35 +++---
>> .../arch/x86/icelakex/frontend.json | 2 +-
>> .../arch/x86/icelakex/icx-metrics.json | 95 ++++++--------
>> .../pmu-events/arch/x86/icelakex/memory.json | 1 +
>> .../arch/x86/icelakex/uncore-cache.json | 22 +++-
>> .../x86/icelakex/uncore-interconnect.json | 64 +++++-----
>> .../arch/x86/icelakex/uncore-io.json | 11 --
>> .../pmu-events/arch/x86/lunarlake/cache.json | 24 ++--
>> .../arch/x86/lunarlake/frontend.json | 2 +-
>> .../pmu-events/arch/x86/lunarlake/memory.json | 4 +-
>> .../pmu-events/arch/x86/lunarlake/other.json | 4 +-
>> .../arch/x86/lunarlake/pipeline.json | 109 +++++++++++++---
>> tools/perf/pmu-events/arch/x86/mapfile.csv | 20 +--
>> .../pmu-events/arch/x86/meteorlake/cache.json | 30 +++++
>> .../arch/x86/meteorlake/frontend.json | 4 +-
>> .../arch/x86/meteorlake/memory.json | 20 +++
>> .../pmu-events/arch/x86/meteorlake/other.json | 42 ++++++-
>> .../arch/x86/meteorlake/pipeline.json | 44 +++++--
>> .../x86/meteorlake/uncore-interconnect.json | 22 +++-
>> .../arch/x86/sapphirerapids/cache.json | 1 +
>> .../arch/x86/sapphirerapids/frontend.json | 2 +-
>> .../arch/x86/sapphirerapids/memory.json | 1 +
>> .../arch/x86/sapphirerapids/pipeline.json | 19 +--
>> .../arch/x86/sapphirerapids/spr-metrics.json | 119 +++++++-----------
>> .../arch/x86/sapphirerapids/uncore-cache.json | 112 ++++++++++++++++-
>> .../sapphirerapids/uncore-interconnect.json | 26 ++--
>> .../arch/x86/sierraforest/pipeline.json | 36 +++++-
>> .../pmu-events/arch/x86/skylake/frontend.json | 10 +-
>> .../pmu-events/arch/x86/skylakex/cache.json | 9 ++
>> .../arch/x86/skylakex/frontend.json | 10 +-
>> .../pmu-events/arch/x86/skylakex/memory.json | 2 +-
>> .../pmu-events/arch/x86/skylakex/other.json | 2 +-
>> .../arch/x86/skylakex/pipeline.json | 2 +-
>> .../arch/x86/skylakex/skx-metrics.json | 85 +++++--------
>> .../x86/skylakex/uncore-interconnect.json | 14 +--
>> .../arch/x86/skylakex/uncore-io.json | 2 +-
>> .../arch/x86/skylakex/virtual-memory.json | 2 +-
>> .../arch/x86/snowridgex/uncore-cache.json | 4 +-
>> .../x86/snowridgex/uncore-interconnect.json | 6 +-
>> .../arch/x86/snowridgex/uncore-io.json | 11 --
>> 55 files changed, 911 insertions(+), 486 deletions(-)
>>
>> --
>> 2.44.0.396.g6e790dbe36-goog
>>
>

2024-03-21 15:23:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v1 00/12] perf vendor events intel update

On Thu, Mar 21, 2024 at 11:14:35AM -0400, Liang, Kan wrote:
>
>
> On 2024-03-21 10:25 a.m., Arnaldo Carvalho de Melo wrote:
> > On Wed, Mar 20, 2024 at 11:00:04PM -0700, Ian Rogers wrote:
> >> Update events to the latest on:
> >> https://github.com/intel/perfmon
> >>
> >> Using the converter script:
> >> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
> >
> > Kan,
> >
> > Can you please take a look and provide an Acked or Reviewed-by?
> >
>
>
> Sure. The patch series look good to me. Thanks Ian!
>
> Reviewed-by: Kan Liang <[email protected]>

Thanks!

- Arnaldo

2024-03-21 06:03:53

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 12/12] perf vendor events intel: Remove info metrics erroneously in TopdownL1

Bug affected server metrics only. This doesn't impact default metrics
but if the TopdownL1 metric group is specified. Passes on the fix in:
https://github.com/intel/perfmon/commit/b09f0a3953234ec592b4a872b87764c78da05d8b

Signed-off-by: Ian Rogers <[email protected]>
---
.../arch/x86/broadwellx/bdx-metrics.json | 35 +++---
.../arch/x86/cascadelakex/clx-metrics.json | 85 +++++--------
.../arch/x86/haswellx/hsx-metrics.json | 35 +++---
.../arch/x86/icelakex/icx-metrics.json | 95 ++++++--------
.../arch/x86/sapphirerapids/spr-metrics.json | 119 +++++++-----------
.../arch/x86/skylakex/skx-metrics.json | 85 +++++--------
6 files changed, 181 insertions(+), 273 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json b/tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json
index f2d378c9d68f..0aed533da882 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json
@@ -732,9 +732,8 @@
{
"BriefDescription": "Average Parallel L2 cache miss data reads",
"MetricExpr": "tma_info_memory_latency_data_l2_mlp",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_data_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_data_l2_mlp"
},
{
"BriefDescription": "",
@@ -745,9 +744,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
"MetricExpr": "64 * L1D.REPLACEMENT / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t"
},
{
"BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
@@ -764,9 +762,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
"MetricExpr": "64 * L2_LINES_IN.ALL / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l2_cache_fill_bw_2t"
},
{
"BriefDescription": "L2 cache hits per kilo instruction for all request types (including speculative)",
@@ -807,9 +804,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * LONGEST_LAT_CACHE.MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l3_cache_fill_bw_2t"
},
{
"BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
@@ -838,16 +834,14 @@
{
"BriefDescription": "Average Latency for L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS.DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l2_miss_latency"
},
{
"BriefDescription": "Average Parallel L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_load_l2_mlp"
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core cycles)",
@@ -867,9 +861,8 @@
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
"MetricExpr": "tma_info_memory_tlb_page_walks_utilization",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_page_walks_utilization",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_page_walks_utilization"
},
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json b/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
index 7f88b156f73b..297046818efe 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
@@ -670,23 +670,20 @@
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
"MetricExpr": "(100 * (1 - tma_core_bound / (((EXE_ACTIVITY.EXE_BOUND_0_PORTS + tma_core_bound * RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD) if tma_core_bound < (((EXE_ACTIVITY.EXE_BOUND_0_PORTS + tma_core_bound * RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD) else 1) if tma_info_system_smt_2t_utilization > 0.5 else 0)",
- "MetricGroup": "Cor;SMT;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_core_bound_likely",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Cor;SMT",
+ "MetricName": "tma_info_botlnk_core_bound_likely"
},
{
"BriefDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fetch_BW Bottleneck.",
"MetricExpr": "100 * (100 * (tma_fetch_latency * (DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) / ((ICACHE_16B.IFDATA_STALL + 2 * cpu@ICACHE_16B.IFDATA_STALL\\,cmask\\=0x1\\,edge\\=0x1@) / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + 9 * BACLEARS.ANY / CPU_CLK_UNHALTED.THREAD) + min(2 * IDQ.MS_SWITCHES / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) + tma_fetch_bandwidth * tma_mite / (tma_mite + tma_dsb)))",
- "MetricGroup": "DSBmiss;Fed;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_dsb_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "DSBmiss;Fed",
+ "MetricName": "tma_info_botlnk_dsb_misses"
},
{
"BriefDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bottleneck.",
"MetricExpr": "100 * (100 * (tma_fetch_latency * ((ICACHE_16B.IFDATA_STALL + 2 * cpu@ICACHE_16B.IFDATA_STALL\\,cmask\\=0x1\\,edge\\=0x1@) / CPU_CLK_UNHALTED.THREAD) / ((ICACHE_16B.IFDATA_STALL + 2 * cpu@ICACHE_16B.IFDATA_STALL\\,cmask\\=0x1\\,edge\\=0x1@) / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + 9 * BACLEARS.ANY / CPU_CLK_UNHALTED.THREAD) + min(2 * IDQ.MS_SWITCHES / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD)))",
- "MetricGroup": "Fed;FetchLat;IcMiss;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_ic_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;FetchLat;IcMiss",
+ "MetricName": "tma_info_botlnk_ic_misses"
},
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
@@ -1045,9 +1042,8 @@
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_code_stlb_mpki",
- "MetricGroup": "Fed;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_code_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;MemoryTLB",
+ "MetricName": "tma_info_memory_code_stlb_mpki"
},
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
@@ -1088,9 +1084,8 @@
{
"BriefDescription": "Average Parallel L2 cache miss data reads",
"MetricExpr": "tma_info_memory_latency_data_l2_mlp",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_data_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_data_l2_mlp"
},
{
"BriefDescription": "Fill Buffer (FB) hits per kilo instructions for retired demand loads (L1D misses that merge into ongoing miss-handling entries)",
@@ -1107,9 +1102,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
"MetricExpr": "64 * L1D.REPLACEMENT / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t"
},
{
"BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
@@ -1132,23 +1126,20 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
"MetricExpr": "64 * L2_LINES_IN.ALL / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l2_cache_fill_bw_2t"
},
{
"BriefDescription": "Rate of non silent evictions from the L2 cache per Kilo instruction",
"MetricExpr": "1e3 * L2_LINES_OUT.NON_SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki"
},
{
"BriefDescription": "Rate of silent evictions from the L2 cache per Kilo instruction where the evicted lines are dropped (no writeback to L3 or memory)",
"MetricExpr": "1e3 * L2_LINES_OUT.SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_silent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_silent_pki"
},
{
"BriefDescription": "L2 cache hits per kilo instruction for all request types (including speculative)",
@@ -1189,9 +1180,8 @@
{
"BriefDescription": "Average per-core data access bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * OFFCORE_REQUESTS.ALL_REQUESTS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_access_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW;Offcore",
+ "MetricName": "tma_info_memory_l3_cache_access_bw_2t"
},
{
"BriefDescription": "",
@@ -1202,9 +1192,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * LONGEST_LAT_CACHE.MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l3_cache_fill_bw_2t"
},
{
"BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
@@ -1233,16 +1222,14 @@
{
"BriefDescription": "Average Latency for L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS.DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l2_miss_latency"
},
{
"BriefDescription": "Average Parallel L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_load_l2_mlp"
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core cycles)",
@@ -1253,9 +1240,8 @@
{
"BriefDescription": "STLB (2nd level TLB) data load speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_load_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_load_stlb_mpki"
},
{
"BriefDescription": "Un-cacheable retired load per kilo instruction",
@@ -1273,16 +1259,14 @@
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
"MetricExpr": "tma_info_memory_tlb_page_walks_utilization",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_page_walks_utilization",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_page_walks_utilization"
},
{
"BriefDescription": "STLB (2nd level TLB) data store speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_store_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_store_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_store_stlb_mpki"
},
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
@@ -1313,9 +1297,8 @@
{
"BriefDescription": "Un-cacheable retired load per kilo instruction",
"MetricExpr": "1e3 * MEM_LOAD_MISC_RETIRED.UC / INST_RETIRED.ANY",
- "MetricGroup": "Mem;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_uc_load_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem",
+ "MetricName": "tma_info_memory_uc_load_pki"
},
{
"BriefDescription": "",
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/hsx-metrics.json b/tools/perf/pmu-events/arch/x86/haswellx/hsx-metrics.json
index 21e2cb5e3178..83d50d80a148 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/hsx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/hsx-metrics.json
@@ -618,9 +618,8 @@
{
"BriefDescription": "Average Parallel L2 cache miss data reads",
"MetricExpr": "tma_info_memory_latency_data_l2_mlp",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_data_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_data_l2_mlp"
},
{
"BriefDescription": "",
@@ -631,9 +630,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
"MetricExpr": "64 * L1D.REPLACEMENT / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t"
},
{
"BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
@@ -650,9 +648,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
"MetricExpr": "64 * L2_LINES_IN.ALL / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l2_cache_fill_bw_2t"
},
{
"BriefDescription": "L2 cache true misses per kilo instruction for retired demand loads",
@@ -669,9 +666,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * LONGEST_LAT_CACHE.MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l3_cache_fill_bw_2t"
},
{
"BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
@@ -700,16 +696,14 @@
{
"BriefDescription": "Average Latency for L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS.DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l2_miss_latency"
},
{
"BriefDescription": "Average Parallel L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_load_l2_mlp"
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core cycles)",
@@ -729,9 +723,8 @@
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
"MetricExpr": "tma_info_memory_tlb_page_walks_utilization",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_page_walks_utilization",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_page_walks_utilization"
},
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json b/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
index c015b8277dc7..769ba12bef87 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
@@ -667,23 +667,20 @@
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
"MetricExpr": "(100 * (1 - max(0, topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots - (CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES) * (topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots)) / (((cpu@EXE_ACTIVITY.3_PORTS_UTIL\\,umask\\=0x80@ + max(0, topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots - (CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES) * (topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots)) * RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD) if max(0, topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots - (CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES) * (topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISCCLEARS_COUNT / slots)) < (((cpu@EXE_ACTIVITY.3_PORTS_UTIL\\,umask\\=0x80@ + max(0, topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots - (CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES) * (topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 5 * INT_MISC.CLEARS_COUNT / slots)) * RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD) else 1) if tma_info_system_smt_2t_utilization > 0.5 else 0)",
- "MetricGroup": "Cor;SMT;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_core_bound_likely",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Cor;SMT",
+ "MetricName": "tma_info_botlnk_core_bound_likely"
},
{
"BriefDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fetch_BW Bottleneck.",
"MetricExpr": "100 * (100 * ((5 * IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE - INT_MISC.UOP_DROPPING) / slots * (DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) / (ICACHE_DATA.STALLS / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + 10 * BACLEARS.ANY / CPU_CLK_UNHALTED.THREAD) + min(3 * IDQ.MS_SWITCHES / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) + max(0, topdown\\-fe\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - INT_MISC.UOP_DROPPING / slots - (5 * IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE - INT_MISC.UOP_DROPPING) / slots) * ((IDQ.MITE_CYCLES_ANY - IDQ.MITE_CYCLES_OK) / (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD) / 2) / ((IDQ.MITE_CYCLES_ANY - IDQ.MITE_CYCLES_OK) / (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD) / 2 + (IDQ.DSB_CYCLES_ANY - IDQ.DSB_CYCLES_OK) / (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD) / 2)))",
- "MetricGroup": "DSBmiss;Fed;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_dsb_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "DSBmiss;Fed",
+ "MetricName": "tma_info_botlnk_dsb_misses"
},
{
"BriefDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bottleneck.",
"MetricExpr": "100 * (100 * ((5 * IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE - INT_MISC.UOP_DROPPING) / slots * (ICACHE_DATA.STALLS / CPU_CLK_UNHALTED.THREAD) / (ICACHE_DATA.STALLS / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + 10 * BACLEARS.ANY / CPU_CLK_UNHALTED.THREAD) + min(3 * IDQ.MS_SWITCHES / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD)))",
- "MetricGroup": "Fed;FetchLat;IcMiss;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_ic_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;FetchLat;IcMiss",
+ "MetricName": "tma_info_botlnk_ic_misses"
},
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
@@ -1045,16 +1042,14 @@
{
"BriefDescription": "\"Bus lock\" per kilo instruction",
"MetricExpr": "tma_info_memory_mix_bus_lock_pki",
- "MetricGroup": "Mem;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_bus_lock_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem",
+ "MetricName": "tma_info_memory_bus_lock_pki"
},
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_code_stlb_mpki",
- "MetricGroup": "Fed;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_code_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;MemoryTLB",
+ "MetricName": "tma_info_memory_code_stlb_mpki"
},
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
@@ -1095,9 +1090,8 @@
{
"BriefDescription": "Average Parallel L2 cache miss data reads",
"MetricExpr": "tma_info_memory_latency_data_l2_mlp",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_data_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_data_l2_mlp"
},
{
"BriefDescription": "Fill Buffer (FB) hits per kilo instructions for retired demand loads (L1D misses that merge into ongoing miss-handling entries)",
@@ -1114,9 +1108,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
"MetricExpr": "64 * L1D.REPLACEMENT / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t"
},
{
"BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
@@ -1139,23 +1132,20 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
"MetricExpr": "64 * L2_LINES_IN.ALL / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l2_cache_fill_bw_2t"
},
{
"BriefDescription": "Rate of non silent evictions from the L2 cache per Kilo instruction",
"MetricExpr": "1e3 * L2_LINES_OUT.NON_SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki"
},
{
"BriefDescription": "Rate of silent evictions from the L2 cache per Kilo instruction where the evicted lines are dropped (no writeback to L3 or memory)",
"MetricExpr": "1e3 * L2_LINES_OUT.SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_silent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_silent_pki"
},
{
"BriefDescription": "L2 cache hits per kilo instruction for all demand loads (including speculative)",
@@ -1190,9 +1180,8 @@
{
"BriefDescription": "Average per-core data access bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * OFFCORE_REQUESTS.ALL_REQUESTS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_access_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW;Offcore",
+ "MetricName": "tma_info_memory_l3_cache_access_bw_2t"
},
{
"BriefDescription": "",
@@ -1203,9 +1192,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * LONGEST_LAT_CACHE.MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l3_cache_fill_bw_2t"
},
{
"BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
@@ -1240,23 +1228,20 @@
{
"BriefDescription": "Average Latency for L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS.DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l2_miss_latency"
},
{
"BriefDescription": "Average Parallel L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / cpu@OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD\\,cmask\\=0x1@",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_load_l2_mlp"
},
{
"BriefDescription": "Average Latency for L3 cache miss demand Loads",
"MetricExpr": "cpu@OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD\\,umask\\=0x10@ / OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l3_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l3_miss_latency"
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core cycles)",
@@ -1267,9 +1252,8 @@
{
"BriefDescription": "STLB (2nd level TLB) data load speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_load_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_load_stlb_mpki"
},
{
"BriefDescription": "\"Bus lock\" per kilo instruction",
@@ -1293,16 +1277,14 @@
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
"MetricExpr": "(ITLB_MISSES.WALK_PENDING + DTLB_LOAD_MISSES.WALK_PENDING + DTLB_STORE_MISSES.WALK_PENDING) / (2 * (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD))",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_page_walks_utilization",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_page_walks_utilization"
},
{
"BriefDescription": "STLB (2nd level TLB) data store speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_store_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_store_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_store_stlb_mpki"
},
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
@@ -1332,9 +1314,8 @@
{
"BriefDescription": "Un-cacheable retired load per kilo instruction",
"MetricExpr": "1e3 * MEM_LOAD_MISC_RETIRED.UC / INST_RETIRED.ANY",
- "MetricGroup": "Mem;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_uc_load_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem",
+ "MetricName": "tma_info_memory_uc_load_pki"
},
{
"BriefDescription": "",
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
index 6f0e6360e989..f8c0eac8b828 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
@@ -727,23 +727,20 @@
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
"MetricExpr": "(100 * (1 - max(0, topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - topdown\\-mem\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound)) / (((cpu@EXE_ACTIVITY.3_PORTS_UTIL\\,umask\\=0x80@ + [email protected]\\,umask\\=0x1@) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - EXE_ACTIVITY.BOUND_ON_LOADS) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * cpu@EXE_ACTIVITY.2_PORTS_UTIL\\,umask\\=0xc@)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIV_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - EXE_ACTIVITY.BOUND_ON_LOADS else (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * cpu@EXE_ACTIVITY.2_PORTS_UTIL\\,umask\\=0xc@) / CPU_CLK_UNHALTED.THREAD) if max(0, topdown\\-be\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - topdown\\-mem\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound)) < (((cpu@EXE_ACTIVITY.3_PORTS_UTIL\\,umask\\=0x80@ + [email protected]\\,umask\\=0x1@) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - EXE_ACTIVITY.BOUND_ON_LOADS) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * cpu@EXE_ACTIVITY.2_PORTS_UTIL\\,umask\\=0xc@)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIV_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - EXE_ACTIVITY.BOUND_ON_LOADS else (EXE_ACTIVITY.1_PORTS_UTIL + topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) * cpu@EXE_ACTIVITY.2_PORTS_UTIL\\,umask\\=0xc@) / CPU_CLK_UNHALTED.THREAD) else 1) if tma_info_system_smt_2t_utilization > 0.5 else 0) + 0 * slots",
- "MetricGroup": "Cor;SMT;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_core_bound_likely",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Cor;SMT",
+ "MetricName": "tma_info_botlnk_core_bound_likely"
},
{
"BriefDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fetch_BW Bottleneck.",
"MetricExpr": "100 * (100 * ((topdown\\-fetch\\-lat / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - INT_MISC.UOP_DROPPING / slots) * (DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) / (ICACHE_DATA.STALLS / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + INT_MISC.UNKNOWN_BRANCH_CYCLES / CPU_CLK_UNHALTED.THREAD) + min(3 * cpu@UOPS_RETIRED.MS\\,cmask\\=0x1\\,edge\\=0x1@ / (UOPS_RETIRED.SLOTS / UOPS_ISSUED.ANY) / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) + max(0, topdown\\-fe\\-bound / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - INT_MISC.UOP_DROPPING / slots - (topdown\\-fetch\\-lat / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - INT_MISC.UOP_DROPPING / slots)) * ((IDQ.MITE_CYCLES_ANY - IDQ.MITE_CYCLES_OK) / (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD) / 2) / ((IDQ.MITE_CYCLES_ANY - IDQ.MITE_CYCLES_OK) / (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD) / 2 + (IDQ.DSB_CYCLES_ANY - IDQDSB_CYCLES_OK) / (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD) / 2)))",
- "MetricGroup": "DSBmiss;Fed;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_dsb_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "DSBmiss;Fed",
+ "MetricName": "tma_info_botlnk_dsb_misses"
},
{
"BriefDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bottleneck.",
"MetricExpr": "100 * (100 * ((topdown\\-fetch\\-lat / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) - INT_MISC.UOP_DROPPING / slots) * (ICACHE_DATA.STALLS / CPU_CLK_UNHALTED.THREAD) / (ICACHE_DATA.STALLS / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + INT_MISC.UNKNOWN_BRANCH_CYCLES / CPU_CLK_UNHALTED.THREAD) + min(3 * cpu@UOPS_RETIRED.MS\\,cmask\\=0x1\\,edge\\=0x1@ / (UOPS_RETIREDSLOTS / UOPS_ISSUED.ANY) / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD)))",
- "MetricGroup": "Fed;FetchLat;IcMiss;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_ic_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;FetchLat;IcMiss",
+ "MetricName": "tma_info_botlnk_ic_misses"
},
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
@@ -1113,16 +1110,14 @@
{
"BriefDescription": "\"Bus lock\" per kilo instruction",
"MetricExpr": "tma_info_memory_mix_bus_lock_pki",
- "MetricGroup": "Mem;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_bus_lock_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem",
+ "MetricName": "tma_info_memory_bus_lock_pki"
},
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_code_stlb_mpki",
- "MetricGroup": "Fed;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_code_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;MemoryTLB",
+ "MetricName": "tma_info_memory_code_stlb_mpki"
},
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
@@ -1163,9 +1158,8 @@
{
"BriefDescription": "Average Parallel L2 cache miss data reads",
"MetricExpr": "tma_info_memory_latency_data_l2_mlp",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_data_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_data_l2_mlp"
},
{
"BriefDescription": "Fill Buffer (FB) hits per kilo instructions for retired demand loads (L1D misses that merge into ongoing miss-handling entries)",
@@ -1182,9 +1176,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
"MetricExpr": "64 * L1D.REPLACEMENT / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t"
},
{
"BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
@@ -1207,23 +1200,20 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
"MetricExpr": "64 * L2_LINES_IN.ALL / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l2_cache_fill_bw_2t"
},
{
"BriefDescription": "Rate of non silent evictions from the L2 cache per Kilo instruction",
"MetricExpr": "1e3 * L2_LINES_OUT.NON_SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki"
},
{
"BriefDescription": "Rate of silent evictions from the L2 cache per Kilo instruction where the evicted lines are dropped (no writeback to L3 or memory)",
"MetricExpr": "1e3 * L2_LINES_OUT.SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_silent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_silent_pki"
},
{
"BriefDescription": "L2 cache hits per kilo instruction for all request types (including speculative)",
@@ -1264,9 +1254,8 @@
{
"BriefDescription": "Average per-core data access bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * OFFCORE_REQUESTS.ALL_REQUESTS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_access_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW;Offcore",
+ "MetricName": "tma_info_memory_l3_cache_access_bw_2t"
},
{
"BriefDescription": "",
@@ -1277,9 +1266,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * LONGEST_LAT_CACHE.MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l3_cache_fill_bw_2t"
},
{
"BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
@@ -1314,23 +1302,20 @@
{
"BriefDescription": "Average Latency for L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS.DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l2_miss_latency"
},
{
"BriefDescription": "Average Parallel L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / cpu@OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD\\,cmask\\=0x1@",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_load_l2_mlp"
},
{
"BriefDescription": "Average Latency for L3 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD / OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l3_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l3_miss_latency"
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core cycles)",
@@ -1341,9 +1326,8 @@
{
"BriefDescription": "STLB (2nd level TLB) data load speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_load_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_load_stlb_mpki"
},
{
"BriefDescription": "\"Bus lock\" per kilo instruction",
@@ -1385,53 +1369,46 @@
{
"BriefDescription": "Off-core accesses per kilo instruction for modified write requests",
"MetricExpr": "1e3 * OCR.MODIFIED_WRITE.ANY_RESPONSE / INST_RETIRED.ANY",
- "MetricGroup": "Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_offcore_mwrite_any_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Offcore",
+ "MetricName": "tma_info_memory_offcore_mwrite_any_pki"
},
{
"BriefDescription": "Off-core accesses per kilo instruction for reads-to-core requests (speculative; including in-core HW prefetches)",
"MetricExpr": "1e3 * OCR.READS_TO_CORE.ANY_RESPONSE / INST_RETIREDANY",
- "MetricGroup": "CacheHits;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_offcore_read_any_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "CacheHits;Offcore",
+ "MetricName": "tma_info_memory_offcore_read_any_pki"
},
{
"BriefDescription": "L3 cache misses per kilo instruction for reads-to-core requests (speculative; including in-core HW prefetches)",
"MetricExpr": "1e3 * OCR.READS_TO_CORE.L3_MISS / INST_RETIRED.ANY",
- "MetricGroup": "Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_offcore_read_l3m_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Offcore",
+ "MetricName": "tma_info_memory_offcore_read_l3m_pki"
},
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
"MetricExpr": "(ITLB_MISSES.WALK_PENDING + DTLB_LOAD_MISSES.WALK_PENDING + DTLB_STORE_MISSES.WALK_PENDING) / (4 * (CPU_CLK_UNHALTED.DISTRIBUTED if #SMT_on else CPU_CLK_UNHALTED.THREAD))",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_page_walks_utilization",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_page_walks_utilization"
},
{
"BriefDescription": "Average DRAM BW for Reads-to-Core (R2C) covering for memory attached to local- and remote-socket",
"MetricExpr": "64 * OCR.READS_TO_CORE.DRAM / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "HPC;Mem;MemoryBW;SoC;TopdownL1;tma_L1_group",
+ "MetricGroup": "HPC;Mem;MemoryBW;SoC",
"MetricName": "tma_info_memory_r2c_dram_bw",
- "MetricgroupNoGroup": "TopdownL1",
"PublicDescription": "Average DRAM BW for Reads-to-Core (R2C) covering for memory attached to local- and remote-socket. See R2C_Offcore_BW."
},
{
"BriefDescription": "Average L3-cache miss BW for Reads-to-Core (R2C)",
"MetricExpr": "64 * OCR.READS_TO_CORE.L3_MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "HPC;Mem;MemoryBW;SoC;TopdownL1;tma_L1_group",
+ "MetricGroup": "HPC;Mem;MemoryBW;SoC",
"MetricName": "tma_info_memory_r2c_l3m_bw",
- "MetricgroupNoGroup": "TopdownL1",
"PublicDescription": "Average L3-cache miss BW for Reads-to-Core (R2C). This covering going to DRAM or other memory off-chip memory tears. See R2C_Offcore_BW."
},
{
"BriefDescription": "Average Off-core access BW for Reads-to-Core (R2C)",
"MetricExpr": "64 * OCR.READS_TO_CORE.ANY_RESPONSE / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "HPC;Mem;MemoryBW;SoC;TopdownL1;tma_L1_group",
+ "MetricGroup": "HPC;Mem;MemoryBW;SoC",
"MetricName": "tma_info_memory_r2c_offcore_bw",
- "MetricgroupNoGroup": "TopdownL1",
"PublicDescription": "Average Off-core access BW for Reads-to-Core (R2C). R2C account for demand or prefetch load/RFO/code access that fill data into the Core caches."
},
{
@@ -1458,9 +1435,8 @@
{
"BriefDescription": "STLB (2nd level TLB) data store speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_store_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_store_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_store_stlb_mpki"
},
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
@@ -1490,9 +1466,8 @@
{
"BriefDescription": "Un-cacheable retired load per kilo instruction",
"MetricExpr": "1e3 * MEM_LOAD_MISC_RETIRED.UC / INST_RETIRED.ANY",
- "MetricGroup": "Mem;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_uc_load_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem",
+ "MetricName": "tma_info_memory_uc_load_pki"
},
{
"BriefDescription": "",
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json b/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
index 025e836a1c80..8126f952a30c 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
@@ -652,23 +652,20 @@
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
"MetricExpr": "(100 * (1 - tma_core_bound / (((EXE_ACTIVITY.EXE_BOUND_0_PORTS + tma_core_bound * RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD) if tma_core_bound < (((EXE_ACTIVITY.EXE_BOUND_0_PORTS + tma_core_bound * RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD * (CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) / CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD) else 1) if tma_info_system_smt_2t_utilization > 0.5 else 0)",
- "MetricGroup": "Cor;SMT;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_core_bound_likely",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Cor;SMT",
+ "MetricName": "tma_info_botlnk_core_bound_likely"
},
{
"BriefDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fetch_BW Bottleneck.",
"MetricExpr": "100 * (100 * (tma_fetch_latency * (DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) / ((ICACHE_16B.IFDATA_STALL + 2 * cpu@ICACHE_16B.IFDATA_STALL\\,cmask\\=0x1\\,edge\\=0x1@) / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + 9 * BACLEARS.ANY / CPU_CLK_UNHALTED.THREAD) + min(2 * IDQ.MS_SWITCHES / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD) + tma_fetch_bandwidth * tma_mite / (tma_mite + tma_dsb)))",
- "MetricGroup": "DSBmiss;Fed;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_dsb_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "DSBmiss;Fed",
+ "MetricName": "tma_info_botlnk_dsb_misses"
},
{
"BriefDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bottleneck.",
"MetricExpr": "100 * (100 * (tma_fetch_latency * ((ICACHE_16B.IFDATA_STALL + 2 * cpu@ICACHE_16B.IFDATA_STALL\\,cmask\\=0x1\\,edge\\=0x1@) / CPU_CLK_UNHALTED.THREAD) / ((ICACHE_16B.IFDATA_STALL + 2 * cpu@ICACHE_16B.IFDATA_STALL\\,cmask\\=0x1\\,edge\\=0x1@) / CPU_CLK_UNHALTED.THREAD + ICACHE_TAG.STALLS / CPU_CLK_UNHALTED.THREAD + (INT_MISC.CLEAR_RESTEER_CYCLES / CPU_CLK_UNHALTED.THREAD + 9 * BACLEARS.ANY / CPU_CLK_UNHALTED.THREAD) + min(2 * IDQ.MS_SWITCHES / CPU_CLK_UNHALTED.THREAD, 1) + DECODE.LCP / CPU_CLK_UNHALTED.THREAD + DSB2MITE_SWITCHES.PENALTY_CYCLES / CPU_CLK_UNHALTED.THREAD)))",
- "MetricGroup": "Fed;FetchLat;IcMiss;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_botlnk_ic_misses",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;FetchLat;IcMiss",
+ "MetricName": "tma_info_botlnk_ic_misses"
},
{
"BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
@@ -1021,9 +1018,8 @@
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_code_stlb_mpki",
- "MetricGroup": "Fed;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_code_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Fed;MemoryTLB",
+ "MetricName": "tma_info_memory_code_stlb_mpki"
},
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
@@ -1064,9 +1060,8 @@
{
"BriefDescription": "Average Parallel L2 cache miss data reads",
"MetricExpr": "tma_info_memory_latency_data_l2_mlp",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_data_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_data_l2_mlp"
},
{
"BriefDescription": "Fill Buffer (FB) hits per kilo instructions for retired demand loads (L1D misses that merge into ongoing miss-handling entries)",
@@ -1083,9 +1078,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
"MetricExpr": "64 * L1D.REPLACEMENT / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l1d_cache_fill_bw_2t"
},
{
"BriefDescription": "L1 cache true misses per kilo instruction for retired demand loads",
@@ -1108,23 +1102,20 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
"MetricExpr": "64 * L2_LINES_IN.ALL / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l2_cache_fill_bw_2t"
},
{
"BriefDescription": "Rate of non silent evictions from the L2 cache per Kilo instruction",
"MetricExpr": "1e3 * L2_LINES_OUT.NON_SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_nonsilent_pki"
},
{
"BriefDescription": "Rate of silent evictions from the L2 cache per Kilo instruction where the evicted lines are dropped (no writeback to L3 or memory)",
"MetricExpr": "1e3 * L2_LINES_OUT.SILENT / INST_RETIRED.ANY",
- "MetricGroup": "L2Evicts;Mem;Server;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l2_evictions_silent_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "L2Evicts;Mem;Server",
+ "MetricName": "tma_info_memory_l2_evictions_silent_pki"
},
{
"BriefDescription": "L2 cache hits per kilo instruction for all request types (including speculative)",
@@ -1165,9 +1156,8 @@
{
"BriefDescription": "Average per-core data access bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * OFFCORE_REQUESTS.ALL_REQUESTS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_access_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW;Offcore",
+ "MetricName": "tma_info_memory_l3_cache_access_bw_2t"
},
{
"BriefDescription": "",
@@ -1178,9 +1168,8 @@
{
"BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
"MetricExpr": "64 * LONGEST_LAT_CACHE.MISS / 1e9 / (duration_time * 1e3 / 1e3)",
- "MetricGroup": "Mem;MemoryBW;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_l3_cache_fill_bw_2t",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryBW",
+ "MetricName": "tma_info_memory_l3_cache_fill_bw_2t"
},
{
"BriefDescription": "L3 cache true misses per kilo instruction for retired demand loads",
@@ -1209,16 +1198,14 @@
{
"BriefDescription": "Average Latency for L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS.DEMAND_DATA_RD",
- "MetricGroup": "Memory_Lat;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_miss_latency",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_Lat;Offcore",
+ "MetricName": "tma_info_memory_load_l2_miss_latency"
},
{
"BriefDescription": "Average Parallel L2 cache miss demand Loads",
"MetricExpr": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD / OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
- "MetricGroup": "Memory_BW;Offcore;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_l2_mlp",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Memory_BW;Offcore",
+ "MetricName": "tma_info_memory_load_l2_mlp"
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core cycles)",
@@ -1229,9 +1216,8 @@
{
"BriefDescription": "STLB (2nd level TLB) data load speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_load_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_load_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_load_stlb_mpki"
},
{
"BriefDescription": "Un-cacheable retired load per kilo instruction",
@@ -1249,16 +1235,14 @@
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
"MetricExpr": "tma_info_memory_tlb_page_walks_utilization",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_page_walks_utilization",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_page_walks_utilization"
},
{
"BriefDescription": "STLB (2nd level TLB) data store speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
"MetricExpr": "tma_info_memory_tlb_store_stlb_mpki",
- "MetricGroup": "Mem;MemoryTLB;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_store_stlb_mpki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem;MemoryTLB",
+ "MetricName": "tma_info_memory_store_stlb_mpki"
},
{
"BriefDescription": "STLB (2nd level TLB) code speculative misses per kilo instruction (misses of any page-size that complete the page walk)",
@@ -1289,9 +1273,8 @@
{
"BriefDescription": "Un-cacheable retired load per kilo instruction",
"MetricExpr": "1e3 * MEM_LOAD_MISC_RETIRED.UC / INST_RETIRED.ANY",
- "MetricGroup": "Mem;TopdownL1;tma_L1_group",
- "MetricName": "tma_info_memory_uc_load_pki",
- "MetricgroupNoGroup": "TopdownL1"
+ "MetricGroup": "Mem",
+ "MetricName": "tma_info_memory_uc_load_pki"
},
{
"BriefDescription": "",
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:01:48

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 05/12] perf vendor events intel: Update lunarlake to 1.01

Update events from 1.00 to 1.01 as released in:
https://github.com/intel/perfmon/commit/56ab8d837ac566d51a4d8748b6b4b817a22c9b84

Various encoding and description updates. Adds the events
CPU_CLK_UNHALTED.CORE, CPU_CLK_UNHALTED.CORE_P,
CPU_CLK_UNHALTED.REF_TSC_P, CPU_CLK_UNHALTED.THREAD,
MISC_RETIRED.LBR_INSERTS, TOPDOWN_BAD_SPECULATION.ALL_P,
TOPDOWN_BE_BOUND.ALL_P, TOPDOWN_FE_BOUND.ALL_P,
TOPDOWN_RETIRING.ALL_P.

Signed-off-by: Ian Rogers <[email protected]>
---
.../pmu-events/arch/x86/lunarlake/cache.json | 24 ++--
.../arch/x86/lunarlake/frontend.json | 2 +-
.../pmu-events/arch/x86/lunarlake/memory.json | 4 +-
.../pmu-events/arch/x86/lunarlake/other.json | 4 +-
.../arch/x86/lunarlake/pipeline.json | 109 +++++++++++++++---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
6 files changed, 113 insertions(+), 32 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/cache.json b/tools/perf/pmu-events/arch/x86/lunarlake/cache.json
index 1823149067b5..fb48be357c4e 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/cache.json
@@ -31,7 +31,7 @@
"EventCode": "0x2e",
"EventName": "LONGEST_LAT_CACHE.REFERENCE",
"PublicDescription": "Counts the number of cacheable memory requests that access the Last Level Cache (LLC). Requests include demand loads, reads for ownership (RFO), instruction fetches and L1 HW prefetches. If the platform has an L3 cache, the LLC is the L3 cache, otherwise it is the L2 cache. Counts on a per core basis.",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x4f",
"Unit": "cpu_atom"
},
@@ -94,7 +94,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x400",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -106,7 +106,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x80",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -118,7 +118,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x10",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -130,7 +130,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x800",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -142,7 +142,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x100",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -154,7 +154,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x20",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -166,7 +166,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x4",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -178,7 +178,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x200",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -190,7 +190,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x40",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -202,7 +202,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x8",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -212,7 +212,7 @@
"EventCode": "0xd0",
"EventName": "MEM_UOPS_RETIRED.STORE_LATENCY",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x6",
"Unit": "cpu_atom"
}
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json b/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json
index 5e4ef81b43d6..3a24934e8d6e 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json
@@ -19,7 +19,7 @@
"BriefDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.",
"EventCode": "0x9c",
"EventName": "IDQ_BUBBLES.CORE",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.\nSoftware can use this event as the numerator for the Frontend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations. Software can use this event as the numerator for the Frontend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "1000003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/memory.json b/tools/perf/pmu-events/arch/x86/lunarlake/memory.json
index 51d70ba00bd4..9c188d80b7b9 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/memory.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/memory.json
@@ -155,7 +155,7 @@
"EventCode": "0x2A,0x2B",
"EventName": "OCR.DEMAND_DATA_RD.L3_MISS",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x3FBFC00001",
+ "MSRValue": "0xFE7F8000001",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_core"
@@ -175,7 +175,7 @@
"EventCode": "0x2A,0x2B",
"EventName": "OCR.DEMAND_RFO.L3_MISS",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x3FBFC00002",
+ "MSRValue": "0xFE7F8000002",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/other.json b/tools/perf/pmu-events/arch/x86/lunarlake/other.json
index 69adaed5686d..377f717db6cc 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/other.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/other.json
@@ -24,7 +24,7 @@
"EventCode": "0xB7",
"EventName": "OCR.DEMAND_DATA_RD.DRAM",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x184000001",
+ "MSRValue": "0x1FBC000001",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_atom"
@@ -34,7 +34,7 @@
"EventCode": "0x2A,0x2B",
"EventName": "OCR.DEMAND_DATA_RD.DRAM",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x184000001",
+ "MSRValue": "0x1E780000001",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json b/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json
index 2bde664fdc0f..2c9f85ec8c4a 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json
@@ -38,10 +38,18 @@
{
"BriefDescription": "Fixed Counter: Counts the number of unhalted core clock cycles",
"EventName": "CPU_CLK_UNHALTED.CORE",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Core cycles when the core is not in a halt state.",
+ "EventName": "CPU_CLK_UNHALTED.CORE",
+ "PublicDescription": "Counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the programmable counters available for other events.",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x2",
+ "Unit": "cpu_core"
+ },
{
"BriefDescription": "Counts the number of unhalted core clock cycles [This event is alias to CPU_CLK_UNHALTED.THREAD_P]",
"EventCode": "0x3c",
@@ -49,10 +57,18 @@
"SampleAfterValue": "2000003",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Thread cycles when thread is not in halt state [This event is alias to CPU_CLK_UNHALTED.THREAD_P]",
+ "EventCode": "0x3c",
+ "EventName": "CPU_CLK_UNHALTED.CORE_P",
+ "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time. [This event is alias to CPU_CLK_UNHALTED.THREAD_P]",
+ "SampleAfterValue": "2000003",
+ "Unit": "cpu_core"
+ },
{
"BriefDescription": "Fixed Counter: Counts the number of unhalted reference clock cycles",
"EventName": "CPU_CLK_UNHALTED.REF_TSC",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "2000003",
"UMask": "0x3",
"Unit": "cpu_atom"
},
@@ -64,6 +80,15 @@
"UMask": "0x3",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts the number of unhalted reference clock cycles",
+ "EventCode": "0x3c",
+ "EventName": "CPU_CLK_UNHALTED.REF_TSC_P",
+ "PublicDescription": "Counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is not affected by core frequency changes and increments at a fixed frequency that is also used for the Time Stamp Counter (TSC). This event uses a programmable general purpose performance counter.",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Reference cycles when the core is not in halt state.",
"EventCode": "0x3c",
@@ -74,9 +99,16 @@
"Unit": "cpu_core"
},
{
- "BriefDescription": "Core cycles when the thread is not in halt state",
+ "BriefDescription": "Fixed Counter: Counts the number of unhalted core clock cycles",
+ "EventName": "CPU_CLK_UNHALTED.THREAD",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x2",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Core cycles when the thread is not in a halt state.",
"EventName": "CPU_CLK_UNHALTED.THREAD",
- "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the eight programmable counters available for other events.",
+ "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the programmable counters available for other events.",
"SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_core"
@@ -89,10 +121,10 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Thread cycles when thread is not in halt state",
+ "BriefDescription": "Thread cycles when thread is not in halt state [This event is alias to CPU_CLK_UNHALTED.CORE_P]",
"EventCode": "0x3c",
"EventName": "CPU_CLK_UNHALTED.THREAD_P",
- "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
+ "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time. [This event is alias to CPU_CLK_UNHALTED.CORE_P]",
"SampleAfterValue": "2000003",
"Unit": "cpu_core"
},
@@ -100,7 +132,7 @@
"BriefDescription": "Fixed Counter: Counts the number of instructions retired",
"EventName": "INST_RETIRED.ANY",
"PEBS": "1",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "2000003",
"UMask": "0x1",
"Unit": "cpu_atom"
},
@@ -148,11 +180,29 @@
"UMask": "0x82",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts the number of LBR entries recorded. Requires LBRs to be enabled in IA32_LBR_CTL.",
+ "EventCode": "0xe4",
+ "EventName": "MISC_RETIRED.LBR_INSERTS",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "LBR record is inserted",
+ "EventCode": "0xe4",
+ "EventName": "MISC_RETIRED.LBR_INSERTS",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x1",
+ "Unit": "cpu_core"
+ },
{
"BriefDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.",
"EventCode": "0xa4",
"EventName": "TOPDOWN.BACKEND_BOUND_SLOTS",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.\nSoftware can use this event as the numerator for the Backend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions. Software can use this event as the numerator for the Backend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "10000003",
"UMask": "0x2",
"Unit": "cpu_core"
@@ -175,21 +225,35 @@
"Unit": "cpu_core"
},
{
- "BriefDescription": "Fixed Counter: Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL_P]",
+ "EventCode": "0x73",
"EventName": "TOPDOWN_BAD_SPECULATION.ALL",
- "PublicDescription": "Fixed Counter: Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Counts all issue slots blocked during this recovery window including relevant microcode flows and while uops are not yet available in the IQ. Also, includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
"SampleAfterValue": "1000003",
- "UMask": "0x5",
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL]",
+ "EventCode": "0x73",
+ "EventName": "TOPDOWN_BAD_SPECULATION.ALL_P",
+ "SampleAfterValue": "1000003",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL_P]",
"EventCode": "0xa4",
"EventName": "TOPDOWN_BE_BOUND.ALL",
"SampleAfterValue": "1000003",
"UMask": "0x2",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL]",
+ "EventCode": "0xa4",
+ "EventName": "TOPDOWN_BE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x2",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Fixed Counter: Counts the number of retirement slots not consumed due to front end stalls",
"EventName": "TOPDOWN_FE_BOUND.ALL",
@@ -198,18 +262,35 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Fixed Counter: Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
+ "EventCode": "0x9c",
+ "EventName": "TOPDOWN_FE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Fixed Counter: Counts the number of consumed retirement slots.",
"EventName": "TOPDOWN_RETIRING.ALL",
"PEBS": "1",
"SampleAfterValue": "1000003",
"UMask": "0x7",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of consumed retirement slots.",
+ "EventCode": "0xc2",
+ "EventName": "TOPDOWN_RETIRING.ALL_P",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x2",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.",
"EventCode": "0xc2",
"EventName": "UOPS_RETIRED.SLOTS",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.\nSoftware can use this event as the numerator for the Retiring metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric. Software can use this event as the numerator for the Retiring metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index a219dc3bd1ef..710f8dfefeed 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -20,7 +20,7 @@ GenuineIntel-6-3A,v24,ivybridge,core
GenuineIntel-6-3E,v24,ivytown,core
GenuineIntel-6-2D,v24,jaketown,core
GenuineIntel-6-(57|85),v16,knightslanding,core
-GenuineIntel-6-BD,v1.00,lunarlake,core
+GenuineIntel-6-BD,v1.01,lunarlake,core
GenuineIntel-6-A[AC],v1.07,meteorlake,core
GenuineIntel-6-1[AEF],v4,nehalemep,core
GenuineIntel-6-2E,v4,nehalemex,core
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:02:04

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 06/12] perf vendor events intel: Update meteorlake to 1.08

Update events from 1.07 to 1.08 as released in:
https://github.com/intel/perfmon/commit/f0f8f3e163d9eb84e6ce8e2108a22cb43b2527e5

Various description updates. Adds topdown, offcore and uncore events
OCR.DEMAND_DATA_RD.L3_HIT, OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_NO_FWD,
OCR.DEMAND_RFO.L3_HIT, OCR.DEMAND_DATA_RD.L3_MISS,
OCR.DEMAND_RFO.L3_MISS, OCR.DEMAND_DATA_RD.ANY_RESPONSE,
OCR.DEMAND_DATA_RD.DRAM, OCR.DEMAND_RFO.ANY_RESPONSE,
OCR.DEMAND_RFO.DRAM, TOPDOWN_BAD_SPECULATION.ALL_P,
TOPDOWN_BE_BOUND.ALL_P, TOPDOWN_FE_BOUND.ALL_P,
TOPDOWN_RETIRING.ALL_P, UNC_ARB_DAT_OCCUPANCY.RD and
UNC_HAC_ARB_COH_TRK_REQUESTS.ALL.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../pmu-events/arch/x86/meteorlake/cache.json | 30 +++++++++++++
.../arch/x86/meteorlake/frontend.json | 4 +-
.../arch/x86/meteorlake/memory.json | 20 +++++++++
.../pmu-events/arch/x86/meteorlake/other.json | 42 +++++++++++++++++-
.../arch/x86/meteorlake/pipeline.json | 44 ++++++++++++++++---
.../x86/meteorlake/uncore-interconnect.json | 22 ++++++++--
7 files changed, 150 insertions(+), 14 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 710f8dfefeed..fedaacbe981a 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -21,7 +21,7 @@ GenuineIntel-6-3E,v24,ivytown,core
GenuineIntel-6-2D,v24,jaketown,core
GenuineIntel-6-(57|85),v16,knightslanding,core
GenuineIntel-6-BD,v1.01,lunarlake,core
-GenuineIntel-6-A[AC],v1.07,meteorlake,core
+GenuineIntel-6-A[AC],v1.08,meteorlake,core
GenuineIntel-6-1[AEF],v4,nehalemep,core
GenuineIntel-6-2E,v4,nehalemex,core
GenuineIntel-6-A7,v1.02,rocketlake,core
diff --git a/tools/perf/pmu-events/arch/x86/meteorlake/cache.json b/tools/perf/pmu-events/arch/x86/meteorlake/cache.json
index 47861a6dd8e9..af7acb15f661 100644
--- a/tools/perf/pmu-events/arch/x86/meteorlake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/meteorlake/cache.json
@@ -966,6 +966,16 @@
"UMask": "0x3",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand data reads that were supplied by the L3 cache.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_DATA_RD.L3_HIT",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x3F803C0001",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand data reads that were supplied by the L3 cache where a snoop was sent, the snoop hit, and modified data was forwarded.",
"EventCode": "0xB7",
@@ -986,6 +996,16 @@
"UMask": "0x1",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand data reads that were supplied by the L3 cache where a snoop was sent, the snoop hit, but no data was forwarded.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_NO_FWD",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x4003C0001",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand data reads that were supplied by the L3 cache where a snoop was sent, the snoop hit, and non-modified data was forwarded.",
"EventCode": "0xB7",
@@ -1006,6 +1026,16 @@
"UMask": "0x1",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were supplied by the L3 cache.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_RFO.L3_HIT",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x3F803C0002",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were supplied by the L3 cache where a snoop was sent, the snoop hit, and modified data was forwarded.",
"EventCode": "0xB7",
diff --git a/tools/perf/pmu-events/arch/x86/meteorlake/frontend.json b/tools/perf/pmu-events/arch/x86/meteorlake/frontend.json
index 9da8689eda81..f3b7b211afb5 100644
--- a/tools/perf/pmu-events/arch/x86/meteorlake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/meteorlake/frontend.json
@@ -378,7 +378,7 @@
"CounterMask": "6",
"EventCode": "0x79",
"EventName": "IDQ.DSB_CYCLES_OK",
- "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).",
+ "PublicDescription": "Counts the number of cycles where optimal number of uops was delivered to the Instruction Decode Queue (IDQ) from the DSB (Decode Stream Buffer) path. Count includes uops that may 'bypass' the IDQ.",
"SampleAfterValue": "2000003",
"UMask": "0x8",
"Unit": "cpu_core"
@@ -455,7 +455,7 @@
"BriefDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.",
"EventCode": "0x9c",
"EventName": "IDQ_BUBBLES.CORE",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.\nThe count may be distributed among unhalted logical processors (hyper-threads) who share the same physical core, in processors that support Intel Hyper-Threading Technology. Software can use this event as the numerator for the Frontend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations. The count may be distributed among unhalted logical processors (hyper-threads) who share the same physical core, in processors that support Intel Hyper-Threading Technology. Software can use this event as the numerator for the Frontend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "1000003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/meteorlake/memory.json b/tools/perf/pmu-events/arch/x86/meteorlake/memory.json
index a5b83293f157..617d0e255fd5 100644
--- a/tools/perf/pmu-events/arch/x86/meteorlake/memory.json
+++ b/tools/perf/pmu-events/arch/x86/meteorlake/memory.json
@@ -296,6 +296,16 @@
"UMask": "0x4",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts demand data reads that were not supplied by the L3 cache.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_DATA_RD.L3_MISS",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x3FBFC00001",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand data reads that were not supplied by the L3 cache.",
"EventCode": "0x2A,0x2B",
@@ -306,6 +316,16 @@
"UMask": "0x1",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the L3 cache.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_RFO.L3_MISS",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x3FBFC00002",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand read for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that were not supplied by the L3 cache.",
"EventCode": "0x2A,0x2B",
diff --git a/tools/perf/pmu-events/arch/x86/meteorlake/other.json b/tools/perf/pmu-events/arch/x86/meteorlake/other.json
index 7effc1f271e7..0bc2cb2eabb3 100644
--- a/tools/perf/pmu-events/arch/x86/meteorlake/other.json
+++ b/tools/perf/pmu-events/arch/x86/meteorlake/other.json
@@ -17,6 +17,16 @@
"UMask": "0x1",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts demand data reads that have any type of response.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_DATA_RD.ANY_RESPONSE",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x10001",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand data reads that have any type of response.",
"EventCode": "0x2A,0x2B",
@@ -27,6 +37,16 @@
"UMask": "0x1",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand data reads that were supplied by DRAM.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_DATA_RD.DRAM",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x184000001",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand data reads that were supplied by DRAM.",
"EventCode": "0x2A,0x2B",
@@ -37,6 +57,16 @@
"UMask": "0x1",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that have any type of response.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_RFO.ANY_RESPONSE",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x10002",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts demand read for ownership (RFO) requests and software prefetches for exclusive ownership (PREFETCHW) that have any type of response.",
"EventCode": "0x2A,0x2B",
@@ -47,6 +77,16 @@
"UMask": "0x1",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts demand reads for ownership (RFO) and software prefetches for exclusive ownership (PREFETCHW) that were supplied by DRAM.",
+ "EventCode": "0xB7",
+ "EventName": "OCR.DEMAND_RFO.DRAM",
+ "MSRIndex": "0x1a6,0x1a7",
+ "MSRValue": "0x184000002",
+ "SampleAfterValue": "100003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts streaming stores that have any type of response.",
"EventCode": "0xB7",
@@ -97,7 +137,7 @@
"Unit": "cpu_core"
},
{
- "BriefDescription": "Counts the number of issue slots in a UMWAIT or TPAUSE instruction where no uop issues due to the instruction putting the CPU into the C0.1 activity state. For Tremont, UMWAIT and TPAUSE will only put the CPU into C0.1 activity state (not C0.2 activity state)",
+ "BriefDescription": "Counts the number of issue slots in a UMWAIT or TPAUSE instruction where no uop issues due to the instruction putting the CPU into the C0.1 activity state.",
"EventCode": "0x75",
"EventName": "SERIALIZATION.C01_MS_SCB",
"SampleAfterValue": "200003",
diff --git a/tools/perf/pmu-events/arch/x86/meteorlake/pipeline.json b/tools/perf/pmu-events/arch/x86/meteorlake/pipeline.json
index 24bbfcebd2be..5ff4a7a32250 100644
--- a/tools/perf/pmu-events/arch/x86/meteorlake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/meteorlake/pipeline.json
@@ -1067,7 +1067,7 @@
"BriefDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.",
"EventCode": "0xa4",
"EventName": "TOPDOWN.BACKEND_BOUND_SLOTS",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.\nThe count is distributed among unhalted logical processors (hyper-threads) who share the same physical core, in processors that support Intel Hyper-Threading Technology. Software can use this event as the numerator for the Backend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions. The count is distributed among unhalted logical processors (hyper-threads) who share the same physical core, in processors that support Intel Hyper-Threading Technology. Software can use this event as the numerator for the Backend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "10000003",
"UMask": "0x2",
"Unit": "cpu_core"
@@ -1116,10 +1116,18 @@
"Unit": "cpu_core"
},
{
- "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL_P]",
"EventCode": "0x73",
"EventName": "TOPDOWN_BAD_SPECULATION.ALL",
- "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
+ "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear. [This event is alias to TOPDOWN_BAD_SPECULATION.ALL_P]",
+ "SampleAfterValue": "1000003",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL]",
+ "EventCode": "0x73",
+ "EventName": "TOPDOWN_BAD_SPECULATION.ALL_P",
+ "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear. [This event is alias to TOPDOWN_BAD_SPECULATION.ALL]",
"SampleAfterValue": "1000003",
"Unit": "cpu_atom"
},
@@ -1156,7 +1164,7 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL_P]",
"EventCode": "0x74",
"EventName": "TOPDOWN_BE_BOUND.ALL",
"SampleAfterValue": "1000003",
@@ -1170,6 +1178,13 @@
"UMask": "0x1",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL]",
+ "EventCode": "0x74",
+ "EventName": "TOPDOWN_BE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to memory reservation stall (scheduler not being able to accept another uop). This could be caused by RSV full or load/store buffer block.",
"EventCode": "0x74",
@@ -1211,12 +1226,19 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls [This event is alias to TOPDOWN_FE_BOUND.ALL_P]",
"EventCode": "0x71",
"EventName": "TOPDOWN_FE_BOUND.ALL",
"SampleAfterValue": "1000003",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls [This event is alias to TOPDOWN_FE_BOUND.ALL]",
+ "EventCode": "0x71",
+ "EventName": "TOPDOWN_FE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to BAClear",
"EventCode": "0x71",
@@ -1299,13 +1321,21 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL",
+ "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL [This event is alias to TOPDOWN_RETIRING.ALL_P]",
"EventCode": "0x72",
"EventName": "TOPDOWN_RETIRING.ALL",
"PEBS": "1",
"SampleAfterValue": "1000003",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL [This event is alias to TOPDOWN_RETIRING.ALL]",
+ "EventCode": "0x72",
+ "EventName": "TOPDOWN_RETIRING.ALL_P",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Number of non dec-by-all uops decoded by decoder",
"EventCode": "0x76",
@@ -1591,7 +1621,7 @@
"BriefDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.",
"EventCode": "0xc2",
"EventName": "UOPS_RETIRED.SLOTS",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.\nSoftware can use this event as the numerator for the Retiring metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric. Software can use this event as the numerator for the Retiring metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/meteorlake/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/meteorlake/uncore-interconnect.json
index 08b5c7574cfc..901d8510f90f 100644
--- a/tools/perf/pmu-events/arch/x86/meteorlake/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/meteorlake/uncore-interconnect.json
@@ -1,4 +1,20 @@
[
+ {
+ "BriefDescription": "Each cycle counts number of coherent reads pending on data return from memory controller that were issued by any core.",
+ "EventCode": "0x85",
+ "EventName": "UNC_ARB_DAT_OCCUPANCY.RD",
+ "PerPkg": "1",
+ "UMask": "0x2",
+ "Unit": "ARB"
+ },
+ {
+ "BriefDescription": "Number of entries allocated. Account for Any type: e.g. Snoop, etc.",
+ "EventCode": "0x84",
+ "EventName": "UNC_HAC_ARB_COH_TRK_REQUESTS.ALL",
+ "PerPkg": "1",
+ "UMask": "0x1",
+ "Unit": "HAC_ARB"
+ },
{
"BriefDescription": "Number of all coherent Data Read entries. Doesn't include prefetches",
"EventCode": "0x81",
@@ -9,7 +25,7 @@
},
{
"BriefDescription": "Number of all CMI transactions",
- "EventCode": "0x8a",
+ "EventCode": "0x8A",
"EventName": "UNC_HAC_ARB_TRANSACTIONS.ALL",
"PerPkg": "1",
"UMask": "0x1",
@@ -17,7 +33,7 @@
},
{
"BriefDescription": "Number of all CMI reads",
- "EventCode": "0x8a",
+ "EventCode": "0x8A",
"EventName": "UNC_HAC_ARB_TRANSACTIONS.READS",
"PerPkg": "1",
"UMask": "0x2",
@@ -25,7 +41,7 @@
},
{
"BriefDescription": "Number of all CMI writes not including Mflush",
- "EventCode": "0x8a",
+ "EventCode": "0x8A",
"EventName": "UNC_HAC_ARB_TRANSACTIONS.WRITES",
"PerPkg": "1",
"UMask": "0x4",
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:01:16

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 03/12] perf vendor events intel: Update grandridge to 1.02

Update events from 1.01 to 1.02 as released in:
https://github.com/intel/perfmon/commit/b2a81e803add1ba0af68a442c975683d226d868c

Fixes spelling and descriptions. Adds topdown events and uncore cache
UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD_OPT,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT,
UNC_CHA_TOR_OCCUPANCY.IA_DRD_OPT.

Signed-off-by: Ian Rogers <[email protected]>
---
.../arch/x86/grandridge/pipeline.json | 43 ++++++++++++++++---
.../arch/x86/grandridge/uncore-cache.json | 28 +++++++++++-
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
3 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/grandridge/pipeline.json b/tools/perf/pmu-events/arch/x86/grandridge/pipeline.json
index daa0639bb1ca..90292dc03d33 100644
--- a/tools/perf/pmu-events/arch/x86/grandridge/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/grandridge/pipeline.json
@@ -249,10 +249,17 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL_P]",
"EventCode": "0x73",
"EventName": "TOPDOWN_BAD_SPECULATION.ALL",
- "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
+ "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear. [This event is alias to TOPDOWN_BAD_SPECULATION.ALL_P]",
+ "SampleAfterValue": "1000003"
+ },
+ {
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL]",
+ "EventCode": "0x73",
+ "EventName": "TOPDOWN_BAD_SPECULATION.ALL_P",
+ "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear. [This event is alias to TOPDOWN_BAD_SPECULATION.ALL]",
"SampleAfterValue": "1000003"
},
{
@@ -284,7 +291,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL_P]",
"EventCode": "0x74",
"EventName": "TOPDOWN_BE_BOUND.ALL",
"SampleAfterValue": "1000003"
@@ -296,6 +303,12 @@
"SampleAfterValue": "1000003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL]",
+ "EventCode": "0x74",
+ "EventName": "TOPDOWN_BE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to memory reservation stall (scheduler not being able to accept another uop). This could be caused by RSV full or load/store buffer block.",
"EventCode": "0x74",
@@ -317,6 +330,13 @@
"SampleAfterValue": "1000003",
"UMask": "0x20"
},
+ {
+ "BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to ROB full",
+ "EventCode": "0x74",
+ "EventName": "TOPDOWN_BE_BOUND.REORDER_BUFFER",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x40"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to iq/jeu scoreboards or ms scb",
"EventCode": "0x74",
@@ -325,11 +345,17 @@
"UMask": "0x10"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls [This event is alias to TOPDOWN_FE_BOUND.ALL_P]",
"EventCode": "0x71",
"EventName": "TOPDOWN_FE_BOUND.ALL",
"SampleAfterValue": "1000003"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls [This event is alias to TOPDOWN_FE_BOUND.ALL]",
+ "EventCode": "0x71",
+ "EventName": "TOPDOWN_FE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to BAClear",
"EventCode": "0x71",
@@ -402,12 +428,19 @@
"UMask": "0x4"
},
{
- "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL",
+ "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL [This event is alias to TOPDOWN_RETIRING.ALL_P]",
"EventCode": "0x72",
"EventName": "TOPDOWN_RETIRING.ALL",
"PEBS": "1",
"SampleAfterValue": "1000003"
},
+ {
+ "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL [This event is alias to TOPDOWN_RETIRING.ALL]",
+ "EventCode": "0x72",
+ "EventName": "TOPDOWN_RETIRING.ALL_P",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003"
+ },
{
"BriefDescription": "Counts the number of uops issued by the front end every cycle.",
"EventCode": "0x0e",
diff --git a/tools/perf/pmu-events/arch/x86/grandridge/uncore-cache.json b/tools/perf/pmu-events/arch/x86/grandridge/uncore-cache.json
index 74dfd9272cef..36614429dd72 100644
--- a/tools/perf/pmu-events/arch/x86/grandridge/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/grandridge/uncore-cache.json
@@ -5,7 +5,6 @@
"EventName": "UNC_CHACMS_CLOCKTICKS",
"PerPkg": "1",
"PortMask": "0x000",
- "PublicDescription": "UNC_CHACMS_CLOCKTICKS",
"Unit": "CHACMS"
},
{
@@ -1216,6 +1215,15 @@
"UMask": "0xc88fff01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for Data read opt from local IA that miss the cache",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_DRD_OPT",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Occupancy : DRd_Opts issued by iA Cores",
+ "UMask": "0xc827ff01",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy for Data read opt prefetch from local IA that miss the cache",
"EventCode": "0x36",
@@ -1252,6 +1260,15 @@
"UMask": "0xc88ffd01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for Data read opt from local IA that hit the cache",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD_OPT",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Occupancy : DRd_Opts issued by iA Cores that hit the LLC",
+ "UMask": "0xc827fd01",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy for Data read opt prefetch from local IA that hit the cache",
"EventCode": "0x36",
@@ -1405,6 +1422,15 @@
"UMask": "0xc88efe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for Data read opt from local IA that miss the cache",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT",
+ "PerPkg": "1",
+ "PublicDescription": "TOR Occupancy : DRd_Opt issued by iA Cores that missed the LLC",
+ "UMask": "0xc827fe01",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy for Data read opt prefetch from local IA that miss the cache",
"EventCode": "0x36",
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 85b89b1098cf..650c28574576 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -10,7 +10,7 @@ GenuineIntel-6-9[6C],v1.04,elkhartlake,core
GenuineIntel-6-CF,v1.06,emeraldrapids,core
GenuineIntel-6-5[CF],v13,goldmont,core
GenuineIntel-6-7A,v1.01,goldmontplus,core
-GenuineIntel-6-B6,v1.01,grandridge,core
+GenuineIntel-6-B6,v1.02,grandridge,core
GenuineIntel-6-A[DE],v1.01,graniterapids,core
GenuineIntel-6-(3C|45|46),v35,haswell,core
GenuineIntel-6-3F,v28,haswellx,core
--
2.44.0.396.g6e790dbe36-goog


2024-03-21 06:02:21

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 08/12] perf vendor events intel: Update sierraforest to 1.02

Update events from 1.01 to 1.02 as released in:
https://github.com/intel/perfmon/commit/451dd41ae627b56433ad4065bf3632789eb70834

Various description updates. Adds topdown events
TOPDOWN_BAD_SPECULATION.ALL_P, TOPDOWN_BE_BOUND.ALL_P,
TOPDOWN_FE_BOUND.ALL_P and TOPDOWN_RETIRING.ALL_P.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../arch/x86/sierraforest/pipeline.json | 36 ++++++++++++++++---
2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index e5de35c96358..ac32377ab01a 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -27,7 +27,7 @@ GenuineIntel-6-2E,v4,nehalemex,core
GenuineIntel-6-A7,v1.02,rocketlake,core
GenuineIntel-6-2A,v19,sandybridge,core
GenuineIntel-6-8F,v1.20,sapphirerapids,core
-GenuineIntel-6-AF,v1.01,sierraforest,core
+GenuineIntel-6-AF,v1.02,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v58,skylake,core
GenuineIntel-6-55-[01234],v1.32,skylakex,core
diff --git a/tools/perf/pmu-events/arch/x86/sierraforest/pipeline.json b/tools/perf/pmu-events/arch/x86/sierraforest/pipeline.json
index ba9843110f07..90292dc03d33 100644
--- a/tools/perf/pmu-events/arch/x86/sierraforest/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/sierraforest/pipeline.json
@@ -249,10 +249,17 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL_P]",
"EventCode": "0x73",
"EventName": "TOPDOWN_BAD_SPECULATION.ALL",
- "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
+ "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear. [This event is alias to TOPDOWN_BAD_SPECULATION.ALL_P]",
+ "SampleAfterValue": "1000003"
+ },
+ {
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL]",
+ "EventCode": "0x73",
+ "EventName": "TOPDOWN_BAD_SPECULATION.ALL_P",
+ "PublicDescription": "Counts the total number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Only issue slots wasted due to fast nukes such as memory ordering nukes are counted. Other nukes are not accounted for. Counts all issue slots blocked during this recovery window, including relevant microcode flows, and while uops are not yet available in the instruction queue (IQ) or until an FE_BOUND event occurs besides OTHER and CISC. Also includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear. [This event is alias to TOPDOWN_BAD_SPECULATION.ALL]",
"SampleAfterValue": "1000003"
},
{
@@ -284,7 +291,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL_P]",
"EventCode": "0x74",
"EventName": "TOPDOWN_BE_BOUND.ALL",
"SampleAfterValue": "1000003"
@@ -296,6 +303,12 @@
"SampleAfterValue": "1000003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL]",
+ "EventCode": "0x74",
+ "EventName": "TOPDOWN_BE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not consumed by the backend due to memory reservation stall (scheduler not being able to accept another uop). This could be caused by RSV full or load/store buffer block.",
"EventCode": "0x74",
@@ -332,11 +345,17 @@
"UMask": "0x10"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls [This event is alias to TOPDOWN_FE_BOUND.ALL_P]",
"EventCode": "0x71",
"EventName": "TOPDOWN_FE_BOUND.ALL",
"SampleAfterValue": "1000003"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls [This event is alias to TOPDOWN_FE_BOUND.ALL]",
+ "EventCode": "0x71",
+ "EventName": "TOPDOWN_FE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003"
+ },
{
"BriefDescription": "Counts the number of issue slots every cycle that were not delivered by the frontend due to BAClear",
"EventCode": "0x71",
@@ -409,12 +428,19 @@
"UMask": "0x4"
},
{
- "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL",
+ "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL [This event is alias to TOPDOWN_RETIRING.ALL_P]",
"EventCode": "0x72",
"EventName": "TOPDOWN_RETIRING.ALL",
"PEBS": "1",
"SampleAfterValue": "1000003"
},
+ {
+ "BriefDescription": "Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL [This event is alias to TOPDOWN_RETIRING.ALL]",
+ "EventCode": "0x72",
+ "EventName": "TOPDOWN_RETIRING.ALL_P",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003"
+ },
{
"BriefDescription": "Counts the number of uops issued by the front end every cycle.",
"EventCode": "0x0e",
--
2.44.0.396.g6e790dbe36-goog