2020-09-01 22:11:41

by Kim Phillips

[permalink] [raw]
Subject: [PATCH 1/4] perf vendor events amd: Add L2 Prefetch events for zen1

Later revisions of PPRs that post-date the original Family 17h events
submission patch add these events.

Specifically, they were not in this 2017 revision of the F17h PPR:

Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017

But e.g., are included in this 2019 version of the PPR:

Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019

Signed-off-by: Kim Phillips <[email protected]>
Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: "Martin Liška" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
.../pmu-events/arch/x86/amdzen1/cache.json | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
index 404d4c569c01..695ed3ffa3a6 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
@@ -249,6 +249,24 @@
"BriefDescription": "Cycles with fill pending from L2. Total cycles spent with one or more fill requests in flight from L2.",
"UMask": "0x1"
},
+ {
+ "EventName": "l2_pf_hit_l2",
+ "EventCode": "0x70",
+ "BriefDescription": "L2 prefetch hit in L2.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_pf_miss_l2_hit_l3",
+ "EventCode": "0x71",
+ "BriefDescription": "L2 prefetcher hits in L3. Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_pf_miss_l2_l3",
+ "EventCode": "0x72",
+ "BriefDescription": "L2 prefetcher misses in L3. All L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.",
+ "UMask": "0xff"
+ },
{
"EventName": "l3_request_g1.caching_l3_cache_accesses",
"EventCode": "0x01",
--
2.27.0


2020-09-01 22:12:08

by Kim Phillips

[permalink] [raw]
Subject: [PATCH 3/4] perf vendor events amd: Add recommended events

Add support for events listed in Section 2.1.15.2 "Performance
Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803
Rev 0.54 - Sep 12, 2019".

perf now supports these new events (-e):

all_dc_accesses
all_tlbs_flushed
l1_dtlb_misses
l2_cache_accesses_from_dc_misses
l2_cache_accesses_from_ic_misses
l2_cache_hits_from_dc_misses
l2_cache_hits_from_ic_misses
l2_cache_misses_from_dc_misses
l2_cache_misses_from_ic_miss
l2_dtlb_misses
l2_itlb_misses
sse_avx_stalls
uops_dispatched
uops_retired
l3_accesses
l3_misses

and these metrics (-M):

branch_misprediction_ratio
all_l2_cache_accesses
all_l2_cache_hits
all_l2_cache_misses
ic_fetch_miss_ratio
l2_cache_accesses_from_l2_hwpf
l2_cache_hits_from_l2_hwpf
l2_cache_misses_from_l2_hwpf
l3_read_miss_latency
l1_itlb_misses
all_remote_links_outbound
nps1_die_to_dram

The nps1_die_to_dram event may need perf stat's --metric-no-group
switch if the number of available data fabric counters is less
than the number it uses (8).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: "Martin Liška" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
.../pmu-events/arch/x86/amdzen1/cache.json | 23 +++
.../arch/x86/amdzen1/data-fabric.json | 98 ++++++++++
.../arch/x86/amdzen1/recommended.json | 178 ++++++++++++++++++
.../pmu-events/arch/x86/amdzen2/cache.json | 23 +++
.../arch/x86/amdzen2/data-fabric.json | 98 ++++++++++
.../arch/x86/amdzen2/recommended.json | 178 ++++++++++++++++++
tools/perf/pmu-events/jevents.c | 1 +
7 files changed, 599 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json
create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/recommended.json

diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
index 695ed3ffa3a6..4ea7ec4f496e 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
@@ -117,6 +117,11 @@
"BriefDescription": "Miscellaneous events covered in more detail by l2_request_g2 (PMCx061).",
"UMask": "0x1"
},
+ {
+ "EventName": "l2_request_g1.all_no_prefetch",
+ "EventCode": "0x60",
+ "UMask": "0xf9"
+ },
{
"EventName": "l2_request_g2.group1",
"EventCode": "0x61",
@@ -243,6 +248,24 @@
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2.",
"UMask": "0x1"
},
+ {
+ "EventName": "l2_cache_req_stat.ic_access_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache requests in L2.",
+ "UMask": "0x7"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_dc_miss_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2 and Data cache request miss in L2 (all types).",
+ "UMask": "0x9"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_dc_hit_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request hit in L2 and Data cache request hit in L2 (all types).",
+ "UMask": "0xf6"
+ },
{
"EventName": "l2_fill_pending.l2_fill_busy",
"EventCode": "0x6d",
diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json b/tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json
new file mode 100644
index 000000000000..40271df40015
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json
@@ -0,0 +1,98 @@
+[
+ {
+ "EventName": "remote_outbound_data_controller_0",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 0",
+ "EventCode": "0x7c7",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_1",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 1",
+ "EventCode": "0x807",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_2",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 2",
+ "EventCode": "0x847",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_3",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 3",
+ "EventCode": "0x887",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_0",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x07",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_1",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x47",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_2",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x87",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_3",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0xc7",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_4",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x107",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_5",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x147",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_6",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x187",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_7",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x1c7",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
new file mode 100644
index 000000000000..2cfe2d2f3bfd
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
@@ -0,0 +1,178 @@
+[
+ {
+ "MetricName": "branch_misprediction_ratio",
+ "BriefDescription": "Execution-Time Branch Misprediction Ratio (Non-Speculative)",
+ "MetricExpr": "d_ratio(ex_ret_brn_misp, ex_ret_brn)",
+ "MetricGroup": "branch_prediction",
+ "ScaleUnit": "100%"
+ },
+ {
+ "EventName": "all_dc_accesses",
+ "EventCode": "0x29",
+ "BriefDescription": "All L1 Data Cache Accesses",
+ "UMask": "0x7"
+ },
+ {
+ "MetricName": "all_l2_cache_accesses",
+ "BriefDescription": "All L2 Cache Accesses",
+ "MetricExpr": "l2_request_g1.all_no_prefetch + l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_accesses_from_ic_misses",
+ "EventCode": "0x60",
+ "BriefDescription": "L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "l2_cache_accesses_from_dc_misses",
+ "EventCode": "0x60",
+ "BriefDescription": "L2 Cache Accesses from L1 Data Cache Misses (including prefetch)",
+ "UMask": "0xc8"
+ },
+ {
+ "MetricName": "l2_cache_accesses_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Accesses from L2 HWPF",
+ "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "all_l2_cache_misses",
+ "BriefDescription": "All L2 Cache Misses",
+ "MetricExpr": "l2_cache_req_stat.ic_dc_miss_in_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_misses_from_ic_miss",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Misses from L1 Instruction Cache Misses",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_cache_misses_from_dc_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Misses from L1 Data Cache Misses",
+ "UMask": "0x08"
+ },
+ {
+ "MetricName": "l2_cache_misses_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Misses from L2 HWPF",
+ "MetricExpr": "l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "all_l2_cache_hits",
+ "BriefDescription": "All L2 Cache Hits",
+ "MetricExpr": "l2_cache_req_stat.ic_dc_hit_in_l2 + l2_pf_hit_l2",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_hits_from_ic_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Hits from L1 Instruction Cache Misses",
+ "UMask": "0x06"
+ },
+ {
+ "EventName": "l2_cache_hits_from_dc_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Hits from L1 Data Cache Misses",
+ "UMask": "0x70"
+ },
+ {
+ "MetricName": "l2_cache_hits_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Hits from L2 HWPF",
+ "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l3_accesses",
+ "EventCode": "0x04",
+ "BriefDescription": "L3 Accesses",
+ "UMask": "0xff",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "l3_misses",
+ "EventCode": "0x04",
+ "BriefDescription": "L3 Misses (includes Chg2X)",
+ "UMask": "0x01",
+ "Unit": "L3PMC"
+ },
+ {
+ "MetricName": "l3_read_miss_latency",
+ "BriefDescription": "Average L3 Read Miss Latency (in core clocks)",
+ "MetricExpr": "(xi_sys_fill_latency * 16) / xi_ccx_sdp_req1.all_l3_miss_req_typs",
+ "MetricGroup": "l3_cache",
+ "ScaleUnit": "1core clocks"
+ },
+ {
+ "MetricName": "ic_fetch_miss_ratio",
+ "BriefDescription": "L1 Instruction Cache (32B) Fetch Miss Ratio",
+ "MetricExpr": "d_ratio(l2_cache_req_stat.ic_access_in_l2, bp_l1_tlb_fetch_hit + bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_miss)",
+ "MetricGroup": "l2_cache",
+ "ScaleUnit": "100%"
+ },
+ {
+ "MetricName": "l1_itlb_misses",
+ "BriefDescription": "L1 ITLB Misses",
+ "MetricExpr": "bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_miss",
+ "MetricGroup": "tlb"
+ },
+ {
+ "EventName": "l2_itlb_misses",
+ "EventCode": "0x85",
+ "BriefDescription": "L2 ITLB Misses & Instruction page walks",
+ "UMask": "0x07"
+ },
+ {
+ "EventName": "l1_dtlb_misses",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Misses",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_dtlb_misses",
+ "EventCode": "0x45",
+ "BriefDescription": "L2 DTLB Misses & Data page walks",
+ "UMask": "0xf0"
+ },
+ {
+ "EventName": "all_tlbs_flushed",
+ "EventCode": "0x78",
+ "BriefDescription": "All TLBs Flushed",
+ "UMask": "0xdf"
+ },
+ {
+ "EventName": "uops_dispatched",
+ "EventCode": "0xaa",
+ "BriefDescription": "Micro-ops Dispatched",
+ "UMask": "0x03"
+ },
+ {
+ "EventName": "sse_avx_stalls",
+ "EventCode": "0x0e",
+ "BriefDescription": "Mixed SSE/AVX Stalls",
+ "UMask": "0x0e"
+ },
+ {
+ "EventName": "uops_retired",
+ "EventCode": "0xc1",
+ "BriefDescription": "Micro-ops Retired"
+ },
+ {
+ "MetricName": "all_remote_links_outbound",
+ "BriefDescription": "Approximate: Outbound data bytes for all Remote Links for a node (die)",
+ "MetricExpr": "remote_outbound_data_controller_0 + remote_outbound_data_controller_1 + remote_outbound_data_controller_2 + remote_outbound_data_controller_3",
+ "MetricGroup": "data_fabric",
+ "PerPkg": "1",
+ "ScaleUnit": "3e-5MiB"
+ },
+ {
+ "MetricName": "nps1_die_to_dram",
+ "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+ "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+ "MetricGroup": "data_fabric",
+ "PerPkg": "1",
+ "ScaleUnit": "6.1e-5MiB"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/cache.json b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
index 1c60bfa0f00b..f61b982f83ca 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
@@ -47,6 +47,11 @@
"BriefDescription": "Miscellaneous events covered in more detail by l2_request_g2 (PMCx061).",
"UMask": "0x1"
},
+ {
+ "EventName": "l2_request_g1.all_no_prefetch",
+ "EventCode": "0x60",
+ "UMask": "0xf9"
+ },
{
"EventName": "l2_request_g2.group1",
"EventCode": "0x61",
@@ -173,6 +178,24 @@
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2.",
"UMask": "0x1"
},
+ {
+ "EventName": "l2_cache_req_stat.ic_access_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache requests in L2.",
+ "UMask": "0x7"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_dc_miss_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2 and Data cache request miss in L2 (all types).",
+ "UMask": "0x9"
+ },
+ {
+ "EventName": "l2_cache_req_stat.ic_dc_hit_in_l2",
+ "EventCode": "0x64",
+ "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request hit in L2 and Data cache request hit in L2 (all types).",
+ "UMask": "0xf6"
+ },
{
"EventName": "l2_fill_pending.l2_fill_busy",
"EventCode": "0x6d",
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json b/tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json
new file mode 100644
index 000000000000..40271df40015
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json
@@ -0,0 +1,98 @@
+[
+ {
+ "EventName": "remote_outbound_data_controller_0",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 0",
+ "EventCode": "0x7c7",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_1",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 1",
+ "EventCode": "0x807",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_2",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 2",
+ "EventCode": "0x847",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "remote_outbound_data_controller_3",
+ "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 3",
+ "EventCode": "0x887",
+ "UMask": "0x02",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_0",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x07",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_1",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x47",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_2",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x87",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_3",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0xc7",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_4",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x107",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_5",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x147",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_6",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x187",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ },
+ {
+ "EventName": "dram_channel_data_controller_7",
+ "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
+ "EventCode": "0x1c7",
+ "UMask": "0x38",
+ "PerPkg": "1",
+ "Unit": "DFPMC"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
new file mode 100644
index 000000000000..2ef91e25e661
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
@@ -0,0 +1,178 @@
+[
+ {
+ "MetricName": "branch_misprediction_ratio",
+ "BriefDescription": "Execution-Time Branch Misprediction Ratio (Non-Speculative)",
+ "MetricExpr": "d_ratio(ex_ret_brn_misp, ex_ret_brn)",
+ "MetricGroup": "branch_prediction",
+ "ScaleUnit": "100%"
+ },
+ {
+ "EventName": "all_dc_accesses",
+ "EventCode": "0x29",
+ "BriefDescription": "All L1 Data Cache Accesses",
+ "UMask": "0x7"
+ },
+ {
+ "MetricName": "all_l2_cache_accesses",
+ "BriefDescription": "All L2 Cache Accesses",
+ "MetricExpr": "l2_request_g1.all_no_prefetch + l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_accesses_from_ic_misses",
+ "EventCode": "0x60",
+ "BriefDescription": "L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)",
+ "UMask": "0x10"
+ },
+ {
+ "EventName": "l2_cache_accesses_from_dc_misses",
+ "EventCode": "0x60",
+ "BriefDescription": "L2 Cache Accesses from L1 Data Cache Misses (including prefetch)",
+ "UMask": "0xc8"
+ },
+ {
+ "MetricName": "l2_cache_accesses_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Accesses from L2 HWPF",
+ "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "all_l2_cache_misses",
+ "BriefDescription": "All L2 Cache Misses",
+ "MetricExpr": "l2_cache_req_stat.ic_dc_miss_in_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_misses_from_ic_miss",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Misses from L1 Instruction Cache Misses",
+ "UMask": "0x01"
+ },
+ {
+ "EventName": "l2_cache_misses_from_dc_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Misses from L1 Data Cache Misses",
+ "UMask": "0x08"
+ },
+ {
+ "MetricName": "l2_cache_misses_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Misses from L2 HWPF",
+ "MetricExpr": "l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "MetricName": "all_l2_cache_hits",
+ "BriefDescription": "All L2 Cache Hits",
+ "MetricExpr": "l2_cache_req_stat.ic_dc_hit_in_l2 + l2_pf_hit_l2",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l2_cache_hits_from_ic_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Hits from L1 Instruction Cache Misses",
+ "UMask": "0x06"
+ },
+ {
+ "EventName": "l2_cache_hits_from_dc_misses",
+ "EventCode": "0x64",
+ "BriefDescription": "L2 Cache Hits from L1 Data Cache Misses",
+ "UMask": "0x70"
+ },
+ {
+ "MetricName": "l2_cache_hits_from_l2_hwpf",
+ "BriefDescription": "L2 Cache Hits from L2 HWPF",
+ "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
+ "MetricGroup": "l2_cache"
+ },
+ {
+ "EventName": "l3_accesses",
+ "EventCode": "0x04",
+ "BriefDescription": "L3 Accesses",
+ "UMask": "0xff",
+ "Unit": "L3PMC"
+ },
+ {
+ "EventName": "l3_misses",
+ "EventCode": "0x04",
+ "BriefDescription": "L3 Misses (includes Chg2X)",
+ "UMask": "0x01",
+ "Unit": "L3PMC"
+ },
+ {
+ "MetricName": "l3_read_miss_latency",
+ "BriefDescription": "Average L3 Read Miss Latency (in core clocks)",
+ "MetricExpr": "(xi_sys_fill_latency * 16) / xi_ccx_sdp_req1.all_l3_miss_req_typs",
+ "MetricGroup": "l3_cache",
+ "ScaleUnit": "1core clocks"
+ },
+ {
+ "MetricName": "ic_fetch_miss_ratio",
+ "BriefDescription": "L1 Instruction Cache (32B) Fetch Miss Ratio",
+ "MetricExpr": "d_ratio(l2_cache_req_stat.ic_access_in_l2, bp_l1_tlb_fetch_hit + bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_tlb_miss)",
+ "MetricGroup": "l2_cache",
+ "ScaleUnit": "100%"
+ },
+ {
+ "MetricName": "l1_itlb_misses",
+ "BriefDescription": "L1 ITLB Misses",
+ "MetricExpr": "bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_tlb_miss",
+ "MetricGroup": "tlb"
+ },
+ {
+ "EventName": "l2_itlb_misses",
+ "EventCode": "0x85",
+ "BriefDescription": "L2 ITLB Misses & Instruction page walks",
+ "UMask": "0x07"
+ },
+ {
+ "EventName": "l1_dtlb_misses",
+ "EventCode": "0x45",
+ "BriefDescription": "L1 DTLB Misses",
+ "UMask": "0xff"
+ },
+ {
+ "EventName": "l2_dtlb_misses",
+ "EventCode": "0x45",
+ "BriefDescription": "L2 DTLB Misses & Data page walks",
+ "UMask": "0xf0"
+ },
+ {
+ "EventName": "all_tlbs_flushed",
+ "EventCode": "0x78",
+ "BriefDescription": "All TLBs Flushed",
+ "UMask": "0xdf"
+ },
+ {
+ "EventName": "uops_dispatched",
+ "EventCode": "0xaa",
+ "BriefDescription": "Micro-ops Dispatched",
+ "UMask": "0x03"
+ },
+ {
+ "EventName": "sse_avx_stalls",
+ "EventCode": "0x0e",
+ "BriefDescription": "Mixed SSE/AVX Stalls",
+ "UMask": "0x0e"
+ },
+ {
+ "EventName": "uops_retired",
+ "EventCode": "0xc1",
+ "BriefDescription": "Micro-ops Retired"
+ },
+ {
+ "MetricName": "all_remote_links_outbound",
+ "BriefDescription": "Approximate: Outbound data bytes for all Remote Links for a node (die)",
+ "MetricExpr": "remote_outbound_data_controller_0 + remote_outbound_data_controller_1 + remote_outbound_data_controller_2 + remote_outbound_data_controller_3",
+ "MetricGroup": "data_fabric",
+ "PerPkg": "1",
+ "ScaleUnit": "3e-5MiB"
+ },
+ {
+ "MetricName": "nps1_die_to_dram",
+ "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+ "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+ "MetricGroup": "data_fabric",
+ "PerPkg": "1",
+ "ScaleUnit": "6.1e-5MiB"
+ }
+]
diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index fa86c5f997cc..5984906b6893 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -240,6 +240,7 @@ static struct map {
{ "hisi_sccl,hha", "hisi_sccl,hha" },
{ "hisi_sccl,l3c", "hisi_sccl,l3c" },
{ "L3PMC", "amd_l3" },
+ { "DFPMC", "amd_df" },
{}
};

--
2.27.0

2020-09-01 22:12:49

by Kim Phillips

[permalink] [raw]
Subject: [PATCH 4/4] perf vendor events amd: Enable Family 19h users by matching Zen2 events

This enables zen3 users by reusing mostly-compatible zen2 events
until the official public list of zen3 events is published in a
future PPR.

Signed-off-by: Kim Phillips <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: "Martin Liška" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 25b06cf98747..2f2a209e87e1 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -38,3 +38,4 @@ GenuineIntel-6-7E,v1,icelake,core
GenuineIntel-6-86,v1,tremontx,core
AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v2,amdzen1,core
AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
+AuthenticAMD-25-[[:xdigit:]]+,v1,amdzen2,core
--
2.27.0

2020-09-01 22:14:59

by Kim Phillips

[permalink] [raw]
Subject: [PATCH 2/4] perf vendor events amd: Add ITLB Instruction Fetch Hits event for zen1

The ITLB Instruction Fetch Hits event isn't documented even in
later zen1 PPRs, but it seems to count correctly on zen1 hardware.

Add it to zen1 group so zen1 users can use the upcoming IC Fetch Miss
Ratio Metric.

The IF1G, 1IF2M, IF4K (Instruction fetches to a 1 GB, 2 MB, and 4K page)
unit masks are not added because unlike zen2 hardware, zen1 hardware
counts all its unit masks with a 0 unit mask according to the old
convention:

zen1$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1

Performance counter stats for 'sleep 1':

211,318 cpu/event=0x94/u
211,318 cpu/event=0x94,umask=0xff/u

Rome/zen2:

zen2$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1

Performance counter stats for 'sleep 1':

0 cpu/event=0x94/u
190,744 cpu/event=0x94,umask=0xff/u

Signed-off-by: Kim Phillips <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: "Martin Liška" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
tools/perf/pmu-events/arch/x86/amdzen1/branch.json | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/branch.json b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
index a9943eeb8d6b..4ceb67a0db21 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
@@ -19,5 +19,10 @@
"EventName": "bp_de_redirect",
"EventCode": "0x91",
"BriefDescription": "Decoder Overrides Existing Branch Prediction (speculative)."
+ },
+ {
+ "EventName": "bp_l1_tlb_fetch_hit",
+ "EventCode": "0x94",
+ "BriefDescription": "The number of instruction fetches that hit in the L1 ITLB."
}
]
--
2.27.0

2020-09-03 05:42:05

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 1/4] perf vendor events amd: Add L2 Prefetch events for zen1

On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
>
> Later revisions of PPRs that post-date the original Family 17h events
> submission patch add these events.
>
> Specifically, they were not in this 2017 revision of the F17h PPR:
>
> Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017
>
> But e.g., are included in this 2019 version of the PPR:
>
> Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019
>
> Signed-off-by: Kim Phillips <[email protected]>

Reviewed-by: Ian Rogers <[email protected]>

Sanity checked manual and ran tests. Thanks,
Ian

> Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Vijay Thakkar <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Yunfeng Ye <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: "Martin Liška" <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jon Grimm <[email protected]>
> Cc: Martin Jambor <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: William Cohen <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> .../pmu-events/arch/x86/amdzen1/cache.json | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> index 404d4c569c01..695ed3ffa3a6 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> @@ -249,6 +249,24 @@
> "BriefDescription": "Cycles with fill pending from L2. Total cycles spent with one or more fill requests in flight from L2.",
> "UMask": "0x1"
> },
> + {
> + "EventName": "l2_pf_hit_l2",
> + "EventCode": "0x70",
> + "BriefDescription": "L2 prefetch hit in L2.",
> + "UMask": "0xff"
> + },
> + {
> + "EventName": "l2_pf_miss_l2_hit_l3",
> + "EventCode": "0x71",
> + "BriefDescription": "L2 prefetcher hits in L3. Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.",
> + "UMask": "0xff"
> + },
> + {
> + "EventName": "l2_pf_miss_l2_l3",
> + "EventCode": "0x72",
> + "BriefDescription": "L2 prefetcher misses in L3. All L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.",
> + "UMask": "0xff"
> + },
> {
> "EventName": "l3_request_g1.caching_l3_cache_accesses",
> "EventCode": "0x01",
> --
> 2.27.0
>

2020-09-03 06:07:01

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 2/4] perf vendor events amd: Add ITLB Instruction Fetch Hits event for zen1

On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
>
> The ITLB Instruction Fetch Hits event isn't documented even in
> later zen1 PPRs, but it seems to count correctly on zen1 hardware.
>
> Add it to zen1 group so zen1 users can use the upcoming IC Fetch Miss
> Ratio Metric.
>
> The IF1G, 1IF2M, IF4K (Instruction fetches to a 1 GB, 2 MB, and 4K page)
> unit masks are not added because unlike zen2 hardware, zen1 hardware
> counts all its unit masks with a 0 unit mask according to the old
> convention:
>
> zen1$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 211,318 cpu/event=0x94/u
> 211,318 cpu/event=0x94,umask=0xff/u
>
> Rome/zen2:
>
> zen2$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 0 cpu/event=0x94/u
> 190,744 cpu/event=0x94,umask=0xff/u
>
> Signed-off-by: Kim Phillips <[email protected]>

Acked-by: Ian Rogers <[email protected]>

Thanks,
Ian

> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Vijay Thakkar <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Yunfeng Ye <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: "Martin Liška" <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jon Grimm <[email protected]>
> Cc: Martin Jambor <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: William Cohen <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> tools/perf/pmu-events/arch/x86/amdzen1/branch.json | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/branch.json b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
> index a9943eeb8d6b..4ceb67a0db21 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/branch.json
> @@ -19,5 +19,10 @@
> "EventName": "bp_de_redirect",
> "EventCode": "0x91",
> "BriefDescription": "Decoder Overrides Existing Branch Prediction (speculative)."
> + },
> + {
> + "EventName": "bp_l1_tlb_fetch_hit",
> + "EventCode": "0x94",
> + "BriefDescription": "The number of instruction fetches that hit in the L1 ITLB."
> }
> ]
> --
> 2.27.0
>

2020-09-03 06:21:28

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf vendor events amd: Add recommended events

On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
>
> Add support for events listed in Section 2.1.15.2 "Performance
> Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803
> Rev 0.54 - Sep 12, 2019".
>
> perf now supports these new events (-e):
>
> all_dc_accesses
> all_tlbs_flushed
> l1_dtlb_misses
> l2_cache_accesses_from_dc_misses
> l2_cache_accesses_from_ic_misses
> l2_cache_hits_from_dc_misses
> l2_cache_hits_from_ic_misses
> l2_cache_misses_from_dc_misses
> l2_cache_misses_from_ic_miss
> l2_dtlb_misses
> l2_itlb_misses
> sse_avx_stalls
> uops_dispatched
> uops_retired
> l3_accesses
> l3_misses
>
> and these metrics (-M):
>
> branch_misprediction_ratio
> all_l2_cache_accesses
> all_l2_cache_hits
> all_l2_cache_misses
> ic_fetch_miss_ratio
> l2_cache_accesses_from_l2_hwpf
> l2_cache_hits_from_l2_hwpf
> l2_cache_misses_from_l2_hwpf
> l3_read_miss_latency
> l1_itlb_misses
> all_remote_links_outbound
> nps1_die_to_dram
>
> The nps1_die_to_dram event may need perf stat's --metric-no-group
> switch if the number of available data fabric counters is less
> than the number it uses (8).

These are really excellent additions! Does:
"MetricConstraint": "NO_NMI_WATCHDOG"
solve the grouping issue? Perhaps the MetricConstraint needs to be
named more generically to cover this case as it seems sub-optimal to
require the use of --metric-no-group.

>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Kim Phillips <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Vijay Thakkar <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Yunfeng Ye <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: "Martin Liška" <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jon Grimm <[email protected]>
> Cc: Martin Jambor <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: William Cohen <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> .../pmu-events/arch/x86/amdzen1/cache.json | 23 +++
> .../arch/x86/amdzen1/data-fabric.json | 98 ++++++++++
> .../arch/x86/amdzen1/recommended.json | 178 ++++++++++++++++++
> .../pmu-events/arch/x86/amdzen2/cache.json | 23 +++
> .../arch/x86/amdzen2/data-fabric.json | 98 ++++++++++
> .../arch/x86/amdzen2/recommended.json | 178 ++++++++++++++++++
> tools/perf/pmu-events/jevents.c | 1 +
> 7 files changed, 599 insertions(+)
> create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json
> create mode 100644 tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json
> create mode 100644 tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> index 695ed3ffa3a6..4ea7ec4f496e 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> @@ -117,6 +117,11 @@
> "BriefDescription": "Miscellaneous events covered in more detail by l2_request_g2 (PMCx061).",
> "UMask": "0x1"
> },
> + {
> + "EventName": "l2_request_g1.all_no_prefetch",
> + "EventCode": "0x60",
> + "UMask": "0xf9"
> + },

Would it be possible to have a BriefDescription here?

> {
> "EventName": "l2_request_g2.group1",
> "EventCode": "0x61",
> @@ -243,6 +248,24 @@
> "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2.",
> "UMask": "0x1"
> },
> + {
> + "EventName": "l2_cache_req_stat.ic_access_in_l2",
> + "EventCode": "0x64",
> + "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache requests in L2.",
> + "UMask": "0x7"
> + },
> + {
> + "EventName": "l2_cache_req_stat.ic_dc_miss_in_l2",
> + "EventCode": "0x64",
> + "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2 and Data cache request miss in L2 (all types).",
> + "UMask": "0x9"
> + },
> + {
> + "EventName": "l2_cache_req_stat.ic_dc_hit_in_l2",
> + "EventCode": "0x64",
> + "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request hit in L2 and Data cache request hit in L2 (all types).",
> + "UMask": "0xf6"
> + },
> {
> "EventName": "l2_fill_pending.l2_fill_busy",
> "EventCode": "0x6d",
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json b/tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json
> new file mode 100644
> index 000000000000..40271df40015
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/data-fabric.json
> @@ -0,0 +1,98 @@
> +[
> + {
> + "EventName": "remote_outbound_data_controller_0",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 0",
> + "EventCode": "0x7c7",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "remote_outbound_data_controller_1",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 1",
> + "EventCode": "0x807",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "remote_outbound_data_controller_2",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 2",
> + "EventCode": "0x847",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "remote_outbound_data_controller_3",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 3",
> + "EventCode": "0x887",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_0",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x07",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_1",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x47",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_2",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x87",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_3",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0xc7",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_4",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x107",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_5",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x147",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_6",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x187",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_7",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x1c7",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + }
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> new file mode 100644
> index 000000000000..2cfe2d2f3bfd
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> @@ -0,0 +1,178 @@
> +[
> + {
> + "MetricName": "branch_misprediction_ratio",
> + "BriefDescription": "Execution-Time Branch Misprediction Ratio (Non-Speculative)",
> + "MetricExpr": "d_ratio(ex_ret_brn_misp, ex_ret_brn)",
> + "MetricGroup": "branch_prediction",
> + "ScaleUnit": "100%"
> + },
> + {
> + "EventName": "all_dc_accesses",
> + "EventCode": "0x29",
> + "BriefDescription": "All L1 Data Cache Accesses",
> + "UMask": "0x7"
> + },
> + {
> + "MetricName": "all_l2_cache_accesses",
> + "BriefDescription": "All L2 Cache Accesses",
> + "MetricExpr": "l2_request_g1.all_no_prefetch + l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l2_cache_accesses_from_ic_misses",
> + "EventCode": "0x60",
> + "BriefDescription": "L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)",
> + "UMask": "0x10"
> + },
> + {
> + "EventName": "l2_cache_accesses_from_dc_misses",
> + "EventCode": "0x60",
> + "BriefDescription": "L2 Cache Accesses from L1 Data Cache Misses (including prefetch)",
> + "UMask": "0xc8"
> + },
> + {
> + "MetricName": "l2_cache_accesses_from_l2_hwpf",
> + "BriefDescription": "L2 Cache Accesses from L2 HWPF",
> + "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "MetricName": "all_l2_cache_misses",
> + "BriefDescription": "All L2 Cache Misses",
> + "MetricExpr": "l2_cache_req_stat.ic_dc_miss_in_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l2_cache_misses_from_ic_miss",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Misses from L1 Instruction Cache Misses",
> + "UMask": "0x01"
> + },
> + {
> + "EventName": "l2_cache_misses_from_dc_misses",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Misses from L1 Data Cache Misses",
> + "UMask": "0x08"
> + },
> + {
> + "MetricName": "l2_cache_misses_from_l2_hwpf",
> + "BriefDescription": "L2 Cache Misses from L2 HWPF",
> + "MetricExpr": "l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "MetricName": "all_l2_cache_hits",
> + "BriefDescription": "All L2 Cache Hits",
> + "MetricExpr": "l2_cache_req_stat.ic_dc_hit_in_l2 + l2_pf_hit_l2",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l2_cache_hits_from_ic_misses",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Hits from L1 Instruction Cache Misses",
> + "UMask": "0x06"
> + },
> + {
> + "EventName": "l2_cache_hits_from_dc_misses",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Hits from L1 Data Cache Misses",
> + "UMask": "0x70"
> + },
> + {
> + "MetricName": "l2_cache_hits_from_l2_hwpf",
> + "BriefDescription": "L2 Cache Hits from L2 HWPF",
> + "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l3_accesses",
> + "EventCode": "0x04",
> + "BriefDescription": "L3 Accesses",
> + "UMask": "0xff",
> + "Unit": "L3PMC"
> + },
> + {
> + "EventName": "l3_misses",
> + "EventCode": "0x04",
> + "BriefDescription": "L3 Misses (includes Chg2X)",

Would it be possible to add a slightly more expanded description of
what Chg2X means? I don't see it in the PPR either :-(

> + "UMask": "0x01",
> + "Unit": "L3PMC"
> + },
> + {
> + "MetricName": "l3_read_miss_latency",
> + "BriefDescription": "Average L3 Read Miss Latency (in core clocks)",
> + "MetricExpr": "(xi_sys_fill_latency * 16) / xi_ccx_sdp_req1.all_l3_miss_req_typs",
> + "MetricGroup": "l3_cache",
> + "ScaleUnit": "1core clocks"
> + },
> + {
> + "MetricName": "ic_fetch_miss_ratio",
> + "BriefDescription": "L1 Instruction Cache (32B) Fetch Miss Ratio",
> + "MetricExpr": "d_ratio(l2_cache_req_stat.ic_access_in_l2, bp_l1_tlb_fetch_hit + bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_miss)",
> + "MetricGroup": "l2_cache",
> + "ScaleUnit": "100%"
> + },
> + {
> + "MetricName": "l1_itlb_misses",
> + "BriefDescription": "L1 ITLB Misses",
> + "MetricExpr": "bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_miss",
> + "MetricGroup": "tlb"
> + },
> + {
> + "EventName": "l2_itlb_misses",
> + "EventCode": "0x85",
> + "BriefDescription": "L2 ITLB Misses & Instruction page walks",
> + "UMask": "0x07"
> + },
> + {
> + "EventName": "l1_dtlb_misses",
> + "EventCode": "0x45",
> + "BriefDescription": "L1 DTLB Misses",
> + "UMask": "0xff"
> + },
> + {
> + "EventName": "l2_dtlb_misses",
> + "EventCode": "0x45",
> + "BriefDescription": "L2 DTLB Misses & Data page walks",
> + "UMask": "0xf0"
> + },
> + {
> + "EventName": "all_tlbs_flushed",
> + "EventCode": "0x78",
> + "BriefDescription": "All TLBs Flushed",
> + "UMask": "0xdf"
> + },
> + {
> + "EventName": "uops_dispatched",
> + "EventCode": "0xaa",
> + "BriefDescription": "Micro-ops Dispatched",
> + "UMask": "0x03"
> + },
> + {
> + "EventName": "sse_avx_stalls",
> + "EventCode": "0x0e",
> + "BriefDescription": "Mixed SSE/AVX Stalls",
> + "UMask": "0x0e"
> + },
> + {
> + "EventName": "uops_retired",
> + "EventCode": "0xc1",
> + "BriefDescription": "Micro-ops Retired"
> + },
> + {
> + "MetricName": "all_remote_links_outbound",
> + "BriefDescription": "Approximate: Outbound data bytes for all Remote Links for a node (die)",
> + "MetricExpr": "remote_outbound_data_controller_0 + remote_outbound_data_controller_1 + remote_outbound_data_controller_2 + remote_outbound_data_controller_3",
> + "MetricGroup": "data_fabric",
> + "PerPkg": "1",
> + "ScaleUnit": "3e-5MiB"
> + },
> + {
> + "MetricName": "nps1_die_to_dram",
> + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> + "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> + "MetricGroup": "data_fabric",
> + "PerPkg": "1",
> + "ScaleUnit": "6.1e-5MiB"
> + }
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/cache.json b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
> index 1c60bfa0f00b..f61b982f83ca 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
> @@ -47,6 +47,11 @@
> "BriefDescription": "Miscellaneous events covered in more detail by l2_request_g2 (PMCx061).",
> "UMask": "0x1"
> },
> + {
> + "EventName": "l2_request_g1.all_no_prefetch",
> + "EventCode": "0x60",
> + "UMask": "0xf9"
> + },

Possible BriefDescription?

> {
> "EventName": "l2_request_g2.group1",
> "EventCode": "0x61",
> @@ -173,6 +178,24 @@
> "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2.",
> "UMask": "0x1"
> },
> + {
> + "EventName": "l2_cache_req_stat.ic_access_in_l2",
> + "EventCode": "0x64",
> + "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache requests in L2.",
> + "UMask": "0x7"
> + },
> + {
> + "EventName": "l2_cache_req_stat.ic_dc_miss_in_l2",
> + "EventCode": "0x64",
> + "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2 and Data cache request miss in L2 (all types).",
> + "UMask": "0x9"
> + },
> + {
> + "EventName": "l2_cache_req_stat.ic_dc_hit_in_l2",
> + "EventCode": "0x64",
> + "BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request hit in L2 and Data cache request hit in L2 (all types).",
> + "UMask": "0xf6"
> + },
> {
> "EventName": "l2_fill_pending.l2_fill_busy",
> "EventCode": "0x6d",
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json b/tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json
> new file mode 100644
> index 000000000000..40271df40015
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/data-fabric.json
> @@ -0,0 +1,98 @@
> +[
> + {
> + "EventName": "remote_outbound_data_controller_0",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 0",
> + "EventCode": "0x7c7",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "remote_outbound_data_controller_1",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 1",
> + "EventCode": "0x807",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "remote_outbound_data_controller_2",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 2",
> + "EventCode": "0x847",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "remote_outbound_data_controller_3",
> + "PublicDescription": "Remote Link Controller Outbound Packet Types: Data (32B): Remote Link Controller 3",
> + "EventCode": "0x887",
> + "UMask": "0x02",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_0",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x07",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_1",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x47",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_2",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x87",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_3",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0xc7",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_4",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x107",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_5",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x147",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_6",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x187",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + },
> + {
> + "EventName": "dram_channel_data_controller_7",
> + "PublicDescription": "DRAM Channel Controller Request Types: Requests with Data (64B): DRAM Channel Controller 0",
> + "EventCode": "0x1c7",
> + "UMask": "0x38",
> + "PerPkg": "1",
> + "Unit": "DFPMC"
> + }
> +]
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> new file mode 100644
> index 000000000000..2ef91e25e661
> --- /dev/null
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> @@ -0,0 +1,178 @@
> +[
> + {
> + "MetricName": "branch_misprediction_ratio",
> + "BriefDescription": "Execution-Time Branch Misprediction Ratio (Non-Speculative)",
> + "MetricExpr": "d_ratio(ex_ret_brn_misp, ex_ret_brn)",
> + "MetricGroup": "branch_prediction",
> + "ScaleUnit": "100%"
> + },
> + {
> + "EventName": "all_dc_accesses",
> + "EventCode": "0x29",
> + "BriefDescription": "All L1 Data Cache Accesses",
> + "UMask": "0x7"
> + },
> + {
> + "MetricName": "all_l2_cache_accesses",
> + "BriefDescription": "All L2 Cache Accesses",
> + "MetricExpr": "l2_request_g1.all_no_prefetch + l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l2_cache_accesses_from_ic_misses",
> + "EventCode": "0x60",
> + "BriefDescription": "L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)",
> + "UMask": "0x10"
> + },
> + {
> + "EventName": "l2_cache_accesses_from_dc_misses",
> + "EventCode": "0x60",
> + "BriefDescription": "L2 Cache Accesses from L1 Data Cache Misses (including prefetch)",
> + "UMask": "0xc8"
> + },
> + {
> + "MetricName": "l2_cache_accesses_from_l2_hwpf",
> + "BriefDescription": "L2 Cache Accesses from L2 HWPF",
> + "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "MetricName": "all_l2_cache_misses",
> + "BriefDescription": "All L2 Cache Misses",
> + "MetricExpr": "l2_cache_req_stat.ic_dc_miss_in_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l2_cache_misses_from_ic_miss",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Misses from L1 Instruction Cache Misses",
> + "UMask": "0x01"
> + },
> + {
> + "EventName": "l2_cache_misses_from_dc_misses",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Misses from L1 Data Cache Misses",
> + "UMask": "0x08"
> + },
> + {
> + "MetricName": "l2_cache_misses_from_l2_hwpf",
> + "BriefDescription": "L2 Cache Misses from L2 HWPF",
> + "MetricExpr": "l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "MetricName": "all_l2_cache_hits",
> + "BriefDescription": "All L2 Cache Hits",
> + "MetricExpr": "l2_cache_req_stat.ic_dc_hit_in_l2 + l2_pf_hit_l2",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l2_cache_hits_from_ic_misses",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Hits from L1 Instruction Cache Misses",
> + "UMask": "0x06"
> + },
> + {
> + "EventName": "l2_cache_hits_from_dc_misses",
> + "EventCode": "0x64",
> + "BriefDescription": "L2 Cache Hits from L1 Data Cache Misses",
> + "UMask": "0x70"
> + },
> + {
> + "MetricName": "l2_cache_hits_from_l2_hwpf",
> + "BriefDescription": "L2 Cache Hits from L2 HWPF",
> + "MetricExpr": "l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3",
> + "MetricGroup": "l2_cache"
> + },
> + {
> + "EventName": "l3_accesses",
> + "EventCode": "0x04",
> + "BriefDescription": "L3 Accesses",
> + "UMask": "0xff",
> + "Unit": "L3PMC"
> + },
> + {
> + "EventName": "l3_misses",
> + "EventCode": "0x04",
> + "BriefDescription": "L3 Misses (includes Chg2X)",
> + "UMask": "0x01",
> + "Unit": "L3PMC"
> + },
> + {
> + "MetricName": "l3_read_miss_latency",
> + "BriefDescription": "Average L3 Read Miss Latency (in core clocks)",
> + "MetricExpr": "(xi_sys_fill_latency * 16) / xi_ccx_sdp_req1.all_l3_miss_req_typs",
> + "MetricGroup": "l3_cache",
> + "ScaleUnit": "1core clocks"
> + },
> + {
> + "MetricName": "ic_fetch_miss_ratio",
> + "BriefDescription": "L1 Instruction Cache (32B) Fetch Miss Ratio",
> + "MetricExpr": "d_ratio(l2_cache_req_stat.ic_access_in_l2, bp_l1_tlb_fetch_hit + bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_tlb_miss)",
> + "MetricGroup": "l2_cache",
> + "ScaleUnit": "100%"
> + },
> + {
> + "MetricName": "l1_itlb_misses",
> + "BriefDescription": "L1 ITLB Misses",
> + "MetricExpr": "bp_l1_tlb_miss_l2_hit + bp_l1_tlb_miss_l2_tlb_miss",
> + "MetricGroup": "tlb"
> + },
> + {
> + "EventName": "l2_itlb_misses",
> + "EventCode": "0x85",
> + "BriefDescription": "L2 ITLB Misses & Instruction page walks",
> + "UMask": "0x07"
> + },
> + {
> + "EventName": "l1_dtlb_misses",
> + "EventCode": "0x45",
> + "BriefDescription": "L1 DTLB Misses",
> + "UMask": "0xff"
> + },
> + {
> + "EventName": "l2_dtlb_misses",
> + "EventCode": "0x45",
> + "BriefDescription": "L2 DTLB Misses & Data page walks",
> + "UMask": "0xf0"
> + },
> + {
> + "EventName": "all_tlbs_flushed",
> + "EventCode": "0x78",
> + "BriefDescription": "All TLBs Flushed",
> + "UMask": "0xdf"
> + },
> + {
> + "EventName": "uops_dispatched",
> + "EventCode": "0xaa",
> + "BriefDescription": "Micro-ops Dispatched",
> + "UMask": "0x03"
> + },
> + {
> + "EventName": "sse_avx_stalls",
> + "EventCode": "0x0e",
> + "BriefDescription": "Mixed SSE/AVX Stalls",
> + "UMask": "0x0e"
> + },
> + {
> + "EventName": "uops_retired",
> + "EventCode": "0xc1",
> + "BriefDescription": "Micro-ops Retired"
> + },
> + {
> + "MetricName": "all_remote_links_outbound",
> + "BriefDescription": "Approximate: Outbound data bytes for all Remote Links for a node (die)",
> + "MetricExpr": "remote_outbound_data_controller_0 + remote_outbound_data_controller_1 + remote_outbound_data_controller_2 + remote_outbound_data_controller_3",
> + "MetricGroup": "data_fabric",
> + "PerPkg": "1",
> + "ScaleUnit": "3e-5MiB"
> + },
> + {
> + "MetricName": "nps1_die_to_dram",
> + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> + "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> + "MetricGroup": "data_fabric",
> + "PerPkg": "1",
> + "ScaleUnit": "6.1e-5MiB"
> + }
> +]
> diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
> index fa86c5f997cc..5984906b6893 100644
> --- a/tools/perf/pmu-events/jevents.c
> +++ b/tools/perf/pmu-events/jevents.c
> @@ -240,6 +240,7 @@ static struct map {
> { "hisi_sccl,hha", "hisi_sccl,hha" },
> { "hisi_sccl,l3c", "hisi_sccl,l3c" },
> { "L3PMC", "amd_l3" },
> + { "DFPMC", "amd_df" },
> {}
> };
>
> --
> 2.27.0
>

2020-09-03 06:24:40

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf vendor events amd: Enable Family 19h users by matching Zen2 events

On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
>
> This enables zen3 users by reusing mostly-compatible zen2 events
> until the official public list of zen3 events is published in a
> future PPR.
>
> Signed-off-by: Kim Phillips <[email protected]>

Acked-by: Ian Rogers <[email protected]>

Thanks!
Ian

> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Vijay Thakkar <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Yunfeng Ye <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: "Martin Liška" <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jon Grimm <[email protected]>
> Cc: Martin Jambor <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: William Cohen <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
> index 25b06cf98747..2f2a209e87e1 100644
> --- a/tools/perf/pmu-events/arch/x86/mapfile.csv
> +++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
> @@ -38,3 +38,4 @@ GenuineIntel-6-7E,v1,icelake,core
> GenuineIntel-6-86,v1,tremontx,core
> AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v2,amdzen1,core
> AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
> +AuthenticAMD-25-[[:xdigit:]]+,v1,amdzen2,core
> --
> 2.27.0
>

2020-09-03 18:30:25

by Kim Phillips

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf vendor events amd: Add recommended events

On 9/3/20 1:19 AM, Ian Rogers wrote:
> On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
>> The nps1_die_to_dram event may need perf stat's --metric-no-group
>> switch if the number of available data fabric counters is less
>> than the number it uses (8).
>
> These are really excellent additions! Does:
> "MetricConstraint": "NO_NMI_WATCHDOG"
> solve the grouping issue? Perhaps the MetricConstraint needs to be
> named more generically to cover this case as it seems sub-optimal to
> require the use of --metric-no-group.

That metric uses data fabric (DFPMC/amd_df) events, not Core PMC
events, which the watchdog uses, so NO_NMI_WATCHDOG wouldn't have
an effect. The event is defined as an approximation anyway.

I'll have to get back to you on the other items.

Thanks for your review!

Kim

2020-09-04 05:50:47

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf vendor events amd: Add recommended events

On Thu, Sep 3, 2020 at 11:27 AM Kim Phillips <[email protected]> wrote:
>
> On 9/3/20 1:19 AM, Ian Rogers wrote:
> > On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
> >> The nps1_die_to_dram event may need perf stat's --metric-no-group
> >> switch if the number of available data fabric counters is less
> >> than the number it uses (8).
> >
> > These are really excellent additions! Does:
> > "MetricConstraint": "NO_NMI_WATCHDOG"
> > solve the grouping issue? Perhaps the MetricConstraint needs to be
> > named more generically to cover this case as it seems sub-optimal to
> > require the use of --metric-no-group.
>
> That metric uses data fabric (DFPMC/amd_df) events, not Core PMC
> events, which the watchdog uses, so NO_NMI_WATCHDOG wouldn't have
> an effect. The event is defined as an approximation anyway.
>
> I'll have to get back to you on the other items.
>
> Thanks for your review!

NP, more nits than anything else.

Acked-by: Ian Rogers <[email protected]>

Thanks,
Ian

> Kim

2020-09-04 19:20:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 1/4] perf vendor events amd: Add L2 Prefetch events for zen1

Em Tue, Sep 01, 2020 at 05:09:41PM -0500, Kim Phillips escreveu:
> Later revisions of PPRs that post-date the original Family 17h events
> submission patch add these events.
>
> Specifically, they were not in this 2017 revision of the F17h PPR:
>
> Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017
>
> But e.g., are included in this 2019 version of the PPR:
>
> Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019


Thanks, applied.

- Arnaldo

> Signed-off-by: Kim Phillips <[email protected]>
> Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Vijay Thakkar <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Yunfeng Ye <[email protected]>
> Cc: Jin Yao <[email protected]>
> Cc: "Martin Liška" <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Jon Grimm <[email protected]>
> Cc: Martin Jambor <[email protected]>
> Cc: Michael Petlan <[email protected]>
> Cc: William Cohen <[email protected]>
> Cc: Stephane Eranian <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> .../pmu-events/arch/x86/amdzen1/cache.json | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> index 404d4c569c01..695ed3ffa3a6 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/cache.json
> @@ -249,6 +249,24 @@
> "BriefDescription": "Cycles with fill pending from L2. Total cycles spent with one or more fill requests in flight from L2.",
> "UMask": "0x1"
> },
> + {
> + "EventName": "l2_pf_hit_l2",
> + "EventCode": "0x70",
> + "BriefDescription": "L2 prefetch hit in L2.",
> + "UMask": "0xff"
> + },
> + {
> + "EventName": "l2_pf_miss_l2_hit_l3",
> + "EventCode": "0x71",
> + "BriefDescription": "L2 prefetcher hits in L3. Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.",
> + "UMask": "0xff"
> + },
> + {
> + "EventName": "l2_pf_miss_l2_l3",
> + "EventCode": "0x72",
> + "BriefDescription": "L2 prefetcher misses in L3. All L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.",
> + "UMask": "0xff"
> + },
> {
> "EventName": "l3_request_g1.caching_l3_cache_accesses",
> "EventCode": "0x01",
> --
> 2.27.0
>

--

- Arnaldo

2020-09-04 19:23:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/4] perf vendor events amd: Add ITLB Instruction Fetch Hits event for zen1

Em Wed, Sep 02, 2020 at 11:03:38PM -0700, Ian Rogers escreveu:
> On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
> >
> > The ITLB Instruction Fetch Hits event isn't documented even in
> > later zen1 PPRs, but it seems to count correctly on zen1 hardware.
> >
> > Add it to zen1 group so zen1 users can use the upcoming IC Fetch Miss
> > Ratio Metric.
> >
> > The IF1G, 1IF2M, IF4K (Instruction fetches to a 1 GB, 2 MB, and 4K page)
> > unit masks are not added because unlike zen2 hardware, zen1 hardware
> > counts all its unit masks with a 0 unit mask according to the old
> > convention:
> >
> > zen1$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1
> >
> > Performance counter stats for 'sleep 1':
> >
> > 211,318 cpu/event=0x94/u
> > 211,318 cpu/event=0x94,umask=0xff/u
> >
> > Rome/zen2:
> >
> > zen2$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1
> >
> > Performance counter stats for 'sleep 1':
> >
> > 0 cpu/event=0x94/u
> > 190,744 cpu/event=0x94,umask=0xff/u
> >
> > Signed-off-by: Kim Phillips <[email protected]>
>
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied.

- Arnaldo

2020-09-04 19:31:54

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf vendor events amd: Add recommended events

Em Thu, Sep 03, 2020 at 10:48:15PM -0700, Ian Rogers escreveu:
> On Thu, Sep 3, 2020 at 11:27 AM Kim Phillips <[email protected]> wrote:
> > On 9/3/20 1:19 AM, Ian Rogers wrote:
> > > On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
> > >> The nps1_die_to_dram event may need perf stat's --metric-no-group
> > >> switch if the number of available data fabric counters is less
> > >> than the number it uses (8).

> > > These are really excellent additions! Does:
> > > "MetricConstraint": "NO_NMI_WATCHDOG"
> > > solve the grouping issue? Perhaps the MetricConstraint needs to be
> > > named more generically to cover this case as it seems sub-optimal to
> > > require the use of --metric-no-group.

> > That metric uses data fabric (DFPMC/amd_df) events, not Core PMC
> > events, which the watchdog uses, so NO_NMI_WATCHDOG wouldn't have
> > an effect. The event is defined as an approximation anyway.

> > I'll have to get back to you on the other items.

> > Thanks for your review!

> NP, more nits than anything else.

> Acked-by: Ian Rogers <[email protected]>

Thanks, applied, testing notes added to the cset:

Committer testing:

On a AMD Ryzen 3900x system:

Before:

# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$"
#

After:

# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$"
all_dc_accesses
[All L1 Data Cache Accesses]
all_tlbs_flushed
[All TLBs Flushed]
l1_dtlb_misses
[L1 DTLB Misses]
l2_cache_accesses_from_dc_misses
[L2 Cache Accesses from L1 Data Cache Misses (including prefetch)]
l2_cache_accesses_from_ic_misses
[L2 Cache Accesses from L1 Instruction Cache Misses (including
prefetch)]
l2_cache_hits_from_dc_misses
[L2 Cache Hits from L1 Data Cache Misses]
l2_cache_hits_from_ic_misses
[L2 Cache Hits from L1 Instruction Cache Misses]
l2_cache_misses_from_dc_misses
[L2 Cache Misses from L1 Data Cache Misses]
l2_cache_misses_from_ic_miss
[L2 Cache Misses from L1 Instruction Cache Misses]
l2_dtlb_misses
[L2 DTLB Misses & Data page walks]
l2_itlb_misses
[L2 ITLB Misses & Instruction page walks]
sse_avx_stalls
[Mixed SSE/AVX Stalls]
uops_dispatched
[Micro-ops Dispatched]
uops_retired
[Micro-ops Retired]
l3_accesses
[L3 Accesses. Unit: amd_l3]
l3_misses
[L3 Misses (includes Chg2X). Unit: amd_l3]
#

# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2

Performance counter stats for 'system wide':

433,439,949 all_dc_accesses (35.66%)
443 all_tlbs_flushed (35.66%)
2,985,885 l1_dtlb_misses (35.66%)
18,318,019 l2_cache_accesses_from_dc_misses (35.68%)
50,114,810 l2_cache_accesses_from_ic_misses (35.72%)
12,423,978 l2_cache_hits_from_dc_misses (35.74%)
40,703,103 l2_cache_hits_from_ic_misses (35.74%)
6,698,673 l2_cache_misses_from_dc_misses (35.74%)
12,090,892 l2_cache_misses_from_ic_miss (35.74%)
614,267 l2_dtlb_misses (35.74%)
216,036 l2_itlb_misses (35.74%)
11,977 sse_avx_stalls (35.74%)
999,276,223 uops_dispatched (35.73%)
1,075,311,620 uops_retired (35.69%)
1,420,763 l3_accesses
540,164 l3_misses

2.002344121 seconds time elapsed

# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2

Performance counter stats for 'system wide':

175,943,104 all_dc_accesses
310 all_tlbs_flushed
2,280,359 l1_dtlb_misses
11,700,151 l2_cache_accesses_from_dc_misses
25,414,963 l2_cache_accesses_from_ic_misses

2.001957818 seconds time elapsed

#

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>

2020-09-04 19:36:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf vendor events amd: Enable Family 19h users by matching Zen2 events

Em Wed, Sep 02, 2020 at 11:20:20PM -0700, Ian Rogers escreveu:
> On Tue, Sep 1, 2020 at 3:10 PM Kim Phillips <[email protected]> wrote:
> >
> > This enables zen3 users by reusing mostly-compatible zen2 events
> > until the official public list of zen3 events is published in a
> > future PPR.
> >
> > Signed-off-by: Kim Phillips <[email protected]>
>
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied,

- Arnaldo

> Thanks!
> Ian
>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Arnaldo Carvalho de Melo <[email protected]>
> > Cc: Mark Rutland <[email protected]>
> > Cc: Alexander Shishkin <[email protected]>
> > Cc: Jiri Olsa <[email protected]>
> > Cc: Namhyung Kim <[email protected]>
> > Cc: Vijay Thakkar <[email protected]>
> > Cc: Andi Kleen <[email protected]>
> > Cc: John Garry <[email protected]>
> > Cc: Kan Liang <[email protected]>
> > Cc: Yunfeng Ye <[email protected]>
> > Cc: Jin Yao <[email protected]>
> > Cc: "Martin Liška" <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > Cc: Jon Grimm <[email protected]>
> > Cc: Martin Jambor <[email protected]>
> > Cc: Michael Petlan <[email protected]>
> > Cc: William Cohen <[email protected]>
> > Cc: Stephane Eranian <[email protected]>
> > Cc: Ian Rogers <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > ---
> > tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
> > index 25b06cf98747..2f2a209e87e1 100644
> > --- a/tools/perf/pmu-events/arch/x86/mapfile.csv
> > +++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
> > @@ -38,3 +38,4 @@ GenuineIntel-6-7E,v1,icelake,core
> > GenuineIntel-6-86,v1,tremontx,core
> > AuthenticAMD-23-([12][0-9A-F]|[0-9A-F]),v2,amdzen1,core
> > AuthenticAMD-23-[[:xdigit:]]+,v1,amdzen2,core
> > +AuthenticAMD-25-[[:xdigit:]]+,v1,amdzen2,core
> > --
> > 2.27.0
> >

--

- Arnaldo