2023-03-13 08:06:25

by Thomas Richter

[permalink] [raw]
Subject: [PATCH 1/6] tools/perf/json: Add common metrics for s390

Add 3 metrics for s390 machines:
- Cycles per instruction: Amount of CPU cycles used per instructions,
named cpi.
- Problem state ratio: Ratio of instructions executed in problem state
compared to total number of instructions, named prbstate.
- Level one instruction and data cache misses per 100 instructions,
named l1mp.

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Outpuf after:
# ./perf stat -M cpi -- dd if=/dev/zero of=/dev/null bs=1M count=10K
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 1.30151 s, 8.2 GB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null .....':

6,779,778,802 CPU_CYCLES # 1.96 cpi
3,461,975,090 INSTRUCTIONS

1.306873021 seconds time elapsed

0.001034000 seconds user
1.305677000 seconds sys
#

Signed-off-by: Thomas Richter <[email protected]>
Acked-By: Sumanth Korikkar <[email protected]>
---
.../pmu-events/arch/s390/cf_z13/transaction.json | 15 +++++++++++++++
.../pmu-events/arch/s390/cf_z14/transaction.json | 15 +++++++++++++++
.../pmu-events/arch/s390/cf_z15/transaction.json | 15 +++++++++++++++
.../pmu-events/arch/s390/cf_z16/transaction.json | 15 +++++++++++++++
4 files changed, 60 insertions(+)

diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
index 1a0034f79f73..86bf83b4504e 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
@@ -3,5 +3,20 @@
"BriefDescription": "Transaction count",
"MetricName": "transaction",
"MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
+ },
+ {
+ "BriefDescription": "Cycles per Instruction",
+ "MetricName": "cpi",
+ "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Problem State Instruction Ratio",
+ "MetricName": "prbstate",
+ "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Level One Miss per 100 Instructions",
+ "MetricName": "l1mp",
+ "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
}
]
diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
index 1a0034f79f73..86bf83b4504e 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
@@ -3,5 +3,20 @@
"BriefDescription": "Transaction count",
"MetricName": "transaction",
"MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
+ },
+ {
+ "BriefDescription": "Cycles per Instruction",
+ "MetricName": "cpi",
+ "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Problem State Instruction Ratio",
+ "MetricName": "prbstate",
+ "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Level One Miss per 100 Instructions",
+ "MetricName": "l1mp",
+ "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
}
]
diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
index 1a0034f79f73..86bf83b4504e 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
@@ -3,5 +3,20 @@
"BriefDescription": "Transaction count",
"MetricName": "transaction",
"MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
+ },
+ {
+ "BriefDescription": "Cycles per Instruction",
+ "MetricName": "cpi",
+ "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Problem State Instruction Ratio",
+ "MetricName": "prbstate",
+ "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Level One Miss per 100 Instructions",
+ "MetricName": "l1mp",
+ "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
}
]
diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
index 1a0034f79f73..86bf83b4504e 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
@@ -3,5 +3,20 @@
"BriefDescription": "Transaction count",
"MetricName": "transaction",
"MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
+ },
+ {
+ "BriefDescription": "Cycles per Instruction",
+ "MetricName": "cpi",
+ "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Problem State Instruction Ratio",
+ "MetricName": "prbstate",
+ "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Level One Miss per 100 Instructions",
+ "MetricName": "l1mp",
+ "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
}
]
--
2.39.1



2023-03-13 08:06:30

by Thomas Richter

[permalink] [raw]
Subject: [PATCH 4/6] tools/perf/json: Add cache metrics for s390 z14

Add metrics for s390 z14
- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Outpuf after:
# ./perf stat -M l4rp -- find /
.... find output deleted

Performance counter stats for 'find /':

0 L1I_OFFDRAWER_L4_SOURCED_WRITES # 0.01 l4rp
84 L1D_OFFDRAWER_L4_SOURCED_WRITES
0 L1I_OFFDRAWER_L3_SOURCED_WRITES
71,535,353 L1I_DIR_WRITES
219 L1D_OFFDRAWER_L3_SOURCED_WRITES
16,436 L1D_OFFDRAWER_L3_SOURCED_WRITES_IV
0 L1I_OFFDRAWER_L3_SOURCED_WRITES_IV
46,343,940 L1D_DIR_WRITES

10.530805537 seconds time elapsed

0.774396000 seconds user
1.602714000 seconds sys

#

Signed-off-by: Thomas Richter <[email protected]>
Acked-By: Sumanth Korikkar <[email protected]>
---
.../arch/s390/cf_z14/transaction.json | 25 +++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
index 86bf83b4504e..cca237bdb7ba 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
@@ -18,5 +18,30 @@
"BriefDescription": "Level One Miss per 100 Instructions",
"MetricName": "l1mp",
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 2 cache",
+ "MetricName": "l2p",
+ "MetricExpr": "((L1D_L2D_SOURCED_WRITES + L1I_L2I_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
+ "MetricName": "l3p",
+ "MetricExpr": "((L1D_ONCHIP_L3_SOURCED_WRITES + L1D_ONCHIP_L3_SOURCED_WRITES_IV + L1I_ONCHIP_L3_SOURCED_WRITES + L1I_ONCHIP_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
+ "MetricName": "l4lp",
+ "MetricExpr": "((L1D_ONCLUSTER_L3_SOURCED_WRITES + L1D_ONCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONDRAWER_L4_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES_IV + L1I_ONDRAWER_L4_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONCHIP_L3_SOURCED_WRITES_RO + L1I_OFFCLUSTER_L3_SOURCED_WRITES + L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
+ "MetricName": "l4rp",
+ "MetricExpr": "((L1D_OFFDRAWER_L3_SOURCED_WRITES + L1D_OFFDRAWER_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_L4_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_L4_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from memory",
+ "MetricName": "memp",
+ "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
}
]
--
2.39.1


2023-03-13 08:06:41

by Thomas Richter

[permalink] [raw]
Subject: [PATCH 5/6] tools/perf/json: Add cache metrics for s390 z13

Add metrics for s390 z13
- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:

# ./perf stat -M l4rp -- find /
...find output deleted

Performance counter stats for 'find /':

2 L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES # 0.02 l4rp
252 L1D_ONDRAWER_L4_SOURCED_WRITES
3,465 L1D_ONDRAWER_L3_SOURCED_WRITES_IV
80 L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES
761 L1D_ONDRAWER_L3_SOURCED_WRITES
0 L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES
131,817,067 L1I_DIR_WRITES
1 L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES
447 L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES
22 L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES
7 L1I_ONDRAWER_L4_SOURCED_WRITES
0 L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES
1,071 L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES
3 L1I_ONDRAWER_L3_SOURCED_WRITES
13,352 L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV
15,252 L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV
0 L1I_ONDRAWER_L3_SOURCED_WRITES_IV
0 L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV
57,431,083 L1D_DIR_WRITES
0 L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV

15.386502874 seconds time elapsed

0.647348000 seconds user
3.537041000 seconds sys

#

Signed-off-by: Thomas Richter <[email protected]>
Acked-By: Sumanth Korikkar <[email protected]>
---
.../arch/s390/cf_z13/transaction.json | 25 +++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
index 86bf83b4504e..71e2c7fa734c 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
@@ -18,5 +18,30 @@
"BriefDescription": "Level One Miss per 100 Instructions",
"MetricName": "l1mp",
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 2 cache",
+ "MetricName": "l2p",
+ "MetricExpr": "((L1D_L2D_SOURCED_WRITES + L1I_L2I_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
+ "MetricName": "l3p",
+ "MetricExpr": "((L1D_ONCHIP_L3_SOURCED_WRITES + L1D_ONCHIP_L3_SOURCED_WRITES_IV + L1I_ONCHIP_L3_SOURCED_WRITES + L1I_ONCHIP_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
+ "MetricName": "l4lp",
+ "MetricExpr": "((L1D_ONNODE_L4_SOURCED_WRITES + L1D_ONNODE_L3_SOURCED_WRITES_IV + L1D_ONNODE_L3_SOURCED_WRITES + L1I_ONNODE_L4_SOURCED_WRITES + L1I_ONNODE_L3_SOURCED_WRITES_IV + L1I_ONNODE_L3_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
+ "MetricName": "l4rp",
+ "MetricExpr": "((L1D_ONDRAWER_L4_SOURCED_WRITES + L1D_ONDRAWER_L3_SOURCED_WRITES_IV + L1D_ONDRAWER_L3_SOURCED_WRITES + L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES + L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES + L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES + L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES + L1I_ONDRAWER_L4_SOURCED_WRITES + L1I_ONDRAWER_L3_SOURCED_WRITES_IV + L1I_ONDRAWER_L3_SOURCED_WRITES + L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES + L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES + L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES + L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from memory",
+ "MetricName": "memp",
+ "MetricExpr": "((L1D_ONNODE_MEM_SOURCED_WRITES + L1D_ONDRAWER_MEM_SOURCED_WRITES + L1D_OFFDRAWER_MEM_SOURCED_WRITES + L1D_ONCHIP_MEM_SOURCED_WRITES + L1I_ONNODE_MEM_SOURCED_WRITES + L1I_ONDRAWER_MEM_SOURCED_WRITES + L1I_OFFDRAWER_MEM_SOURCED_WRITES + L1I_ONCHIP_MEM_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
}
]
--
2.39.1


2023-03-13 08:30:09

by Thomas Richter

[permalink] [raw]
Subject: [PATCH 6/6] tools/perf/json: Add metric for tlb and cache s390

Add metrics for tlb and cache statistics:
- finite_cpi: Cycles per Instructions from Finite cache/memory
- est_cpi: Estimated Instruction Complexity CPI infinite Level 1
- scpl1m: Estimated Sourcing Cycles per Level 1 Miss
- tlb_percent: Estimated TLB CPU percentage of Total CPU
- tlb_miss: Estimated Cycles per TLB Miss

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:
# ./perf stat -M tlb_miss -- dd if=/dev/zero of=/dev/null bs=1M count=10K
... dd output removed

Performance counter stats for
'dd if=/dev/zero of=/dev/null bs=1M count=10K':

667,726 DTLB2_MISSES # 440.96 tlb_miss
198 ITLB2_WRITES
795,170,260 L1C_TLB2_MISSES
9,478 ITLB2_MISSES
820 DTLB2_WRITES
1,197,126,869 L1D_PENALTY_CYCLES
2,457,447 L1I_PENALTY_CYCLES

1.249342187 seconds time elapsed

0.001030000 seconds user
1.248105000 seconds sys

#

Signed-off-by: Thomas Richter <[email protected]>
Acked-By: Sumanth Korikkar <[email protected]>
---
.../arch/s390/cf_z13/transaction.json | 30 +++++++++++++++++++
.../arch/s390/cf_z14/transaction.json | 25 ++++++++++++++++
.../arch/s390/cf_z15/transaction.json | 25 ++++++++++++++++
.../arch/s390/cf_z16/transaction.json | 25 ++++++++++++++++
4 files changed, 105 insertions(+)

diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
index 71e2c7fa734c..b941a7212a4d 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
@@ -43,5 +43,35 @@
"BriefDescription": "Percentage sourced from memory",
"MetricName": "memp",
"MetricExpr": "((L1D_ONNODE_MEM_SOURCED_WRITES + L1D_ONDRAWER_MEM_SOURCED_WRITES + L1D_OFFDRAWER_MEM_SOURCED_WRITES + L1D_ONCHIP_MEM_SOURCED_WRITES + L1I_ONNODE_MEM_SOURCED_WRITES + L1I_ONDRAWER_MEM_SOURCED_WRITES + L1I_OFFDRAWER_MEM_SOURCED_WRITES + L1I_ONCHIP_MEM_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Cycles per Instructions from Finite cache/memory",
+ "MetricName": "finite_cpi",
+ "MetricExpr": "L1C_TLB1_MISSES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
+ "MetricName": "est_cpi",
+ "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB1_MISSES / INSTRUCTIONS)"
+ },
+ {
+ "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
+ "MetricName": "scpl1m",
+ "MetricExpr": "L1C_TLB1_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
+ },
+ {
+ "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
+ "MetricName": "tlb_percent",
+ "MetricExpr": "((DTLB1_MISSES + ITLB1_MISSES) / CPU_CYCLES) * (L1C_TLB1_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
+ },
+ {
+ "BriefDescription": "Estimated Cycles per TLB Miss",
+ "MetricName": "tlb_miss",
+ "MetricExpr": "((DTLB1_MISSES + ITLB1_MISSES) / (DTLB1_WRITES + ITLB1_WRITES)) * (L1C_TLB1_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
+ },
+ {
+ "BriefDescription": "Page Table Entry misses",
+ "MetricName": "pte_miss",
+ "MetricExpr": "(TLB2_PTE_WRITES / (DTLB1_WRITES + ITLB1_WRITES)) * 100"
}
]
diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
index cca237bdb7ba..ce814ea93396 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
@@ -43,5 +43,30 @@
"BriefDescription": "Percentage sourced from memory",
"MetricName": "memp",
"MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Cycles per Instructions from Finite cache/memory",
+ "MetricName": "finite_cpi",
+ "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
+ "MetricName": "est_cpi",
+ "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
+ },
+ {
+ "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
+ "MetricName": "scpl1m",
+ "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
+ },
+ {
+ "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
+ "MetricName": "tlb_percent",
+ "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
+ },
+ {
+ "BriefDescription": "Estimated Cycles per TLB Miss",
+ "MetricName": "tlb_miss",
+ "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
}
]
diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
index cca237bdb7ba..ce814ea93396 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
@@ -43,5 +43,30 @@
"BriefDescription": "Percentage sourced from memory",
"MetricName": "memp",
"MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Cycles per Instructions from Finite cache/memory",
+ "MetricName": "finite_cpi",
+ "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
+ "MetricName": "est_cpi",
+ "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
+ },
+ {
+ "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
+ "MetricName": "scpl1m",
+ "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
+ },
+ {
+ "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
+ "MetricName": "tlb_percent",
+ "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
+ },
+ {
+ "BriefDescription": "Estimated Cycles per TLB Miss",
+ "MetricName": "tlb_miss",
+ "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
}
]
diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
index dde0735a7d22..ec2ff78e2b5f 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
@@ -43,5 +43,30 @@
"BriefDescription": "Percentage sourced from memory",
"MetricName": "memp",
"MetricExpr": "((DCW_ON_CHIP_MEMORY + DCW_ON_MODULE_MEMORY + DCW_ON_DRAWER_MEMORY + DCW_OFF_DRAWER_MEMORY + ICW_ON_CHIP_MEMORY + ICW_ON_MODULE_MEMORY + ICW_ON_DRAWER_MEMORY + ICW_OFF_DRAWER_MEMORY) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Cycles per Instructions from Finite cache/memory",
+ "MetricName": "finite_cpi",
+ "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
+ },
+ {
+ "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
+ "MetricName": "est_cpi",
+ "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
+ },
+ {
+ "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
+ "MetricName": "scpl1m",
+ "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
+ },
+ {
+ "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
+ "MetricName": "tlb_percent",
+ "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
+ },
+ {
+ "BriefDescription": "Estimated Cycles per TLB Miss",
+ "MetricName": "tlb_miss",
+ "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
}
]
--
2.39.1


2023-03-13 08:30:52

by Thomas Richter

[permalink] [raw]
Subject: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

Add metrics for s390 z16
- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Outpuf after:
# ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
.... dd output deleted

Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':

0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
431,866 L1I_DIR_WRITES
2,395 IDCW_OFF_DRAWER_IV
0 ICW_OFF_DRAWER
0 IDCW_OFF_DRAWER_DRAWER_HIT
1,437 DCW_OFF_DRAWER
425,960,793 L1D_DIR_WRITES

12.165030699 seconds time elapsed

0.001037000 seconds user
12.162140000 seconds sys

#

Signed-off-by: Thomas Richter <[email protected]>
Acked-By: Sumanth Korikkar <[email protected]>
---
.../arch/s390/cf_z16/transaction.json | 25 +++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
index 86bf83b4504e..dde0735a7d22 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
@@ -18,5 +18,30 @@
"BriefDescription": "Level One Miss per 100 Instructions",
"MetricName": "l1mp",
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 2 cache",
+ "MetricName": "l2p",
+ "MetricExpr": "((DCW_REQ + DCW_REQ_IV + ICW_REQ + ICW_REQ_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
+ "MetricName": "l3p",
+ "MetricExpr": "((DCW_REQ_CHIP_HIT + DCW_ON_CHIP + DCW_ON_CHIP_IV + DCW_ON_CHIP_CHIP_HIT + ICW_REQ_CHIP_HIT + ICW_ON_CHIP + ICW_ON_CHIP_IV + ICW_ON_CHIP_CHIP_HIT) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
+ "MetricName": "l4lp",
+ "MetricExpr": "((DCW_REQ_DRAWER_HIT + DCW_ON_CHIP_DRAWER_HIT + DCW_ON_MODULE + DCW_ON_DRAWER + IDCW_ON_MODULE_IV + IDCW_ON_MODULE_CHIP_HIT + IDCW_ON_MODULE_DRAWER_HIT + IDCW_ON_DRAWER_IV + IDCW_ON_DRAWER_CHIP_HIT + IDCW_ON_DRAWER_DRAWER_HIT + ICW_REQ_DRAWER_HIT + ICW_ON_CHIP_DRAWER_HIT + ICW_ON_MODULE + ICW_ON_DRAWER) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
+ "MetricName": "l4rp",
+ "MetricExpr": "((DCW_OFF_DRAWER + IDCW_OFF_DRAWER_IV + IDCW_OFF_DRAWER_CHIP_HIT + IDCW_OFF_DRAWER_DRAWER_HIT + ICW_OFF_DRAWER) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from memory",
+ "MetricName": "memp",
+ "MetricExpr": "((DCW_ON_CHIP_MEMORY + DCW_ON_MODULE_MEMORY + DCW_ON_DRAWER_MEMORY + DCW_OFF_DRAWER_MEMORY + ICW_ON_CHIP_MEMORY + ICW_ON_MODULE_MEMORY + ICW_ON_DRAWER_MEMORY + ICW_OFF_DRAWER_MEMORY) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
}
]
--
2.39.1


2023-03-13 08:31:31

by Thomas Richter

[permalink] [raw]
Subject: [PATCH 3/6] tools/perf/json: Add cache metrics for s390 z15

Add metrics for s390 z15
- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Outpuf after:
# ./perf stat -M l4rp -- find /
.... find output deleted

Performance counter stats for 'find /':

5 L1I_OFFDRAWER_L4_SOURCED_WRITES # 0.01 l4rp
187 L1D_OFFDRAWER_L4_SOURCED_WRITES
0 L1I_OFFDRAWER_L3_SOURCED_WRITES
231,333,165 L1I_DIR_WRITES
3,303 L1D_OFFDRAWER_L3_SOURCED_WRITES
47,461 L1D_OFFDRAWER_L3_SOURCED_WRITES_IV
0 L1I_OFFDRAWER_L3_SOURCED_WRITES_IV
126,706,244 L1D_DIR_WRITES

27.870355461 seconds time elapsed

0.521562000 seconds user
12.494503000 seconds sys
#

Signed-off-by: Thomas Richter <[email protected]>
Acked-By: Sumanth Korikkar <[email protected]>
---
.../arch/s390/cf_z15/transaction.json | 25 +++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
index 86bf83b4504e..cca237bdb7ba 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
@@ -18,5 +18,30 @@
"BriefDescription": "Level One Miss per 100 Instructions",
"MetricName": "l1mp",
"MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 2 cache",
+ "MetricName": "l2p",
+ "MetricExpr": "((L1D_L2D_SOURCED_WRITES + L1I_L2I_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
+ "MetricName": "l3p",
+ "MetricExpr": "((L1D_ONCHIP_L3_SOURCED_WRITES + L1D_ONCHIP_L3_SOURCED_WRITES_IV + L1I_ONCHIP_L3_SOURCED_WRITES + L1I_ONCHIP_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
+ "MetricName": "l4lp",
+ "MetricExpr": "((L1D_ONCLUSTER_L3_SOURCED_WRITES + L1D_ONCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONDRAWER_L4_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES_IV + L1I_ONDRAWER_L4_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONCHIP_L3_SOURCED_WRITES_RO + L1I_OFFCLUSTER_L3_SOURCED_WRITES + L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
+ "MetricName": "l4rp",
+ "MetricExpr": "((L1D_OFFDRAWER_L3_SOURCED_WRITES + L1D_OFFDRAWER_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_L4_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_L4_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
+ },
+ {
+ "BriefDescription": "Percentage sourced from memory",
+ "MetricName": "memp",
+ "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
}
]
--
2.39.1


2023-03-13 15:22:19

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 1/6] tools/perf/json: Add common metrics for s390

On Mon, Mar 13, 2023 at 1:06 AM Thomas Richter <[email protected]> wrote:
>
> Add 3 metrics for s390 machines:
> - Cycles per instruction: Amount of CPU cycles used per instructions,
> named cpi.
> - Problem state ratio: Ratio of instructions executed in problem state
> compared to total number of instructions, named prbstate.
> - Level one instruction and data cache misses per 100 instructions,
> named l1mp.
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Outpuf after:
> # ./perf stat -M cpi -- dd if=/dev/zero of=/dev/null bs=1M count=10K
> 10240+0 records in
> 10240+0 records out
> 10737418240 bytes (11 GB, 10 GiB) copied, 1.30151 s, 8.2 GB/s
>
> Performance counter stats for 'dd if=/dev/zero of=/dev/null .....':
>
> 6,779,778,802 CPU_CYCLES # 1.96 cpi
> 3,461,975,090 INSTRUCTIONS
>
> 1.306873021 seconds time elapsed
>
> 0.001034000 seconds user
> 1.305677000 seconds sys
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-By: Sumanth Korikkar <[email protected]>

Acked-by: Ian Rogers <[email protected]>

Thanks,
Ian

> ---
> .../pmu-events/arch/s390/cf_z13/transaction.json | 15 +++++++++++++++
> .../pmu-events/arch/s390/cf_z14/transaction.json | 15 +++++++++++++++
> .../pmu-events/arch/s390/cf_z15/transaction.json | 15 +++++++++++++++
> .../pmu-events/arch/s390/cf_z16/transaction.json | 15 +++++++++++++++
> 4 files changed, 60 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> index 1a0034f79f73..86bf83b4504e 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> @@ -3,5 +3,20 @@
> "BriefDescription": "Transaction count",
> "MetricName": "transaction",
> "MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
> + },
> + {
> + "BriefDescription": "Cycles per Instruction",
> + "MetricName": "cpi",
> + "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Problem State Instruction Ratio",
> + "MetricName": "prbstate",
> + "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Level One Miss per 100 Instructions",
> + "MetricName": "l1mp",
> + "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> index 1a0034f79f73..86bf83b4504e 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> @@ -3,5 +3,20 @@
> "BriefDescription": "Transaction count",
> "MetricName": "transaction",
> "MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
> + },
> + {
> + "BriefDescription": "Cycles per Instruction",
> + "MetricName": "cpi",
> + "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Problem State Instruction Ratio",
> + "MetricName": "prbstate",
> + "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Level One Miss per 100 Instructions",
> + "MetricName": "l1mp",
> + "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> index 1a0034f79f73..86bf83b4504e 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> @@ -3,5 +3,20 @@
> "BriefDescription": "Transaction count",
> "MetricName": "transaction",
> "MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
> + },
> + {
> + "BriefDescription": "Cycles per Instruction",
> + "MetricName": "cpi",
> + "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Problem State Instruction Ratio",
> + "MetricName": "prbstate",
> + "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Level One Miss per 100 Instructions",
> + "MetricName": "l1mp",
> + "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> index 1a0034f79f73..86bf83b4504e 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> @@ -3,5 +3,20 @@
> "BriefDescription": "Transaction count",
> "MetricName": "transaction",
> "MetricExpr": "TX_C_TEND + TX_NC_TEND + TX_NC_TABORT + TX_C_TABORT_SPECIAL + TX_C_TABORT_NO_SPECIAL"
> + },
> + {
> + "BriefDescription": "Cycles per Instruction",
> + "MetricName": "cpi",
> + "MetricExpr": "CPU_CYCLES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Problem State Instruction Ratio",
> + "MetricName": "prbstate",
> + "MetricExpr": "(PROBLEM_STATE_INSTRUCTIONS / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Level One Miss per 100 Instructions",
> + "MetricName": "l1mp",
> + "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> }
> ]
> --
> 2.39.1
>

2023-03-13 15:23:06

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
>
> Add metrics for s390 z16
> - Percentage sourced from Level 2 cache
> - Percentage sourced from Level 3 on same chip cache
> - Percentage sourced from Level 4 Local cache on same book
> - Percentage sourced from Level 4 Remote cache on different book
> - Percentage sourced from memory
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Outpuf after:
> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
> .... dd output deleted
>
> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
>
> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
> 431,866 L1I_DIR_WRITES
> 2,395 IDCW_OFF_DRAWER_IV
> 0 ICW_OFF_DRAWER
> 0 IDCW_OFF_DRAWER_DRAWER_HIT
> 1,437 DCW_OFF_DRAWER
> 425,960,793 L1D_DIR_WRITES
>
> 12.165030699 seconds time elapsed
>
> 0.001037000 seconds user
> 12.162140000 seconds sys
>
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-By: Sumanth Korikkar <[email protected]>

Acked-by: Ian Rogers <[email protected]>

Thanks,
Ian

> ---
> .../arch/s390/cf_z16/transaction.json | 25 +++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> index 86bf83b4504e..dde0735a7d22 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> @@ -18,5 +18,30 @@
> "BriefDescription": "Level One Miss per 100 Instructions",
> "MetricName": "l1mp",
> "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 2 cache",
> + "MetricName": "l2p",
> + "MetricExpr": "((DCW_REQ + DCW_REQ_IV + ICW_REQ + ICW_REQ_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
> + "MetricName": "l3p",
> + "MetricExpr": "((DCW_REQ_CHIP_HIT + DCW_ON_CHIP + DCW_ON_CHIP_IV + DCW_ON_CHIP_CHIP_HIT + ICW_REQ_CHIP_HIT + ICW_ON_CHIP + ICW_ON_CHIP_IV + ICW_ON_CHIP_CHIP_HIT) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
> + "MetricName": "l4lp",
> + "MetricExpr": "((DCW_REQ_DRAWER_HIT + DCW_ON_CHIP_DRAWER_HIT + DCW_ON_MODULE + DCW_ON_DRAWER + IDCW_ON_MODULE_IV + IDCW_ON_MODULE_CHIP_HIT + IDCW_ON_MODULE_DRAWER_HIT + IDCW_ON_DRAWER_IV + IDCW_ON_DRAWER_CHIP_HIT + IDCW_ON_DRAWER_DRAWER_HIT + ICW_REQ_DRAWER_HIT + ICW_ON_CHIP_DRAWER_HIT + ICW_ON_MODULE + ICW_ON_DRAWER) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
> + "MetricName": "l4rp",
> + "MetricExpr": "((DCW_OFF_DRAWER + IDCW_OFF_DRAWER_IV + IDCW_OFF_DRAWER_CHIP_HIT + IDCW_OFF_DRAWER_DRAWER_HIT + ICW_OFF_DRAWER) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from memory",
> + "MetricName": "memp",
> + "MetricExpr": "((DCW_ON_CHIP_MEMORY + DCW_ON_MODULE_MEMORY + DCW_ON_DRAWER_MEMORY + DCW_OFF_DRAWER_MEMORY + ICW_ON_CHIP_MEMORY + ICW_ON_MODULE_MEMORY + ICW_ON_DRAWER_MEMORY + ICW_OFF_DRAWER_MEMORY) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> }
> ]
> --
> 2.39.1
>

2023-03-13 15:24:26

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 3/6] tools/perf/json: Add cache metrics for s390 z15

On Mon, Mar 13, 2023 at 1:31 AM Thomas Richter <[email protected]> wrote:
>
> Add metrics for s390 z15
> - Percentage sourced from Level 2 cache
> - Percentage sourced from Level 3 on same chip cache
> - Percentage sourced from Level 4 Local cache on same book
> - Percentage sourced from Level 4 Remote cache on different book
> - Percentage sourced from memory
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Outpuf after:
> # ./perf stat -M l4rp -- find /
> .... find output deleted
>
> Performance counter stats for 'find /':
>
> 5 L1I_OFFDRAWER_L4_SOURCED_WRITES # 0.01 l4rp
> 187 L1D_OFFDRAWER_L4_SOURCED_WRITES
> 0 L1I_OFFDRAWER_L3_SOURCED_WRITES
> 231,333,165 L1I_DIR_WRITES
> 3,303 L1D_OFFDRAWER_L3_SOURCED_WRITES
> 47,461 L1D_OFFDRAWER_L3_SOURCED_WRITES_IV
> 0 L1I_OFFDRAWER_L3_SOURCED_WRITES_IV
> 126,706,244 L1D_DIR_WRITES
>
> 27.870355461 seconds time elapsed
>
> 0.521562000 seconds user
> 12.494503000 seconds sys
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-By: Sumanth Korikkar <[email protected]>
> ---
> .../arch/s390/cf_z15/transaction.json | 25 +++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> index 86bf83b4504e..cca237bdb7ba 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> @@ -18,5 +18,30 @@
> "BriefDescription": "Level One Miss per 100 Instructions",
> "MetricName": "l1mp",
> "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 2 cache",
> + "MetricName": "l2p",
> + "MetricExpr": "((L1D_L2D_SOURCED_WRITES + L1I_L2I_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
> + "MetricName": "l3p",
> + "MetricExpr": "((L1D_ONCHIP_L3_SOURCED_WRITES + L1D_ONCHIP_L3_SOURCED_WRITES_IV + L1I_ONCHIP_L3_SOURCED_WRITES + L1I_ONCHIP_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
> + "MetricName": "l4lp",
> + "MetricExpr": "((L1D_ONCLUSTER_L3_SOURCED_WRITES + L1D_ONCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONDRAWER_L4_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES_IV + L1I_ONDRAWER_L4_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONCHIP_L3_SOURCED_WRITES_RO + L1I_OFFCLUSTER_L3_SOURCED_WRITES + L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"

It is more typical for percentages to change the ScaleUnit to "100%"
and not to do the "* 100". Otherwise these look good.

Thanks,
Ian

> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
> + "MetricName": "l4rp",
> + "MetricExpr": "((L1D_OFFDRAWER_L3_SOURCED_WRITES + L1D_OFFDRAWER_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_L4_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_L4_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from memory",
> + "MetricName": "memp",
> + "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> }
> ]
> --
> 2.39.1
>

2023-03-13 15:25:35

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 4/6] tools/perf/json: Add cache metrics for s390 z14

On Mon, Mar 13, 2023 at 1:06 AM Thomas Richter <[email protected]> wrote:
>
> Add metrics for s390 z14
> - Percentage sourced from Level 2 cache
> - Percentage sourced from Level 3 on same chip cache
> - Percentage sourced from Level 4 Local cache on same book
> - Percentage sourced from Level 4 Remote cache on different book
> - Percentage sourced from memory
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Outpuf after:
> # ./perf stat -M l4rp -- find /
> .... find output deleted
>
> Performance counter stats for 'find /':
>
> 0 L1I_OFFDRAWER_L4_SOURCED_WRITES # 0.01 l4rp
> 84 L1D_OFFDRAWER_L4_SOURCED_WRITES
> 0 L1I_OFFDRAWER_L3_SOURCED_WRITES
> 71,535,353 L1I_DIR_WRITES
> 219 L1D_OFFDRAWER_L3_SOURCED_WRITES
> 16,436 L1D_OFFDRAWER_L3_SOURCED_WRITES_IV
> 0 L1I_OFFDRAWER_L3_SOURCED_WRITES_IV
> 46,343,940 L1D_DIR_WRITES
>
> 10.530805537 seconds time elapsed
>
> 0.774396000 seconds user
> 1.602714000 seconds sys
>
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-By: Sumanth Korikkar <[email protected]>
> ---
> .../arch/s390/cf_z14/transaction.json | 25 +++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> index 86bf83b4504e..cca237bdb7ba 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> @@ -18,5 +18,30 @@
> "BriefDescription": "Level One Miss per 100 Instructions",
> "MetricName": "l1mp",
> "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 2 cache",
> + "MetricName": "l2p",
> + "MetricExpr": "((L1D_L2D_SOURCED_WRITES + L1I_L2I_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"

Some comment as patch #3 wrt ScaleUnit of "100%" rather than "* 100".
I can see from the metric above this way is being consistent.

Thanks,
Ian

> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
> + "MetricName": "l3p",
> + "MetricExpr": "((L1D_ONCHIP_L3_SOURCED_WRITES + L1D_ONCHIP_L3_SOURCED_WRITES_IV + L1I_ONCHIP_L3_SOURCED_WRITES + L1I_ONCHIP_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
> + "MetricName": "l4lp",
> + "MetricExpr": "((L1D_ONCLUSTER_L3_SOURCED_WRITES + L1D_ONCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONDRAWER_L4_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES + L1I_ONCLUSTER_L3_SOURCED_WRITES_IV + L1I_ONDRAWER_L4_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES + L1D_OFFCLUSTER_L3_SOURCED_WRITES_IV + L1D_ONCHIP_L3_SOURCED_WRITES_RO + L1I_OFFCLUSTER_L3_SOURCED_WRITES + L1I_OFFCLUSTER_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
> + "MetricName": "l4rp",
> + "MetricExpr": "((L1D_OFFDRAWER_L3_SOURCED_WRITES + L1D_OFFDRAWER_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_L4_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES + L1I_OFFDRAWER_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_L4_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from memory",
> + "MetricName": "memp",
> + "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> }
> ]
> --
> 2.39.1
>

2023-03-13 15:28:40

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 5/6] tools/perf/json: Add cache metrics for s390 z13

On Mon, Mar 13, 2023 at 1:06 AM Thomas Richter <[email protected]> wrote:
>
> Add metrics for s390 z13
> - Percentage sourced from Level 2 cache
> - Percentage sourced from Level 3 on same chip cache
> - Percentage sourced from Level 4 Local cache on same book
> - Percentage sourced from Level 4 Remote cache on different book
> - Percentage sourced from memory
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Output after:
>
> # ./perf stat -M l4rp -- find /
> ...find output deleted
>
> Performance counter stats for 'find /':
>
> 2 L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES # 0.02 l4rp
> 252 L1D_ONDRAWER_L4_SOURCED_WRITES
> 3,465 L1D_ONDRAWER_L3_SOURCED_WRITES_IV
> 80 L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES
> 761 L1D_ONDRAWER_L3_SOURCED_WRITES
> 0 L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES
> 131,817,067 L1I_DIR_WRITES
> 1 L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES
> 447 L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES
> 22 L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES
> 7 L1I_ONDRAWER_L4_SOURCED_WRITES
> 0 L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES
> 1,071 L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES
> 3 L1I_ONDRAWER_L3_SOURCED_WRITES
> 13,352 L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV
> 15,252 L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV
> 0 L1I_ONDRAWER_L3_SOURCED_WRITES_IV
> 0 L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV
> 57,431,083 L1D_DIR_WRITES
> 0 L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV
>
> 15.386502874 seconds time elapsed
>
> 0.647348000 seconds user
> 3.537041000 seconds sys
>
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-By: Sumanth Korikkar <[email protected]>
> ---
> .../arch/s390/cf_z13/transaction.json | 25 +++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> index 86bf83b4504e..71e2c7fa734c 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> @@ -18,5 +18,30 @@
> "BriefDescription": "Level One Miss per 100 Instructions",
> "MetricName": "l1mp",
> "MetricExpr": "((L1I_DIR_WRITES + L1D_DIR_WRITES) / INSTRUCTIONS) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 2 cache",
> + "MetricName": "l2p",
> + "MetricExpr": "((L1D_L2D_SOURCED_WRITES + L1I_L2I_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"

Looks good but the same comment about using ScaleUnit.

Thanks,
Ian

> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 3 on same chip cache",
> + "MetricName": "l3p",
> + "MetricExpr": "((L1D_ONCHIP_L3_SOURCED_WRITES + L1D_ONCHIP_L3_SOURCED_WRITES_IV + L1I_ONCHIP_L3_SOURCED_WRITES + L1I_ONCHIP_L3_SOURCED_WRITES_IV) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Local cache on same book",
> + "MetricName": "l4lp",
> + "MetricExpr": "((L1D_ONNODE_L4_SOURCED_WRITES + L1D_ONNODE_L3_SOURCED_WRITES_IV + L1D_ONNODE_L3_SOURCED_WRITES + L1I_ONNODE_L4_SOURCED_WRITES + L1I_ONNODE_L3_SOURCED_WRITES_IV + L1I_ONNODE_L3_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from Level 4 Remote cache on different book",
> + "MetricName": "l4rp",
> + "MetricExpr": "((L1D_ONDRAWER_L4_SOURCED_WRITES + L1D_ONDRAWER_L3_SOURCED_WRITES_IV + L1D_ONDRAWER_L3_SOURCED_WRITES + L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES + L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES + L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES + L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV + L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES + L1I_ONDRAWER_L4_SOURCED_WRITES + L1I_ONDRAWER_L3_SOURCED_WRITES_IV + L1I_ONDRAWER_L3_SOURCED_WRITES + L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES + L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES + L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES + L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV + L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Percentage sourced from memory",
> + "MetricName": "memp",
> + "MetricExpr": "((L1D_ONNODE_MEM_SOURCED_WRITES + L1D_ONDRAWER_MEM_SOURCED_WRITES + L1D_OFFDRAWER_MEM_SOURCED_WRITES + L1D_ONCHIP_MEM_SOURCED_WRITES + L1I_ONNODE_MEM_SOURCED_WRITES + L1I_ONDRAWER_MEM_SOURCED_WRITES + L1I_OFFDRAWER_MEM_SOURCED_WRITES + L1I_ONCHIP_MEM_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> }
> ]
> --
> 2.39.1
>

2023-03-13 15:32:44

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 6/6] tools/perf/json: Add metric for tlb and cache s390

On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
>
> Add metrics for tlb and cache statistics:
> - finite_cpi: Cycles per Instructions from Finite cache/memory
> - est_cpi: Estimated Instruction Complexity CPI infinite Level 1
> - scpl1m: Estimated Sourcing Cycles per Level 1 Miss
> - tlb_percent: Estimated TLB CPU percentage of Total CPU
> - tlb_miss: Estimated Cycles per TLB Miss
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Output after:
> # ./perf stat -M tlb_miss -- dd if=/dev/zero of=/dev/null bs=1M count=10K
> ... dd output removed
>
> Performance counter stats for
> 'dd if=/dev/zero of=/dev/null bs=1M count=10K':
>
> 667,726 DTLB2_MISSES # 440.96 tlb_miss
> 198 ITLB2_WRITES
> 795,170,260 L1C_TLB2_MISSES
> 9,478 ITLB2_MISSES
> 820 DTLB2_WRITES
> 1,197,126,869 L1D_PENALTY_CYCLES
> 2,457,447 L1I_PENALTY_CYCLES
>
> 1.249342187 seconds time elapsed
>
> 0.001030000 seconds user
> 1.248105000 seconds sys
>
> #
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-By: Sumanth Korikkar <[email protected]>
> ---
> .../arch/s390/cf_z13/transaction.json | 30 +++++++++++++++++++
> .../arch/s390/cf_z14/transaction.json | 25 ++++++++++++++++
> .../arch/s390/cf_z15/transaction.json | 25 ++++++++++++++++
> .../arch/s390/cf_z16/transaction.json | 25 ++++++++++++++++
> 4 files changed, 105 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> index 71e2c7fa734c..b941a7212a4d 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> @@ -43,5 +43,35 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((L1D_ONNODE_MEM_SOURCED_WRITES + L1D_ONDRAWER_MEM_SOURCED_WRITES + L1D_OFFDRAWER_MEM_SOURCED_WRITES + L1D_ONCHIP_MEM_SOURCED_WRITES + L1I_ONNODE_MEM_SOURCED_WRITES + L1I_ONDRAWER_MEM_SOURCED_WRITES + L1I_OFFDRAWER_MEM_SOURCED_WRITES + L1I_ONCHIP_MEM_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB1_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB1_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB1_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB1_MISSES + ITLB1_MISSES) / CPU_CYCLES) * (L1C_TLB1_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"

Looks good again but perhaps the ScaleUnit change. If you'd prefer to
keep as-is for consistency I'm happy to add my Acked-by.

Thanks,
Ian

> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB1_MISSES + ITLB1_MISSES) / (DTLB1_WRITES + ITLB1_WRITES)) * (L1C_TLB1_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> + },
> + {
> + "BriefDescription": "Page Table Entry misses",
> + "MetricName": "pte_miss",
> + "MetricExpr": "(TLB2_PTE_WRITES / (DTLB1_WRITES + ITLB1_WRITES)) * 100"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> index cca237bdb7ba..ce814ea93396 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> @@ -43,5 +43,30 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> index cca237bdb7ba..ce814ea93396 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> @@ -43,5 +43,30 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> index dde0735a7d22..ec2ff78e2b5f 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> @@ -43,5 +43,30 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((DCW_ON_CHIP_MEMORY + DCW_ON_MODULE_MEMORY + DCW_ON_DRAWER_MEMORY + DCW_OFF_DRAWER_MEMORY + ICW_ON_CHIP_MEMORY + ICW_ON_MODULE_MEMORY + ICW_ON_DRAWER_MEMORY + ICW_OFF_DRAWER_MEMORY) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> }
> ]
> --
> 2.39.1
>

2023-03-13 18:35:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
> >
> > Add metrics for s390 z16
> > - Percentage sourced from Level 2 cache
> > - Percentage sourced from Level 3 on same chip cache
> > - Percentage sourced from Level 4 Local cache on same book
> > - Percentage sourced from Level 4 Remote cache on different book
> > - Percentage sourced from memory
> >
> > For details about the formulas see this documentation:
> > https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
> >
> > Outpuf after:
> > # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
> > .... dd output deleted
> >
> > Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
> >
> > 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
> > 431,866 L1I_DIR_WRITES
> > 2,395 IDCW_OFF_DRAWER_IV
> > 0 ICW_OFF_DRAWER
> > 0 IDCW_OFF_DRAWER_DRAWER_HIT
> > 1,437 DCW_OFF_DRAWER
> > 425,960,793 L1D_DIR_WRITES
> >
> > 12.165030699 seconds time elapsed
> >
> > 0.001037000 seconds user
> > 12.162140000 seconds sys
> >
> > #
> >
> > Signed-off-by: Thomas Richter <[email protected]>
> > Acked-By: Sumanth Korikkar <[email protected]>
>
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied the first two patches, please address the review
suggestions for patches 3-6 and resubmit only those.

The patches will be in the public perf-tools-next branch later today.

- Arnaldo


2023-03-14 08:22:40

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

On 3/13/23 19:33, Arnaldo Carvalho de Melo wrote:
> Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
>> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
>>>
>>> Add metrics for s390 z16
>>> - Percentage sourced from Level 2 cache
>>> - Percentage sourced from Level 3 on same chip cache
>>> - Percentage sourced from Level 4 Local cache on same book
>>> - Percentage sourced from Level 4 Remote cache on different book
>>> - Percentage sourced from memory
>>>
>>> For details about the formulas see this documentation:
>>> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>>>
>>> Outpuf after:
>>> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
>>> .... dd output deleted
>>>
>>> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
>>>
>>> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
>>> 431,866 L1I_DIR_WRITES
>>> 2,395 IDCW_OFF_DRAWER_IV
>>> 0 ICW_OFF_DRAWER
>>> 0 IDCW_OFF_DRAWER_DRAWER_HIT
>>> 1,437 DCW_OFF_DRAWER
>>> 425,960,793 L1D_DIR_WRITES
>>>
>>> 12.165030699 seconds time elapsed
>>>
>>> 0.001037000 seconds user
>>> 12.162140000 seconds sys
>>>
>>> #
>>>
>>> Signed-off-by: Thomas Richter <[email protected]>
>>> Acked-By: Sumanth Korikkar <[email protected]>
>>
>> Acked-by: Ian Rogers <[email protected]>
>
> Thanks, applied the first two patches, please address the review
> suggestions for patches 3-6 and resubmit only those.
>
> The patches will be in the public perf-tools-next branch later today.
>
> - Arnaldo
>

I would really prefer the current implementation without using "ScaleUnit": "100%"
The reason is that these formulars are given to me from the s390 Performance team.
They want to use the exact same formulars on all platforms running on s390
which includes z/OS and z/VM. This way they are sure to get the same numbers.

Hope this background info helps.

Thanks a lot.
--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


2023-03-14 16:35:49

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

On Tue, Mar 14, 2023 at 1:20 AM Thomas Richter <[email protected]> wrote:
>
> On 3/13/23 19:33, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
> >> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
> >>>
> >>> Add metrics for s390 z16
> >>> - Percentage sourced from Level 2 cache
> >>> - Percentage sourced from Level 3 on same chip cache
> >>> - Percentage sourced from Level 4 Local cache on same book
> >>> - Percentage sourced from Level 4 Remote cache on different book
> >>> - Percentage sourced from memory
> >>>
> >>> For details about the formulas see this documentation:
> >>> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
> >>>
> >>> Outpuf after:
> >>> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
> >>> .... dd output deleted
> >>>
> >>> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
> >>>
> >>> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
> >>> 431,866 L1I_DIR_WRITES
> >>> 2,395 IDCW_OFF_DRAWER_IV
> >>> 0 ICW_OFF_DRAWER
> >>> 0 IDCW_OFF_DRAWER_DRAWER_HIT
> >>> 1,437 DCW_OFF_DRAWER
> >>> 425,960,793 L1D_DIR_WRITES
> >>>
> >>> 12.165030699 seconds time elapsed
> >>>
> >>> 0.001037000 seconds user
> >>> 12.162140000 seconds sys
> >>>
> >>> #
> >>>
> >>> Signed-off-by: Thomas Richter <[email protected]>
> >>> Acked-By: Sumanth Korikkar <[email protected]>
> >>
> >> Acked-by: Ian Rogers <[email protected]>
> >
> > Thanks, applied the first two patches, please address the review
> > suggestions for patches 3-6 and resubmit only those.
> >
> > The patches will be in the public perf-tools-next branch later today.
> >
> > - Arnaldo
> >
>
> I would really prefer the current implementation without using "ScaleUnit": "100%"
> The reason is that these formulars are given to me from the s390 Performance team.
> They want to use the exact same formulars on all platforms running on s390
> which includes z/OS and z/VM. This way they are sure to get the same numbers.
>
> Hope this background info helps.

For the series:
Acked-by: Ian Rogers <[email protected]>

Using ScaleUnit won't change the result. A ScaleUnit of "100%" means
scale the result up by multiplying by 100 and then apply the % after
the value. Another nit is having metrics that place their units in the
name, like _percent, is usually a sign the name can be better. Perhaps
we can follow up with some clean up.

Thanks,
Ian

> Thanks a lot.
> --
> Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> --
> Vorsitzender des Aufsichtsrats: Gregor Pillen
> Geschäftsführung: David Faller
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
>

2023-03-14 21:36:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

Em Tue, Mar 14, 2023 at 09:34:46AM -0700, Ian Rogers escreveu:
> On Tue, Mar 14, 2023 at 1:20 AM Thomas Richter <[email protected]> wrote:
> >
> > On 3/13/23 19:33, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
> > >> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
> > >>>
> > >>> Add metrics for s390 z16
> > >>> - Percentage sourced from Level 2 cache
> > >>> - Percentage sourced from Level 3 on same chip cache
> > >>> - Percentage sourced from Level 4 Local cache on same book
> > >>> - Percentage sourced from Level 4 Remote cache on different book
> > >>> - Percentage sourced from memory
> > >>>
> > >>> For details about the formulas see this documentation:
> > >>> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
> > >>>
> > >>> Outpuf after:
> > >>> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
> > >>> .... dd output deleted
> > >>>
> > >>> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
> > >>>
> > >>> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
> > >>> 431,866 L1I_DIR_WRITES
> > >>> 2,395 IDCW_OFF_DRAWER_IV
> > >>> 0 ICW_OFF_DRAWER
> > >>> 0 IDCW_OFF_DRAWER_DRAWER_HIT
> > >>> 1,437 DCW_OFF_DRAWER
> > >>> 425,960,793 L1D_DIR_WRITES
> > >>>
> > >>> 12.165030699 seconds time elapsed
> > >>>
> > >>> 0.001037000 seconds user
> > >>> 12.162140000 seconds sys
> > >>>
> > >>> #
> > >>>
> > >>> Signed-off-by: Thomas Richter <[email protected]>
> > >>> Acked-By: Sumanth Korikkar <[email protected]>
> > >>
> > >> Acked-by: Ian Rogers <[email protected]>
> > >
> > > Thanks, applied the first two patches, please address the review
> > > suggestions for patches 3-6 and resubmit only those.
> > >
> > > The patches will be in the public perf-tools-next branch later today.
> > >
> > > - Arnaldo
> > >
> >
> > I would really prefer the current implementation without using "ScaleUnit": "100%"
> > The reason is that these formulars are given to me from the s390 Performance team.
> > They want to use the exact same formulars on all platforms running on s390
> > which includes z/OS and z/VM. This way they are sure to get the same numbers.
> >
> > Hope this background info helps.
>
> For the series:
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied.

- Arnaldo


> Using ScaleUnit won't change the result. A ScaleUnit of "100%" means
> scale the result up by multiplying by 100 and then apply the % after
> the value. Another nit is having metrics that place their units in the
> name, like _percent, is usually a sign the name can be better. Perhaps
> we can follow up with some clean up.
>
> Thanks,
> Ian
>
> > Thanks a lot.
> > --
> > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> > --
> > Vorsitzender des Aufsichtsrats: Gregor Pillen
> > Geschäftsführung: David Faller
> > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
> >

--

- Arnaldo

2023-03-15 07:22:13

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

On 3/14/23 17:34, Ian Rogers wrote:
> On Tue, Mar 14, 2023 at 1:20 AM Thomas Richter <[email protected]> wrote:
>>
>> On 3/13/23 19:33, Arnaldo Carvalho de Melo wrote:
>>> Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
>>>> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
>>>>>
>>>>> Add metrics for s390 z16
>>>>> - Percentage sourced from Level 2 cache
>>>>> - Percentage sourced from Level 3 on same chip cache
>>>>> - Percentage sourced from Level 4 Local cache on same book
>>>>> - Percentage sourced from Level 4 Remote cache on different book
>>>>> - Percentage sourced from memory
>>>>>
>>>>> For details about the formulas see this documentation:
>>>>> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>>>>>
>>>>> Outpuf after:
>>>>> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
>>>>> .... dd output deleted
>>>>>
>>>>> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
>>>>>
>>>>> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
>>>>> 431,866 L1I_DIR_WRITES
>>>>> 2,395 IDCW_OFF_DRAWER_IV
>>>>> 0 ICW_OFF_DRAWER
>>>>> 0 IDCW_OFF_DRAWER_DRAWER_HIT
>>>>> 1,437 DCW_OFF_DRAWER
>>>>> 425,960,793 L1D_DIR_WRITES
>>>>>
>>>>> 12.165030699 seconds time elapsed
>>>>>
>>>>> 0.001037000 seconds user
>>>>> 12.162140000 seconds sys
>>>>>
>>>>> #
>>>>>
>>>>> Signed-off-by: Thomas Richter <[email protected]>
>>>>> Acked-By: Sumanth Korikkar <[email protected]>
>>>>
>>>> Acked-by: Ian Rogers <[email protected]>
>>>
>>> Thanks, applied the first two patches, please address the review
>>> suggestions for patches 3-6 and resubmit only those.
>>>
>>> The patches will be in the public perf-tools-next branch later today.
>>>
>>> - Arnaldo
>>>
>>
>> I would really prefer the current implementation without using "ScaleUnit": "100%"
>> The reason is that these formulars are given to me from the s390 Performance team.
>> They want to use the exact same formulars on all platforms running on s390
>> which includes z/OS and z/VM. This way they are sure to get the same numbers.
>>
>> Hope this background info helps.
>
> For the series:
> Acked-by: Ian Rogers <[email protected]>
>
> Using ScaleUnit won't change the result. A ScaleUnit of "100%" means
> scale the result up by multiplying by 100 and then apply the % after
> the value. Another nit is having metrics that place their units in the
> name, like _percent, is usually a sign the name can be better. Perhaps
> we can follow up with some clean up.
>
> Thanks,
> Ian

Thanks Ian,
I put the ScaleUnit thing on my todo list and will provide a clean up...

--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


2023-03-22 21:00:23

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

Em Tue, Mar 14, 2023 at 06:36:24PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Mar 14, 2023 at 09:34:46AM -0700, Ian Rogers escreveu:
> > On Tue, Mar 14, 2023 at 1:20 AM Thomas Richter <[email protected]> wrote:
> > >
> > > On 3/13/23 19:33, Arnaldo Carvalho de Melo wrote:
> > > > Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
> > > >> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
> > > >>>
> > > >>> Add metrics for s390 z16
> > > >>> - Percentage sourced from Level 2 cache
> > > >>> - Percentage sourced from Level 3 on same chip cache
> > > >>> - Percentage sourced from Level 4 Local cache on same book
> > > >>> - Percentage sourced from Level 4 Remote cache on different book
> > > >>> - Percentage sourced from memory
> > > >>>
> > > >>> For details about the formulas see this documentation:
> > > >>> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
> > > >>>
> > > >>> Outpuf after:
> > > >>> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
> > > >>> .... dd output deleted
> > > >>>
> > > >>> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
> > > >>>
> > > >>> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
> > > >>> 431,866 L1I_DIR_WRITES
> > > >>> 2,395 IDCW_OFF_DRAWER_IV
> > > >>> 0 ICW_OFF_DRAWER
> > > >>> 0 IDCW_OFF_DRAWER_DRAWER_HIT
> > > >>> 1,437 DCW_OFF_DRAWER
> > > >>> 425,960,793 L1D_DIR_WRITES
> > > >>>
> > > >>> 12.165030699 seconds time elapsed
> > > >>>
> > > >>> 0.001037000 seconds user
> > > >>> 12.162140000 seconds sys
> > > >>>
> > > >>> #
> > > >>>
> > > >>> Signed-off-by: Thomas Richter <[email protected]>
> > > >>> Acked-By: Sumanth Korikkar <[email protected]>
> > > >>
> > > >> Acked-by: Ian Rogers <[email protected]>
> > > >
> > > > Thanks, applied the first two patches, please address the review
> > > > suggestions for patches 3-6 and resubmit only those.
> > > >
> > > > The patches will be in the public perf-tools-next branch later today.
> > > >
> > > > - Arnaldo
> > > >
> > >
> > > I would really prefer the current implementation without using "ScaleUnit": "100%"
> > > The reason is that these formulars are given to me from the s390 Performance team.
> > > They want to use the exact same formulars on all platforms running on s390
> > > which includes z/OS and z/VM. This way they are sure to get the same numbers.
> > >
> > > Hope this background info helps.
> >
> > For the series:
> > Acked-by: Ian Rogers <[email protected]>
>
> Thanks, applied.
>
> - Arnaldo

While trying to cross build to s390 on:

ubuntu:18.04

using python3


CC /tmp/build/perf/tests/parse-events.o
Exception processing pmu-events/arch/s390/cf_z16/extended.json
Traceback (most recent call last):
File "pmu-events/jevents.py", line 997, in <module>
main()
File "pmu-events/jevents.py", line 979, in main
ftw(arch_path, [], preprocess_one_file)
File "pmu-events/jevents.py", line 935, in ftw
ftw(item.path, parents + [item.name], action)
File "pmu-events/jevents.py", line 933, in ftw
action(parents, item)
File "pmu-events/jevents.py", line 514, in preprocess_one_file
for event in read_json_events(item.path, topic):
File "pmu-events/jevents.py", line 388, in read_json_events
events = json.load(open(path), object_hook=JsonEvent)
File "/usr/lib/python3.6/json/__init__.py", line 296, in load
return loads(fp.read(),
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4271: ordinal not in range(128)

>
> > Using ScaleUnit won't change the result. A ScaleUnit of "100%" means
> > scale the result up by multiplying by 100 and then apply the % after
> > the value. Another nit is having metrics that place their units in the
> > name, like _percent, is usually a sign the name can be better. Perhaps
> > we can follow up with some clean up.
> >
> > Thanks,
> > Ian
> >
> > > Thanks a lot.
> > > --
> > > Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> > > --
> > > Vorsitzender des Aufsichtsrats: Gregor Pillen
> > > Geschäftsführung: David Faller
> > > Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
> > >
>
> --
>
> - Arnaldo

--

- Arnaldo

2023-03-23 09:54:36

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

On 3/22/23 21:59, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 14, 2023 at 06:36:24PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Tue, Mar 14, 2023 at 09:34:46AM -0700, Ian Rogers escreveu:
>>> On Tue, Mar 14, 2023 at 1:20 AM Thomas Richter <[email protected]> wrote:
>>>>
>>>> On 3/13/23 19:33, Arnaldo Carvalho de Melo wrote:
>>>>> Em Mon, Mar 13, 2023 at 08:22:44AM -0700, Ian Rogers escreveu:
>>>>>> On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <[email protected]> wrote:
>>>>>>>
>>>>>>> Add metrics for s390 z16
>>>>>>> - Percentage sourced from Level 2 cache
>>>>>>> - Percentage sourced from Level 3 on same chip cache
>>>>>>> - Percentage sourced from Level 4 Local cache on same book
>>>>>>> - Percentage sourced from Level 4 Remote cache on different book
>>>>>>> - Percentage sourced from memory
>>>>>>>
>>>>>>> For details about the formulas see this documentation:
>>>>>>> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>>>>>>>
>>>>>>> Outpuf after:
>>>>>>> # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
>>>>>>> .... dd output deleted
>>>>>>>
>>>>>>> Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':
>>>>>>>
>>>>>>> 0 IDCW_OFF_DRAWER_CHIP_HIT # 0.00 l4rp
>>>>>>> 431,866 L1I_DIR_WRITES
>>>>>>> 2,395 IDCW_OFF_DRAWER_IV
>>>>>>> 0 ICW_OFF_DRAWER
>>>>>>> 0 IDCW_OFF_DRAWER_DRAWER_HIT
>>>>>>> 1,437 DCW_OFF_DRAWER
>>>>>>> 425,960,793 L1D_DIR_WRITES
>>>>>>>
>>>>>>> 12.165030699 seconds time elapsed
>>>>>>>
>>>>>>> 0.001037000 seconds user
>>>>>>> 12.162140000 seconds sys
>>>>>>>
>>>>>>> #
>>>>>>>
>>>>>>> Signed-off-by: Thomas Richter <[email protected]>
>>>>>>> Acked-By: Sumanth Korikkar <[email protected]>
>>>>>>
>>>>>> Acked-by: Ian Rogers <[email protected]>
>>>>>
>>>>> Thanks, applied the first two patches, please address the review
>>>>> suggestions for patches 3-6 and resubmit only those.
>>>>>
>>>>> The patches will be in the public perf-tools-next branch later today.
>>>>>
>>>>> - Arnaldo
>>>>>
>>>>
>>>> I would really prefer the current implementation without using "ScaleUnit": "100%"
>>>> The reason is that these formulars are given to me from the s390 Performance team.
>>>> They want to use the exact same formulars on all platforms running on s390
>>>> which includes z/OS and z/VM. This way they are sure to get the same numbers.
>>>>
>>>> Hope this background info helps.
>>>
>>> For the series:
>>> Acked-by: Ian Rogers <[email protected]>
>>
>> Thanks, applied.
>>
>> - Arnaldo
>
> While trying to cross build to s390 on:
>
> ubuntu:18.04
>
> using python3
>
>
> CC /tmp/build/perf/tests/parse-events.o
> Exception processing pmu-events/arch/s390/cf_z16/extended.json
> Traceback (most recent call last):
> File "pmu-events/jevents.py", line 997, in <module>
> main()
> File "pmu-events/jevents.py", line 979, in main
> ftw(arch_path, [], preprocess_one_file)
> File "pmu-events/jevents.py", line 935, in ftw
> ftw(item.path, parents + [item.name], action)
> File "pmu-events/jevents.py", line 933, in ftw
> action(parents, item)
> File "pmu-events/jevents.py", line 514, in preprocess_one_file
> for event in read_json_events(item.path, topic):
> File "pmu-events/jevents.py", line 388, in read_json_events
> events = json.load(open(path), object_hook=JsonEvent)
> File "/usr/lib/python3.6/json/__init__.py", line 296, in load
> return loads(fp.read(),
> File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4271: ordinal not in range(128)
>
>

Hmmm, this is very strange. After reading this mail I installed Ubuntu 18.04
on my s390 system. The build works fine, no errors at all.


# pmu-events/jevents.py s390 all pmu-events/arch pmu-events/pmu-events.c
# ll pmu-events/pmu-events.c
-rw-r--r-- 1 root root 317284 Mar 23 10:46 pmu-events/pmu-events.c
#

The file has the correct contents and the build works fine too.
# make
....
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... libbfd: [ on ]
... libbfd-buildid: [ on ]
... libcap: [ OFF ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ OFF ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ OFF ]

INSTALL libsubcmd_headers
INSTALL libsymbol_headers
INSTALL libperf_headers
INSTALL libapi_headers
INSTALL libbpf_headers
CC pmu-events/pmu-events.o
LD pmu-events/pmu-events-in.o
LINK perf
# ./perf list | grep -A 20 basic:
basic:
CPU_CYCLES
[Cycle Count. Unit: cpum_cf]
INSTRUCTIONS
[Instruction Count. Unit: cpum_cf]
L1D_DIR_WRITES
[Level-1 D-Cache Directory Write Count. Unit: cpum_cf]
L1D_PENALTY_CYCLES
[Level-1 D-Cache Penalty Cycle Count. Unit: cpum_cf]
L1I_DIR_WRITES
[Level-1 I-Cache Directory Write Count. Unit: cpum_cf]
L1I_PENALTY_CYCLES
[Level-1 I-Cache Penalty Cycle Count. Unit: cpum_cf]
PROBLEM_STATE_CPU_CYCLES
[Problem-State Cycle Count. Unit: cpum_cf]
PROBLEM_STATE_INSTRUCTIONS
[Problem-State Instruction Count. Unit: cpum_cf]


So everythings works as usual.
--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

2023-03-23 10:14:29

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

On Thu, Mar 23, 2023 at 10:51:16AM +0100, Thomas Richter wrote:
> On 3/22/23 21:59, Arnaldo Carvalho de Melo wrote:
> > While trying to cross build to s390 on:
> >
> > ubuntu:18.04
> >
> > using python3
> >
> >
> > CC /tmp/build/perf/tests/parse-events.o
> > Exception processing pmu-events/arch/s390/cf_z16/extended.json
> > Traceback (most recent call last):
> > File "pmu-events/jevents.py", line 997, in <module>
> > main()
> > File "pmu-events/jevents.py", line 979, in main
> > ftw(arch_path, [], preprocess_one_file)
> > File "pmu-events/jevents.py", line 935, in ftw
> > ftw(item.path, parents + [item.name], action)
> > File "pmu-events/jevents.py", line 933, in ftw
> > action(parents, item)
> > File "pmu-events/jevents.py", line 514, in preprocess_one_file
> > for event in read_json_events(item.path, topic):
> > File "pmu-events/jevents.py", line 388, in read_json_events
> > events = json.load(open(path), object_hook=JsonEvent)
> > File "/usr/lib/python3.6/json/__init__.py", line 296, in load
> > return loads(fp.read(),
> > File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> > return codecs.ascii_decode(input, self.errors)[0]
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4271: ordinal not in range(128)
> >
> >
>
> Hmmm, this is very strange. After reading this mail I installed Ubuntu 18.04
> on my s390 system. The build works fine, no errors at all.
>
>
> # pmu-events/jevents.py s390 all pmu-events/arch pmu-events/pmu-events.c
> # ll pmu-events/pmu-events.c
> -rw-r--r-- 1 root root 317284 Mar 23 10:46 pmu-events/pmu-events.c
> #
>
> The file has the correct contents and the build works fine too.
> # make

The file contains UTF-8 characters, which were already present before
your patch. Guess you need to provide an addon patch which converts to
plain ASCII.

2023-03-23 13:23:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/6] tools/perf/json: Add cache metrics for s390 z16

Em Thu, Mar 23, 2023 at 11:06:15AM +0100, Heiko Carstens escreveu:
> On Thu, Mar 23, 2023 at 10:51:16AM +0100, Thomas Richter wrote:
> > On 3/22/23 21:59, Arnaldo Carvalho de Melo wrote:
> > > While trying to cross build to s390 on:
> > >
> > > ubuntu:18.04
> > >
> > > using python3
> > >
> > >
> > > CC /tmp/build/perf/tests/parse-events.o
> > > Exception processing pmu-events/arch/s390/cf_z16/extended.json
> > > Traceback (most recent call last):
> > > File "pmu-events/jevents.py", line 997, in <module>
> > > main()
> > > File "pmu-events/jevents.py", line 979, in main
> > > ftw(arch_path, [], preprocess_one_file)
> > > File "pmu-events/jevents.py", line 935, in ftw
> > > ftw(item.path, parents + [item.name], action)
> > > File "pmu-events/jevents.py", line 933, in ftw
> > > action(parents, item)
> > > File "pmu-events/jevents.py", line 514, in preprocess_one_file
> > > for event in read_json_events(item.path, topic):
> > > File "pmu-events/jevents.py", line 388, in read_json_events
> > > events = json.load(open(path), object_hook=JsonEvent)
> > > File "/usr/lib/python3.6/json/__init__.py", line 296, in load
> > > return loads(fp.read(),
> > > File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
> > > return codecs.ascii_decode(input, self.errors)[0]
> > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4271: ordinal not in range(128)
> > >
> > >
> >
> > Hmmm, this is very strange. After reading this mail I installed Ubuntu 18.04
> > on my s390 system. The build works fine, no errors at all.
> >
> >
> > # pmu-events/jevents.py s390 all pmu-events/arch pmu-events/pmu-events.c
> > # ll pmu-events/pmu-events.c
> > -rw-r--r-- 1 root root 317284 Mar 23 10:46 pmu-events/pmu-events.c
> > #
> >
> > The file has the correct contents and the build works fine too.
> > # make
>
> The file contains UTF-8 characters, which were already present before
> your patch. Guess you need to provide an addon patch which converts to
> plain ASCII.

Yeah, and in this s390 perf test build container in the past I didn't
have the needed python3-dev package needed to build jevents.py, so it
was being disabled and the problem was left unnoticed.

Now that it is a opt-out feature, I installed python3-dev, jevents.py
got built and then this failure surfaced.

- Arnaldo