2024-02-29 00:19:41

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 00/20] Python generated Intel metrics

Generate twenty sets of additional metrics for Intel. Rapl and Idle
metrics aren't specific to Intel but are placed here for ease and
convenience. Smi and tsx metrics are added so they can be dropped from
the per model json files. There are four uncore sets of metrics and
twelve core metrics.

The cstate metrics require the event encoding fix of:
https://lore.kernel.org/lkml/[email protected]/

The patches should be applied on top of:
https://lore.kernel.org/lkml/[email protected]/

Ian Rogers (20):
perf jevents: Add RAPL metrics for all Intel models
perf jevents: Add idle metric for Intel models
perf jevents: Add smi metric group for Intel models
perf jevents: Add tsx metric group for Intel models
perf jevents: Add br metric group for branch statistics on Intel
perf jevents: Add software prefetch (swpf) metric group for Intel
perf jevents: Add ports metric group giving utilization on Intel
perf jevents: Add L2 metrics for Intel
perf jevents: Add load store breakdown metrics ldst for Intel
perf jevents: Add ILP metrics for Intel
perf jevents: Add context switch metrics for Intel
perf jevents: Add FPU metrics for Intel
perf jevents: Add cycles breakdown metric for Intel
perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
perf jevents: Add mem_bw metric for Intel
perf jevents: Add local/remote "mem" breakdown metrics for Intel
perf jevents: Add dir breakdown metrics for Intel
perf jevents: Add C-State metrics from the PCU PMU for Intel
perf jevents: Add local/remote miss latency metrics for Intel
perf jevents: Add upi_bw metric for Intel

tools/perf/pmu-events/intel_metrics.py | 1040 +++++++++++++++++++++++-
1 file changed, 1037 insertions(+), 3 deletions(-)

--
2.44.0.278.ge034bb2e1d-goog



2024-02-29 00:19:49

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 02/20] perf jevents: Add idle metric for Intel models

Compute using the msr PMU the percentage of wallclock cycles where the
CPUs are in a low power state.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 5827f555005f..46866a25b166 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1,7 +1,8 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
-from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
- LoadEvents, Metric, MetricGroup, Select)
+from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
+ JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
+ MetricGroup, Select)
import argparse
import json
import math
@@ -17,6 +18,16 @@ LoadEvents(directory)

interval_sec = Event("duration_time")

+def Idle() -> Metric:
+ cyc = Event("msr/mperf/")
+ tsc = Event("msr/tsc/")
+ low = max(tsc - cyc, 0)
+ return Metric(
+ "idle",
+ "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
+ d_ratio(low, tsc), "100%")
+
+
def Rapl() -> MetricGroup:
"""Processor socket power consumption estimate.

@@ -52,6 +63,7 @@ def Rapl() -> MetricGroup:


all_metrics = MetricGroup("", [
+ Idle(),
Rapl(),
])

--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:20:05

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 03/20] perf jevents: Add smi metric group for Intel models

Allow duplicated metric to be dropped from json files.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 46866a25b166..20c25d142f24 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -2,7 +2,7 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, Select)
+ MetricGroup, MetricRef, Select)
import argparse
import json
import math
@@ -62,9 +62,25 @@ def Rapl() -> MetricGroup:
description="Processor socket power consumption estimates")


+def Smi() -> MetricGroup:
+ aperf = Event('msr/aperf/')
+ cycles = Event('cycles')
+ smi_num = Event('msr/smi/')
+ smi_cycles = Select((aperf - cycles) / aperf, smi_num > 0, 0)
+ return MetricGroup('smi', [
+ Metric('smi_num', 'Number of SMI interrupts.',
+ smi_num, 'SMI#'),
+ # Note, the smi_cycles "Event" is really a reference to the metric.
+ Metric('smi_cycles',
+ 'Percentage of cycles spent in System Management Interrupts.',
+ smi_cycles, '100%', threshold=(MetricRef('smi_cycles') > 0.10))
+ ])
+
+
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
+ Smi(),
])

if args.metricgroups:
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:20:20

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models

Allow duplicated metric to be dropped from json files.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
1 file changed, 51 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 20c25d142f24..1096accea2aa 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -7,6 +7,7 @@ import argparse
import json
import math
import os
+from typing import Optional

parser = argparse.ArgumentParser(description="Intel perf json generator")
parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
@@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
])


+def Tsx() -> Optional[MetricGroup]:
+ if args.model not in [
+ 'alderlake',
+ 'cascadelakex',
+ 'icelake',
+ 'icelakex',
+ 'rocketlake',
+ 'sapphirerapids',
+ 'skylake',
+ 'skylakex',
+ 'tigerlake',
+ ]:
+ return None
+
+ pmu = "cpu_core" if args.model == "alderlake" else "cpu"
+ cycles = Event('cycles')
+ cycles_in_tx = Event(f'{pmu}/cycles\-t/')
+ transaction_start = Event(f'{pmu}/tx\-start/')
+ cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
+ metrics = [
+ Metric('tsx_transactional_cycles',
+ 'Percentage of cycles within a transaction region.',
+ Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
+ '100%'),
+ Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
+ Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
+ has_event(cycles_in_tx),
+ 0),
+ '100%'),
+ Metric('tsx_cycles_per_transaction',
+ 'Number of cycles within a transaction divided by the number of transactions.',
+ Select(cycles_in_tx / transaction_start,
+ has_event(cycles_in_tx),
+ 0),
+ "cycles / transaction"),
+ ]
+ if args.model != 'sapphirerapids':
+ elision_start = Event(f'{pmu}/el\-start/')
+ metrics += [
+ Metric('tsx_cycles_per_elision',
+ 'Number of cycles within a transaction divided by the number of elisions.',
+ Select(cycles_in_tx / elision_start,
+ has_event(elision_start),
+ 0),
+ "cycles / elision"),
+ ]
+ return MetricGroup('transaction', metrics)
+
+
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
Smi(),
+ Tsx(),
])

if args.metricgroups:
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:20:30

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 05/20] perf jevents: Add br metric group for branch statistics on Intel

The br metric group for branches itself comprises metric groups for
total, taken, conditional, fused and far metric groups using json
events. Condtional taken and not taken metrics are specific to Icelake
and later generations, so a model to generation look up is added.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 139 +++++++++++++++++++++++++
1 file changed, 139 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 1096accea2aa..bee5da19d19d 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -19,6 +19,7 @@ LoadEvents(directory)

interval_sec = Event("duration_time")

+
def Idle() -> Metric:
cyc = Event("msr/mperf/")
tsc = Event("msr/tsc/")
@@ -127,11 +128,149 @@ def Tsx() -> Optional[MetricGroup]:
return MetricGroup('transaction', metrics)


+def IntelBr():
+ ins = Event("instructions")
+
+ def Total() -> MetricGroup:
+ br_all = Event ("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
+ br_m_all = Event("BR_MISP_RETIRED.ALL_BRANCHES",
+ "BR_INST_RETIRED.MISPRED",
+ "BR_MISP_EXEC.ANY")
+ br_clr = None
+ try:
+ br_clr = Event("BACLEARS.ANY", "BACLEARS.ALL")
+ except:
+ pass
+
+ br_r = d_ratio(br_all, interval_sec)
+ ins_r = d_ratio(ins, br_all)
+ misp_r = d_ratio(br_m_all, br_all)
+ clr_r = d_ratio(br_clr, interval_sec) if br_clr else None
+
+ return MetricGroup("br_total", [
+ Metric("br_total_retired",
+ "The number of branch instructions retired per second.", br_r,
+ "insn/s"),
+ Metric(
+ "br_total_mispred",
+ "The number of branch instructions retired, of any type, that were "
+ "not correctly predicted as a percentage of all branch instrucions.",
+ misp_r, "100%"),
+ Metric("br_total_insn_between_branches",
+ "The number of instructions divided by the number of branches.",
+ ins_r, "insn"),
+ Metric("br_total_insn_fe_resteers",
+ "The number of resync branches per second.", clr_r, "req/s"
+ ) if clr_r else None
+ ])
+
+ def Taken() -> MetricGroup:
+ br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
+ br_m_tk = None
+ try:
+ br_m_tk = Event("BR_MISP_RETIRED.NEAR_TAKEN",
+ "BR_MISP_RETIRED.TAKEN_JCC",
+ "BR_INST_RETIRED.MISPRED_TAKEN")
+ except:
+ pass
+ br_r = d_ratio(br_all, interval_sec)
+ ins_r = d_ratio(ins, br_all)
+ misp_r = d_ratio(br_m_tk, br_all) if br_m_tk else None
+ return MetricGroup("br_taken", [
+ Metric("br_taken_retired",
+ "The number of taken branches that were retired per second.",
+ br_r, "insn/s"),
+ Metric(
+ "br_taken_mispred",
+ "The number of retired taken branch instructions that were "
+ "mispredicted as a percentage of all taken branches.", misp_r,
+ "100%") if misp_r else None,
+ Metric(
+ "br_taken_insn_between_branches",
+ "The number of instructions divided by the number of taken branches.",
+ ins_r, "insn"),
+ ])
+
+ def Conditional() -> Optional[MetricGroup]:
+ try:
+ br_cond = Event("BR_INST_RETIRED.COND",
+ "BR_INST_RETIRED.CONDITIONAL",
+ "BR_INST_RETIRED.TAKEN_JCC")
+ br_m_cond = Event("BR_MISP_RETIRED.COND",
+ "BR_MISP_RETIRED.CONDITIONAL",
+ "BR_MISP_RETIRED.TAKEN_JCC")
+ except:
+ return None
+
+ br_cond_nt = None
+ br_m_cond_nt = None
+ try:
+ br_cond_nt = Event("BR_INST_RETIRED.COND_NTAKEN")
+ br_m_cond_nt = Event("BR_MISP_RETIRED.COND_NTAKEN")
+ except:
+ pass
+ br_r = d_ratio(br_cond, interval_sec)
+ ins_r = d_ratio(ins, br_cond)
+ misp_r = d_ratio(br_m_cond, br_cond)
+ taken_metrics = [
+ Metric("br_cond_retired", "Retired conditional branch instructions.",
+ br_r, "insn/s"),
+ Metric("br_cond_insn_between_branches",
+ "The number of instructions divided by the number of conditional "
+ "branches.", ins_r, "insn"),
+ Metric("br_cond_mispred",
+ "Retired conditional branch instructions mispredicted as a "
+ "percentage of all conditional branches.", misp_r, "100%"),
+ ]
+ if not br_m_cond_nt:
+ return MetricGroup("br_cond", taken_metrics)
+
+ br_r = d_ratio(br_cond_nt, interval_sec)
+ ins_r = d_ratio(ins, br_cond_nt)
+ misp_r = d_ratio(br_m_cond_nt, br_cond_nt)
+
+ not_taken_metrics = [
+ Metric("br_cond_retired", "Retired conditional not taken branch instructions.",
+ br_r, "insn/s"),
+ Metric("br_cond_insn_between_branches",
+ "The number of instructions divided by the number of not taken conditional "
+ "branches.", ins_r, "insn"),
+ Metric("br_cond_mispred",
+ "Retired not taken conditional branch instructions mispredicted as a "
+ "percentage of all not taken conditional branches.", misp_r, "100%"),
+ ]
+ return MetricGroup("br_cond", [
+ MetricGroup("br_cond_nt", not_taken_metrics),
+ MetricGroup("br_cond_tkn", taken_metrics),
+ ])
+
+ def Far() -> Optional[MetricGroup]:
+ try:
+ br_far = Event("BR_INST_RETIRED.FAR_BRANCH")
+ except:
+ return None
+
+ br_r = d_ratio(br_far, interval_sec)
+ ins_r = d_ratio(ins, br_far)
+ return MetricGroup("br_far", [
+ Metric("br_far_retired", "Retired far control transfers per second.",
+ br_r, "insn/s"),
+ Metric(
+ "br_far_insn_between_branches",
+ "The number of instructions divided by the number of far branches.",
+ ins_r, "insn"),
+ ])
+
+ return MetricGroup("br", [Total(), Taken(), Conditional(), Far()],
+ description="breakdown of retired branch instructions")
+
+
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
Smi(),
Tsx(),
+ IntelBr(),
])

if args.metricgroups:
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:20:40

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 06/20] perf jevents: Add software prefetch (swpf) metric group for Intel

Add metrics that breakdown software prefetch instruction use.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 65 ++++++++++++++++++++++++++
1 file changed, 65 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index bee5da19d19d..f11273e9935c 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -265,12 +265,77 @@ def IntelBr():
description="breakdown of retired branch instructions")


+def IntelSwpf() -> Optional[MetricGroup]:
+ ins = Event("instructions")
+ try:
+ s_ld = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ s_nta = Event("SW_PREFETCH_ACCESS.NTA")
+ s_t0 = Event("SW_PREFETCH_ACCESS.T0")
+ s_t1 = Event("SW_PREFETCH_ACCESS.T1_T2")
+ s_w = Event("SW_PREFETCH_ACCESS.PREFETCHW")
+ except:
+ return None
+
+ all_sw = s_nta + s_t0 + s_t1 + s_w
+ swp_r = d_ratio(all_sw, interval_sec)
+ ins_r = d_ratio(ins, all_sw)
+ ld_r = d_ratio(s_ld, all_sw)
+
+ return MetricGroup("swpf", [
+ MetricGroup("swpf_totals", [
+ Metric("swpf_totals_exec", "Software prefetch instructions per second",
+ swp_r, "swpf/s"),
+ Metric("swpf_totals_insn_per_pf",
+ "Average number of instructions between software prefetches",
+ ins_r, "insn/swpf"),
+ Metric("swpf_totals_loads_per_pf",
+ "Average number of loads between software prefetches",
+ ld_r, "loads/swpf"),
+ ]),
+ MetricGroup("swpf_bkdwn", [
+ MetricGroup("swpf_bkdwn_nta", [
+ Metric("swpf_bkdwn_nta_per_swpf",
+ "Software prefetch NTA instructions as a percent of all prefetch instructions",
+ d_ratio(s_nta, all_sw), "100%"),
+ Metric("swpf_bkdwn_nta_rate",
+ "Software prefetch NTA instructions per second",
+ d_ratio(s_nta, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("swpf_bkdwn_t0", [
+ Metric("swpf_bkdwn_t0_per_swpf",
+ "Software prefetch T0 instructions as a percent of all prefetch instructions",
+ d_ratio(s_t0, all_sw), "100%"),
+ Metric("swpf_bkdwn_t0_rate",
+ "Software prefetch T0 instructions per second",
+ d_ratio(s_t0, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("swpf_bkdwn_t1_t2", [
+ Metric("swpf_bkdwn_t1_t2_per_swpf",
+ "Software prefetch T1 or T2 instructions as a percent of all prefetch instructions",
+ d_ratio(s_t1, all_sw), "100%"),
+ Metric("swpf_bkdwn_t1_t2_rate",
+ "Software prefetch T1 or T2 instructions per second",
+ d_ratio(s_t1, interval_sec), "insn/s"),
+ ]),
+ MetricGroup("swpf_bkdwn_w", [
+ Metric("swpf_bkdwn_w_per_swpf",
+ "Software prefetch W instructions as a percent of all prefetch instructions",
+ d_ratio(s_w, all_sw), "100%"),
+ Metric("swpf_bkdwn_w_rate",
+ "Software prefetch W instructions per second",
+ d_ratio(s_w, interval_sec), "insn/s"),
+ ]),
+ ]),
+ ], description="Sofware prefetch instruction breakdown")
+
+
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
Smi(),
Tsx(),
IntelBr(),
+ IntelSwpf(),
])

if args.metricgroups:
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:20:55

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 07/20] perf jevents: Add ports metric group giving utilization on Intel

The ports metric group contains a metric for each port giving its
utilization as a ratio of cycles. The metrics are created by looking
for UOPS_DISPATCHED.PORT events.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 33 ++++++++++++++++++++++++--
1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index f11273e9935c..63d46ee1dca9 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1,12 +1,13 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
- JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
- MetricGroup, MetricRef, Select)
+ JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
+ Metric, MetricGroup, MetricRef, Select)
import argparse
import json
import math
import os
+import re
from typing import Optional

parser = argparse.ArgumentParser(description="Intel perf json generator")
@@ -18,6 +19,11 @@ directory = f"{os.path.dirname(os.path.realpath(__file__))}/arch/x86/{args.model
LoadEvents(directory)

interval_sec = Event("duration_time")
+core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
+ "CPU_CLK_UNHALTED.DISTRIBUTED",
+ "cycles")
+# Number of CPU cycles scaled for SMT.
+smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)


def Idle() -> Metric:
@@ -265,6 +271,28 @@ def IntelBr():
description="breakdown of retired branch instructions")


+def IntelPorts() -> Optional[MetricGroup]:
+ pipeline_events = json.load(open(f"{os.path.dirname(os.path.realpath(__file__))}"
+ f"/arch/x86/{args.model}/pipeline.json"))
+
+ metrics = []
+ for x in pipeline_events:
+ if "EventName" in x and re.search("^UOPS_DISPATCHED.PORT", x["EventName"]):
+ name = x["EventName"]
+ port = re.search(r"(PORT_[0-9].*)", name).group(0).lower()
+ if name.endswith("_CORE"):
+ cyc = core_cycles
+ else:
+ cyc = smt_cycles
+ metrics.append(Metric(port, f"{port} utilization (higher is better)",
+ d_ratio(Event(name), cyc), "100%"))
+ if len(metrics) == 0:
+ return None
+
+ return MetricGroup("ports", metrics, "functional unit (port) utilization -- "
+ "fraction of cycles each port is utilized (higher is better)")
+
+
def IntelSwpf() -> Optional[MetricGroup]:
ins = Event("instructions")
try:
@@ -335,6 +363,7 @@ all_metrics = MetricGroup("", [
Smi(),
Tsx(),
IntelBr(),
+ IntelPorts(),
IntelSwpf(),
])

--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:21:11

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 08/20] perf jevents: Add L2 metrics for Intel

Give a breakdown of various L2 counters as metrics, including totals,
reads, hardware prefetcher, RFO, code and evictions.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 158 +++++++++++++++++++++++++
1 file changed, 158 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 63d46ee1dca9..d22a1abca8d9 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -271,6 +271,163 @@ def IntelBr():
description="breakdown of retired branch instructions")


+def IntelL2() -> Optional[MetricGroup]:
+ try:
+ DC_HIT = Event("L2_RQSTS.DEMAND_DATA_RD_HIT")
+ except:
+ return None
+ try:
+ DC_MISS = Event("L2_RQSTS.DEMAND_DATA_RD_MISS")
+ l2_dmnd_miss = DC_MISS
+ l2_dmnd_rd_all = DC_MISS + DC_HIT
+ except:
+ DC_ALL = Event("L2_RQSTS.ALL_DEMAND_DATA_RD")
+ l2_dmnd_miss = DC_ALL - DC_HIT
+ l2_dmnd_rd_all = DC_ALL
+ l2_dmnd_mrate = d_ratio(l2_dmnd_miss, interval_sec)
+ l2_dmnd_rrate = d_ratio(l2_dmnd_rd_all, interval_sec)
+
+ DC_PFH = None
+ DC_PFM = None
+ l2_pf_all = None
+ l2_pf_mrate = None
+ l2_pf_rrate = None
+ try:
+ DC_PFH = Event("L2_RQSTS.PF_HIT")
+ DC_PFM = Event("L2_RQSTS.PF_MISS")
+ l2_pf_all = DC_PFH + DC_PFM
+ l2_pf_mrate = d_ratio(DC_PFM, interval_sec)
+ l2_pf_rrate = d_ratio(l2_pf_all, interval_sec)
+ except:
+ pass
+
+ DC_RFOH = Event("L2_RQSTS.RFO_HIT")
+ DC_RFOM = Event("L2_RQSTS.RFO_MISS")
+ l2_rfo_all = DC_RFOH + DC_RFOM
+ l2_rfo_mrate = d_ratio(DC_RFOM, interval_sec)
+ l2_rfo_rrate = d_ratio(l2_rfo_all, interval_sec)
+
+ DC_CH = Event("L2_RQSTS.CODE_RD_HIT")
+ DC_CM = Event("L2_RQSTS.CODE_RD_MISS")
+ DC_IN = Event("L2_LINES_IN.ALL")
+ DC_OUT_NS = None
+ DC_OUT_S = None
+ l2_lines_out = None
+ l2_out_rate = None
+ wbn = None
+ isd = None
+ try:
+ DC_OUT_NS = Event("L2_LINES_OUT.NON_SILENT",
+ "L2_LINES_OUT.DEMAND_DIRTY",
+ "L2_LINES_IN.S")
+ DC_OUT_S = Event("L2_LINES_OUT.SILENT",
+ "L2_LINES_OUT.DEMAND_CLEAN",
+ "L2_LINES_IN.I")
+ if DC_OUT_S.name == "L2_LINES_OUT.SILENT" and (
+ args.model.startswith("skylake") or
+ args.model == "cascadelakex"):
+ DC_OUT_S.name = "L2_LINES_OUT.SILENT/any/"
+ # bring is back to per-CPU
+ l2_s = Select(DC_OUT_S / 2, Literal("#smt_on"), DC_OUT_S)
+ l2_ns = DC_OUT_NS
+ l2_lines_out = l2_s + l2_ns;
+ l2_out_rate = d_ratio(l2_lines_out, interval_sec);
+ nlr = max(l2_ns - DC_WB_U - DC_WB_D, 0)
+ wbn = d_ratio(nlr, interval_sec)
+ isd = d_ratio(l2_s, interval_sec)
+ except:
+ pass
+ DC_OUT_U = None
+ l2_pf_useless = None
+ l2_useless_rate = None
+ try:
+ DC_OUT_U = Event("L2_LINES_OUT.USELESS_HWPF")
+ l2_pf_useless = DC_OUT_U
+ l2_useless_rate = d_ratio(l2_pf_useless, interval_sec)
+ except:
+ pass
+ DC_WB_U = None
+ DC_WB_D = None
+ wbu = None
+ wbd = None
+ try:
+ DC_WB_U = Event("IDI_MISC.WB_UPGRADE")
+ DC_WB_D = Event("IDI_MISC.WB_DOWNGRADE")
+ wbu = d_ratio(DC_WB_U, interval_sec)
+ wbd = d_ratio(DC_WB_D, interval_sec)
+ except:
+ pass
+
+ l2_lines_in = DC_IN
+ l2_code_all = DC_CH + DC_CM
+ l2_code_rate = d_ratio(l2_code_all, interval_sec)
+ l2_code_miss_rate = d_ratio(DC_CM, interval_sec)
+ l2_in_rate = d_ratio(l2_lines_in, interval_sec)
+
+ return MetricGroup("l2", [
+ MetricGroup("l2_totals", [
+ Metric("l2_totals_in", "L2 cache total in per second",
+ l2_in_rate, "In/s"),
+ Metric("l2_totals_out", "L2 cache total out per second",
+ l2_out_rate, "Out/s") if l2_out_rate else None,
+ ]),
+ MetricGroup("l2_rd", [
+ Metric("l2_rd_hits", "L2 cache data read hits",
+ d_ratio(DC_HIT, l2_dmnd_rd_all), "100%"),
+ Metric("l2_rd_hits", "L2 cache data read hits",
+ d_ratio(l2_dmnd_miss, l2_dmnd_rd_all), "100%"),
+ Metric("l2_rd_requests", "L2 cache data read requests per second",
+ l2_dmnd_rrate, "requests/s"),
+ Metric("l2_rd_misses", "L2 cache data read misses per second",
+ l2_dmnd_mrate, "misses/s"),
+ ]),
+ MetricGroup("l2_hwpf", [
+ Metric("l2_hwpf_hits", "L2 cache hardware prefetcher hits",
+ d_ratio(DC_PFH, l2_pf_all), "100%"),
+ Metric("l2_hwpf_misses", "L2 cache hardware prefetcher misses",
+ d_ratio(DC_PFM, l2_pf_all), "100%"),
+ Metric("l2_hwpf_useless", "L2 cache hardware prefetcher useless prefetches per second",
+ l2_useless_rate, "100%") if l2_useless_rate else None,
+ Metric("l2_hwpf_requests", "L2 cache hardware prefetcher requests per second",
+ l2_pf_rrate, "100%"),
+ Metric("l2_hwpf_misses", "L2 cache hardware prefetcher misses per second",
+ l2_pf_mrate, "100%"),
+ ]) if DC_PFH else None,
+ MetricGroup("l2_rfo", [
+ Metric("l2_rfo_hits", "L2 cache request for ownership (RFO) hits",
+ d_ratio(DC_RFOH, l2_rfo_all), "100%"),
+ Metric("l2_rfo_misses", "L2 cache request for ownership (RFO) misses",
+ d_ratio(DC_RFOM, l2_rfo_all), "100%"),
+ Metric("l2_rfo_requests", "L2 cache request for ownership (RFO) requests per second",
+ l2_rfo_rrate, "requests/s"),
+ Metric("l2_rfo_misses", "L2 cache request for ownership (RFO) misses per second",
+ l2_rfo_mrate, "misses/s"),
+ ]),
+ MetricGroup("l2_code", [
+ Metric("l2_code_hits", "L2 cache code hits",
+ d_ratio(DC_CH, l2_code_all), "100%"),
+ Metric("l2_code_misses", "L2 cache code misses",
+ d_ratio(DC_CM, l2_code_all), "100%"),
+ Metric("l2_code_requests", "L2 cache code requests per second",
+ l2_code_rate, "requests/s"),
+ Metric("l2_code_misses", "L2 cache code misses per second",
+ l2_code_miss_rate, "misses/s"),
+ ]),
+ MetricGroup("l2_evict", [
+ MetricGroup("l2_evict_mef_lines", [
+ Metric("l2_evict_mef_lines_l3_hot_lru", "L2 evictions M/E/F lines L3 hot LRU per second",
+ wbu, "HotLRU/s") if wbu else None,
+ Metric("l2_evict_mef_lines_l3_norm_lru", "L2 evictions M/E/F lines L3 normal LRU per second",
+ wbn, "NormLRU/s") if wbn else None,
+ Metric("l2_evict_mef_lines_dropped", "L2 evictions M/E/F lines dropped per second",
+ wbd, "dropped/s") if wbd else None,
+ Metric("l2_evict_is_lines_dropped", "L2 evictions I/S lines dropped per second",
+ isd, "dropped/s") if isd else None,
+ ]),
+ ]),
+ ], description = "L2 data cache analysis")
+
+
def IntelPorts() -> Optional[MetricGroup]:
pipeline_events = json.load(open(f"{os.path.dirname(os.path.realpath(__file__))}"
f"/arch/x86/{args.model}/pipeline.json"))
@@ -363,6 +520,7 @@ all_metrics = MetricGroup("", [
Smi(),
Tsx(),
IntelBr(),
+ IntelL2(),
IntelPorts(),
IntelSwpf(),
])
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:22:04

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 11/20] perf jevents: Add context switch metrics for Intel

Metrics break down context switches for different kinds of
instruction.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 55 ++++++++++++++++++++++++++
1 file changed, 55 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 0ca72aeec1ea..6ee708e84863 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -271,6 +271,60 @@ def IntelBr():
description="breakdown of retired branch instructions")


+def IntelCtxSw() -> MetricGroup:
+ cs = Event("context\-switches")
+ metrics = [
+ Metric("cs_rate", "Context switches per second", d_ratio(cs, interval_sec), "ctxsw/s")
+ ]
+
+ ev = Event("instructions")
+ metrics.append(Metric("cs_instr", "Instructions per context switch",
+ d_ratio(ev, cs), "instr/cs"))
+
+ ev = Event("cycles")
+ metrics.append(Metric("cs_cycles", "Cycles per context switch",
+ d_ratio(ev, cs), "cycles/cs"))
+
+ try:
+ ev = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ metrics.append(Metric("cs_loads", "Loads per context switch",
+ d_ratio(ev, cs), "loads/cs"))
+ except:
+ pass
+
+ try:
+ ev = Event("MEM_INST_RETIRED.ALL_STORES", "MEM_UOPS_RETIRED.ALL_STORES")
+ metrics.append(Metric("cs_stores", "Stores per context switch",
+ d_ratio(ev, cs), "stores/cs"))
+ except:
+ pass
+
+ try:
+ ev = Event("BR_INST_RETIRED.NEAR_TAKEN", "BR_INST_RETIRED.TAKEN_JCC")
+ metrics.append(Metric("cs_br_taken", "Branches taken per context switch",
+ d_ratio(ev, cs), "br_taken/cs"))
+ except:
+ pass
+
+ try:
+ l2_misses = (Event("L2_RQSTS.DEMAND_DATA_RD_MISS") +
+ Event("L2_RQSTS.RFO_MISS") +
+ Event("L2_RQSTS.CODE_RD_MISS"))
+ try:
+ l2_misses += Event("L2_RQSTS.HWPF_MISS", "L2_RQSTS.L2_PF_MISS", "L2_RQSTS.PF_MISS")
+ except:
+ pass
+
+ metrics.append(Metric("cs_l2_misses", "L2 misses per context switch",
+ d_ratio(l2_misses, cs), "l2_misses/cs"))
+ except:
+ pass
+
+ return MetricGroup("cs", metrics,
+ description = ("Number of context switches per second, instructions "
+ "retired & core cycles between context switches"))
+
+
def IntelIlp() -> MetricGroup:
tsc = Event("msr/tsc/")
c0 = Event("msr/mperf/")
@@ -632,6 +686,7 @@ all_metrics = MetricGroup("", [
Smi(),
Tsx(),
IntelBr(),
+ IntelCtxSw(),
IntelIlp(),
IntelL2(),
IntelLdSt(),
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:22:44

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 13/20] perf jevents: Add cycles breakdown metric for Intel

Breakdown cycles to user, kernel and guest.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index dae44d296861..fef40969a4b8 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -26,6 +26,23 @@ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)


+def Cycles() -> MetricGroup:
+ cyc_k = Event("cycles:kHh")
+ cyc_g = Event("cycles:G")
+ cyc_u = Event("cycles:uH")
+ cyc = cyc_k + cyc_g + cyc_u
+
+ return MetricGroup("cycles", [
+ Metric("cycles_total", "Total number of cycles", cyc, "cycles"),
+ Metric("cycles_user", "User cycles as a percentage of all cycles",
+ d_ratio(cyc_u, cyc), "100%"),
+ Metric("cycles_kernel", "Kernel cycles as a percentage of all cycles",
+ d_ratio(cyc_k, cyc), "100%"),
+ Metric("cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
+ d_ratio(cyc_g, cyc), "100%"),
+ ], description = "cycles breakdown per privilege level (users, kernel, guest)")
+
+
def Idle() -> Metric:
cyc = Event("msr/mperf/")
tsc = Event("msr/tsc/")
@@ -770,6 +787,7 @@ def IntelLdSt() -> Optional[MetricGroup]:


all_metrics = MetricGroup("", [
+ Cycles(),
Idle(),
Rapl(),
Smi(),
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:22:59

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 14/20] perf jevents: Add Miss Level Parallelism (MLP) metric for Intel

Number of oustanding load misses per cycle.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index fef40969a4b8..e373f87d499d 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -617,6 +617,20 @@ def IntelL2() -> Optional[MetricGroup]:
], description = "L2 data cache analysis")


+def IntelMlp() -> Optional[Metric]:
+ try:
+ l1d = Event("L1D_PEND_MISS.PENDING")
+ l1dc = Event("L1D_PEND_MISS.PENDING_CYCLES")
+ except:
+ return None
+
+ l1dc = Select(l1dc / 2, Literal("#smt_on"), l1dc)
+ ml = d_ratio(l1d, l1dc)
+ return Metric("mlp",
+ "Miss level parallelism - number of oustanding load misses per cycle (higher is better)",
+ ml, "load_miss_pending/cycle")
+
+
def IntelPorts() -> Optional[MetricGroup]:
pipeline_events = json.load(open(f"{os.path.dirname(os.path.realpath(__file__))}"
f"/arch/x86/{args.model}/pipeline.json"))
@@ -798,6 +812,7 @@ all_metrics = MetricGroup("", [
IntelIlp(),
IntelL2(),
IntelLdSt(),
+ IntelMlp(),
IntelPorts(),
IntelSwpf(),
])
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:23:00

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 09/20] perf jevents: Add load store breakdown metrics ldst for Intel

Give breakdown of number of instructions. Use the counter mask (cmask)
to show the number of cycles taken to retire the instructions.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 86 +++++++++++++++++++++++++-
1 file changed, 85 insertions(+), 1 deletion(-)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index d22a1abca8d9..0035e2441d6b 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -2,7 +2,7 @@
# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
JsonEncodeMetricGroupDescriptions, Literal, LoadEvents,
- Metric, MetricGroup, MetricRef, Select)
+ Metric, MetricConstraint, MetricGroup, MetricRef, Select)
import argparse
import json
import math
@@ -514,6 +514,89 @@ def IntelSwpf() -> Optional[MetricGroup]:
], description="Sofware prefetch instruction breakdown")


+def IntelLdSt() -> Optional[MetricGroup]:
+ if args.model in [
+ "bonnell",
+ "nehalemep",
+ "nehalemex",
+ "westmereep-dp",
+ "westmereep-sp",
+ "westmereex",
+ ]:
+ return None
+ LDST_LD = Event("MEM_INST_RETIRED.ALL_LOADS", "MEM_UOPS_RETIRED.ALL_LOADS")
+ LDST_ST = Event("MEM_INST_RETIRED.ALL_STORES", "MEM_UOPS_RETIRED.ALL_STORES")
+ LDST_LDC1 = Event(f"{LDST_LD.name}/cmask=1/")
+ LDST_STC1 = Event(f"{LDST_ST.name}/cmask=1/")
+ LDST_LDC2 = Event(f"{LDST_LD.name}/cmask=2/")
+ LDST_STC2 = Event(f"{LDST_ST.name}/cmask=2/")
+ LDST_LDC3 = Event(f"{LDST_LD.name}/cmask=3/")
+ LDST_STC3 = Event(f"{LDST_ST.name}/cmask=3/")
+ ins = Event("instructions")
+ LDST_CYC = Event("CPU_CLK_UNHALTED.THREAD",
+ "CPU_CLK_UNHALTED.CORE_P",
+ "CPU_CLK_UNHALTED.THREAD_P")
+ LDST_PRE = None
+ try:
+ LDST_PRE = Event("LOAD_HIT_PREFETCH.SWPF", "LOAD_HIT_PRE.SW_PF")
+ except:
+ pass
+ LDST_AT = None
+ try:
+ LDST_AT = Event("MEM_INST_RETIRED.LOCK_LOADS")
+ except:
+ pass
+ cyc = LDST_CYC
+
+ ld_rate = d_ratio(LDST_LD, interval_sec)
+ st_rate = d_ratio(LDST_ST, interval_sec)
+ pf_rate = d_ratio(LDST_PRE, interval_sec) if LDST_PRE else None
+ at_rate = d_ratio(LDST_AT, interval_sec) if LDST_AT else None
+
+ ldst_ret_constraint = MetricConstraint.GROUPED_EVENTS
+ if LDST_LD.name == "MEM_UOPS_RETIRED.ALL_LOADS":
+ ldst_ret_constraint = MetricConstraint.NO_GROUP_EVENTS_NMI
+
+ return MetricGroup("ldst", [
+ MetricGroup("ldst_total", [
+ Metric("ldst_total_loads", "Load/store instructions total loads",
+ ld_rate, "loads"),
+ Metric("ldst_total_stores", "Load/store instructions total stores",
+ st_rate, "stores"),
+ ]),
+ MetricGroup("ldst_prcnt", [
+ Metric("ldst_prcnt_loads", "Percent of all instructions that are loads",
+ d_ratio(LDST_LD, ins), "100%"),
+ Metric("ldst_prcnt_stores", "Percent of all instructions that are stores",
+ d_ratio(LDST_ST, ins), "100%"),
+ ]),
+ MetricGroup("ldst_ret_lds", [
+ Metric("ldst_ret_lds_1", "Retired loads in 1 cycle",
+ d_ratio(max(LDST_LDC1 - LDST_LDC2, 0), cyc), "100%",
+ constraint = ldst_ret_constraint),
+ Metric("ldst_ret_lds_2", "Retired loads in 2 cycles",
+ d_ratio(max(LDST_LDC2 - LDST_LDC3, 0), cyc), "100%",
+ constraint = ldst_ret_constraint),
+ Metric("ldst_ret_lds_3", "Retired loads in 3 or more cycles",
+ d_ratio(LDST_LDC3, cyc), "100%"),
+ ]),
+ MetricGroup("ldst_ret_sts", [
+ Metric("ldst_ret_sts_1", "Retired stores in 1 cycle",
+ d_ratio(max(LDST_STC1 - LDST_STC2, 0), cyc), "100%",
+ constraint = ldst_ret_constraint),
+ Metric("ldst_ret_sts_2", "Retired stores in 2 cycles",
+ d_ratio(max(LDST_STC2 - LDST_STC3, 0), cyc), "100%",
+ constraint = ldst_ret_constraint),
+ Metric("ldst_ret_sts_3", "Retired stores in 3 more cycles",
+ d_ratio(LDST_STC3, cyc), "100%"),
+ ]),
+ Metric("ldst_ld_hit_swpf", "Load hit software prefetches per second",
+ pf_rate, "swpf/s") if pf_rate else None,
+ Metric("ldst_atomic_lds", "Atomic loads per second",
+ at_rate, "loads/s") if at_rate else None,
+ ], description = "Breakdown of load/store instructions")
+
+
all_metrics = MetricGroup("", [
Idle(),
Rapl(),
@@ -521,6 +604,7 @@ all_metrics = MetricGroup("", [
Tsx(),
IntelBr(),
IntelL2(),
+ IntelLdSt(),
IntelPorts(),
IntelSwpf(),
])
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:23:15

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 15/20] perf jevents: Add mem_bw metric for Intel

Break down memory bandwidth using uncore counters. For many models
this matches the memory_bandwidth_* metrics, but these metrics aren't
made available on all models. Add support for free running counters.
Query the event json when determining which what events/counters are
available.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 62 ++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index e373f87d499d..8d02be83b491 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -800,6 +800,67 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description = "Breakdown of load/store instructions")


+def UncoreMemBw() -> Optional[MetricGroup]:
+ mem_events = []
+ try:
+ mem_events = json.load(open(f"{os.path.dirname(os.path.realpath(__file__))}"
+ f"/arch/x86/{args.model}/uncore-memory.json"))
+ except:
+ pass
+
+ ddr_rds = 0
+ ddr_wrs = 0
+ ddr_total = 0
+ for x in mem_events:
+ if "EventName" in x:
+ name = x["EventName"]
+ if re.search("^UNC_MC[0-9]+_RDCAS_COUNT_FREERUN", name):
+ ddr_rds += Event(name)
+ elif re.search("^UNC_MC[0-9]+_WRCAS_COUNT_FREERUN", name):
+ ddr_wrs += Event(name)
+ #elif re.search("^UNC_MC[0-9]+_TOTAL_REQCOUNT_FREERUN", name):
+ # ddr_total += Event(name)
+
+ if ddr_rds == 0:
+ try:
+ ddr_rds = Event("UNC_M_CAS_COUNT.RD")
+ ddr_wrs = Event("UNC_M_CAS_COUNT.WR")
+ except:
+ return None
+
+ ddr_total = ddr_rds + ddr_wrs
+
+ pmm_rds = 0
+ pmm_wrs = 0
+ try:
+ pmm_rds = Event("UNC_M_PMM_RPQ_INSERTS")
+ pmm_wrs = Event("UNC_M_PMM_WPQ_INSERTS")
+ except:
+ pass
+
+ pmm_total = pmm_rds + pmm_wrs
+
+ scale = 64 / 1_000_000
+ return MetricGroup("mem_bw", [
+ MetricGroup("mem_bw_ddr", [
+ Metric("mem_bw_ddr_read", "DDR memory read bandwidth",
+ d_ratio(ddr_rds, interval_sec), f"{scale}MB/s"),
+ Metric("mem_bw_ddr_write", "DDR memory write bandwidth",
+ d_ratio(ddr_wrs, interval_sec), f"{scale}MB/s"),
+ Metric("mem_bw_ddr_total", "DDR memory write bandwidth",
+ d_ratio(ddr_total, interval_sec), f"{scale}MB/s"),
+ ], description = "DDR Memory Bandwidth"),
+ MetricGroup("mem_bw_pmm", [
+ Metric("mem_bw_pmm_read", "PMM memory read bandwidth",
+ d_ratio(pmm_rds, interval_sec), f"{scale}MB/s"),
+ Metric("mem_bw_pmm_write", "PMM memory write bandwidth",
+ d_ratio(pmm_wrs, interval_sec), f"{scale}MB/s"),
+ Metric("mem_bw_pmm_total", "PMM memory write bandwidth",
+ d_ratio(pmm_total, interval_sec), f"{scale}MB/s"),
+ ], description = "PMM Memory Bandwidth") if pmm_rds != 0 else None,
+ ], description = "Memory Bandwidth")
+
+
all_metrics = MetricGroup("", [
Cycles(),
Idle(),
@@ -815,6 +876,7 @@ all_metrics = MetricGroup("", [
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreMemBw(),
])

if args.metricgroups:
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:23:15

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 10/20] perf jevents: Add ILP metrics for Intel

Use the counter mask (cmask) to see how many cycles an instruction
takes to retire. Present as a set of ILP metrics.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 30 ++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 0035e2441d6b..0ca72aeec1ea 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -271,6 +271,35 @@ def IntelBr():
description="breakdown of retired branch instructions")


+def IntelIlp() -> MetricGroup:
+ tsc = Event("msr/tsc/")
+ c0 = Event("msr/mperf/")
+ low = tsc - c0
+ inst_ret = Event("INST_RETIRED.ANY_P")
+ inst_ret_c = [Event(f"{inst_ret.name}/cmask={x}/") for x in range(1, 6)]
+ ilp = [d_ratio(max(inst_ret_c[x] - inst_ret_c[x + 1], 0), core_cycles) for x in range(0, 4)]
+ ilp.append(d_ratio(inst_ret_c[4], core_cycles))
+ ilp0 = 1
+ for x in ilp:
+ ilp0 -= x
+ return MetricGroup("ilp", [
+ Metric("ilp_idle", "Lower power cycles as a percentage of all cycles",
+ d_ratio(low, tsc), "100%"),
+ Metric("ilp_inst_ret_0", "Instructions retired in 0 cycles as a percentage of all cycles",
+ ilp0, "100%"),
+ Metric("ilp_inst_ret_1", "Instructions retired in 1 cycles as a percentage of all cycles",
+ ilp[0], "100%"),
+ Metric("ilp_inst_ret_2", "Instructions retired in 2 cycles as a percentage of all cycles",
+ ilp[1], "100%"),
+ Metric("ilp_inst_ret_3", "Instructions retired in 3 cycles as a percentage of all cycles",
+ ilp[2], "100%"),
+ Metric("ilp_inst_ret_4", "Instructions retired in 4 cycles as a percentage of all cycles",
+ ilp[3], "100%"),
+ Metric("ilp_inst_ret_5", "Instructions retired in 5 or more cycles as a percentage of all cycles",
+ ilp[4], "100%"),
+ ])
+
+
def IntelL2() -> Optional[MetricGroup]:
try:
DC_HIT = Event("L2_RQSTS.DEMAND_DATA_RD_HIT")
@@ -603,6 +632,7 @@ all_metrics = MetricGroup("", [
Smi(),
Tsx(),
IntelBr(),
+ IntelIlp(),
IntelL2(),
IntelLdSt(),
IntelPorts(),
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:23:47

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 17/20] perf jevents: Add dir breakdown metrics for Intel

Breakdown directory hit, misses and requests. The implementation uses
the M2M and CHA PMUs present in server models broadwellde, broadwellx
cascadelakex, emeraldrapids, icelakex, sapphirerapids and skylakex.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 36 ++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 82fd23cf5500..07aafdf77f79 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -800,6 +800,41 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description = "Breakdown of load/store instructions")


+def UncoreDir() -> Optional[MetricGroup]:
+ try:
+ m2m_upd = Event("UNC_M2M_DIRECTORY_UPDATE.ANY")
+ m2m_hits = Event("UNC_M2M_DIRECTORY_HIT.DIRTY_I")
+ # Turn the umask into a ANY rather than DIRTY_I filter.
+ m2m_hits.name += "/umask=0xFF,name=UNC_M2M_DIRECTORY_HIT.ANY/"
+ m2m_miss = Event("UNC_M2M_DIRECTORY_MISS.DIRTY_I")
+ # Turn the umask into a ANY rather than DIRTY_I filter.
+ m2m_miss.name += "/umask=0xFF,name=UNC_M2M_DIRECTORY_MISS.ANY/"
+ cha_upd = Event("UNC_CHA_DIR_UPDATE.HA")
+ # Turn the umask into a ANY rather than HA filter.
+ cha_upd.name += "/umask=3,name=UNC_CHA_DIR_UPDATE.ANY/"
+ except:
+ return None
+
+ m2m_total = m2m_hits + m2m_miss
+ upd = m2m_upd + cha_upd # in cache lines
+ upd_r = upd / interval_sec
+ look_r = m2m_total / interval_sec
+
+ scale = 64 / 1_000_000 # Cache lines to MB
+ return MetricGroup("dir", [
+ Metric("dir_lookup_rate", "",
+ d_ratio(m2m_total, interval_sec), "requests/s"),
+ Metric("dir_lookup_hits", "",
+ d_ratio(m2m_hits, m2m_total), "100%"),
+ Metric("dir_lookup_misses", "",
+ d_ratio(m2m_miss, m2m_total), "100%"),
+ Metric("dir_update_requests", "",
+ d_ratio(m2m_upd + cha_upd, interval_sec), "requests/s"),
+ Metric("dir_update_bw", "",
+ d_ratio(m2m_upd + cha_upd, interval_sec), f"{scale}MB/s"),
+ ])
+
+
def UncoreMem() -> Optional[MetricGroup]:
try:
loc_rds = Event("UNC_CHA_REQUESTS.READS_LOCAL", "UNC_H_REQUESTS.READS_LOCAL")
@@ -902,6 +937,7 @@ all_metrics = MetricGroup("", [
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreDir(),
UncoreMem(),
UncoreMemBw(),
])
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:24:00

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 18/20] perf jevents: Add C-State metrics from the PCU PMU for Intel

Use occupancy events fixed in:
https://lore.kernel.org/lkml/[email protected]/

Metrics are at the socket level referring to cores, not hyperthreads.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 27 ++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 07aafdf77f79..1b9f7cd3b789 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -800,6 +800,32 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description = "Breakdown of load/store instructions")


+def UncoreCState() -> Optional[MetricGroup]:
+ try:
+ pcu_ticks = Event("UNC_P_CLOCKTICKS")
+ c0 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C0")
+ c3 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C3")
+ c6 = Event("UNC_P_POWER_STATE_OCCUPANCY.CORES_C6")
+ except:
+ return None
+
+ num_cores = Literal("#num_cores") / Literal("#num_packages")
+
+ max_cycles = pcu_ticks * num_cores;
+ total_cycles = c0 + c3 + c6
+
+ # remove fused-off cores which show up in C6/C7.
+ c6 = Select(max(c6 - (total_cycles - max_cycles), 0),
+ total_cycles > max_cycles,
+ c6)
+
+ return MetricGroup("cstate", [
+ Metric("cstate_c0", "C-State cores in C0/C1", d_ratio(c0, pcu_ticks), "cores"),
+ Metric("cstate_c3", "C-State cores in C3", d_ratio(c3, pcu_ticks), "cores"),
+ Metric("cstate_c6", "C-State cores in C6/C7", d_ratio(c6, pcu_ticks), "cores"),
+ ])
+
+
def UncoreDir() -> Optional[MetricGroup]:
try:
m2m_upd = Event("UNC_M2M_DIRECTORY_UPDATE.ANY")
@@ -937,6 +963,7 @@ all_metrics = MetricGroup("", [
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreCState(),
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:24:19

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 12/20] perf jevents: Add FPU metrics for Intel

Metrics break down of floating point operations.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 90 ++++++++++++++++++++++++++
1 file changed, 90 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 6ee708e84863..dae44d296861 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -325,6 +325,95 @@ def IntelCtxSw() -> MetricGroup:
"retired & core cycles between context switches"))


+def IntelFpu() -> Optional[MetricGroup]:
+ cyc = Event("cycles")
+ try:
+ s_64 = Event("FP_ARITH_INST_RETIRED.SCALAR_SINGLE",
+ "SIMD_INST_RETIRED.SCALAR_SINGLE")
+ except:
+ return None
+ d_64 = Event("FP_ARITH_INST_RETIRED.SCALAR_DOUBLE",
+ "SIMD_INST_RETIRED.SCALAR_DOUBLE")
+ s_128 = Event("FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
+ "SIMD_INST_RETIRED.PACKED_SINGLE")
+
+ flop = s_64 + d_64 + 4 * s_128
+
+ d_128 = None
+ s_256 = None
+ d_256 = None
+ s_512 = None
+ d_512 = None
+ try:
+ d_128 = Event("FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE")
+ flop += 2 * d_128
+ s_256 = Event("FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE")
+ flop += 8 * s_256
+ d_256 = Event("FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE")
+ flop += 4 * d_256
+ s_512 = Event("FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE")
+ flop += 16 * s_512
+ d_512 = Event("FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE")
+ flop += 8 * d_512
+ except:
+ pass
+
+ f_assist = Event("ASSISTS.FP", "FP_ASSIST.ANY", "FP_ASSIST.S")
+ if f_assist in [
+ "ASSISTS.FP",
+ "FP_ASSIST.S",
+ ]:
+ f_assist += "/cmask=1/"
+
+ flop_r = d_ratio(flop, interval_sec)
+ flop_c = d_ratio(flop, cyc)
+ nmi_constraint = MetricConstraint.GROUPED_EVENTS
+ if f_assist.name == "ASSISTS.FP": # Icelake+
+ nmi_constraint = MetricConstraint.NO_GROUP_EVENTS_NMI
+ def FpuMetrics(group: str, fl: Optional[Event], mult: int, desc: str) -> Optional[MetricGroup]:
+ if not fl:
+ return None
+
+ f = fl * mult
+ fl_r = d_ratio(f, interval_sec)
+ r_s = d_ratio(fl, interval_sec)
+ return MetricGroup(group, [
+ Metric(f"{group}_of_total", desc + " floating point operations per second",
+ d_ratio(f, flop), "100%"),
+ Metric(f"{group}_flops", desc + " floating point operations per second",
+ fl_r, "flops/s"),
+ Metric(f"{group}_ops", desc + " operations per second",
+ r_s, "ops/s"),
+ ])
+
+ return MetricGroup("fpu", [
+ MetricGroup("fpu_total", [
+ Metric("fpu_total_flops", "Floating point operations per second",
+ flop_r, "flops/s"),
+ Metric("fpu_total_flopc", "Floating point operations per cycle",
+ flop_c, "flops/cycle", constraint=nmi_constraint),
+ ]),
+ MetricGroup("fpu_64", [
+ FpuMetrics("fpu_64_single", s_64, 1, "64-bit single"),
+ FpuMetrics("fpu_64_double", d_64, 1, "64-bit double"),
+ ]),
+ MetricGroup("fpu_128", [
+ FpuMetrics("fpu_128_single", s_128, 4, "128-bit packed single"),
+ FpuMetrics("fpu_128_double", d_128, 2, "128-bit packed double"),
+ ]),
+ MetricGroup("fpu_256", [
+ FpuMetrics("fpu_256_single", s_256, 8, "128-bit packed single"),
+ FpuMetrics("fpu_256_double", d_256, 4, "128-bit packed double"),
+ ]),
+ MetricGroup("fpu_512", [
+ FpuMetrics("fpu_512_single", s_512, 16, "128-bit packed single"),
+ FpuMetrics("fpu_512_double", d_512, 8, "128-bit packed double"),
+ ]),
+ Metric("fpu_assists", "FP assists as a percentage of cycles",
+ d_ratio(f_assist, cyc), "100%"),
+ ])
+
+
def IntelIlp() -> MetricGroup:
tsc = Event("msr/tsc/")
c0 = Event("msr/mperf/")
@@ -687,6 +776,7 @@ all_metrics = MetricGroup("", [
Tsx(),
IntelBr(),
IntelCtxSw(),
+ IntelFpu(),
IntelIlp(),
IntelL2(),
IntelLdSt(),
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:24:31

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 20/20] perf jevents: Add upi_bw metric for Intel

Break down UPI read and write bandwidth using uncore_upi counters.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index cdeb58e17c5e..219541a30450 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -1006,6 +1006,27 @@ def UncoreMemBw() -> Optional[MetricGroup]:
], description = "Memory Bandwidth")


+def UncoreUpiBw() -> Optional[MetricGroup]:
+ try:
+ upi_rds = Event("UNC_UPI_RxL_FLITS.ALL_DATA")
+ upi_wrs = Event("UNC_UPI_TxL_FLITS.ALL_DATA")
+ except:
+ return None
+
+ upi_total = upi_rds + upi_wrs
+
+ # From "Uncore Performance Monitoring": When measuring the amount of
+ # bandwidth consumed by transmission of the data (i.e. NOT including
+ # the header), it should be .ALL_DATA / 9 * 64B.
+ scale = (64 / 9) / 1_000_000
+ return MetricGroup("upi_bw", [
+ Metric("upi_bw_read", "UPI read bandwidth",
+ d_ratio(upi_rds, interval_sec), f"{scale}MB/s"),
+ Metric("upi_bw_write", "DDR memory write bandwidth",
+ d_ratio(upi_wrs, interval_sec), f"{scale}MB/s"),
+ ], description = "UPI Bandwidth")
+
+
all_metrics = MetricGroup("", [
Cycles(),
Idle(),
@@ -1026,6 +1047,7 @@ all_metrics = MetricGroup("", [
UncoreDir(),
UncoreMem(),
UncoreMemBw(),
+ UncoreUpiBw(),
])

if args.metricgroups:
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:25:38

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 16/20] perf jevents: Add local/remote "mem" breakdown metrics for Intel

Breakdown local and remote memory bandwidth, read and writes. The
implementation uses the HA and CHA PMUs present in server models
broadwellde, broadwellx cascadelakex, emeraldrapids, haswellx,
icelakex, ivytown, sapphirerapids and skylakex.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 27 ++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 8d02be83b491..82fd23cf5500 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -800,6 +800,32 @@ def IntelLdSt() -> Optional[MetricGroup]:
], description = "Breakdown of load/store instructions")


+def UncoreMem() -> Optional[MetricGroup]:
+ try:
+ loc_rds = Event("UNC_CHA_REQUESTS.READS_LOCAL", "UNC_H_REQUESTS.READS_LOCAL")
+ rem_rds = Event("UNC_CHA_REQUESTS.READS_REMOTE", "UNC_H_REQUESTS.READS_REMOTE")
+ loc_wrs = Event("UNC_CHA_REQUESTS.WRITES_LOCAL", "UNC_H_REQUESTS.WRITES_LOCAL")
+ rem_wrs = Event("UNC_CHA_REQUESTS.WRITES_REMOTE", "UNC_H_REQUESTS.WRITES_REMOTE")
+ except:
+ return None
+
+ scale = 64 / 1_000_000
+ return MetricGroup("mem", [
+ MetricGroup("mem_local", [
+ Metric("mem_local_read", "Local memory read bandwidth not including directory updates",
+ d_ratio(loc_rds, interval_sec), f"{scale}MB/s"),
+ Metric("mem_local_write", "Local memory write bandwidth not including directory updates",
+ d_ratio(loc_wrs, interval_sec), f"{scale}MB/s"),
+ ]),
+ MetricGroup("mem_remote", [
+ Metric("mem_remote_read", "Remote memory read bandwidth not including directory updates",
+ d_ratio(rem_rds, interval_sec), f"{scale}MB/s"),
+ Metric("mem_remote_write", "Remote memory write bandwidth not including directory updates",
+ d_ratio(rem_wrs, interval_sec), f"{scale}MB/s"),
+ ]),
+ ], description = "Memory Bandwidth breakdown local vs. remote (remote requests in). directory updates not included")
+
+
def UncoreMemBw() -> Optional[MetricGroup]:
mem_events = []
try:
@@ -876,6 +902,7 @@ all_metrics = MetricGroup("", [
IntelMlp(),
IntelPorts(),
IntelSwpf(),
+ UncoreMem(),
UncoreMemBw(),
])

--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 00:26:42

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v1 19/20] perf jevents: Add local/remote miss latency metrics for Intel

Derive from CBOX/CHA occupancy and inserts the average latency as is
provided in Intel's uncore performance monitoring reference.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/pmu-events/intel_metrics.py | 59 ++++++++++++++++++++++++++
1 file changed, 59 insertions(+)

diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
index 1b9f7cd3b789..cdeb58e17c5e 100755
--- a/tools/perf/pmu-events/intel_metrics.py
+++ b/tools/perf/pmu-events/intel_metrics.py
@@ -617,6 +617,64 @@ def IntelL2() -> Optional[MetricGroup]:
], description = "L2 data cache analysis")


+def IntelMissLat() -> Optional[MetricGroup]:
+ try:
+ ticks = Event("UNC_CHA_CLOCKTICKS", "UNC_C_CLOCKTICKS")
+ data_rd_loc_occ = Event("UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_LOCAL",
+ "UNC_CHA_TOR_OCCUPANCY.IA_MISS",
+ "UNC_C_TOR_OCCUPANCY.MISS_LOCAL_OPCODE",
+ "UNC_C_TOR_OCCUPANCY.MISS_OPCODE")
+ data_rd_loc_ins = Event("UNC_CHA_TOR_INSERTS.IA_MISS_DRD_LOCAL",
+ "UNC_CHA_TOR_INSERTS.IA_MISS",
+ "UNC_C_TOR_INSERTS.MISS_LOCAL_OPCODE",
+ "UNC_C_TOR_INSERTS.MISS_OPCODE")
+ data_rd_rem_occ = Event("UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE",
+ "UNC_CHA_TOR_OCCUPANCY.IA_MISS",
+ "UNC_C_TOR_OCCUPANCY.MISS_REMOTE_OPCODE",
+ "UNC_C_TOR_OCCUPANCY.NID_MISS_OPCODE")
+ data_rd_rem_ins = Event("UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE",
+ "UNC_CHA_TOR_INSERTS.IA_MISS",
+ "UNC_C_TOR_INSERTS.MISS_REMOTE_OPCODE",
+ "UNC_C_TOR_INSERTS.NID_MISS_OPCODE")
+ except:
+ return None
+
+ if (data_rd_loc_occ.name == "UNC_C_TOR_OCCUPANCY.MISS_LOCAL_OPCODE" or
+ data_rd_loc_occ.name == "UNC_C_TOR_OCCUPANCY.MISS_OPCODE"):
+ data_rd = 0x182
+ for e in [data_rd_loc_occ, data_rd_loc_ins, data_rd_rem_occ, data_rd_rem_ins]:
+ e.name += f"/filter_opc={hex(data_rd)}/"
+ elif data_rd_loc_occ.name == "UNC_CHA_TOR_OCCUPANCY.IA_MISS":
+ # Demand Data Read - Full cache-line read requests from core for
+ # lines to be cached in S or E, typically for data
+ demand_data_rd = 0x202
+ # LLC Prefetch Data - Uncore will first look up the line in the
+ # LLC; for a cache hit, the LRU will be updated, on a miss, the
+ # DRd will be initiated
+ llc_prefetch_data = 0x25a
+ local_filter = (f"/filter_opc0={hex(demand_data_rd)},"
+ f"filter_opc1={hex(llc_prefetch_data)},"
+ "filter_loc,filter_nm,filter_not_nm/")
+ remote_filter = (f"/filter_opc0={hex(demand_data_rd)},"
+ f"filter_opc1={hex(llc_prefetch_data)},"
+ "filter_rem,filter_nm,filter_not_nm/")
+ for e in [data_rd_loc_occ, data_rd_loc_ins]:
+ e.name += local_filter
+ for e in [data_rd_rem_occ, data_rd_rem_ins]:
+ e.name += remote_filter
+ else:
+ assert data_rd_loc_occ.name == "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_LOCAL", data_rd_loc_occ
+
+ loc_lat = interval_sec * 1e9 * data_rd_loc_occ / (ticks * data_rd_loc_ins)
+ rem_lat = interval_sec * 1e9 * data_rd_rem_occ / (ticks * data_rd_rem_ins)
+ return MetricGroup("miss_lat", [
+ Metric("miss_lat_loc", "Local to a socket miss latency in nanoseconds",
+ loc_lat, "ns"),
+ Metric("miss_lat_rem", "Remote to a socket miss latency in nanoseconds",
+ rem_lat, "ns"),
+ ])
+
+
def IntelMlp() -> Optional[Metric]:
try:
l1d = Event("L1D_PEND_MISS.PENDING")
@@ -960,6 +1018,7 @@ all_metrics = MetricGroup("", [
IntelIlp(),
IntelL2(),
IntelLdSt(),
+ IntelMissLat(),
IntelMlp(),
IntelPorts(),
IntelSwpf(),
--
2.44.0.278.ge034bb2e1d-goog


2024-02-29 21:13:32

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 03/20] perf jevents: Add smi metric group for Intel models



On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> Allow duplicated metric to be dropped from json files.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/pmu-events/intel_metrics.py | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> index 46866a25b166..20c25d142f24 100755
> --- a/tools/perf/pmu-events/intel_metrics.py
> +++ b/tools/perf/pmu-events/intel_metrics.py
> @@ -2,7 +2,7 @@
> # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
> JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
> - MetricGroup, Select)
> + MetricGroup, MetricRef, Select)
> import argparse
> import json
> import math
> @@ -62,9 +62,25 @@ def Rapl() -> MetricGroup:
> description="Processor socket power consumption estimates")
>
>
> +def Smi() -> MetricGroup:
> + aperf = Event('msr/aperf/')

There are CPUID enumeration for the aperf and mperf. I believe they
should be always available for a newer bare metal. But they may not be
enumerated in an virtualization env. Should we add a has_event() check
before using it?

Thanks,
Kan

> + cycles = Event('cycles')
> + smi_num = Event('msr/smi/')
> + smi_cycles = Select((aperf - cycles) / aperf, smi_num > 0, 0)
> + return MetricGroup('smi', [
> + Metric('smi_num', 'Number of SMI interrupts.',
> + smi_num, 'SMI#'),
> + # Note, the smi_cycles "Event" is really a reference to the metric.
> + Metric('smi_cycles',
> + 'Percentage of cycles spent in System Management Interrupts.',
> + smi_cycles, '100%', threshold=(MetricRef('smi_cycles') > 0.10))
> + ])
> +
> +
> all_metrics = MetricGroup("", [
> Idle(),
> Rapl(),
> + Smi(),
> ])
>
> if args.metricgroups:

2024-02-29 21:16:09

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models



On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> Allow duplicated metric to be dropped from json files.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
> 1 file changed, 51 insertions(+)
>
> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> index 20c25d142f24..1096accea2aa 100755
> --- a/tools/perf/pmu-events/intel_metrics.py
> +++ b/tools/perf/pmu-events/intel_metrics.py
> @@ -7,6 +7,7 @@ import argparse
> import json
> import math
> import os
> +from typing import Optional
>
> parser = argparse.ArgumentParser(description="Intel perf json generator")
> parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
> ])
>
>
> +def Tsx() -> Optional[MetricGroup]:
> + if args.model not in [
> + 'alderlake',
> + 'cascadelakex',
> + 'icelake',
> + 'icelakex',
> + 'rocketlake',
> + 'sapphirerapids',
> + 'skylake',
> + 'skylakex',
> + 'tigerlake',> + ]:

Can we get ride of the model list? Otherwise, we have to keep updating
the list.

> + return None
> +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"

Is it possible to change the check to the existence of the "cpu" PMU
here? has_pmu("cpu") ? "cpu" : "cpu_core"

> + cycles = Event('cycles')
> + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> + transaction_start = Event(f'{pmu}/tx\-start/')
> + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> + metrics = [
> + Metric('tsx_transactional_cycles',
> + 'Percentage of cycles within a transaction region.',
> + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> + '100%'),
> + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> + has_event(cycles_in_tx),
> + 0),
> + '100%'),
> + Metric('tsx_cycles_per_transaction',
> + 'Number of cycles within a transaction divided by the number of transactions.',
> + Select(cycles_in_tx / transaction_start,
> + has_event(cycles_in_tx),
> + 0),
> + "cycles / transaction"),
> + ]
> + if args.model != 'sapphirerapids':

Add the "tsx_cycles_per_elision" metric only if
has_event(f'{pmu}/el\-start/')?

Thanks,
Kan

> + elision_start = Event(f'{pmu}/el\-start/')
> + metrics += [
> + Metric('tsx_cycles_per_elision',
> + 'Number of cycles within a transaction divided by the number of elisions.',
> + Select(cycles_in_tx / elision_start,
> + has_event(elision_start),
> + 0),
> + "cycles / elision"),
> + ]
> + return MetricGroup('transaction', metrics)
> +
> +
> all_metrics = MetricGroup("", [
> Idle(),
> Rapl(),
> Smi(),
> + Tsx(),
> ])
>
> if args.metricgroups:

2024-02-29 22:24:45

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 05/20] perf jevents: Add br metric group for branch statistics on Intel



On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> The br metric group for branches itself comprises metric groups for
> total, taken, conditional, fused and far metric groups using json
> events. Condtional taken and not taken metrics are specific to Icelake
> and later generations, so a model to generation look up is added.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/pmu-events/intel_metrics.py | 139 +++++++++++++++++++++++++
> 1 file changed, 139 insertions(+)
>
> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> index 1096accea2aa..bee5da19d19d 100755
> --- a/tools/perf/pmu-events/intel_metrics.py
> +++ b/tools/perf/pmu-events/intel_metrics.py
> @@ -19,6 +19,7 @@ LoadEvents(directory)
>
> interval_sec = Event("duration_time")
>
> +

Unnecessary empty line.

Thanks,
Kan

> def Idle() -> Metric:
> cyc = Event("msr/mperf/")
> tsc = Event("msr/tsc/")
> @@ -127,11 +128,149 @@ def Tsx() -> Optional[MetricGroup]:
> return MetricGroup('transaction', metrics)
>
>
> +def IntelBr():
> + ins = Event("instructions")
> +
> + def Total() -> MetricGroup:
> + br_all = Event ("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
> + br_m_all = Event("BR_MISP_RETIRED.ALL_BRANCHES",
> + "BR_INST_RETIRED.MISPRED",
> + "BR_MISP_EXEC.ANY")
> + br_clr = None
> + try:
> + br_clr = Event("BACLEARS.ANY", "BACLEARS.ALL")
> + except:
> + pass
> +
> + br_r = d_ratio(br_all, interval_sec)
> + ins_r = d_ratio(ins, br_all)
> + misp_r = d_ratio(br_m_all, br_all)
> + clr_r = d_ratio(br_clr, interval_sec) if br_clr else None
> +
> + return MetricGroup("br_total", [
> + Metric("br_total_retired",
> + "The number of branch instructions retired per second.", br_r,
> + "insn/s"),
> + Metric(
> + "br_total_mispred",
> + "The number of branch instructions retired, of any type, that were "
> + "not correctly predicted as a percentage of all branch instrucions.",
> + misp_r, "100%"),
> + Metric("br_total_insn_between_branches",
> + "The number of instructions divided by the number of branches.",
> + ins_r, "insn"),
> + Metric("br_total_insn_fe_resteers",
> + "The number of resync branches per second.", clr_r, "req/s"
> + ) if clr_r else None
> + ])
> +
> + def Taken() -> MetricGroup:
> + br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
> + br_m_tk = None
> + try:
> + br_m_tk = Event("BR_MISP_RETIRED.NEAR_TAKEN",
> + "BR_MISP_RETIRED.TAKEN_JCC",
> + "BR_INST_RETIRED.MISPRED_TAKEN")
> + except:
> + pass
> + br_r = d_ratio(br_all, interval_sec)
> + ins_r = d_ratio(ins, br_all)
> + misp_r = d_ratio(br_m_tk, br_all) if br_m_tk else None
> + return MetricGroup("br_taken", [
> + Metric("br_taken_retired",
> + "The number of taken branches that were retired per second.",
> + br_r, "insn/s"),
> + Metric(
> + "br_taken_mispred",
> + "The number of retired taken branch instructions that were "
> + "mispredicted as a percentage of all taken branches.", misp_r,
> + "100%") if misp_r else None,
> + Metric(
> + "br_taken_insn_between_branches",
> + "The number of instructions divided by the number of taken branches.",
> + ins_r, "insn"),
> + ])
> +
> + def Conditional() -> Optional[MetricGroup]:
> + try:
> + br_cond = Event("BR_INST_RETIRED.COND",
> + "BR_INST_RETIRED.CONDITIONAL",
> + "BR_INST_RETIRED.TAKEN_JCC")
> + br_m_cond = Event("BR_MISP_RETIRED.COND",
> + "BR_MISP_RETIRED.CONDITIONAL",
> + "BR_MISP_RETIRED.TAKEN_JCC")
> + except:
> + return None
> +
> + br_cond_nt = None
> + br_m_cond_nt = None
> + try:
> + br_cond_nt = Event("BR_INST_RETIRED.COND_NTAKEN")
> + br_m_cond_nt = Event("BR_MISP_RETIRED.COND_NTAKEN")
> + except:
> + pass
> + br_r = d_ratio(br_cond, interval_sec)
> + ins_r = d_ratio(ins, br_cond)
> + misp_r = d_ratio(br_m_cond, br_cond)
> + taken_metrics = [
> + Metric("br_cond_retired", "Retired conditional branch instructions.",
> + br_r, "insn/s"),
> + Metric("br_cond_insn_between_branches",
> + "The number of instructions divided by the number of conditional "
> + "branches.", ins_r, "insn"),
> + Metric("br_cond_mispred",
> + "Retired conditional branch instructions mispredicted as a "
> + "percentage of all conditional branches.", misp_r, "100%"),
> + ]
> + if not br_m_cond_nt:
> + return MetricGroup("br_cond", taken_metrics)
> +
> + br_r = d_ratio(br_cond_nt, interval_sec)
> + ins_r = d_ratio(ins, br_cond_nt)
> + misp_r = d_ratio(br_m_cond_nt, br_cond_nt)
> +
> + not_taken_metrics = [
> + Metric("br_cond_retired", "Retired conditional not taken branch instructions.",
> + br_r, "insn/s"),
> + Metric("br_cond_insn_between_branches",
> + "The number of instructions divided by the number of not taken conditional "
> + "branches.", ins_r, "insn"),
> + Metric("br_cond_mispred",
> + "Retired not taken conditional branch instructions mispredicted as a "
> + "percentage of all not taken conditional branches.", misp_r, "100%"),
> + ]
> + return MetricGroup("br_cond", [
> + MetricGroup("br_cond_nt", not_taken_metrics),
> + MetricGroup("br_cond_tkn", taken_metrics),
> + ])
> +
> + def Far() -> Optional[MetricGroup]:
> + try:
> + br_far = Event("BR_INST_RETIRED.FAR_BRANCH")
> + except:
> + return None
> +
> + br_r = d_ratio(br_far, interval_sec)
> + ins_r = d_ratio(ins, br_far)
> + return MetricGroup("br_far", [
> + Metric("br_far_retired", "Retired far control transfers per second.",
> + br_r, "insn/s"),
> + Metric(
> + "br_far_insn_between_branches",
> + "The number of instructions divided by the number of far branches.",
> + ins_r, "insn"),
> + ])
> +
> + return MetricGroup("br", [Total(), Taken(), Conditional(), Far()],
> + description="breakdown of retired branch instructions")
> +
> +
> all_metrics = MetricGroup("", [
> Idle(),
> Rapl(),
> Smi(),
> Tsx(),
> + IntelBr(),
> ])
>
> if args.metricgroups:

2024-03-01 00:35:19

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 13/20] perf jevents: Add cycles breakdown metric for Intel



On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> Breakdown cycles to user, kernel and guest.
>
> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/pmu-events/intel_metrics.py | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> index dae44d296861..fef40969a4b8 100755
> --- a/tools/perf/pmu-events/intel_metrics.py
> +++ b/tools/perf/pmu-events/intel_metrics.py
> @@ -26,6 +26,23 @@ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
> smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)
>
>
> +def Cycles() -> MetricGroup:
> + cyc_k = Event("cycles:kHh")
> + cyc_g = Event("cycles:G")
> + cyc_u = Event("cycles:uH")
> + cyc = cyc_k + cyc_g + cyc_u
> +
> + return MetricGroup("cycles", [
> + Metric("cycles_total", "Total number of cycles", cyc, "cycles"),
> + Metric("cycles_user", "User cycles as a percentage of all cycles",
> + d_ratio(cyc_u, cyc), "100%"),
> + Metric("cycles_kernel", "Kernel cycles as a percentage of all cycles",
> + d_ratio(cyc_k, cyc), "100%"),
> + Metric("cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
> + d_ratio(cyc_g, cyc), "100%"),
> + ], description = "cycles breakdown per privilege level (users, kernel, guest)")
> +
> +
> def Idle() -> Metric:
> cyc = Event("msr/mperf/")
> tsc = Event("msr/tsc/")
> @@ -770,6 +787,7 @@ def IntelLdSt() -> Optional[MetricGroup]:
>
>
> all_metrics = MetricGroup("", [
> + Cycles(),

The metric group seem exactly the same on AMD and ARM. Maybe we can have
tools/perf/pmu-events/common_metrics.py for all the common metrics.

Thanks,
Kan

> Idle(),
> Rapl(),
> Smi(),

2024-03-01 00:48:38

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 13/20] perf jevents: Add cycles breakdown metric for Intel

On Thu, Feb 29, 2024 at 1:30 PM Liang, Kan <[email protected]> wrote:
>
>
>
> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> > Breakdown cycles to user, kernel and guest.
> >
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/pmu-events/intel_metrics.py | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> > index dae44d296861..fef40969a4b8 100755
> > --- a/tools/perf/pmu-events/intel_metrics.py
> > +++ b/tools/perf/pmu-events/intel_metrics.py
> > @@ -26,6 +26,23 @@ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
> > smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)
> >
> >
> > +def Cycles() -> MetricGroup:
> > + cyc_k = Event("cycles:kHh")
> > + cyc_g = Event("cycles:G")
> > + cyc_u = Event("cycles:uH")
> > + cyc = cyc_k + cyc_g + cyc_u
> > +
> > + return MetricGroup("cycles", [
> > + Metric("cycles_total", "Total number of cycles", cyc, "cycles"),
> > + Metric("cycles_user", "User cycles as a percentage of all cycles",
> > + d_ratio(cyc_u, cyc), "100%"),
> > + Metric("cycles_kernel", "Kernel cycles as a percentage of all cycles",
> > + d_ratio(cyc_k, cyc), "100%"),
> > + Metric("cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
> > + d_ratio(cyc_g, cyc), "100%"),
> > + ], description = "cycles breakdown per privilege level (users, kernel, guest)")
> > +
> > +
> > def Idle() -> Metric:
> > cyc = Event("msr/mperf/")
> > tsc = Event("msr/tsc/")
> > @@ -770,6 +787,7 @@ def IntelLdSt() -> Optional[MetricGroup]:
> >
> >
> > all_metrics = MetricGroup("", [
> > + Cycles(),
>
> The metric group seem exactly the same on AMD and ARM. Maybe we can have
> tools/perf/pmu-events/common_metrics.py for all the common metrics.

Agreed. I think we can drop cycles in the three sets and then once
then do the common_metrics.py as a follow up.

Thanks,
Ian

> Thanks,
> Kan
>
> > Idle(),
> > Rapl(),
> > Smi(),

2024-03-01 00:54:28

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 03/20] perf jevents: Add smi metric group for Intel models

On Thu, Feb 29, 2024 at 1:09 PM Liang, Kan <[email protected]> wrote:
>
>
>
> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> > Allow duplicated metric to be dropped from json files.
> >
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/pmu-events/intel_metrics.py | 18 +++++++++++++++++-
> > 1 file changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> > index 46866a25b166..20c25d142f24 100755
> > --- a/tools/perf/pmu-events/intel_metrics.py
> > +++ b/tools/perf/pmu-events/intel_metrics.py
> > @@ -2,7 +2,7 @@
> > # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > from metric import (d_ratio, has_event, max, Event, JsonEncodeMetric,
> > JsonEncodeMetricGroupDescriptions, LoadEvents, Metric,
> > - MetricGroup, Select)
> > + MetricGroup, MetricRef, Select)
> > import argparse
> > import json
> > import math
> > @@ -62,9 +62,25 @@ def Rapl() -> MetricGroup:
> > description="Processor socket power consumption estimates")
> >
> >
> > +def Smi() -> MetricGroup:
> > + aperf = Event('msr/aperf/')
>
> There are CPUID enumeration for the aperf and mperf. I believe they
> should be always available for a newer bare metal. But they may not be
> enumerated in an virtualization env. Should we add a has_event() check
> before using it?

It would make sense to have the has_event so that the metric doesn't
fail in perf test. I'll add it.

Thanks,
Ian

> Thanks,
> Kan
>
> > + cycles = Event('cycles')
> > + smi_num = Event('msr/smi/')
> > + smi_cycles = Select((aperf - cycles) / aperf, smi_num > 0, 0)
> > + return MetricGroup('smi', [
> > + Metric('smi_num', 'Number of SMI interrupts.',
> > + smi_num, 'SMI#'),
> > + # Note, the smi_cycles "Event" is really a reference to the metric.
> > + Metric('smi_cycles',
> > + 'Percentage of cycles spent in System Management Interrupts.',
> > + smi_cycles, '100%', threshold=(MetricRef('smi_cycles') > 0.10))
> > + ])
> > +
> > +
> > all_metrics = MetricGroup("", [
> > Idle(),
> > Rapl(),
> > + Smi(),
> > ])
> >
> > if args.metricgroups:

2024-03-01 01:01:56

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models

On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <[email protected]> wrote:
>
>
>
> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> > Allow duplicated metric to be dropped from json files.
> >
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
> > 1 file changed, 51 insertions(+)
> >
> > diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> > index 20c25d142f24..1096accea2aa 100755
> > --- a/tools/perf/pmu-events/intel_metrics.py
> > +++ b/tools/perf/pmu-events/intel_metrics.py
> > @@ -7,6 +7,7 @@ import argparse
> > import json
> > import math
> > import os
> > +from typing import Optional
> >
> > parser = argparse.ArgumentParser(description="Intel perf json generator")
> > parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> > @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
> > ])
> >
> >
> > +def Tsx() -> Optional[MetricGroup]:
> > + if args.model not in [
> > + 'alderlake',
> > + 'cascadelakex',
> > + 'icelake',
> > + 'icelakex',
> > + 'rocketlake',
> > + 'sapphirerapids',
> > + 'skylake',
> > + 'skylakex',
> > + 'tigerlake',> + ]:
>
> Can we get ride of the model list? Otherwise, we have to keep updating
> the list.

Do we expect the list to update? :-) The issue is the events are in
sysfs and not the json. If we added the tsx events to json then this
list wouldn't be necessary, but it also would mean the events would be
present in "perf list" even when TSX is disabled.

> > + return None
> > +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>
> Is it possible to change the check to the existence of the "cpu" PMU
> here? has_pmu("cpu") ? "cpu" : "cpu_core"

The "Unit" on "cpu" events in json always just blank. On hybrid it is
either "cpu_core" or "cpu_atom", so I can make this something like:

pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"

which would be a build time test.


> > + cycles = Event('cycles')
> > + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> > + transaction_start = Event(f'{pmu}/tx\-start/')
> > + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> > + metrics = [
> > + Metric('tsx_transactional_cycles',
> > + 'Percentage of cycles within a transaction region.',
> > + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> > + '100%'),
> > + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> > + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> > + has_event(cycles_in_tx),
> > + 0),
> > + '100%'),
> > + Metric('tsx_cycles_per_transaction',
> > + 'Number of cycles within a transaction divided by the number of transactions.',
> > + Select(cycles_in_tx / transaction_start,
> > + has_event(cycles_in_tx),
> > + 0),
> > + "cycles / transaction"),
> > + ]
> > + if args.model != 'sapphirerapids':
>
> Add the "tsx_cycles_per_elision" metric only if
> has_event(f'{pmu}/el\-start/')?

It's a sysfs event, so this wouldn't work :-(

Thanks,
Ian

> Thanks,
> Kan
>
> > + elision_start = Event(f'{pmu}/el\-start/')
> > + metrics += [
> > + Metric('tsx_cycles_per_elision',
> > + 'Number of cycles within a transaction divided by the number of elisions.',
> > + Select(cycles_in_tx / elision_start,
> > + has_event(elision_start),
> > + 0),
> > + "cycles / elision"),
> > + ]
> > + return MetricGroup('transaction', metrics)
> > +
> > +
> > all_metrics = MetricGroup("", [
> > Idle(),
> > Rapl(),
> > Smi(),
> > + Tsx(),
> > ])
> >
> > if args.metricgroups:

2024-03-01 01:02:47

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 05/20] perf jevents: Add br metric group for branch statistics on Intel

On Thu, Feb 29, 2024 at 1:17 PM Liang, Kan <[email protected]> wrote:
>
>
>
> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> > The br metric group for branches itself comprises metric groups for
> > total, taken, conditional, fused and far metric groups using json
> > events. Condtional taken and not taken metrics are specific to Icelake
> > and later generations, so a model to generation look up is added.
> >
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/pmu-events/intel_metrics.py | 139 +++++++++++++++++++++++++
> > 1 file changed, 139 insertions(+)
> >
> > diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> > index 1096accea2aa..bee5da19d19d 100755
> > --- a/tools/perf/pmu-events/intel_metrics.py
> > +++ b/tools/perf/pmu-events/intel_metrics.py
> > @@ -19,6 +19,7 @@ LoadEvents(directory)
> >
> > interval_sec = Event("duration_time")
> >
> > +
>
> Unnecessary empty line.

Ack. Will fix in v2.

Thanks,
Ian

> Thanks,
> Kan
>
> > def Idle() -> Metric:
> > cyc = Event("msr/mperf/")
> > tsc = Event("msr/tsc/")
> > @@ -127,11 +128,149 @@ def Tsx() -> Optional[MetricGroup]:
> > return MetricGroup('transaction', metrics)
> >
> >
> > +def IntelBr():
> > + ins = Event("instructions")
> > +
> > + def Total() -> MetricGroup:
> > + br_all = Event ("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIREDANY")
> > + br_m_all = Event("BR_MISP_RETIRED.ALL_BRANCHES",
> > + "BR_INST_RETIRED.MISPRED",
> > + "BR_MISP_EXEC.ANY")
> > + br_clr = None
> > + try:
> > + br_clr = Event("BACLEARS.ANY", "BACLEARS.ALL")
> > + except:
> > + pass
> > +
> > + br_r = d_ratio(br_all, interval_sec)
> > + ins_r = d_ratio(ins, br_all)
> > + misp_r = d_ratio(br_m_all, br_all)
> > + clr_r = d_ratio(br_clr, interval_sec) if br_clr else None
> > +
> > + return MetricGroup("br_total", [
> > + Metric("br_total_retired",
> > + "The number of branch instructions retired per second.", br_r,
> > + "insn/s"),
> > + Metric(
> > + "br_total_mispred",
> > + "The number of branch instructions retired, of any type, that were "
> > + "not correctly predicted as a percentage of all branch instrucions.",
> > + misp_r, "100%"),
> > + Metric("br_total_insn_between_branches",
> > + "The number of instructions divided by the number of branches.",
> > + ins_r, "insn"),
> > + Metric("br_total_insn_fe_resteers",
> > + "The number of resync branches per second.", clr_r, "req/s"
> > + ) if clr_r else None
> > + ])
> > +
> > + def Taken() -> MetricGroup:
> > + br_all = Event("BR_INST_RETIRED.ALL_BRANCHES", "BR_INST_RETIRED.ANY")
> > + br_m_tk = None
> > + try:
> > + br_m_tk = Event("BR_MISP_RETIRED.NEAR_TAKEN",
> > + "BR_MISP_RETIRED.TAKEN_JCC",
> > + "BR_INST_RETIRED.MISPRED_TAKEN")
> > + except:
> > + pass
> > + br_r = d_ratio(br_all, interval_sec)
> > + ins_r = d_ratio(ins, br_all)
> > + misp_r = d_ratio(br_m_tk, br_all) if br_m_tk else None
> > + return MetricGroup("br_taken", [
> > + Metric("br_taken_retired",
> > + "The number of taken branches that were retired per second.",
> > + br_r, "insn/s"),
> > + Metric(
> > + "br_taken_mispred",
> > + "The number of retired taken branch instructions that were "
> > + "mispredicted as a percentage of all taken branches.", misp_r,
> > + "100%") if misp_r else None,
> > + Metric(
> > + "br_taken_insn_between_branches",
> > + "The number of instructions divided by the number of taken branches.",
> > + ins_r, "insn"),
> > + ])
> > +
> > + def Conditional() -> Optional[MetricGroup]:
> > + try:
> > + br_cond = Event("BR_INST_RETIRED.COND",
> > + "BR_INST_RETIRED.CONDITIONAL",
> > + "BR_INST_RETIRED.TAKEN_JCC")
> > + br_m_cond = Event("BR_MISP_RETIRED.COND",
> > + "BR_MISP_RETIRED.CONDITIONAL",
> > + "BR_MISP_RETIRED.TAKEN_JCC")
> > + except:
> > + return None
> > +
> > + br_cond_nt = None
> > + br_m_cond_nt = None
> > + try:
> > + br_cond_nt = Event("BR_INST_RETIRED.COND_NTAKEN")
> > + br_m_cond_nt = Event("BR_MISP_RETIRED.COND_NTAKEN")
> > + except:
> > + pass
> > + br_r = d_ratio(br_cond, interval_sec)
> > + ins_r = d_ratio(ins, br_cond)
> > + misp_r = d_ratio(br_m_cond, br_cond)
> > + taken_metrics = [
> > + Metric("br_cond_retired", "Retired conditional branch instructions.",
> > + br_r, "insn/s"),
> > + Metric("br_cond_insn_between_branches",
> > + "The number of instructions divided by the number of conditional "
> > + "branches.", ins_r, "insn"),
> > + Metric("br_cond_mispred",
> > + "Retired conditional branch instructions mispredicted as a "
> > + "percentage of all conditional branches.", misp_r, "100%"),
> > + ]
> > + if not br_m_cond_nt:
> > + return MetricGroup("br_cond", taken_metrics)
> > +
> > + br_r = d_ratio(br_cond_nt, interval_sec)
> > + ins_r = d_ratio(ins, br_cond_nt)
> > + misp_r = d_ratio(br_m_cond_nt, br_cond_nt)
> > +
> > + not_taken_metrics = [
> > + Metric("br_cond_retired", "Retired conditional not taken branch instructions.",
> > + br_r, "insn/s"),
> > + Metric("br_cond_insn_between_branches",
> > + "The number of instructions divided by the number of not taken conditional "
> > + "branches.", ins_r, "insn"),
> > + Metric("br_cond_mispred",
> > + "Retired not taken conditional branch instructions mispredicted as a "
> > + "percentage of all not taken conditional branches.", misp_r, "100%"),
> > + ]
> > + return MetricGroup("br_cond", [
> > + MetricGroup("br_cond_nt", not_taken_metrics),
> > + MetricGroup("br_cond_tkn", taken_metrics),
> > + ])
> > +
> > + def Far() -> Optional[MetricGroup]:
> > + try:
> > + br_far = Event("BR_INST_RETIRED.FAR_BRANCH")
> > + except:
> > + return None
> > +
> > + br_r = d_ratio(br_far, interval_sec)
> > + ins_r = d_ratio(ins, br_far)
> > + return MetricGroup("br_far", [
> > + Metric("br_far_retired", "Retired far control transfers per second.",
> > + br_r, "insn/s"),
> > + Metric(
> > + "br_far_insn_between_branches",
> > + "The number of instructions divided by the number of far branches.",
> > + ins_r, "insn"),
> > + ])
> > +
> > + return MetricGroup("br", [Total(), Taken(), Conditional(), Far()],
> > + description="breakdown of retired branch instructions")
> > +
> > +
> > all_metrics = MetricGroup("", [
> > Idle(),
> > Rapl(),
> > Smi(),
> > Tsx(),
> > + IntelBr(),
> > ])
> >
> > if args.metricgroups:

2024-03-01 13:53:56

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 13/20] perf jevents: Add cycles breakdown metric for Intel



On 2024-02-29 7:48 p.m., Ian Rogers wrote:
> On Thu, Feb 29, 2024 at 1:30 PM Liang, Kan <[email protected]> wrote:
>>
>>
>>
>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>> Breakdown cycles to user, kernel and guest.
>>>
>>> Signed-off-by: Ian Rogers <[email protected]>
>>> ---
>>> tools/perf/pmu-events/intel_metrics.py | 18 ++++++++++++++++++
>>> 1 file changed, 18 insertions(+)
>>>
>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>> index dae44d296861..fef40969a4b8 100755
>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>> @@ -26,6 +26,23 @@ core_cycles = Event("CPU_CLK_UNHALTED.THREAD_P_ANY",
>>> smt_cycles = Select(core_cycles / 2, Literal("#smt_on"), core_cycles)
>>>
>>>
>>> +def Cycles() -> MetricGroup:
>>> + cyc_k = Event("cycles:kHh")
>>> + cyc_g = Event("cycles:G")
>>> + cyc_u = Event("cycles:uH")
>>> + cyc = cyc_k + cyc_g + cyc_u
>>> +
>>> + return MetricGroup("cycles", [
>>> + Metric("cycles_total", "Total number of cycles", cyc, "cycles"),
>>> + Metric("cycles_user", "User cycles as a percentage of all cycles",
>>> + d_ratio(cyc_u, cyc), "100%"),
>>> + Metric("cycles_kernel", "Kernel cycles as a percentage of all cycles",
>>> + d_ratio(cyc_k, cyc), "100%"),
>>> + Metric("cycles_guest", "Hypervisor guest cycles as a percentage of all cycles",
>>> + d_ratio(cyc_g, cyc), "100%"),
>>> + ], description = "cycles breakdown per privilege level (users, kernel, guest)")
>>> +
>>> +
>>> def Idle() -> Metric:
>>> cyc = Event("msr/mperf/")
>>> tsc = Event("msr/tsc/")
>>> @@ -770,6 +787,7 @@ def IntelLdSt() -> Optional[MetricGroup]:
>>>
>>>
>>> all_metrics = MetricGroup("", [
>>> + Cycles(),
>>
>> The metric group seem exactly the same on AMD and ARM. Maybe we can have
>> tools/perf/pmu-events/common_metrics.py for all the common metrics.
>
> Agreed. I think we can drop cycles in the three sets and then once
> then do the common_metrics.py as a follow up.
>

Sounds good to me.

Thanks,
Kan

> Thanks,
> Ian
>
>> Thanks,
>> Kan
>>
>>> Idle(),
>>> Rapl(),
>>> Smi(),
>

2024-03-01 14:52:43

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models



On 2024-02-29 8:01 p.m., Ian Rogers wrote:
> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <[email protected]> wrote:
>>
>>
>>
>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>> Allow duplicated metric to be dropped from json files.
>>>
>>> Signed-off-by: Ian Rogers <[email protected]>
>>> ---
>>> tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>> 1 file changed, 51 insertions(+)
>>>
>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>> index 20c25d142f24..1096accea2aa 100755
>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>> @@ -7,6 +7,7 @@ import argparse
>>> import json
>>> import math
>>> import os
>>> +from typing import Optional
>>>
>>> parser = argparse.ArgumentParser(description="Intel perf json generator")
>>> parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>> ])
>>>
>>>
>>> +def Tsx() -> Optional[MetricGroup]:
>>> + if args.model not in [
>>> + 'alderlake',
>>> + 'cascadelakex',
>>> + 'icelake',
>>> + 'icelakex',
>>> + 'rocketlake',
>>> + 'sapphirerapids',
>>> + 'skylake',
>>> + 'skylakex',
>>> + 'tigerlake',> + ]:
>>
>> Can we get ride of the model list? Otherwise, we have to keep updating
>> the list.
>
> Do we expect the list to update? :-)

Yes, at least for the meteorlake and graniterapids. They should be the
same as alderlake and sapphirerapids. I'm not sure about the future
platforms.

Maybe we can have a if args.model in list here to include all the
non-hybrid models which doesn't support TSX. I think the list should not
be changed shortly.

> The issue is the events are in
> sysfs and not the json. If we added the tsx events to json then this
> list wouldn't be necessary, but it also would mean the events would be
> present in "perf list" even when TSX is disabled.

I think there may an alternative way, to check the RTM events, e.g.,
RTM_RETIRED.START event. We only need to generate the metrics for the
platform which supports the RTM_RETIRED.START event.


>
>>> + return None
>>> +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>
>> Is it possible to change the check to the existence of the "cpu" PMU
>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
>
> The "Unit" on "cpu" events in json always just blank. On hybrid it is
> either "cpu_core" or "cpu_atom", so I can make this something like:
>
> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
>
> which would be a build time test.

Yes, I think using the "Unit" is good enough.

>
>
>>> + cycles = Event('cycles')
>>> + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>> + transaction_start = Event(f'{pmu}/tx\-start/')
>>> + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>> + metrics = [
>>> + Metric('tsx_transactional_cycles',
>>> + 'Percentage of cycles within a transaction region.',
>>> + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>> + '100%'),
>>> + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>> + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>> + has_event(cycles_in_tx),
>>> + 0),
>>> + '100%'),
>>> + Metric('tsx_cycles_per_transaction',
>>> + 'Number of cycles within a transaction divided by the number of transactions.',
>>> + Select(cycles_in_tx / transaction_start,
>>> + has_event(cycles_in_tx),
>>> + 0),
>>> + "cycles / transaction"),
>>> + ]
>>> + if args.model != 'sapphirerapids':
>>
>> Add the "tsx_cycles_per_elision" metric only if
>> has_event(f'{pmu}/el\-start/')?
>
> It's a sysfs event, so this wouldn't work :-(

The below is the definition of el-start in the kernel.
EVENT_ATTR_STR(el-start, el_start, "event=0xc8,umask=0x1");

The corresponding event in the event list should be HLE_RETIRED.START
"EventCode": "0xC8",
"UMask": "0x01",
"EventName": "HLE_RETIRED.START",

I think we may check the HLE_RETIRED.START instead. If the
HLE_RETIRED.START doesn't exist, I don't see a reason why the
tsx_cycles_per_elision should be supported.

Again, in the virtualization world, it's possible that the
HLE_RETIRED.START exists in the event list but el_start isn't available
in the sysfs. I think it has to be specially handle in the test as well.

Thanks,
Kan

>
> Thanks,
> Ian
>
>> Thanks,
>> Kan
>>
>>> + elision_start = Event(f'{pmu}/el\-start/')
>>> + metrics += [
>>> + Metric('tsx_cycles_per_elision',
>>> + 'Number of cycles within a transaction divided by the number of elisions.',
>>> + Select(cycles_in_tx / elision_start,
>>> + has_event(elision_start),
>>> + 0),
>>> + "cycles / elision"),
>>> + ]
>>> + return MetricGroup('transaction', metrics)
>>> +
>>> +
>>> all_metrics = MetricGroup("", [
>>> Idle(),
>>> Rapl(),
>>> Smi(),
>>> + Tsx(),
>>> ])
>>>
>>> if args.metricgroups:
>

2024-03-01 16:48:00

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models

On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <[email protected]> wrote:
>
>
>
> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
> > On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <[email protected]> wrote:
> >>
> >>
> >>
> >> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> >>> Allow duplicated metric to be dropped from json files.
> >>>
> >>> Signed-off-by: Ian Rogers <[email protected]>
> >>> ---
> >>> tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
> >>> 1 file changed, 51 insertions(+)
> >>>
> >>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> >>> index 20c25d142f24..1096accea2aa 100755
> >>> --- a/tools/perf/pmu-events/intel_metrics.py
> >>> +++ b/tools/perf/pmu-events/intel_metrics.py
> >>> @@ -7,6 +7,7 @@ import argparse
> >>> import json
> >>> import math
> >>> import os
> >>> +from typing import Optional
> >>>
> >>> parser = argparse.ArgumentParser(description="Intel perf json generator")
> >>> parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
> >>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
> >>> ])
> >>>
> >>>
> >>> +def Tsx() -> Optional[MetricGroup]:
> >>> + if args.model not in [
> >>> + 'alderlake',
> >>> + 'cascadelakex',
> >>> + 'icelake',
> >>> + 'icelakex',
> >>> + 'rocketlake',
> >>> + 'sapphirerapids',
> >>> + 'skylake',
> >>> + 'skylakex',
> >>> + 'tigerlake',> + ]:
> >>
> >> Can we get ride of the model list? Otherwise, we have to keep updating
> >> the list.
> >
> > Do we expect the list to update? :-)
>
> Yes, at least for the meteorlake and graniterapids. They should be the
> same as alderlake and sapphirerapids. I'm not sure about the future
> platforms.
>
> Maybe we can have a if args.model in list here to include all the
> non-hybrid models which doesn't support TSX. I think the list should not
> be changed shortly.
>
> > The issue is the events are in
> > sysfs and not the json. If we added the tsx events to json then this
> > list wouldn't be necessary, but it also would mean the events would be
> > present in "perf list" even when TSX is disabled.
>
> I think there may an alternative way, to check the RTM events, e.g.,
> RTM_RETIRED.START event. We only need to generate the metrics for the
> platform which supports the RTM_RETIRED.START event.
>
>
> >
> >>> + return None
> >>> +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"
> >>
> >> Is it possible to change the check to the existence of the "cpu" PMU
> >> here? has_pmu("cpu") ? "cpu" : "cpu_core"
> >
> > The "Unit" on "cpu" events in json always just blank. On hybrid it is
> > either "cpu_core" or "cpu_atom", so I can make this something like:
> >
> > pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
> >
> > which would be a build time test.
>
> Yes, I think using the "Unit" is good enough.
>
> >
> >
> >>> + cycles = Event('cycles')
> >>> + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> >>> + transaction_start = Event(f'{pmu}/tx\-start/')
> >>> + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> >>> + metrics = [
> >>> + Metric('tsx_transactional_cycles',
> >>> + 'Percentage of cycles within a transaction region.',
> >>> + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> >>> + '100%'),
> >>> + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
> >>> + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> >>> + has_event(cycles_in_tx),
> >>> + 0),
> >>> + '100%'),
> >>> + Metric('tsx_cycles_per_transaction',
> >>> + 'Number of cycles within a transaction divided by the number of transactions.',
> >>> + Select(cycles_in_tx / transaction_start,
> >>> + has_event(cycles_in_tx),
> >>> + 0),
> >>> + "cycles / transaction"),
> >>> + ]
> >>> + if args.model != 'sapphirerapids':
> >>
> >> Add the "tsx_cycles_per_elision" metric only if
> >> has_event(f'{pmu}/el\-start/')?
> >
> > It's a sysfs event, so this wouldn't work :-(
>
> The below is the definition of el-start in the kernel.
> EVENT_ATTR_STR(el-start, el_start, "event=0xc8,umask=0x1");
>
> The corresponding event in the event list should be HLE_RETIRED.START
> "EventCode": "0xC8",
> "UMask": "0x01",
> "EventName": "HLE_RETIRED.START",
>
> I think we may check the HLE_RETIRED.START instead. If the
> HLE_RETIRED.START doesn't exist, I don't see a reason why the
> tsx_cycles_per_elision should be supported.
>
> Again, in the virtualization world, it's possible that the
> HLE_RETIRED.START exists in the event list but el_start isn't available
> in the sysfs. I think it has to be specially handle in the test as well.

So we keep the has_event test on the sysfs event to handle the
virtualization and disabled case. We use HLE_RETIRED.START to detect
whether the model supports TSX. Should the event be the sysfs or json
version? i.e.

"MetricExpr": "(cycles\\-t / el\\-start if
has_event(el\\-start) else 0)",

or

"MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
has_event(el\\-start) else 0)",

I think I favor the former for some consistency with the has_event.

Using HLE_RETIRED.START means the set of TSX models goes from:
'alderlake',
'cascadelakex',
'icelake',
'icelakex',
'rocketlake',
'sapphirerapids',
'skylake',
'skylakex',
'tigerlake',

To:
broadwell
broadwellde
broadwellx
cascadelakex
haswell
haswellx
icelake
rocketlake
skylake
skylakex

Using RTM_RETIRED.START it goes to:
broadwell
broadwellde
broadwellx
cascadelakex
emeraldrapids
graniterapids
haswell
haswellx
icelake
icelakex
rocketlake
sapphirerapids
skylake
skylakex
tigerlake

So I'm not sure it is working equivalently to what we have today,
which may be good or bad. Here is what I think the code should look
like:

def Tsx() -> Optional[MetricGroup]:
pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
cycles = Event('cycles')
cycles_in_tx = Event(f'{pmu}/cycles\-t/')
cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
try:
# Test if the tsx event is present in the json, prefer the
# sysfs version so that we can detect its presence at runtime.
transaction_start = Event("RTM_RETIRED.START")
transaction_start = Event(f'{pmu}/tx\-start/')
except:
return None

elision_start = None
try:
# Elision start isn't supported by all models, but we'll not
# generate the tsx_cycles_per_elision metric in that
# case. Again, prefer the sysfs encoding of the event.
elision_start = Event("HLE_RETIRED.START")
elision_start = Event(f'{pmu}/el\-start/')
except:
pass

return MetricGroup('transaction', [
Metric('tsx_transactional_cycles',
'Percentage of cycles within a transaction region.',
Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
'100%'),
Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
transactions.',
Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
has_event(cycles_in_tx),
0),
'100%'),
Metric('tsx_cycles_per_transaction',
'Number of cycles within a transaction divided by the
number of transactions.',
Select(cycles_in_tx / transaction_start,
has_event(cycles_in_tx),
0),
"cycles / transaction"),
Metric('tsx_cycles_per_elision',
'Number of cycles within a transaction divided by the
number of elisions.',
Select(cycles_in_tx / elision_start,
has_event(elision_start),
0),
"cycles / elision") if elision_start else None,
], description="Breakdown of transactional memory statistics")

Wdyt?

Thanks,
Ian

> Thanks,
> Kan
>
> >
> > Thanks,
> > Ian
> >
> >> Thanks,
> >> Kan
> >>
> >>> + elision_start = Event(f'{pmu}/el\-start/')
> >>> + metrics += [
> >>> + Metric('tsx_cycles_per_elision',
> >>> + 'Number of cycles within a transaction divided by the number of elisions.',
> >>> + Select(cycles_in_tx / elision_start,
> >>> + has_event(elision_start),
> >>> + 0),
> >>> + "cycles / elision"),
> >>> + ]
> >>> + return MetricGroup('transaction', metrics)
> >>> +
> >>> +
> >>> all_metrics = MetricGroup("", [
> >>> Idle(),
> >>> Rapl(),
> >>> Smi(),
> >>> + Tsx(),
> >>> ])
> >>>
> >>> if args.metricgroups:
> >

2024-03-01 17:50:09

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v1 02/20] perf jevents: Add idle metric for Intel models

> +def Idle() -> Metric:
> + cyc = Event("msr/mperf/")
> + tsc = Event("msr/tsc/")
> + low = max(tsc - cyc, 0)
> + return Metric(
> + "idle",
> + "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
> + d_ratio(low, tsc), "100%")

TBH I fail to see the advantage over the JSON. That's much more verbose
and we don't expect to have really complex metrics anyways.

And then we have a gigantic patch kit for what gain?

The motivation was the lack of comments in JSON? We could just add some
to the parser (e.g. with /* */ ). And we could allow an JSON array for the
expression to get multiple lines.


-Andi

2024-03-01 18:18:31

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 02/20] perf jevents: Add idle metric for Intel models

On Fri, Mar 1, 2024 at 9:49 AM Andi Kleen <[email protected]> wrote:
>
> > +def Idle() -> Metric:
> > + cyc = Event("msr/mperf/")
> > + tsc = Event("msr/tsc/")
> > + low = max(tsc - cyc, 0)
> > + return Metric(
> > + "idle",
> > + "Percentage of total wallclock cycles where CPUs are in low power state (C1 or deeper sleep state)",
> > + d_ratio(low, tsc), "100%")
>
> TBH I fail to see the advantage over the JSON. That's much more verbose
> and we don't expect to have really complex metrics anyways.

Are you saying this is more verbose or the json? Here is an example of
a json metric:

https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json?h=perf-tools-next#n652
```
{
"BriefDescription": "Probability of Core Bound bottleneck
hidden by SMT-profiling artifacts",
"MetricExpr": "(100 * (1 - tma_core_bound /
(((EXE_ACTIVITY.EXE_BOUND_0_PORTS + tma_core_bound *
RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD *
(CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) /
CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD +
(EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring *
EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if
ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL -
CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL +
tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD)
if tma_core_bound < (((EXE_ACTIVITY.EXE_BOUND_0_PORTS + tma_core_bound
* RS_EVENTS.EMPTY_CYCLES) / CPU_CLK_UNHALTED.THREAD *
(CYCLE_ACTIVITY.STALLS_TOTAL - CYCLE_ACTIVITY.STALLS_MEM_ANY) /
CPU_CLK_UNHALTED.THREAD * CPU_CLK_UNHALTED.THREAD +
(EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring *
EXE_ACTIVITY.2_PORTS_UTIL)) / CPU_CLK_UNHALTED.THREAD if
ARITH.DIVIDER_ACTIVE < CYCLE_ACTIVITY.STALLS_TOTAL -
CYCLE_ACTIVITY.STALLS_MEM_ANY else (EXE_ACTIVITY.1_PORTS_UTIL +
tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) / CPU_CLK_UNHALTED.THREAD)
else 1) if tma_info_system_smt_2t_utilization > 0.5 else 0)",
"MetricGroup": "Cor;SMT;TopdownL1;tma_L1_group",
"MetricName": "tma_info_botlnk_core_bound_likely",
"MetricgroupNoGroup": "TopdownL1"
},
```

Even with common metrics like tma_core_bound, tma_retiring and
tma_info_system_smt_2t_utilization replacing sections of the metric, I
think anyone has to admit the expression is pretty unintelligible
because of its size/verbosity. To understand the metric would at a
first step involve adding newlines. Comments would be nice, etc.

> And then we have a gigantic patch kit for what gain?

I see some of the gains as:
- metrics that are human intelligible,
- metrics for models that are no longer being updated,
- removing copy-paste of metrics like tsx and smi across each model's
metric json (less lines-of-code),
- validation of events in a metric expression being in the event json
for a model,
- removal of forward porting metrics to a new model if the event
names of the new model line up with those of previous,
- in this patch kit there are metrics added that don't currently
exist (more metrics should be better for users - yes there can always
be bugs).

I also hope the metric grouping is clearer, etc, etc.

> The motivation was the lack of comments in JSON? We could just add some
> to the parser (e.g. with /* */ ). And we could allow an JSON array for the
> expression to get multiple lines.

Imo, a non-json variant of json would just be taking on a tech debt
burden for something that is inferior to this approach and a wasted
effort. We already generate the json from other more intelligible
sources using python - so I don't find the approach controversial:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

The goal here has been to make a bunch of inhouse metrics public. It
also gives a foundation for vendors and other concerned people to add
metrics in a concise, with documentation and safe (broken events cause
compile time failures) way. There are some similar things like common
events/metrics on ARM:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/arm64/arm/cmn/sys?h=perf-tools-next
but this lacks the structure, validation, documentation, etc. that's
present here so my preference would be for more of the common things
to be done in the python way started here.

Thanks,
Ian

> -Andi

2024-03-01 21:34:13

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v1 02/20] perf jevents: Add idle metric for Intel models

>
> I see some of the gains as:
> - metrics that are human intelligible,
> - metrics for models that are no longer being updated,
> - removing copy-paste of metrics like tsx and smi across each model's
> metric json (less lines-of-code),
> - validation of events in a metric expression being in the event json
> for a model,
> - removal of forward porting metrics to a new model if the event
> names of the new model line up with those of previous,
> - in this patch kit there are metrics added that don't currently
> exist (more metrics should be better for users - yes there can always
> be bugs).

But then we have two ways to do things, and we already have a lot
of problems with regressions from complexity and a growing
bug backlog that nobody fixes.

Multiple ways to do basic operations seems just a recipe for
more and more fragmentation and similar problems.

The JSON format is certainly not perfect and has its share
of issues, but at least it's a standard now that is supported
by many vendors and creating new standards just because
you don't like some minor aspects doesn't seem like
a good approach. I'm sure the next person will come around
why wants Ruby metrics and the third would prefer to write
them in Rust. Who knows where it will stop.

Also in my experience this python stuff is unreliable because
half the people who build perf forget to install the python
libraries. Json at least works always.

Incrementional improvements are usually the way to do these
things.

-Andi

2024-03-01 22:32:24

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models



On 2024-03-01 11:37 a.m., Ian Rogers wrote:
> On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <[email protected]> wrote:
>>
>>
>>
>> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
>>> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>>>> Allow duplicated metric to be dropped from json files.
>>>>>
>>>>> Signed-off-by: Ian Rogers <[email protected]>
>>>>> ---
>>>>> tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>>>> 1 file changed, 51 insertions(+)
>>>>>
>>>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>>>> index 20c25d142f24..1096accea2aa 100755
>>>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>>>> @@ -7,6 +7,7 @@ import argparse
>>>>> import json
>>>>> import math
>>>>> import os
>>>>> +from typing import Optional
>>>>>
>>>>> parser = argparse.ArgumentParser(description="Intel perf json generator")
>>>>> parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>>>> ])
>>>>>
>>>>>
>>>>> +def Tsx() -> Optional[MetricGroup]:
>>>>> + if args.model not in [
>>>>> + 'alderlake',
>>>>> + 'cascadelakex',
>>>>> + 'icelake',
>>>>> + 'icelakex',
>>>>> + 'rocketlake',
>>>>> + 'sapphirerapids',
>>>>> + 'skylake',
>>>>> + 'skylakex',
>>>>> + 'tigerlake',> + ]:
>>>>
>>>> Can we get ride of the model list? Otherwise, we have to keep updating
>>>> the list.
>>>
>>> Do we expect the list to update? :-)
>>
>> Yes, at least for the meteorlake and graniterapids. They should be the
>> same as alderlake and sapphirerapids. I'm not sure about the future
>> platforms.
>>
>> Maybe we can have a if args.model in list here to include all the
>> non-hybrid models which doesn't support TSX. I think the list should not
>> be changed shortly.
>>
>>> The issue is the events are in
>>> sysfs and not the json. If we added the tsx events to json then this
>>> list wouldn't be necessary, but it also would mean the events would be
>>> present in "perf list" even when TSX is disabled.
>>
>> I think there may an alternative way, to check the RTM events, e.g.,
>> RTM_RETIRED.START event. We only need to generate the metrics for the
>> platform which supports the RTM_RETIRED.START event.
>>
>>
>>>
>>>>> + return None
>>>>> +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>>>
>>>> Is it possible to change the check to the existence of the "cpu" PMU
>>>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
>>>
>>> The "Unit" on "cpu" events in json always just blank. On hybrid it is
>>> either "cpu_core" or "cpu_atom", so I can make this something like:
>>>
>>> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
>>>
>>> which would be a build time test.
>>
>> Yes, I think using the "Unit" is good enough.
>>
>>>
>>>
>>>>> + cycles = Event('cycles')
>>>>> + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>>>> + transaction_start = Event(f'{pmu}/tx\-start/')
>>>>> + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>>>> + metrics = [
>>>>> + Metric('tsx_transactional_cycles',
>>>>> + 'Percentage of cycles within a transaction region.',
>>>>> + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>>>> + '100%'),
>>>>> + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>>>> + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>>>> + has_event(cycles_in_tx),
>>>>> + 0),
>>>>> + '100%'),
>>>>> + Metric('tsx_cycles_per_transaction',
>>>>> + 'Number of cycles within a transaction divided by the number of transactions.',
>>>>> + Select(cycles_in_tx / transaction_start,
>>>>> + has_event(cycles_in_tx),
>>>>> + 0),
>>>>> + "cycles / transaction"),
>>>>> + ]
>>>>> + if args.model != 'sapphirerapids':
>>>>
>>>> Add the "tsx_cycles_per_elision" metric only if
>>>> has_event(f'{pmu}/el\-start/')?
>>>
>>> It's a sysfs event, so this wouldn't work :-(
>>
>> The below is the definition of el-start in the kernel.
>> EVENT_ATTR_STR(el-start, el_start, "event=0xc8,umask=0x1");
>>
>> The corresponding event in the event list should be HLE_RETIRED.START
>> "EventCode": "0xC8",
>> "UMask": "0x01",
>> "EventName": "HLE_RETIRED.START",
>>
>> I think we may check the HLE_RETIRED.START instead. If the
>> HLE_RETIRED.START doesn't exist, I don't see a reason why the
>> tsx_cycles_per_elision should be supported.
>>
>> Again, in the virtualization world, it's possible that the
>> HLE_RETIRED.START exists in the event list but el_start isn't available
>> in the sysfs. I think it has to be specially handle in the test as well.
>
> So we keep the has_event test on the sysfs event to handle the
> virtualization and disabled case. We use HLE_RETIRED.START to detect
> whether the model supports TSX.

Yes. I think the JSON event always keeps the latest status of an event.
If an event is deprecated someday, I don't think there is a reason to
keep any metrics including the event. So we should use it to check
whether to generate a metrics.

The sysfs event tells if the current kernel support the event. It should
be used to check whether a metrics should be used/enabled.

> Should the event be the sysfs or json
> version? i.e.
>
> "MetricExpr": "(cycles\\-t / el\\-start if
> has_event(el\\-start) else 0)",
>
> or
>
> "MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
> has_event(el\\-start) else 0)",
>
> I think I favor the former for some consistency with the has_event.
>

Agree, the former looks good to me too.


> Using HLE_RETIRED.START means the set of TSX models goes from:
> 'alderlake',
> 'cascadelakex',
> 'icelake',
> 'icelakex',
> 'rocketlake',
> 'sapphirerapids',
> 'skylake',
> 'skylakex',
> 'tigerlake',
>
> To:
> broadwell
> broadwellde
> broadwellx
> cascadelakex
> haswell
> haswellx
> icelake
> rocketlake
> skylake
> skylakex
>
> Using RTM_RETIRED.START it goes to:
> broadwell
> broadwellde
> broadwellx
> cascadelakex
> emeraldrapids
> graniterapids
> haswell
> haswellx
> icelake
> icelakex
> rocketlake
> sapphirerapids
> skylake
> skylakex
> tigerlake
>
> So I'm not sure it is working equivalently to what we have today,
> which may be good or bad. Here is what I think the code should look
> like:

Yes, there should be some changes. But I think the changes should be good.

For icelakex, the HLE_RETIRED.START has been deprecated. I don't see a
reason why should perf keep the tsx_cycles_per_elision metric.

For alderlake, TSX is deprecated. The perf should drop the related
metrics as well.
https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/001/deprecated-technologies/

>
> def Tsx() -> Optional[MetricGroup]:
> pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
> cycles = Event('cycles')
> cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> try:
> # Test if the tsx event is present in the json, prefer the
> # sysfs version so that we can detect its presence at runtime.
> transaction_start = Event("RTM_RETIRED.START")
> transaction_start = Event(f'{pmu}/tx\-start/')
> except:
> return None
>
> elision_start = None
> try:
> # Elision start isn't supported by all models, but we'll not
> # generate the tsx_cycles_per_elision metric in that
> # case. Again, prefer the sysfs encoding of the event.
> elision_start = Event("HLE_RETIRED.START")
> elision_start = Event(f'{pmu}/el\-start/')
> except:
> pass
>
> return MetricGroup('transaction', [
> Metric('tsx_transactional_cycles',
> 'Percentage of cycles within a transaction region.',
> Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> '100%'),
> Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
> transactions.',
> Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> has_event(cycles_in_tx),
> 0),
> '100%'),
> Metric('tsx_cycles_per_transaction',
> 'Number of cycles within a transaction divided by the
> number of transactions.',
> Select(cycles_in_tx / transaction_start,
> has_event(cycles_in_tx),
> 0),
> "cycles / transaction"),
> Metric('tsx_cycles_per_elision',
> 'Number of cycles within a transaction divided by the
> number of elisions.',
> Select(cycles_in_tx / elision_start,
> has_event(elision_start),
> 0),
> "cycles / elision") if elision_start else None,
> ], description="Breakdown of transactional memory statistics")
>
> Wdyt?

Looks good to me.

Thanks,
Kan
>
> Thanks,
> Ian
>
>> Thanks,
>> Kan
>>
>>>
>>> Thanks,
>>> Ian
>>>
>>>> Thanks,
>>>> Kan
>>>>
>>>>> + elision_start = Event(f'{pmu}/el\-start/')
>>>>> + metrics += [
>>>>> + Metric('tsx_cycles_per_elision',
>>>>> + 'Number of cycles within a transaction divided by the number of elisions.',
>>>>> + Select(cycles_in_tx / elision_start,
>>>>> + has_event(elision_start),
>>>>> + 0),
>>>>> + "cycles / elision"),
>>>>> + ]
>>>>> + return MetricGroup('transaction', metrics)
>>>>> +
>>>>> +
>>>>> all_metrics = MetricGroup("", [
>>>>> Idle(),
>>>>> Rapl(),
>>>>> Smi(),
>>>>> + Tsx(),
>>>>> ])
>>>>>
>>>>> if args.metricgroups:
>>>

2024-03-01 23:09:50

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v1 02/20] perf jevents: Add idle metric for Intel models

On Fri, Mar 1, 2024 at 1:34 PM Andi Kleen <[email protected]> wrote:
>
> >
> > I see some of the gains as:
> > - metrics that are human intelligible,
> > - metrics for models that are no longer being updated,
> > - removing copy-paste of metrics like tsx and smi across each model's
> > metric json (less lines-of-code),
> > - validation of events in a metric expression being in the event json
> > for a model,
> > - removal of forward porting metrics to a new model if the event
> > names of the new model line up with those of previous,
> > - in this patch kit there are metrics added that don't currently
> > exist (more metrics should be better for users - yes there can always
> > be bugs).
>
> But then we have two ways to do things, and we already have a lot
> of problems with regressions from complexity and a growing
> bug backlog that nobody fixes.

If you want something to work you put a test on it. We have a number
of both event and metric tests. I'm not sure what the bug backlog you
are mentioning is, but as far as I can see the tool is in the best
condition it has ever been. All tests passing with address sanitizer
was a particular milestone last year.

> Multiple ways to do basic operations seems just a recipe for
> more and more fragmentation and similar problems.
>
> The JSON format is certainly not perfect and has its share
> of issues, but at least it's a standard now that is supported
> by many vendors and creating new standards just because
> you don't like some minor aspects doesn't seem like
> a good approach. I'm sure the next person will come around
> why wants Ruby metrics and the third would prefer to write
> them in Rust. Who knows where it will stop.

These patches don't make the json format disappear, we use python to
generate the json metrics as json strings are a poor programming
language.

I agree we have too many formats, but json is part of the problem
there not the solution. I would like to make the only format the sysfs
one, and then we can do like a unionfs type thing in the perf tool
where we can have sysfs, a sysfs layer built into the tool (out of the
json) and possibly user specified layers. This would allow
customizability once the binary is built, but it would also allow us
to test with a sysfs for a machine we don't have. Linux on M1 macs are
a particular issue, but we recently had an issue with the layout of
the format directory for Intel uncore_pcu pre-Skylake which doesn't
have a umask. Finding such machines to test on is the challenge, and
abstracting sysfs as a unionfs type thing is, I think, the correct
approach.

I don't think the Linux build has tooling around Ruby, and there are
no host tools written in Rust yet. Will it happen? Probably, and I
think it is good the codebase keeps moving forward. Before the C
reference count checking implementation, we were talking about
rewriting at least pieces like libperf in Rust - the code was leaking
memory and it seemed unsolvable as reasonable fixes would yield
use-after-frees and crashes. I've even mentioned this in LWN comments
on articles around Rust, nobody stepped up with a fix until I did the
reference count checking.

Python is a good choice for reading json as the inbuilt library is of
a reasonable quality. Python is good for what I've done here as the
operator overloading makes the expressions readable. We can read in
and out of the python tree format, and do so in jevents.py to validate
the metrics can parse (we still have the C parse test). We haven't
written a full expression parser in python, although it wouldn't be
hard, we just ack the string and pretty much call eval. It'd be
relatively easy to add an output function to the python code to make
it convert the expressions to a different programming language, for
example the ToPython code here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/metric.py?h=perf-tools-next#n17

> Also in my experience this python stuff is unreliable because
> half the people who build perf forget to install the python
> libraries. Json at least works always.

It has been the case for about a year (v6.4):
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=175f9315f76345e88e3abdc947c1e0030ab99da3
that if we can't build jevents because of python then the build fails.
You can explicitly request not to use python/jevents with
NO_JEVENTS=1, but it is an explicit opt-out.
I don't think this is unreliable. We've recently made BPF opt-out
rather than opt-in in a similar way, and that requires clang, etc. It
has been a problem in the past that implicit opt-in and opt-out could
give you a perf tool that a distribution couldn't ship (mixed GPLv2
and v3 code) or that was missing useful things. I think we've fixed
the bug by making the build fail unless you explicitly opt-out of
options we think you should have.

Fwiw, there is a similar bug that BTF support in the kernel is opt-in
rather than opt-out, meaning distributions ship BPF tools that can't
work for the kernel they've built. If there were more time I'd be
looking to make BTF opt-out rather than opt-in, I reported the issue
on the BPF mailing list.

> Incrementional improvements are usually the way to do these
> things.

We've had jevents as python for nearly 2 years. metric.py that this
code is building off has been in the tree for 15 months. I wrote the
code and there is a version of it for:
https://github.com/intel/perfmon/commits/main/scripts/create_perf_json.py
which is 2 years old. I don't see anything non-incremental, if
anything things have been slow to move forward. It's true vendors
haven't really adopted the code outside of Intel's perfmon, I've at
least discussed it with them in face-to-faces like LPC. Hopefully this
work is a foundation for vendors to write more metrics, it should be
little more struggle than it is for them to document the metric in
their manuals.

ARM have a python based json tool for perf (similar to the perfmon one) here:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/tools/perf_json_generator
So I'd say that python and perf json is a standard approach. ARM's
converter is just over a year old.

Thanks,
Ian

> -Andi