2024-02-17 00:58:36

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 00/20] Add Counter delegation ISA extension support

This series adds the counter delegation extension support. It is based on
very early PoC work done by Kevin Xue and mostly rewritten after that.
The counter delegation ISA extension(Smcdeleg/Ssccfg) actually depends
on multiple ISA extensions.

1. S[m|s]csrind : The indirect CSR extension[1] which defines additional
5 ([M|S|VS]IREG2-[M|S|VS]IREG6) register to address size limitation of
RISC-V CSR address space.
2. Smstateen: The stateen bit[60] controls the access to the registers
indirectly via the above indirect registers.
3. Smcdeleg/Ssccfg: The counter delegation extensions[2]

The counter delegation extension allows Supervisor mode to program the
hpmevent and hpmcounters directly without needing the assistance from the
M-mode via SBI calls. This results in a faster perf profiling and very
few traps. This extension also introduces a scountinhibit CSR which allows
to stop/start any counter directly from the S-mode. As the counter
delegation extension potentially can have more than 100 CSRs, the specification
leverages the indirect CSR extension to save the precious CSR address range.

Due to the dependency of these extensions, the following extensions must be
enabled in qemu to use the counter delegation feature in S-mode.

"smstateen=true,sscofpmf=true,ssccfg=true,smcdeleg=true,smcsrind=true,sscsrind=true"

When we access the counters directly in S-mode, we also need to solve the
following problems.

1. Event to counter mapping
2. Event encoding discovery

The RISC-V ISA doesn't define any standard either for event encoding or the
event to counter mapping rules.

Until now, the SBI PMU implementation relies on device tree binding[3] to
discover the event to counter mapping in RISC-V platform in the firmware. The
SBI PMU specification[4] defines event encoding for standard perf events as well.
Thus, the kernel can query the appropriate counter for an given event from the
firmware.

However, the kernel doesn't need any firmware interaction for hardware
counters if counter delegation is available in the hardware. Thus, the driver
needs to discover the above mappings/encodings by itself without any assistance
from firmware. One of the options considered was to extend the PMU DT parsing
support to kernel as well. However, that requires additional support in ACPI
based system. It also needs more infrastructure in the virtualization as well.

This patch series solves the above problem #1 by extending the perf tool in a
way so that event json file can specify the counter constraints of each event
and that can be passed to the driver to choose the best counter for a given
event. The perf stat metric series[5] from Weilin already extend the perf tool
to parse "Counter" property to specify the hardware counter restriction.
I have included the patch from Weilin in this series for verification purposes
only. I will rebase as that series evolves.

This series extends that support by converting comma separated string to a
bitmap. The counter constraint bitmap is passed to the perf driver via
newly introduced "counterid_mask" property set in "config2". Even though, this
is a generic perf tool change, this should not affect any other architecture
if "counterid_mask" is not mapped.

@Weilin: Please let me know if there is a better way to solve the problem I
described.

The problem #2 is solved by defining a architecture specific override function
that will replace the perf standard event encoding with an encoding specified
in the json file with the same event name. The alternate solution considered
was to specify the encodings in the driver. However, these encodings are vendor
specific in absence of an ISA guidelines and will become unmanageable with
so many RISC-V vendors touching the driver for their encoding.

The override is only required when counter delegation is available in the
platform which is detected at the runtime. The SBI PMU (current implementation)
doesn't require any override as it defines the standard event encoding. The
hwprobe syscall defined for RISC-V is used for this detection in this series.
A sysfs based property can be explored to do the same but we may require
hwprobe in future given the churn of extensions in RISC-V. That's why, I went
with hwprobe. Let me know if anybody thinks that's a bad idea.

The perf tool also hook allows RISC-V ISA platform vendors to define their
encoding for any standard perf or ISA event. I have tried to cover all the use
cases that I am aware of (stat, record, top). Please let me know if I have
missed any particular use case where architecture hook must be invoked. I am
also open to any other idea to solve the above said problem.

PATCH organization:
PATCH 1 is from the perf metric series[5]
PATCH 2-5 defines and implements the indirect CSR extension.
PATCH 6-10 defines the other required ISA extensions.
PATCH 11 just an overall restructure of the RISC-V PMU driver.
PATCH 12-14 implements the counter delegation extension and new perf tool
plumbings to solve #1 and #2.
PATCH 15-16 improves the perf tool support to solve #1 and #2.
PATCH 17 adds a perf json file for qemu virt machine.
PATCH 18-20 adds hwprobe mechanism to enable perf to detect if platform supports
delegation extensions.

There is no change in process to run perf stat/record and will continue to work
as it is as long as the relevant extensions have been enabled in Qemu.

However, the perf tool needs to be recompiled with as it requires new kenrel
headers.

The Qemu patches can be found here:
https://github.com/atishp04/qemu/tree/counter_delegation_rfc

The opensbi patch can be found here:
https://github.com/atishp04/opensbi/tree/counter_delegation_v1

The Linux kernel patches can be found here:
https://github.com/atishp04/linux/tree/counter_delegation_rfc

[1] https://github.com/riscv/riscv-indirect-csr-access
[2] https://github.com/riscv/riscv-smcdeleg-ssccfg
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/perf/riscv%2Cpmu.yaml
[4] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc
[5] https://lore.kernel.org/all/[email protected]/

Atish Patra (17):
RISC-V: Add Sxcsrind ISA extension definition and parsing
dt-bindings: riscv: add Sxcsrind ISA extension description
RISC-V: Define indirect CSR access helpers
RISC-V: Add Ssccfg ISA extension definition and parsing
dt-bindings: riscv: add Ssccfg ISA extension description
RISC-V: Add Smcntrpmf extension parsing
dt-bindings: riscv: add Smcntrpmf ISA extension description
RISC-V: perf: Restructure the SBI PMU code
RISC-V: perf: Modify the counter discovery mechanism
RISC-V: perf: Implement supervisor counter delegation support
RISC-V: perf: Use config2 for event to counter mapping
tools/perf: Add arch hooks to override perf standard events
tools/perf: Pass the Counter constraint values in the pmu events
perf: Add json file for virt machine supported events
tools arch uapi: Sync the uinstd.h header file for RISC-V
RISC-V: Add hwprobe support for Counter delegation extensions
tools/perf: Detect if platform supports counter delegation

Kaiwen Xue (2):
RISC-V: Add Sxcsrind ISA extension CSR definitions
RISC-V: Add Sscfg extension CSR definition

Weilin Wang (1):
perf pmu-events: Add functions in jevent.py to parse counter and event
info for hardware aware grouping

Documentation/arch/riscv/hwprobe.rst | 10 +
../devicetree/bindings/riscv/extensions.yaml | 34 +
MAINTAINERS | 4 +-
arch/riscv/include/asm/csr.h | 47 ++
arch/riscv/include/asm/csr_ind.h | 42 ++
arch/riscv/include/asm/hwcap.h | 5 +
arch/riscv/include/asm/sbi.h | 2 +-
arch/riscv/include/uapi/asm/hwprobe.h | 4 +
arch/riscv/kernel/cpufeature.c | 5 +
arch/riscv/kernel/sys_hwprobe.c | 3 +
arch/riscv/kvm/vcpu_pmu.c | 2 +-
drivers/perf/Kconfig | 16 +-
drivers/perf/Makefile | 4 +-
../perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
../perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 654 ++++++++++++++----
include/linux/perf/riscv_pmu.h | 13 +-
tools/arch/riscv/include/uapi/asm/unistd.h | 14 +-
tools/perf/arch/riscv/util/Build | 2 +
tools/perf/arch/riscv/util/evlist.c | 60 ++
tools/perf/arch/riscv/util/pmu.c | 41 ++
tools/perf/arch/riscv/util/pmu.h | 11 +
tools/perf/builtin-record.c | 3 +
tools/perf/builtin-stat.c | 2 +
tools/perf/builtin-top.c | 3 +
../pmu-events/arch/riscv/arch-standard.json | 10 +
tools/perf/pmu-events/arch/riscv/mapfile.csv | 1 +
../pmu-events/arch/riscv/qemu/virt/cpu.json | 30 +
../arch/riscv/qemu/virt/firmware.json | 68 ++
tools/perf/pmu-events/jevents.py | 186 ++++-
tools/perf/pmu-events/pmu-events.h | 25 +-
tools/perf/util/evlist.c | 6 +
tools/perf/util/evlist.h | 6 +
32 files changed, 1167 insertions(+), 146 deletions(-)
create mode 100644 arch/riscv/include/asm/csr_ind.h
rename drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} (100%)
rename drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} (61%)
create mode 100644 tools/perf/arch/riscv/util/evlist.c
create mode 100644 tools/perf/arch/riscv/util/pmu.c
create mode 100644 tools/perf/arch/riscv/util/pmu.h
create mode 100644 tools/perf/pmu-events/arch/riscv/arch-standard.json
create mode 100644 tools/perf/pmu-events/arch/riscv/qemu/virt/cpu.json
create mode 100644 tools/perf/pmu-events/arch/riscv/qemu/virt/firmware.json

--
2.34.1



2024-02-17 00:58:53

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 01/20] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping

From: Weilin Wang <[email protected]>

These functions are added to parse event counter restrictions and counter
availability info from json files so that the metric grouping method could
do grouping based on the counter restriction of events and the counters
that are available on the system.

Signed-off-by: Weilin Wang <[email protected]>
---
tools/perf/pmu-events/jevents.py | 171 ++++++++++++++++++++++++++++-
tools/perf/pmu-events/pmu-events.h | 25 ++++-
2 files changed, 188 insertions(+), 8 deletions(-)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 53ab050c8fa4..81e465a43c75 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -23,6 +23,8 @@ _metric_tables = []
_sys_metric_tables = []
# Mapping between sys event table names and sys metric table names.
_sys_event_table_to_metric_table_mapping = {}
+# List of regular PMU counter layout tables.
+_pmu_layouts_tables = []
# Map from an event name to an architecture standard
# JsonEvent. Architecture standard events are in json files in the top
# f'{_args.starting_dir}/{_args.arch}' directory.
@@ -31,6 +33,9 @@ _arch_std_events = {}
_pending_events = []
# Name of events table to be written out
_pending_events_tblname = None
+# PMU counter layout to write out when the table is closed
+_pending_pmu_counts = [] # Name of PMU counter layout table to be written out
+_pending_pmu_counts_tblname = None
# Metrics to write out when the table is closed
_pending_metrics = []
# Name of metrics table to be written out
@@ -47,10 +52,17 @@ _json_event_attributes = [
'event',
# Short things in alphabetical order.
'compat', 'deprecated', 'perpkg', 'unit',
+ # Counter this event could use
+ 'counter',
# Longer things (the last won't be iterated over during decompress).
'long_desc'
]

+# Attributes that are in pmu_unit_layout.
+_json_layout_attributes = [
+ 'pmu', 'desc', 'size', 'fixed_size'
+]
+
# Attributes that are in pmu_metric rather than pmu_event.
_json_metric_attributes = [
'metric_name', 'metric_group', 'metric_expr', 'metric_threshold',
@@ -58,7 +70,9 @@ _json_metric_attributes = [
'default_metricgroup_name', 'aggr_mode', 'event_grouping'
]
# Attributes that are bools or enum int values, encoded as '0', '1',...
-_json_enum_attributes = ['aggr_mode', 'deprecated', 'event_grouping', 'perpkg']
+_json_enum_attributes = ['aggr_mode', 'deprecated', 'event_grouping', 'perpkg',
+ 'size', 'fixed_size'
+]

def removesuffix(s: str, suffix: str) -> str:
"""Remove the suffix from a string
@@ -317,6 +331,9 @@ class JsonEvent:
if 'Errata' in jd:
extra_desc += ' Spec update: ' + jd['Errata']
self.pmu = unit_to_pmu(jd.get('Unit'))
+ self.counter = jd.get('Counter')
+ self.size = jd.get('Size')
+ self.fixed_size = jd.get('FixedSize')
filter = jd.get('Filter')
self.unit = jd.get('ScaleUnit')
self.perpkg = jd.get('PerPkg')
@@ -388,8 +405,16 @@ class JsonEvent:
s += f'\t{attr} = {value},\n'
return s + '}'

- def build_c_string(self, metric: bool) -> str:
+ def build_c_string(self, metric: bool, layout: bool = False) -> str:
s = ''
+ if layout:
+ for attr in _json_layout_attributes:
+ x = getattr(self, attr)
+ if attr in _json_enum_attributes:
+ s += x if x else '0'
+ else:
+ s += f'{x}\\000' if x else '\\000'
+ return s
for attr in _json_metric_attributes if metric else _json_event_attributes:
x = getattr(self, attr)
if metric and x and attr == 'metric_expr':
@@ -404,10 +429,10 @@ class JsonEvent:
s += f'{x}\\000' if x else '\\000'
return s

- def to_c_string(self, metric: bool) -> str:
+ def to_c_string(self, metric: bool, layout: bool = False) -> str:
"""Representation of the event as a C struct initializer."""

- s = self.build_c_string(metric)
+ s = self.build_c_string(metric, layout)
return f'{{ { _bcs.offsets[s] } }}, /* {s} */\n'


@@ -444,6 +469,8 @@ def preprocess_arch_std_files(archpath: str) -> None:
_arch_std_events[event.name.lower()] = event
if event.metric_name:
_arch_std_events[event.metric_name.lower()] = event
+ if event.size:
+ _arch_std_events[event.pmu.lower()] = event


def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
@@ -453,6 +480,8 @@ def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
_pending_events.append(e)
if e.metric_name:
_pending_metrics.append(e)
+ if e.size:
+ _pending_pmu_counts.append(e)


def print_pending_events() -> None:
@@ -566,6 +595,33 @@ const struct pmu_table_entry {_pending_metrics_tblname}[] = {{
""")
_args.output_file.write('};\n\n')

+def print_pending_pmu_counts() -> None:
+
+ def pmu_counts_cmp_key(j: JsonEvent) -> Tuple[bool, str, str]:
+ def fix_none(s: Optional[str]) -> str:
+ if s is None:
+ return ''
+ return s
+
+ return (j.desc is not None, fix_none(j.pmu), fix_none(j.size))
+
+ global _pending_pmu_counts
+ if not _pending_pmu_counts:
+ return
+
+ global _pending_pmu_counts_tblname
+ global pmu_layouts_tables
+ _pmu_layouts_tables.append(_pending_pmu_counts_tblname)
+
+ _args.output_file.write(
+ f'static const struct compact_pmu_event {_pending_pmu_counts_tblname}[] = {{\n')
+
+ for pmu_layout in sorted(_pending_pmu_counts, key=pmu_counts_cmp_key):
+ _args.output_file.write(pmu_layout.to_c_string(metric=False, layout=True))
+ _pending_pmu_counts = []
+
+ _args.output_file.write('};\n\n')
+
def get_topic(topic: str) -> str:
if topic.endswith('metrics.json'):
return 'metrics'
@@ -606,6 +662,8 @@ def preprocess_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
if event.metric_name:
_bcs.add(pmu_name, metric=True)
_bcs.add(event.build_c_string(metric=True), metric=True)
+ if event.size:
+ _bcs.add(event.build_c_string(metric=False, layout=True), metric=False)

def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
"""Process a JSON file during the main walk."""
@@ -619,11 +677,14 @@ def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
if item.is_dir() and is_leaf_dir(item.path):
print_pending_events()
print_pending_metrics()
+ print_pending_pmu_counts()

global _pending_events_tblname
_pending_events_tblname = file_name_to_table_name('pmu_events_', parents, item.name)
global _pending_metrics_tblname
_pending_metrics_tblname = file_name_to_table_name('pmu_metrics_', parents, item.name)
+ global _pending_pmu_counts_tblname
+ _pending_pmu_counts_tblname = file_name_to_table_name('pmu_layouts_', parents, item.name)

if item.name == 'sys':
_sys_event_table_to_metric_table_mapping[_pending_events_tblname] = _pending_metrics_tblname
@@ -657,6 +718,12 @@ struct pmu_metrics_table {
uint32_t num_pmus;
};

+/* Struct used to make the PMU counter layout table implementation opaque to callers. */
+struct pmu_layouts_table {
+ const struct compact_pmu_event *entries;
+ size_t length;
+};
+
/*
* Map a CPU to its table of PMU events. The CPU is identified by the
* cpuid field, which is an arch-specific identifier for the CPU.
@@ -670,6 +737,7 @@ struct pmu_events_map {
const char *cpuid;
struct pmu_events_table event_table;
struct pmu_metrics_table metric_table;
+ struct pmu_layouts_table layout_table;
};

/*
@@ -714,6 +782,12 @@ const struct pmu_events_map pmu_events_map[] = {
metric_size = '0'
if event_size == '0' and metric_size == '0':
continue
+ layout_tblname = file_name_to_table_name('pmu_layouts_', [], row[2].replace('/', '_'))
+ if layout_tblname in _pmu_layouts_tables:
+ layout_size = f'ARRAY_SIZE({layout_tblname})'
+ else:
+ layout_tblname = 'NULL'
+ layout_size = '0'
cpuid = row[0].replace('\\', '\\\\')
_args.output_file.write(f"""{{
\t.arch = "{arch}",
@@ -725,6 +799,10 @@ const struct pmu_events_map pmu_events_map[] = {
\t.metric_table = {{
\t\t.pmus = {metric_tblname},
\t\t.num_pmus = {metric_size}
+\t}},
+\t.layout_table = {{
+\t\t.entries = {layout_tblname},
+\t\t.length = {layout_size}
\t}}
}},
""")
@@ -735,6 +813,7 @@ const struct pmu_events_map pmu_events_map[] = {
\t.cpuid = 0,
\t.event_table = { 0, 0 },
\t.metric_table = { 0, 0 },
+\t.layout_table = { 0, 0 },
}
};
""")
@@ -823,6 +902,24 @@ static void decompress_metric(int offset, struct pmu_metric *pm)
_args.output_file.write('\twhile (*p++);')
_args.output_file.write("""}

+static void decompress_layout(int offset, struct pmu_layout *pm)
+{
+\tconst char *p = &big_c_string[offset];
+""")
+ for attr in _json_layout_attributes:
+ _args.output_file.write(f'\n\tpm->{attr} = ')
+ if attr in _json_enum_attributes:
+ _args.output_file.write("*p - '0';\n")
+ else:
+ _args.output_file.write("(*p == '\\0' ? NULL : p);\n")
+ if attr == _json_layout_attributes[-1]:
+ continue
+ if attr in _json_enum_attributes:
+ _args.output_file.write('\tp++;')
+ else:
+ _args.output_file.write('\twhile (*p++);')
+ _args.output_file.write("""}
+
static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table,
const struct pmu_table_entry *pmu,
pmu_event_iter_fn fn,
@@ -978,6 +1075,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table,
return 0;
}

+int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
+ pmu_layout_iter_fn fn,
+ void *data) {
+ for (size_t i = 0; i < table->length; i++) {
+ struct pmu_layout pm;
+ int ret;
+
+ decompress_layout(table->entries[i].offset, &pm);
+ ret = fn(&pm, data);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
static const struct pmu_events_map *map_for_pmu(struct perf_pmu *pmu)
{
static struct {
@@ -1073,6 +1185,33 @@ const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu *pm
return NULL;
}

+const struct pmu_layouts_table *perf_pmu__find_layouts_table(struct perf_pmu *pmu)
+{
+ const struct pmu_layouts_table *table = NULL;
+ char *cpuid = perf_pmu__getcpuid(pmu);
+ int i;
+
+ /* on some platforms which uses cpus map, cpuid can be NULL for
+ * PMUs other than CORE PMUs.
+ */
+ if (!cpuid)
+ return NULL;
+
+ i = 0;
+ for (;;) {
+ const struct pmu_events_map *map = &pmu_events_map[i++];
+ if (!map->arch)
+ break;
+
+ if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+ table = &map->layout_table;
+ break;
+ }
+ }
+ free(cpuid);
+ return table;
+}
+
const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
{
for (const struct pmu_events_map *tables = &pmu_events_map[0];
@@ -1094,6 +1233,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const
}
return NULL;
}
+const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid)
+{
+ for (const struct pmu_events_map *tables = &pmu_events_map[0];
+ tables->arch;
+ tables++) {
+ if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid))
+ return &tables->layout_table;
+ }
+ return NULL;
+}

int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
{
@@ -1122,6 +1271,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
return 0;
}

+int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data)
+{
+ for (const struct pmu_events_map *tables = &pmu_events_map[0];
+ tables->arch;
+ tables++) {
+ int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data);
+
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
const struct pmu_events_table *find_sys_events_table(const char *name)
{
for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0];
@@ -1278,6 +1440,7 @@ struct pmu_table_entry {
ftw(arch_path, [], process_one_file)
print_pending_events()
print_pending_metrics()
+ print_pending_pmu_counts()

print_mapping_table(archs)
print_system_mapping_table()
diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
index f5aa96f1685c..65e0c5dd8bb4 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -45,6 +45,7 @@ struct pmu_event {
const char *desc;
const char *topic;
const char *long_desc;
+ const char *counter;
const char *pmu;
const char *unit;
bool perpkg;
@@ -67,8 +68,16 @@ struct pmu_metric {
enum metric_event_groups event_grouping;
};

+struct pmu_layout {
+ const char *pmu;
+ const char *desc;
+ int size;
+ int fixed_size;
+};
+
struct pmu_events_table;
struct pmu_metrics_table;
+struct pmu_layouts_table;

typedef int (*pmu_event_iter_fn)(const struct pmu_event *pe,
const struct pmu_events_table *table,
@@ -78,15 +87,20 @@ typedef int (*pmu_metric_iter_fn)(const struct pmu_metric *pm,
const struct pmu_metrics_table *table,
void *data);

+typedef int (*pmu_layout_iter_fn)(const struct pmu_layout *pm,
+ void *data);
+
int pmu_events_table__for_each_event(const struct pmu_events_table *table,
struct perf_pmu *pmu,
pmu_event_iter_fn fn,
void *data);
int pmu_events_table__find_event(const struct pmu_events_table *table,
- struct perf_pmu *pmu,
- const char *name,
- pmu_event_iter_fn fn,
- void *data);
+ struct perf_pmu *pmu,
+ const char *name,
+ pmu_event_iter_fn fn,
+ void *data);
+int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table, pmu_layout_iter_fn fn,
+ void *data);
size_t pmu_events_table__num_events(const struct pmu_events_table *table,
struct perf_pmu *pmu);

@@ -95,10 +109,13 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, pm

const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu);
const struct pmu_metrics_table *perf_pmu__find_metrics_table(struct perf_pmu *pmu);
+const struct pmu_layouts_table *perf_pmu__find_layouts_table(struct perf_pmu *pmu);
const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid);
const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const char *cpuid);
+const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid);
int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data);
int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data);
+int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data);

const struct pmu_events_table *find_sys_events_table(const char *name);
const struct pmu_metrics_table *find_sys_metrics_table(const char *name);
--
2.34.1


2024-02-17 00:59:05

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 02/20] RISC-V: Add Sxcsrind ISA extension CSR definitions

From: Kaiwen Xue <[email protected]>

This adds definitions of new CSRs and bits defined in Sxcsrind ISA
extension. These CSR enables indirect accesses mechanism to access
any CSRs in M-, S-, and VS-mode. The range of the select values
and ireg will be define by the ISA extension using Sxcsrind extension.

Signed-off-by: Kaiwen Xue <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/csr.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 510014051f5d..0a54856fd807 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -302,6 +302,12 @@
/* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */
#define CSR_SISELECT 0x150
#define CSR_SIREG 0x151
+/* Supervisor-Level Window to Indirectly Accessed Registers (Sxcsrind) */
+#define CSR_SIREG2 0x152
+#define CSR_SIREG3 0x153
+#define CSR_SIREG4 0x155
+#define CSR_SIREG5 0x156
+#define CSR_SIREG6 0x157

/* Supervisor-Level Interrupts (AIA) */
#define CSR_STOPEI 0x15c
@@ -349,6 +355,14 @@
/* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */
#define CSR_VSISELECT 0x250
#define CSR_VSIREG 0x251
+/*
+ * VS-Level Window to Indirectly Accessed Registers (H-extension with Sxcsrind)
+ */
+#define CSR_VSIREG2 0x252
+#define CSR_VSIREG3 0x253
+#define CSR_VSIREG4 0x255
+#define CSR_VSIREG5 0x256
+#define CSR_VISREG6 0x257

/* VS-Level Interrupts (H-extension with AIA) */
#define CSR_VSTOPEI 0x25c
@@ -389,6 +403,12 @@
/* Machine-Level Window to Indirectly Accessed Registers (AIA) */
#define CSR_MISELECT 0x350
#define CSR_MIREG 0x351
+/* Machine-Level Window to Indrecitly Accessed Registers (Sxcsrind) */
+#define CSR_MIREG2 0x352
+#define CSR_MIREG3 0x353
+#define CSR_MIREG4 0x355
+#define CSR_MIREG5 0x356
+#define CSR_MIREG6 0x357

/* Machine-Level Interrupts (AIA) */
#define CSR_MTOPEI 0x35c
--
2.34.1


2024-02-17 00:59:30

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 03/20] RISC-V: Add Sxcsrind ISA extension definition and parsing

The S[m|s]csrind extension extends the indirect CSR access mechanism
defined in Smaia/Ssaia extensions.

This patch just enables the definition and parsing.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/kernel/cpufeature.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 5340f818746b..44df259cc815 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -80,6 +80,8 @@
#define RISCV_ISA_EXT_ZFA 71
#define RISCV_ISA_EXT_ZTSO 72
#define RISCV_ISA_EXT_ZACAS 73
+#define RISCV_ISA_EXT_SSCSRIND 74
+#define RISCV_ISA_EXT_SMCSRIND 75

#define RISCV_ISA_EXT_MAX 128
#define RISCV_ISA_EXT_INVALID U32_MAX
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 89920f84d0a3..52ec88dfb004 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -301,7 +301,9 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
+ __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
+ __RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
--
2.34.1


2024-02-17 01:00:11

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 04/20] dt-bindings: riscv: add Sxcsrind ISA extension description

Add the S[m|s]csrind ISA extension description.

Signed-off-by: Atish Patra <[email protected]>
---
.../devicetree/bindings/riscv/extensions.yaml | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 63d81dc895e5..77a9f867e36b 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -134,6 +134,20 @@ properties:
added by other RISC-V extensions in H/S/VS/U/VU modes and as
ratified at commit a28bfae (Ratified (#7)) of riscv-state-enable.

+ - const: smcsrind
+ description: |
+ The standard Smcsrind supervisor-level extension extends the
+ indirect CSR access mechanism defined by the Smaia extension. This
+ extension allows other ISA extension to use indirect CSR access
+ mechanism in M-mode.
+
+ - const: sscsrind
+ description: |
+ The standard Sscsrind supervisor-level extension extends the
+ indirect CSR access mechanism defined by the Ssaia extension. This
+ extension allows other ISA extension to use indirect CSR access
+ mechanism in S-mode.
+
- const: ssaia
description: |
The standard Ssaia supervisor-level extension for the advanced
--
2.34.1


2024-02-17 01:01:01

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 07/20] RISC-V: Add Ssccfg ISA extension definition and parsing

Ssccfg (‘Ss’ for Privileged architecture and Supervisor-level
extension, ‘ccfg’ for Counter Configuration) provides access to
delegated counters and new supervisor-level state.

This patch just enables the definitions and enable parsing.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/kernel/cpufeature.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 44df259cc815..5f4401e221ee 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -82,6 +82,8 @@
#define RISCV_ISA_EXT_ZACAS 73
#define RISCV_ISA_EXT_SSCSRIND 74
#define RISCV_ISA_EXT_SMCSRIND 75
+#define RISCV_ISA_EXT_SSCCFG 76
+#define RISCV_ISA_EXT_SMCDELEG 77

#define RISCV_ISA_EXT_MAX 128
#define RISCV_ISA_EXT_INVALID U32_MAX
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 52ec88dfb004..77cc5dbd73bf 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -300,10 +300,12 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts),
__RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA(smcdeleg, RISCV_ISA_EXT_SMCDELEG),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
+ __RISCV_ISA_EXT_DATA(ssccfg, RISCV_ISA_EXT_SSCCFG),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
--
2.34.1


2024-02-17 01:01:18

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 08/20] dt-bindings: riscv: add Ssccfg ISA extension description

Add description for the Ssccfg extension.

Signed-off-by: Atish Patra <[email protected]>
---
.../devicetree/bindings/riscv/extensions.yaml | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 77a9f867e36b..15adeb60441b 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -128,6 +128,13 @@ properties:
changes to interrupts as frozen at commit ccbddab ("Merge pull
request #42 from riscv/jhauser-2023-RC4") of riscv-aia.

+ - const: smcdeleg
+ description: |
+ The standard Smcdeleg supervisor-level extension for the machine mode
+ to delegate the hpmcounters to supvervisor mode so that they are
+ directlyi accessible in the supervisor mode. This extension depend
+ on Sscsrind, Zihpm, Zicntr extensions.
+
- const: smstateen
description: |
The standard Smstateen extension for controlling access to CSRs
@@ -154,6 +161,12 @@ properties:
interrupt architecture for supervisor-mode-visible csr and
behavioural changes to interrupts as frozen at commit ccbddab
("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: ssccfg
+ description: |
+ The standard Ssccfg supervisor-level extension for configuring
+ the delegated hpmcounters to be accessible directly in supervisor
+ mode. This extension depend on Sscsrind, Smcdeleg, Zihpm, Zicntr
+ extensions.

- const: sscofpmf
description: |
--
2.34.1


2024-02-17 01:01:32

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 09/20] RISC-V: Add Smcntrpmf extension parsing

Smcntrpmf extension allows M-mode to enable privilege mode filtering
for cycle/instret counters. However, the cyclecfg/instretcfg CSRs are
only available only in Ssccfg only Smcntrpmf is present.

That's why, kernel needs to detect presence of Smcntrpmf extension and
enable privilege mode filtering for cycle/instret counters.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 5f4401e221ee..b82a8d7a9b3b 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -84,6 +84,7 @@
#define RISCV_ISA_EXT_SMCSRIND 75
#define RISCV_ISA_EXT_SSCCFG 76
#define RISCV_ISA_EXT_SMCDELEG 77
+#define RISCV_ISA_EXT_SMCNTRPMF 78

#define RISCV_ISA_EXT_MAX 128
#define RISCV_ISA_EXT_INVALID U32_MAX
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 77cc5dbd73bf..c30be2c924e7 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -302,6 +302,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
__RISCV_ISA_EXT_DATA(smcdeleg, RISCV_ISA_EXT_SMCDELEG),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
+ __RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
--
2.34.1


2024-02-17 01:01:56

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 10/20] dt-bindings: riscv: add Smcntrpmf ISA extension description

Add the description for Smcntrpmf ISA extension

Signed-off-by: Atish Patra <[email protected]>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 15adeb60441b..149ecf2a8af3 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -135,6 +135,13 @@ properties:
directlyi accessible in the supervisor mode. This extension depend
on Sscsrind, Zihpm, Zicntr extensions.

+ - const: smcntrpmf
+ description: |
+ The standard Smcntrpmf supervisor-level extension for the machine mode
+ to enable privilege mode filtering for cycle and instret counters.
+ The Ssccfg extension depends on this as *cfg CSRs are available only
+ if smcntrpmf is present.
+
- const: smstateen
description: |
The standard Smstateen extension for controlling access to CSRs
--
2.34.1


2024-02-17 01:02:40

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 11/20] RISC-V: perf: Restructure the SBI PMU code

With Ssccfg/Smcdeleg, we no longer need SBI PMU extension to program/
access hpmcounter/events. However, we do need it for firmware counters.
Rename the driver and its related code to represent generic name
that will handle both sbi and ISA mechanism for hpmcounter related
operations. Take this opportunity to update the Kconfig names to
match the new driver name closely.

No functional change intended.

Signed-off-by: Atish Patra <[email protected]>
---
MAINTAINERS | 4 +-
drivers/perf/Kconfig | 16 +-
drivers/perf/Makefile | 4 +-
.../perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
.../perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 170 +++++++++++-------
include/linux/perf/riscv_pmu.h | 8 +-
6 files changed, 123 insertions(+), 79 deletions(-)
rename drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} (100%)
rename drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} (87%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 73d898383e51..6adb24d6cc0a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18856,9 +18856,9 @@ M: Atish Patra <[email protected]>
R: Anup Patel <[email protected]>
L: [email protected]
S: Supported
-F: drivers/perf/riscv_pmu.c
+F: drivers/perf/riscv_pmu_common.c
+F: drivers/perf/riscv_pmu_dev.c
F: drivers/perf/riscv_pmu_legacy.c
-F: drivers/perf/riscv_pmu_sbi.c

RISC-V THEAD SoC SUPPORT
M: Jisheng Zhang <[email protected]>
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index ec6e0d9194a1..86aaa1c1161b 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -56,7 +56,7 @@ config ARM_PMU
Say y if you want to use CPU performance monitors on ARM-based
systems.

-config RISCV_PMU
+config RISCV_PMU_COMMON
depends on RISCV
bool "RISC-V PMU framework"
default y
@@ -67,7 +67,7 @@ config RISCV_PMU
can reuse it.

config RISCV_PMU_LEGACY
- depends on RISCV_PMU
+ depends on RISCV_PMU_COMMON
bool "RISC-V legacy PMU implementation"
default y
help
@@ -76,15 +76,15 @@ config RISCV_PMU_LEGACY
of cycle/instruction counter and doesn't support counter overflow,
or programmable counters. It will be removed in future.

-config RISCV_PMU_SBI
- depends on RISCV_PMU && RISCV_SBI
- bool "RISC-V PMU based on SBI PMU extension"
+config RISCV_PMU
+ depends on RISCV_PMU_COMMON && RISCV_SBI
+ bool "RISC-V PMU based on SBI PMU extension and/or Counter delegation extension"
default y
help
Say y if you want to use the CPU performance monitor
- using SBI PMU extension on RISC-V based systems. This option provides
- full perf feature support i.e. counter overflow, privilege mode
- filtering, counter configuration.
+ using SBI PMU extension or counter delegation ISA extension on RISC-V
+ based systems. This option provides full perf feature support i.e.
+ counter overflow, privilege mode filtering, counter configuration.

config ARM_PMU_ACPI
depends on ARM_PMU && ACPI
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index a06338e3401c..f2c72915e11d 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -12,9 +12,9 @@ obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o
obj-$(CONFIG_HISI_PMU) += hisilicon/
obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o
obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
-obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
+obj-$(CONFIG_RISCV_PMU_COMMON) += riscv_pmu_common.o
obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
-obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o
+obj-$(CONFIG_RISCV_PMU) += riscv_pmu_dev.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu_common.c
similarity index 100%
rename from drivers/perf/riscv_pmu.c
rename to drivers/perf/riscv_pmu_common.c
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_dev.c
similarity index 87%
rename from drivers/perf/riscv_pmu_sbi.c
rename to drivers/perf/riscv_pmu_dev.c
index 16acd4dcdb96..3d27bd65f140 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -8,7 +8,7 @@
* sparc64 and x86 code.
*/

-#define pr_fmt(fmt) "riscv-pmu-sbi: " fmt
+#define pr_fmt(fmt) "riscv-pmu-dev: " fmt

#include <linux/mod_devicetable.h>
#include <linux/perf/riscv_pmu.h>
@@ -55,6 +55,8 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;

/*
+ * This structure is SBI specific but counter delegation also require counter
+ * width, csr mapping. Reuse it for now.
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
@@ -265,12 +267,12 @@ static const struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_M
},
};

-static int pmu_sbi_ctr_get_width(int idx)
+static int rvpmu_ctr_get_width(int idx)
{
return pmu_ctr_list[idx].width;
}

-static bool pmu_sbi_ctr_is_fw(int cidx)
+static bool rvpmu_ctr_is_fw(int cidx)
{
union sbi_pmu_ctr_info *info;

@@ -312,12 +314,12 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
}
EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info);

-static uint8_t pmu_sbi_csr_index(struct perf_event *event)
+static uint8_t rvpmu_csr_index(struct perf_event *event)
{
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}

-static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
+static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
bool guest_events = false;
@@ -338,7 +340,7 @@ static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
return cflags;
}

-static int pmu_sbi_ctr_get_idx(struct perf_event *event)
+static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
@@ -348,7 +350,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
uint64_t cbase = 0, cmask = rvpmu->cmask;
unsigned long cflags = 0;

- cflags = pmu_sbi_get_filter_flags(event);
+ cflags = rvpmu_sbi_get_filter_flags(event);

/*
* In legacy mode, we have to force the fixed counters for those events
@@ -385,7 +387,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
return -ENOENT;

/* Additional sanity check for the counter id */
- if (pmu_sbi_ctr_is_fw(idx)) {
+ if (rvpmu_ctr_is_fw(idx)) {
if (!test_and_set_bit(idx, cpuc->used_fw_ctrs))
return idx;
} else {
@@ -396,7 +398,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
return -ENOENT;
}

-static void pmu_sbi_ctr_clear_idx(struct perf_event *event)
+static void rvpmu_ctr_clear_idx(struct perf_event *event)
{

struct hw_perf_event *hwc = &event->hw;
@@ -404,7 +406,7 @@ static void pmu_sbi_ctr_clear_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
int idx = hwc->idx;

- if (pmu_sbi_ctr_is_fw(idx))
+ if (rvpmu_ctr_is_fw(idx))
clear_bit(idx, cpuc->used_fw_ctrs);
else
clear_bit(idx, cpuc->used_hw_ctrs);
@@ -442,7 +444,7 @@ static bool pmu_sbi_is_fw_event(struct perf_event *event)
return false;
}

-static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig)
+static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
{
u32 type = event->attr.type;
u64 config = event->attr.config;
@@ -483,7 +485,7 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig)
return ret;
}

-static u64 pmu_sbi_ctr_read(struct perf_event *event)
+static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
@@ -506,25 +508,25 @@ static u64 pmu_sbi_ctr_read(struct perf_event *event)
return val;
}

-static void pmu_sbi_set_scounteren(void *arg)
+static void rvpmu_set_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;

if (event->hw.idx != -1)
csr_write(CSR_SCOUNTEREN,
- csr_read(CSR_SCOUNTEREN) | (1 << pmu_sbi_csr_index(event)));
+ csr_read(CSR_SCOUNTEREN) | (1 << rvpmu_csr_index(event)));
}

-static void pmu_sbi_reset_scounteren(void *arg)
+static void rvpmu_reset_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;

if (event->hw.idx != -1)
csr_write(CSR_SCOUNTEREN,
- csr_read(CSR_SCOUNTEREN) & ~(1 << pmu_sbi_csr_index(event)));
+ csr_read(CSR_SCOUNTEREN) & ~(1 << rvpmu_csr_index(event)));
}

-static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
+static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
{
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;
@@ -543,17 +545,17 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)

if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- pmu_sbi_set_scounteren((void *)event);
+ rvpmu_set_scounteren((void *)event);
}

-static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
+static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
{
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;

if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- pmu_sbi_reset_scounteren((void *)event);
+ rvpmu_reset_scounteren((void *)event);

ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, hwc->idx, 1, flag, 0, 0, 0);
if (ret.error && (ret.error != SBI_ERR_ALREADY_STOPPED) &&
@@ -562,7 +564,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
hwc->idx, sbi_err_map_linux_errno(ret.error));
}

-static int pmu_sbi_find_num_ctrs(void)
+static int rvpmu_sbi_find_num_ctrs(void)
{
struct sbiret ret;

@@ -573,7 +575,7 @@ static int pmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}

-static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
+static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
{
struct sbiret ret;
int i, num_hw_ctr = 0, num_fw_ctr = 0;
@@ -604,7 +606,7 @@ static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
return 0;
}

-static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu)
+static inline void rvpmu_sbi_stop_all(struct riscv_pmu *pmu)
{
/*
* No need to check the error because we are disabling all the counters
@@ -614,7 +616,7 @@ static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu)
0, pmu->cmask, 0, 0, 0, 0);
}

-static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
+static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
{
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);

@@ -629,7 +631,7 @@ static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
* while the overflowed counters need to be started with updated initialization
* value.
*/
-static inline void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
+static inline void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
unsigned long ctr_ovf_mask)
{
int idx = 0;
@@ -668,7 +670,7 @@ static inline void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
}
}

-static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
+static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
{
struct perf_sample_data data;
struct pt_regs *regs;
@@ -699,7 +701,7 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
}

pmu = to_riscv_pmu(event->pmu);
- pmu_sbi_stop_hw_ctrs(pmu);
+ rvpmu_sbi_stop_hw_ctrs(pmu);

/* Overflow status register should only be read after counter are stopped */
ALT_SBI_PMU_OVERFLOW(overflow);
@@ -755,13 +757,55 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
}
}

- pmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
+ rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
perf_sample_event_took(sched_clock() - start_clock);

return IRQ_HANDLED;
}

-static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
+static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
+{
+ rvpmu_sbi_ctr_start(event, ival);
+ /* TODO: Counter delegation implementation */
+}
+
+static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
+{
+ rvpmu_sbi_ctr_stop(event, flag);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_find_num_ctrs(void)
+{
+ return rvpmu_sbi_find_num_ctrs();
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
+{
+ return rvpmu_sbi_get_ctrinfo(nctr, mask);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
+{
+ return rvpmu_sbi_event_map(event, econfig);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_ctr_get_idx(struct perf_event *event)
+{
+ return rvpmu_sbi_ctr_get_idx(event);
+ /* TODO: Counter delegation implementation */
+}
+
+static u64 rvpmu_ctr_read(struct perf_event *event)
+{
+ return rvpmu_sbi_ctr_read(event);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
{
struct riscv_pmu *pmu = hlist_entry_safe(node, struct riscv_pmu, node);
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
@@ -776,7 +820,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
csr_write(CSR_SCOUNTEREN, 0x2);

/* Stop all the counters so that they can be enabled from perf */
- pmu_sbi_stop_all(pmu);
+ rvpmu_sbi_stop_all(pmu);

if (riscv_pmu_use_irq) {
cpu_hw_evt->irq = riscv_pmu_irq;
@@ -788,7 +832,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
}

-static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node)
+static int rvpmu_dying_cpu(unsigned int cpu, struct hlist_node *node)
{
if (riscv_pmu_use_irq) {
disable_percpu_irq(riscv_pmu_irq);
@@ -801,7 +845,7 @@ static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
}

-static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev)
+static int rvpmu_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev)
{
int ret;
struct cpu_hw_events __percpu *hw_events = pmu->hw_events;
@@ -834,7 +878,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde
return -ENODEV;
}

- ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_handler, "riscv-pmu", hw_events);
+ ret = request_percpu_irq(riscv_pmu_irq, rvpmu_ovf_handler, "riscv-pmu", hw_events);
if (ret) {
pr_err("registering percpu irq failed [%d]\n", ret);
return ret;
@@ -904,7 +948,7 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
}

-static void pmu_sbi_event_init(struct perf_event *event)
+static void rvpmu_event_init(struct perf_event *event)
{
/*
* The permissions are set at event_init so that we do not depend
@@ -918,7 +962,7 @@ static void pmu_sbi_event_init(struct perf_event *event)
event->hw.flags |= PERF_EVENT_FLAG_LEGACY;
}

-static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
+static void rvpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
@@ -946,14 +990,14 @@ static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
* that it is possible to do so to avoid any race.
* And we must notify all cpus here because threads that currently run
* on other cpus will try to directly access the counter too without
- * calling pmu_sbi_ctr_start.
+ * calling rvpmu_sbi_ctr_start.
*/
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
- pmu_sbi_set_scounteren, (void *)event, 1);
+ rvpmu_set_scounteren, (void *)event, 1);
}

-static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+static void rvpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
@@ -975,7 +1019,7 @@ static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *m

if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
- pmu_sbi_reset_scounteren, (void *)event, 1);
+ rvpmu_reset_scounteren, (void *)event, 1);
}

static void riscv_pmu_update_counter_access(void *info)
@@ -1019,7 +1063,7 @@ static struct ctl_table sbi_pmu_sysctl_table[] = {
{ }
};

-static int pmu_sbi_device_probe(struct platform_device *pdev)
+static int rvpmu_device_probe(struct platform_device *pdev)
{
struct riscv_pmu *pmu = NULL;
int ret = -ENODEV;
@@ -1030,7 +1074,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
if (!pmu)
return -ENOMEM;

- num_counters = pmu_sbi_find_num_ctrs();
+ num_counters = rvpmu_find_num_ctrs();
if (num_counters < 0) {
pr_err("SBI PMU extension doesn't provide any counters\n");
goto out_free;
@@ -1043,10 +1087,10 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
}

/* cache all the information about counters now */
- if (pmu_sbi_get_ctrinfo(num_counters, &cmask))
+ if (rvpmu_get_ctrinfo(num_counters, &cmask))
goto out_free;

- ret = pmu_sbi_setup_irqs(pmu, pdev);
+ ret = rvpmu_setup_irqs(pmu, pdev);
if (ret < 0) {
pr_info("Perf sampling/filtering is not supported as sscof extension is not available\n");
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
@@ -1055,17 +1099,17 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)

pmu->pmu.attr_groups = riscv_pmu_attr_groups;
pmu->cmask = cmask;
- pmu->ctr_start = pmu_sbi_ctr_start;
- pmu->ctr_stop = pmu_sbi_ctr_stop;
- pmu->event_map = pmu_sbi_event_map;
- pmu->ctr_get_idx = pmu_sbi_ctr_get_idx;
- pmu->ctr_get_width = pmu_sbi_ctr_get_width;
- pmu->ctr_clear_idx = pmu_sbi_ctr_clear_idx;
- pmu->ctr_read = pmu_sbi_ctr_read;
- pmu->event_init = pmu_sbi_event_init;
- pmu->event_mapped = pmu_sbi_event_mapped;
- pmu->event_unmapped = pmu_sbi_event_unmapped;
- pmu->csr_index = pmu_sbi_csr_index;
+ pmu->ctr_start = rvpmu_ctr_start;
+ pmu->ctr_stop = rvpmu_ctr_stop;
+ pmu->event_map = rvpmu_event_map;
+ pmu->ctr_get_idx = rvpmu_ctr_get_idx;
+ pmu->ctr_get_width = rvpmu_ctr_get_width;
+ pmu->ctr_clear_idx = rvpmu_ctr_clear_idx;
+ pmu->ctr_read = rvpmu_ctr_read;
+ pmu->event_init = rvpmu_event_init;
+ pmu->event_mapped = rvpmu_event_mapped;
+ pmu->event_unmapped = rvpmu_event_unmapped;
+ pmu->csr_index = rvpmu_csr_index;

ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
if (ret)
@@ -1091,14 +1135,14 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
return ret;
}

-static struct platform_driver pmu_sbi_driver = {
- .probe = pmu_sbi_device_probe,
+static struct platform_driver rvpmu_driver = {
+ .probe = rvpmu_device_probe,
.driver = {
- .name = RISCV_PMU_SBI_PDEV_NAME,
+ .name = RISCV_PMU_PDEV_NAME,
},
};

-static int __init pmu_sbi_devinit(void)
+static int __init rvpmu_devinit(void)
{
int ret;
struct platform_device *pdev;
@@ -1110,20 +1154,20 @@ static int __init pmu_sbi_devinit(void)

ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING,
"perf/riscv/pmu:starting",
- pmu_sbi_starting_cpu, pmu_sbi_dying_cpu);
+ rvpmu_starting_cpu, rvpmu_dying_cpu);
if (ret) {
pr_err("CPU hotplug notifier could not be registered: %d\n",
ret);
return ret;
}

- ret = platform_driver_register(&pmu_sbi_driver);
+ ret = platform_driver_register(&rvpmu_driver);
if (ret)
return ret;

- pdev = platform_device_register_simple(RISCV_PMU_SBI_PDEV_NAME, -1, NULL, 0);
+ pdev = platform_device_register_simple(RISCV_PMU_PDEV_NAME, -1, NULL, 0);
if (IS_ERR(pdev)) {
- platform_driver_unregister(&pmu_sbi_driver);
+ platform_driver_unregister(&rvpmu_driver);
return PTR_ERR(pdev);
}

@@ -1132,4 +1176,4 @@ static int __init pmu_sbi_devinit(void)

return ret;
}
-device_initcall(pmu_sbi_devinit)
+device_initcall(rvpmu_devinit)
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 43282e22ebe1..3d2b1d7913f3 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -13,7 +13,7 @@
#include <linux/ptrace.h>
#include <linux/interrupt.h>

-#ifdef CONFIG_RISCV_PMU
+#ifdef CONFIG_RISCV_PMU_COMMON

/*
* The RISCV_MAX_COUNTERS parameter should be specified.
@@ -21,7 +21,7 @@

#define RISCV_MAX_COUNTERS 64
#define RISCV_OP_UNSUPP (-EOPNOTSUPP)
-#define RISCV_PMU_SBI_PDEV_NAME "riscv-pmu-sbi"
+#define RISCV_PMU_PDEV_NAME "riscv-pmu"
#define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy"

#define RISCV_PMU_STOP_FLAG_RESET 1
@@ -79,10 +79,10 @@ void riscv_pmu_legacy_skip_init(void);
static inline void riscv_pmu_legacy_skip_init(void) {};
#endif
struct riscv_pmu *riscv_pmu_alloc(void);
-#ifdef CONFIG_RISCV_PMU_SBI
+#ifdef CONFIG_RISCV_PMU
int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
#endif

-#endif /* CONFIG_RISCV_PMU */
+#endif /* CONFIG_RISCV_PMU_COMMON */

#endif /* _RISCV_PMU_H */
--
2.34.1


2024-02-17 01:03:09

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 12/20] RISC-V: perf: Modify the counter discovery mechanism

If both counter delegation and SBI PMU is present, the counter
delegation will be used for hardware pmu counters while the SBI PMU
will be used for firmware counters. Thus, the driver has to probe
the counters info via SBI PMU to distinguish the firmware counters.

The hybrid scheme also requires improvements of the informational
logging messages to indicate the user about underlying interface
used for each use case.

Signed-off-by: Atish Patra <[email protected]>
---
drivers/perf/riscv_pmu_dev.c | 120 ++++++++++++++++++++++++++---------
1 file changed, 90 insertions(+), 30 deletions(-)

diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 3d27bd65f140..dfc0ddee9da4 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -35,6 +35,11 @@
PMU_FORMAT_ATTR(event, "config:0-47");
PMU_FORMAT_ATTR(firmware, "config:63");

+static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
+static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
+static bool cdeleg_available;
+static bool sbi_available;
+
static struct attribute *riscv_arch_formats_attr[] = {
&format_attr_event.attr,
&format_attr_firmware.attr,
@@ -56,7 +61,8 @@ static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;

/*
* This structure is SBI specific but counter delegation also require counter
- * width, csr mapping. Reuse it for now.
+ * width, csr mapping. Reuse it for now we can have firmware counters for
+ * platfroms with counter delegation support.
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
@@ -67,6 +73,8 @@ static unsigned int riscv_pmu_irq;

/* Cache the available counters in a bitmask */
static unsigned long cmask;
+/* Cache the available firmware counters in another bitmask */
+static unsigned long firmware_cmask;

struct sbi_pmu_event_data {
union {
@@ -575,35 +583,49 @@ static int rvpmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}

-static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
+static int rvpmu_deleg_find_ctrs(void)
+{
+ /* TODO */
+ return -1;
+}
+
+static int rvpmu_sbi_get_ctrinfo(int nsbi_ctr, int ndeleg_ctr)
{
struct sbiret ret;
- int i, num_hw_ctr = 0, num_fw_ctr = 0;
+ int i, num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0;
union sbi_pmu_ctr_info cinfo;

- pmu_ctr_list = kcalloc(nctr, sizeof(*pmu_ctr_list), GFP_KERNEL);
- if (!pmu_ctr_list)
- return -ENOMEM;
-
- for (i = 0; i < nctr; i++) {
+ for (i = 0; i < nsbi_ctr; i++) {
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_GET_INFO, i, 0, 0, 0, 0, 0);
if (ret.error)
/* The logical counter ids are not expected to be contiguous */
continue;

- *mask |= BIT(i);
-
cinfo.value = ret.value;
if (cinfo.type == SBI_PMU_CTR_TYPE_FW)
num_fw_ctr++;
- else
+
+ if (!cdeleg_available) {
num_hw_ctr++;
- pmu_ctr_list[i].value = cinfo.value;
+ cmask |= BIT(i);
+ pmu_ctr_list[i].value = cinfo.value;
+ } else if (cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+ /* Track firmware counters in a different mask */
+ firmware_cmask |= BIT(i);
+ pmu_ctr_list[i].value = cinfo.value;
+ }
+
}

- pr_info("%d firmware and %d hardware counters\n", num_fw_ctr, num_hw_ctr);
+ if (cdeleg_available) {
+ pr_info("%d firmware and %d hardware counters\n", num_fw_ctr, ndeleg_ctr);
+ num_ctr = num_fw_ctr + ndeleg_ctr;
+ } else {
+ pr_info("%d firmware and %d hardware counters\n", num_fw_ctr, num_hw_ctr);
+ num_ctr = nsbi_ctr;
+ }

- return 0;
+ return num_ctr;
}

static inline void rvpmu_sbi_stop_all(struct riscv_pmu *pmu)
@@ -775,16 +797,33 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
/* TODO: Counter delegation implementation */
}

-static int rvpmu_find_num_ctrs(void)
+static int rvpmu_find_ctrs(void)
{
- return rvpmu_sbi_find_num_ctrs();
- /* TODO: Counter delegation implementation */
-}
+ int num_sbi_counters = 0, num_deleg_counters = 0, num_counters = 0;

-static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
-{
- return rvpmu_sbi_get_ctrinfo(nctr, mask);
- /* TODO: Counter delegation implementation */
+ /*
+ * We don't know how many firmware counters available. Just allocate
+ * for maximum counters driver can support. The default is 64 anyways.
+ */
+ pmu_ctr_list = kcalloc(RISCV_MAX_COUNTERS, sizeof(*pmu_ctr_list),
+ GFP_KERNEL);
+ if (!pmu_ctr_list)
+ return -ENOMEM;
+
+ if (cdeleg_available)
+ num_deleg_counters = rvpmu_deleg_find_ctrs();
+
+ /* This is required for firmware counters even if the above is true */
+ if (sbi_available)
+ num_sbi_counters = rvpmu_sbi_find_num_ctrs();
+
+ if (num_sbi_counters >= RISCV_MAX_COUNTERS || num_deleg_counters >= RISCV_MAX_COUNTERS)
+ return -ENOSPC;
+
+ /* cache all the information about counters now */
+ num_counters = rvpmu_sbi_get_ctrinfo(num_sbi_counters, num_deleg_counters);
+
+ return num_counters;
}

static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
@@ -1069,12 +1108,21 @@ static int rvpmu_device_probe(struct platform_device *pdev)
int ret = -ENODEV;
int num_counters;

- pr_info("SBI PMU extension is available\n");
+ if (cdeleg_available) {
+ pr_info("hpmcounters will use the counter delegation ISA extension\n");
+ if (sbi_available)
+ pr_info("Firmware counters will be use SBI PMU extension\n");
+ else
+ pr_info("Firmware counters will be not available as SBI PMU extension is not present\n");
+ } else if (sbi_available) {
+ pr_info("Both hpmcounters and firmware counters will use SBI PMU extension\n");
+ }
+
pmu = riscv_pmu_alloc();
if (!pmu)
return -ENOMEM;

- num_counters = rvpmu_find_num_ctrs();
+ num_counters = rvpmu_find_ctrs();
if (num_counters < 0) {
pr_err("SBI PMU extension doesn't provide any counters\n");
goto out_free;
@@ -1086,9 +1134,6 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pr_info("SBI returned more than maximum number of counters. Limiting the number of counters to %d\n", num_counters);
}

- /* cache all the information about counters now */
- if (rvpmu_get_ctrinfo(num_counters, &cmask))
- goto out_free;

ret = rvpmu_setup_irqs(pmu, pdev);
if (ret < 0) {
@@ -1147,11 +1192,26 @@ static int __init rvpmu_devinit(void)
int ret;
struct platform_device *pdev;

- if (sbi_spec_version < sbi_mk_version(0, 3) ||
- !sbi_probe_extension(SBI_EXT_PMU)) {
- return 0;
+ if (sbi_spec_version >= sbi_mk_version(0, 3) &&
+ sbi_probe_extension(SBI_EXT_PMU)) {
+ static_branch_enable(&riscv_pmu_sbi_available);
+ sbi_available = true;
+ }
+
+ /*
+ * We need all three extensions to be present to access the counters
+ * in S-mode via Supervisor Counter delegation.
+ */
+ if (riscv_isa_extension_available(NULL, SSCCFG) &&
+ riscv_isa_extension_available(NULL, SMCDELEG) &&
+ riscv_isa_extension_available(NULL, SSCSRIND)) {
+ static_branch_enable(&riscv_pmu_cdeleg_available);
+ cdeleg_available = true;
}

+ if (!(sbi_available || cdeleg_available))
+ return 0;
+
ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING,
"perf/riscv/pmu:starting",
rvpmu_starting_cpu, rvpmu_dying_cpu);
--
2.34.1


2024-02-17 01:04:22

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 13/20] RISC-V: perf: Implement supervisor counter delegation support

There are few new RISC-V ISA exensions (ssccfg, sscsrind, smcntrpmf) which
allows the hpmcounter/hpmevents to be programmed directly from S-mode. The
implementation detects the ISA extension at runtime and uses them if
available instead of SBI PMU extension. SBI PMU extension will still be
used for firmware counters if the user requests it.

The current linux driver relies on event encoding defined by SBI PMU
specification for standard perf events. However, there are no standard
event encoding available in the ISA. In the future, we may want to
decouple the counter delegation and SBI PMU completely. In that case,
counter delegation supported platforms must rely on the event encoding
defined in the perf json file only.

For firmware events, it will continue to use the SBI PMU encoding as
one can not support firmware event without SBI PMU.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/csr.h | 1 +
arch/riscv/include/asm/sbi.h | 2 +-
arch/riscv/kvm/vcpu_pmu.c | 2 +-
drivers/perf/riscv_pmu_dev.c | 428 ++++++++++++++++++++++++++++-----
include/linux/perf/riscv_pmu.h | 3 +
5 files changed, 369 insertions(+), 67 deletions(-)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index e1bf1466f32e..efcd956c517a 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -239,6 +239,7 @@
#endif

#define SISELECT_SSCCFG_BASE 0x40
+#define HPMEVENT_MASK GENMASK_ULL(63, 56)

/* symbolic CSR names: */
#define CSR_CYCLE 0xc00
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 6e68f8dff76b..ad0c8a686d6c 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -147,7 +147,7 @@ union sbi_pmu_ctr_info {
};
};

-#define RISCV_PMU_RAW_EVENT_MASK GENMASK_ULL(47, 0)
+#define RISCV_PMU_SBI_RAW_EVENT_MASK GENMASK_ULL(47, 0)
#define RISCV_PMU_RAW_EVENT_IDX 0x20000

/** General pmu event codes specified in SBI PMU extension */
diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c
index 86391a5061dd..4c9502a79a54 100644
--- a/arch/riscv/kvm/vcpu_pmu.c
+++ b/arch/riscv/kvm/vcpu_pmu.c
@@ -125,7 +125,7 @@ static u64 kvm_pmu_get_perf_event_config(unsigned long eidx, uint64_t evt_data)
config = kvm_pmu_get_perf_event_cache_config(ecode);
break;
case SBI_PMU_EVENT_TYPE_RAW:
- config = evt_data & RISCV_PMU_RAW_EVENT_MASK;
+ config = evt_data & RISCV_PMU_SBI_RAW_EVENT_MASK;
break;
case SBI_PMU_EVENT_TYPE_FW:
if (ecode < SBI_PMU_FW_MAX)
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index dfc0ddee9da4..0cdad0dafb71 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -23,6 +23,8 @@
#include <asm/errata_list.h>
#include <asm/sbi.h>
#include <asm/cpufeature.h>
+#include <asm/hwcap.h>
+#include <asm/csr_ind.h>

#define SYSCTL_NO_USER_ACCESS 0
#define SYSCTL_USER_ACCESS 1
@@ -32,7 +34,20 @@
#define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS)
#define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY)

-PMU_FORMAT_ATTR(event, "config:0-47");
+#define RVPMU_SBI_PMU_FORMAT_ATTR "config:0-47"
+#define RVPMU_CDELEG_PMU_FORMAT_ATTR "config:0-55"
+
+static ssize_t __maybe_unused rvpmu_format_show(struct device *dev,
+ struct device_attribute *attr, char *buf);
+
+#define RVPMU_ATTR_ENTRY(_name, _func, _config) ( \
+ &((struct dev_ext_attribute[]) { \
+ { __ATTR(_name, 0444, _func, NULL), (void *)_config } \
+ })[0].attr.attr)
+
+#define RVPMU_FORMAT_ATTR_ENTRY(_name, _config) \
+ RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)
+
PMU_FORMAT_ATTR(firmware, "config:63");

static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
@@ -40,19 +55,35 @@ static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
static bool cdeleg_available;
static bool sbi_available;

-static struct attribute *riscv_arch_formats_attr[] = {
- &format_attr_event.attr,
+static struct attribute *riscv_sbi_pmu_formats_attr[] = {
+ RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_SBI_PMU_FORMAT_ATTR),
+ &format_attr_firmware.attr,
+ NULL,
+};
+
+static struct attribute_group riscv_sbi_pmu_format_group = {
+ .name = "format",
+ .attrs = riscv_sbi_pmu_formats_attr,
+};
+
+static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
+ &riscv_sbi_pmu_format_group,
+ NULL,
+};
+
+static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
+ RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
&format_attr_firmware.attr,
NULL,
};

-static struct attribute_group riscv_pmu_format_group = {
+static struct attribute_group riscv_cdeleg_pmu_format_group = {
.name = "format",
- .attrs = riscv_arch_formats_attr,
+ .attrs = riscv_cdeleg_pmu_formats_attr,
};

-static const struct attribute_group *riscv_pmu_attr_groups[] = {
- &riscv_pmu_format_group,
+static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = {
+ &riscv_cdeleg_pmu_format_group,
NULL,
};

@@ -275,6 +306,14 @@ static const struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_M
},
};

+static ssize_t rvpmu_format_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct dev_ext_attribute *eattr = container_of(attr,
+ struct dev_ext_attribute, attr);
+ return sysfs_emit(buf, "%s\n", (char *)eattr->var);
+}
+
static int rvpmu_ctr_get_width(int idx)
{
return pmu_ctr_list[idx].width;
@@ -327,6 +366,39 @@ static uint8_t rvpmu_csr_index(struct perf_event *event)
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}

+static uint64_t get_deleg_priv_filter_bits(struct perf_event *event)
+{
+ uint64_t priv_filter_bits = 0;
+ bool guest_events = false;
+
+ if (event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS)
+ guest_events = true;
+ if (event->attr.exclude_kernel)
+ priv_filter_bits |= guest_events ? HPMEVENT_VSINH : HPMEVENT_SINH;
+ if (event->attr.exclude_user)
+ priv_filter_bits |= guest_events ? HPMEVENT_VUINH : HPMEVENT_UINH;
+ if (guest_events && event->attr.exclude_hv)
+ priv_filter_bits |= HPMEVENT_SINH;
+ if (event->attr.exclude_host)
+ priv_filter_bits |= HPMEVENT_UINH | HPMEVENT_SINH;
+ if (event->attr.exclude_guest)
+ priv_filter_bits |= HPMEVENT_VSINH | HPMEVENT_VUINH;
+
+
+ return priv_filter_bits;
+}
+
+static bool pmu_sbi_is_fw_event(struct perf_event *event)
+{
+ u32 type = event->attr.type;
+ u64 config = event->attr.config;
+
+ if ((type == PERF_TYPE_RAW) && ((config >> 63) == 1))
+ return true;
+ else
+ return false;
+}
+
static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
@@ -355,7 +427,8 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
struct sbiret ret;
int idx;
- uint64_t cbase = 0, cmask = rvpmu->cmask;
+ uint64_t cbase = 0;
+ unsigned long ctr_mask = rvpmu->cmask;
unsigned long cflags = 0;

cflags = rvpmu_sbi_get_filter_flags(event);
@@ -368,21 +441,24 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
if (hwc->flags & PERF_EVENT_FLAG_LEGACY) {
if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
- cmask = 1;
+ ctr_mask = 1;
} else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
- cmask = 1UL << (CSR_INSTRET - CSR_CYCLE);
+ ctr_mask = 1UL << (CSR_INSTRET - CSR_CYCLE);
}
}

+ if (pmu_sbi_is_fw_event(event) && cdeleg_available)
+ ctr_mask = firmware_cmask;
+
/* retrieve the available counter index */
#if defined(CONFIG_32BIT)
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
- cmask, cflags, hwc->event_base, hwc->config,
+ ctr_mask, cflags, hwc->event_base, hwc->config,
hwc->config >> 32);
#else
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
- cmask, cflags, hwc->event_base, hwc->config, 0);
+ ctr_mask, cflags, hwc->event_base, hwc->config, 0);
#endif
if (ret.error) {
pr_debug("Not able to find a counter for event %lx config %llx\n",
@@ -391,7 +467,7 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
}

idx = ret.value;
- if (!test_bit(idx, &rvpmu->cmask) || !pmu_ctr_list[idx].value)
+ if (!test_bit(idx, &ctr_mask) || !pmu_ctr_list[idx].value)
return -ENOENT;

/* Additional sanity check for the counter id */
@@ -441,15 +517,20 @@ static int pmu_event_find_cache(u64 config)
return ret;
}

-static bool pmu_sbi_is_fw_event(struct perf_event *event)
+static int rvpmu_deleg_event_map(struct perf_event *event, u64 *econfig)
{
- u32 type = event->attr.type;
u64 config = event->attr.config;

- if ((type == PERF_TYPE_RAW) && ((config >> 63) == 1))
- return true;
- else
- return false;
+ /*
+ * The Perf tool for RISC-V is expected to remap the standard perf event to platform
+ * specific encoding if counter delegation extension is present.
+ * Thus, the mapped value should be event encoding value specified in the userspace
+ * There is no additional mapping/validation can be done in the driver.
+ */
+ *econfig = config & RISCV_PMU_DELEG_RAW_EVENT_MASK;
+
+ /* event_base is not used for counter delegation config is sufficient for event encoding */
+ return 0;
}

static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
@@ -476,7 +557,7 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
* raw event and firmware events.
*/
bSoftware = config >> 63;
- raw_config_val = config & RISCV_PMU_RAW_EVENT_MASK;
+ raw_config_val = config & RISCV_PMU_SBI_RAW_EVENT_MASK;
if (bSoftware) {
ret = (raw_config_val & 0xFFFF) |
(SBI_PMU_EVENT_TYPE_FW << 16);
@@ -493,7 +574,7 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
return ret;
}

-static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
+static u64 rvpmu_ctr_read(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
@@ -550,10 +631,6 @@ static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
pr_err("Starting counter idx %d failed with error %d\n",
hwc->idx, sbi_err_map_linux_errno(ret.error));
-
- if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
- (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- rvpmu_set_scounteren((void *)event);
}

static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
@@ -561,10 +638,6 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;

- if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
- (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- rvpmu_reset_scounteren((void *)event);
-
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, hwc->idx, 1, flag, 0, 0, 0);
if (ret.error && (ret.error != SBI_ERR_ALREADY_STOPPED) &&
flag != SBI_PMU_STOP_FLAG_RESET)
@@ -583,12 +656,6 @@ static int rvpmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}

-static int rvpmu_deleg_find_ctrs(void)
-{
- /* TODO */
- return -1;
-}
-
static int rvpmu_sbi_get_ctrinfo(int nsbi_ctr, int ndeleg_ctr)
{
struct sbiret ret;
@@ -647,19 +714,85 @@ static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
cpu_hw_evt->used_hw_ctrs[0], 0, 0, 0, 0);
}

+static void rvpmu_deleg_ctr_start_mask(unsigned long mask)
+{
+ unsigned long scountinhibit_val = 0;
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val &= ~mask;
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_deleg_ctr_enable_irq(struct perf_event *event)
+{
+ unsigned long hpmevent_curr;
+ unsigned long of_mask;
+ struct hw_perf_event *hwc = &event->hw;
+ int counter_idx = hwc->idx;
+ unsigned long sip_val = csr_read(CSR_SIP);
+
+ if (!is_sampling_event(event) || (sip_val & SIP_LCOFIP))
+ return;
+
+#if defined(CONFIG_32BIT)
+ hpmevent_curr = csr_ind_read(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx);
+ of_mask = (u32)~HPMEVENTH_OF;
+#else
+ hpmevent_curr = csr_ind_read(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx);
+ of_mask = ~HPMEVENT_OF;
+#endif
+
+ hpmevent_curr &= of_mask;
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
+#else
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
+#endif
+}
+
+static void rvpmu_deleg_ctr_start(struct perf_event *event, u64 ival)
+{
+ unsigned long scountinhibit_val = 0;
+ struct hw_perf_event *hwc = &event->hw;
+
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival & 0xFFFFFFFF);
+ csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, hwc->idx, ival >> BITS_PER_LONG);
+#else
+ csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival);
+#endif
+
+ rvpmu_deleg_ctr_enable_irq(event);
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val &= ~(1 << hwc->idx);
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_deleg_ctr_stop_mask(unsigned long mask)
+{
+ unsigned long scountinhibit_val = 0;
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val |= mask;
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
/*
* This function starts all the used counters in two step approach.
* Any counter that did not overflow can be start in a single step
* while the overflowed counters need to be started with updated initialization
* value.
*/
-static inline void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
+static inline void rvpmu_start_overflow_mask(struct riscv_pmu *pmu,
unsigned long ctr_ovf_mask)
{
int idx = 0;
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
struct perf_event *event;
- unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE;
unsigned long ctr_start_mask = 0;
uint64_t max_period;
struct hw_perf_event *hwc;
@@ -667,9 +800,12 @@ static inline void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,

ctr_start_mask = cpu_hw_evt->used_hw_ctrs[0] & ~ctr_ovf_mask;

- /* Start all the counters that did not overflow in a single shot */
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, 0, ctr_start_mask,
- 0, 0, 0, 0);
+ if (static_branch_likely(&riscv_pmu_cdeleg_available))
+ rvpmu_deleg_ctr_start_mask(ctr_start_mask);
+ else
+ /* Start all the counters that did not overflow in a single shot */
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, 0, ctr_start_mask,
+ 0, 0, 0, 0);

/* Reinitialize and start all the counter that overflowed */
while (ctr_ovf_mask) {
@@ -678,13 +814,10 @@ static inline void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
hwc = &event->hw;
max_period = riscv_pmu_ctr_get_width_mask(event);
init_val = local64_read(&hwc->prev_count) & max_period;
-#if defined(CONFIG_32BIT)
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
- flag, init_val, init_val >> 32, 0);
-#else
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
- flag, init_val, 0, 0);
-#endif
+ if (static_branch_likely(&riscv_pmu_cdeleg_available))
+ rvpmu_deleg_ctr_start(event, init_val);
+ else
+ rvpmu_sbi_ctr_start(event, init_val);
perf_event_update_userpage(event);
}
ctr_ovf_mask = ctr_ovf_mask >> 1;
@@ -723,7 +856,10 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
}

pmu = to_riscv_pmu(event->pmu);
- rvpmu_sbi_stop_hw_ctrs(pmu);
+ if (static_branch_likely(&riscv_pmu_cdeleg_available))
+ rvpmu_deleg_ctr_stop_mask(cpu_hw_evt->used_hw_ctrs[0]);
+ else
+ rvpmu_sbi_stop_hw_ctrs(pmu);

/* Overflow status register should only be read after counter are stopped */
ALT_SBI_PMU_OVERFLOW(overflow);
@@ -779,22 +915,174 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
}
}

- rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
+ rvpmu_start_overflow_mask(pmu, overflowed_ctrs);
perf_sample_event_took(sched_clock() - start_clock);

return IRQ_HANDLED;
}

+static int get_deleg_hw_ctr_width(int counter_offset)
+{
+ unsigned long hpm_warl;
+ int num_bits;
+
+ if (counter_offset < 3 || counter_offset > 31)
+ return 0;
+
+ hpm_warl = csr_ind_warl(CSR_SIREG, SISELECT_SSCCFG_BASE, counter_offset, -1);
+ num_bits = __fls(hpm_warl);
+
+#if defined(CONFIG_32BIT)
+ hpm_warl = csr_ind_warl(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_offset, -1);
+ num_bits += __fls(hpm_warl);
+#endif
+ return num_bits;
+}
+
+static int rvpmu_deleg_find_ctrs(void)
+{
+ int i, num_hw_ctr = 0;
+ union sbi_pmu_ctr_info cinfo;
+ unsigned long scountinhibit_old = 0;
+
+ /* Do a WARL write/read to detect which hpmcounters have been delegated */
+ scountinhibit_old = csr_read(CSR_SCOUNTINHIBIT);
+ csr_write(CSR_SCOUNTINHIBIT, -1);
+ cmask = csr_read(CSR_SCOUNTINHIBIT);
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_old);
+
+ for_each_set_bit(i, &cmask, RISCV_MAX_HW_COUNTERS) {
+ if (unlikely(i == 1))
+ continue; /* This should never happen as TM is read only */
+ cinfo.value = 0;
+ cinfo.type = SBI_PMU_CTR_TYPE_HW;
+ /*
+ * If counter delegation is enabled, the csr stored to the cinfo will
+ * be a virtual counter that the delegation attempts to read.
+ */
+ cinfo.csr = CSR_CYCLE + i;
+ if (i == 0 || i == 2)
+ cinfo.width = 63;
+ else
+ cinfo.width = get_deleg_hw_ctr_width(i) - 1;
+
+ num_hw_ctr++;
+ pmu_ctr_list[i].value = cinfo.value;
+ }
+
+ return num_hw_ctr;
+}
+
+static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ return -EINVAL;
+}
+
+static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ unsigned long hw_ctr_mask = 0;
+
+ /*
+ * TODO: Treat every hpmcounter can monitor every event for now.
+ * The event to counter mapping should come from the json file.
+ * The mapping should also tell if sampling is supported or not.
+ */
+
+ /* Select only hpmcounters */
+ hw_ctr_mask = cmask & (~0x7);
+ hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
+ return __ffs(hw_ctr_mask);
+}
+
+static void update_deleg_hpmevent(int counter_idx, uint64_t event_value, uint64_t filter_bits)
+{
+ uint64_t hpmevent_value = 0;
+
+ /* OF bit should be enable during the start if sampling is requested */
+ hpmevent_value = (event_value & ~HPMEVENT_MASK) | filter_bits | HPMEVENT_OF;
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value & 0xFFFFFFFF);
+ if (riscv_isa_extension_available(NULL, SSCOFPMF))
+ csr_ind_write(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx,
+ hpmevent_value >> BITS_PER_LONG);
+#else
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value);
+#endif
+}
+
+static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+ unsigned long hw_ctr_max_id;
+ uint64_t priv_filter;
+ int idx;
+
+ /*
+ * TODO: We should not rely on SBI Perf encoding to check if the event
+ * is a fixed one or not.
+ */
+ if (!is_sampling_event(event)) {
+ idx = get_deleg_fixed_hw_idx(cpuc, event);
+ if (idx == 0 || idx == 2) {
+ /* Priv mode filter bits are only available if smcntrpmf is present */
+ if (riscv_isa_extension_available(NULL, SMCNTRPMF))
+ goto found_idx;
+ else
+ goto skip_update;
+ }
+ }
+
+ hw_ctr_max_id = __fls(cmask);
+ idx = get_deleg_next_hpm_hw_idx(cpuc, event);
+ if (idx < 3 || idx > hw_ctr_max_id)
+ goto out_err;
+found_idx:
+ priv_filter = get_deleg_priv_filter_bits(event);
+ update_deleg_hpmevent(idx, hwc->config, priv_filter);
+skip_update:
+ if (!test_and_set_bit(idx, cpuc->used_hw_ctrs))
+ return idx;
+out_err:
+ return -ENOENT;
+}
+
static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
{
- rvpmu_sbi_ctr_start(event, ival);
- /* TODO: Counter delegation implementation */
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (static_branch_likely(&riscv_pmu_cdeleg_available) && !pmu_sbi_is_fw_event(event))
+ rvpmu_deleg_ctr_start(event, ival);
+ else
+ rvpmu_sbi_ctr_start(event, ival);
+
+
+ if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
+ (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
+ rvpmu_set_scounteren((void *)event);
}

static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
{
- rvpmu_sbi_ctr_stop(event, flag);
- /* TODO: Counter delegation implementation */
+ struct hw_perf_event *hwc = &event->hw;
+
+ if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
+ (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
+ rvpmu_reset_scounteren((void *)event);
+
+ if (static_branch_likely(&riscv_pmu_cdeleg_available) &&
+ !pmu_sbi_is_fw_event(event)) {
+ /*
+ * The counter is already stopped. No need to stop again. Counter
+ * mapping will be reset in clear_idx function.
+ */
+ if (flag != RISCV_PMU_STOP_FLAG_RESET)
+ rvpmu_deleg_ctr_stop_mask((1 << hwc->idx));
+ } else {
+ rvpmu_sbi_ctr_stop(event, flag);
+ }
}

static int rvpmu_find_ctrs(void)
@@ -828,20 +1116,18 @@ static int rvpmu_find_ctrs(void)

static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
{
- return rvpmu_sbi_event_map(event, econfig);
- /* TODO: Counter delegation implementation */
+ if (static_branch_likely(&riscv_pmu_cdeleg_available) && !pmu_sbi_is_fw_event(event))
+ return rvpmu_deleg_event_map(event, econfig);
+ else
+ return rvpmu_sbi_event_map(event, econfig);
}

static int rvpmu_ctr_get_idx(struct perf_event *event)
{
- return rvpmu_sbi_ctr_get_idx(event);
- /* TODO: Counter delegation implementation */
-}
-
-static u64 rvpmu_ctr_read(struct perf_event *event)
-{
- return rvpmu_sbi_ctr_read(event);
- /* TODO: Counter delegation implementation */
+ if (static_branch_likely(&riscv_pmu_cdeleg_available) && !pmu_sbi_is_fw_event(event))
+ return rvpmu_deleg_ctr_get_idx(event);
+ else
+ return rvpmu_sbi_ctr_get_idx(event);
}

static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
@@ -859,7 +1145,16 @@ static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
csr_write(CSR_SCOUNTEREN, 0x2);

/* Stop all the counters so that they can be enabled from perf */
- rvpmu_sbi_stop_all(pmu);
+ if (static_branch_likely(&riscv_pmu_cdeleg_available)) {
+ rvpmu_deleg_ctr_stop_mask(cmask);
+ if (static_branch_likely(&riscv_pmu_sbi_available)) {
+ /* Stop the firmware counters as well */
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, 0, firmware_cmask,
+ 0, 0, 0, 0);
+ }
+ } else {
+ rvpmu_sbi_stop_all(pmu);
+ }

if (riscv_pmu_use_irq) {
cpu_hw_evt->irq = riscv_pmu_irq;
@@ -1142,7 +1437,10 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
}

- pmu->pmu.attr_groups = riscv_pmu_attr_groups;
+ if (cdeleg_available)
+ pmu->pmu.attr_groups = riscv_cdeleg_pmu_attr_groups;
+ else
+ pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
pmu->cmask = cmask;
pmu->ctr_start = rvpmu_ctr_start;
pmu->ctr_stop = rvpmu_ctr_stop;
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 3d2b1d7913f3..f878369fecc8 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -20,6 +20,7 @@
*/

#define RISCV_MAX_COUNTERS 64
+#define RISCV_MAX_HW_COUNTERS 32
#define RISCV_OP_UNSUPP (-EOPNOTSUPP)
#define RISCV_PMU_PDEV_NAME "riscv-pmu"
#define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy"
@@ -28,6 +29,8 @@

#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1

+#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
+
struct cpu_hw_events {
/* currently enabled events */
int n_events;
--
2.34.1


2024-02-17 01:04:23

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 14/20] RISC-V: perf: Use config2 for event to counter mapping

The counter restriction specified in the json file is passed to
the drivers via config2 paarameter in perf attributes. This allows
any platform vendor to define their custom mapping between event and
hpmcounters without any rules defined in the ISA.

However, the cycle and instruction counters are fixed (0 and 2
respectively) by the ISA. The platform vendor must specify this
in the json file if intended to be used while profiling. Otherwise,
they can just specify the alternate hpmcounters that may monitor
and/or sample the cycle/instruction counts.

Signed-off-by: Atish Patra <[email protected]>
---
drivers/perf/riscv_pmu_dev.c | 36 +++++++++++++++++++++++-----------
include/linux/perf/riscv_pmu.h | 2 ++
2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 0cdad0dafb71..5bad4131e920 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -49,6 +49,7 @@ static ssize_t __maybe_unused rvpmu_format_show(struct device *dev,
RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)

PMU_FORMAT_ATTR(firmware, "config:63");
+PMU_FORMAT_ATTR(counterid_mask, "config2:0-31");

static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
@@ -74,6 +75,7 @@ static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
&format_attr_firmware.attr,
+ &format_attr_counterid_mask.attr,
NULL,
};

@@ -974,23 +976,39 @@ static int rvpmu_deleg_find_ctrs(void)
return num_hw_ctr;
}

+/* The json file must correctly specify counter 0 or counter 2 is available
+ * in the counter lists for cycle/instret events. Otherwise, the drivers have
+ * no way to figure out if a fixed counter must be used and pick a programmable
+ * counter if available.
+ */
static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
{
- return -EINVAL;
+ if (!event->attr.config2)
+ return -EINVAL;
+
+ if (event->attr.config2 & RISCV_PMU_CYCLE_FIXED_CTR_MASK)
+ return 0; /* CY counter */
+ else if (event->attr.config2 & RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK)
+ return 2; /* IR counter */
+ else
+ return -EINVAL;
}

static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
{
unsigned long hw_ctr_mask = 0;

- /*
- * TODO: Treat every hpmcounter can monitor every event for now.
- * The event to counter mapping should come from the json file.
- * The mapping should also tell if sampling is supported or not.
- */
-
/* Select only hpmcounters */
hw_ctr_mask = cmask & (~0x7);
+
+ /*
+ * Mask off the counters that can't monitor this event (specified via json)
+ * The counter mask for this event is set in config2 via the property 'Counter'
+ * in the json file or manual configuration of config2. If the config2 is not set, it
+ * is assumed all the available hpmcounters can monitor this event.
+ */
+ if (event->attr.config2)
+ hw_ctr_mask = hw_ctr_mask & event->attr.config2;
hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
return __ffs(hw_ctr_mask);
}
@@ -1020,10 +1038,6 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
uint64_t priv_filter;
int idx;

- /*
- * TODO: We should not rely on SBI Perf encoding to check if the event
- * is a fixed one or not.
- */
if (!is_sampling_event(event)) {
idx = get_deleg_fixed_hw_idx(cpuc, event);
if (idx == 0 || idx == 2) {
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index f878369fecc8..425edd6685a9 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -30,6 +30,8 @@
#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1

#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
+#define RISCV_PMU_CYCLE_FIXED_CTR_MASK 0x01
+#define RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK 0x04

struct cpu_hw_events {
/* currently enabled events */
--
2.34.1


2024-02-17 01:04:57

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 15/20] tools/perf: Add arch hooks to override perf standard events

RISC-V doesn't have any standard event encoding defined in the
ISA. Cycle/instruction event is defined in the ISA but lack of
event encoding allow vendors to choose their own encoding scheme.
These events directly map to perf cycle/instruction events which
gets decoded as per perf definitions. An arch hooks allows the
RISC-V implementation to override the encodings if a vendor has
specified the encodings via Json file at runtime.

The alternate solution would be define vendor specific encodings in
the driver similar to other architectures. However, these will grow
over time to become unmaintainable as the number of vendors in RISC-V
can be huge.

Signed-off-by: Atish Patra <[email protected]>
---
tools/perf/arch/riscv/util/Build | 1 +
tools/perf/arch/riscv/util/evlist.c | 59 +++++++++++++++++++
tools/perf/builtin-record.c | 3 +
tools/perf/builtin-stat.c | 2 +
tools/perf/builtin-top.c | 3 +
.../pmu-events/arch/riscv/arch-standard.json | 10 ++++
tools/perf/pmu-events/jevents.py | 6 ++
tools/perf/util/evlist.c | 6 ++
tools/perf/util/evlist.h | 6 ++
9 files changed, 96 insertions(+)
create mode 100644 tools/perf/arch/riscv/util/evlist.c
create mode 100644 tools/perf/pmu-events/arch/riscv/arch-standard.json

diff --git a/tools/perf/arch/riscv/util/Build b/tools/perf/arch/riscv/util/Build
index 603dbb5ae4dc..b581fb3d8677 100644
--- a/tools/perf/arch/riscv/util/Build
+++ b/tools/perf/arch/riscv/util/Build
@@ -1,5 +1,6 @@
perf-y += perf_regs.o
perf-y += header.o
+perf-y += evlist.o

perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
diff --git a/tools/perf/arch/riscv/util/evlist.c b/tools/perf/arch/riscv/util/evlist.c
new file mode 100644
index 000000000000..9ad287c6f396
--- /dev/null
+++ b/tools/perf/arch/riscv/util/evlist.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include "util/pmu.h"
+#include "util/pmus.h"
+#include "util/evlist.h"
+#include "util/parse-events.h"
+#include "util/event.h"
+#include "evsel.h"
+
+static int pmu_update_cpu_stdevents_callback(const struct pmu_event *pe,
+ const struct pmu_events_table *table __maybe_unused,
+ void *vdata)
+{
+ struct evsel *evsel = vdata;
+ struct parse_events_terms terms;
+ int err;
+ struct perf_pmu *pmu = perf_pmus__find("cpu");
+
+ if (pe->event) {
+ parse_events_terms__init(&terms);
+ err = parse_events_terms(&terms, pe->event, NULL);
+ if (err)
+ goto out_free;
+ err = perf_pmu__config_terms(pmu, &evsel->core.attr, &terms,
+ /*zero=*/true, /*err=*/NULL);
+ if (err)
+ goto out_free;
+ }
+
+out_free:
+ parse_events_terms__exit(&terms);
+ return 0;
+}
+
+int arch_evlist__override_default_attrs(struct evlist *evlist, const char *pmu_name)
+{
+ struct evsel *evsel;
+ struct perf_pmu *pmu = perf_pmus__find(pmu_name);
+ static const char *const overriden_event_arr[] = {"cycles", "instructions",
+ "dTLB-load-misses", "dTLB-store-misses",
+ "iTLB-load-misses"};
+ unsigned int i, len = sizeof(overriden_event_arr) / sizeof(char *);
+
+ if (!pmu)
+ return 0;
+
+ for (i = 0; i < len; i++) {
+ if (perf_pmus__have_event(pmu_name, overriden_event_arr[i])) {
+ evsel = evlist__find_evsel_by_str(evlist, overriden_event_arr[i]);
+ if (!evsel)
+ continue;
+ pmu_events_table__find_event(pmu->events_table, pmu,
+ overriden_event_arr[i],
+ pmu_update_cpu_stdevents_callback, evsel);
+ }
+ }
+
+ return 0;
+}
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 86c910125172..305c2c030208 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -4152,6 +4152,9 @@ int cmd_record(int argc, const char **argv)
goto out;
}

+ if (arch_evlist__override_default_attrs(rec->evlist, "cpu"))
+ goto out;
+
if (rec->opts.target.tid && !rec->opts.no_inherit_set)
rec->opts.no_inherit = true;

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 5fe9abc6a524..a0feafc5be2c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2713,6 +2713,8 @@ int cmd_stat(int argc, const char **argv)

if (add_default_attributes())
goto out;
+ if (arch_evlist__override_default_attrs(evsel_list, "cpu"))
+ goto out;

if (stat_config.cgroup_list) {
if (nr_cgroups > 0) {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5301d1badd43..7e268f239df0 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1672,6 +1672,9 @@ int cmd_top(int argc, const char **argv)
goto out_delete_evlist;
}

+ if (arch_evlist__override_default_attrs(top.evlist, "cpu"))
+ goto out_delete_evlist;
+
status = evswitch__init(&top.evswitch, top.evlist, stderr);
if (status)
goto out_delete_evlist;
diff --git a/tools/perf/pmu-events/arch/riscv/arch-standard.json b/tools/perf/pmu-events/arch/riscv/arch-standard.json
new file mode 100644
index 000000000000..96e21f088558
--- /dev/null
+++ b/tools/perf/pmu-events/arch/riscv/arch-standard.json
@@ -0,0 +1,10 @@
+[
+ {
+ "EventName": "cycles",
+ "BriefDescription": "cycle executed"
+ },
+ {
+ "EventName": "instructions",
+ "BriefDescription": "instruction retired"
+ }
+]
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 81e465a43c75..30934a490109 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -7,6 +7,7 @@ from functools import lru_cache
import json
import metric
import os
+import re
import sys
from typing import (Callable, Dict, Optional, Sequence, Set, Tuple)
import collections
@@ -388,6 +389,11 @@ class JsonEvent:
if arch_std:
if arch_std.lower() in _arch_std_events:
event = _arch_std_events[arch_std.lower()].event
+ if eventcode:
+ event = re.sub(r'event=\d+', f'event={llx(eventcode)}', event)
+ if configcode:
+ event = re.sub(r'config=\d+', f'event={llx(configcode)}', event)
+
# Copy from the architecture standard event to self for undefined fields.
for attr, value in _arch_std_events[arch_std.lower()].__dict__.items():
if hasattr(self, attr) and not getattr(self, attr):
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 55a300a0977b..f8a5640cf4fa 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -357,6 +357,12 @@ __weak int arch_evlist__add_default_attrs(struct evlist *evlist,
return __evlist__add_default_attrs(evlist, attrs, nr_attrs);
}

+__weak int arch_evlist__override_default_attrs(struct evlist *evlist __maybe_unused,
+ const char *pmu_name __maybe_unused)
+{
+ return 0;
+}
+
struct evsel *evlist__find_tracepoint_by_id(struct evlist *evlist, int id)
{
struct evsel *evsel;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index cb91dc9117a2..705b6643b558 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -109,9 +109,15 @@ int arch_evlist__add_default_attrs(struct evlist *evlist,
struct perf_event_attr *attrs,
size_t nr_attrs);

+
#define evlist__add_default_attrs(evlist, array) \
arch_evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array))

+int arch_evlist__override_default_attrs(struct evlist *evlist, const char *pmu_name);
+
+#define evlist__override_default_attrs(evlist, pmu_name) \
+ arch_evlist__override_default_attrs(evlist, pmu_name)
+
int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs);

int evlist__add_dummy(struct evlist *evlist);
--
2.34.1


2024-02-17 01:05:06

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 16/20] tools/perf: Pass the Counter constraint values in the pmu events

RISC-V doesn't have any standard event to counter mapping discovery
mechanism in the ISA. The ISA defines 29 programmable counters and
platforms can choose to implement any number of them and map any
events to any counters. Thus, the perf tool need to inform the driver
about the counter mapping of each events.

The current perf infrastructure only parses the 'Counter' constraints
in metrics. This patch extends that to pass in the pmu events so that
any driver can retrieve those values via perf attributes if defined
accordingly.

Signed-off-by: Atish Patra <[email protected]>
---
tools/perf/pmu-events/jevents.py | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 30934a490109..f1e320077695 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -278,6 +278,11 @@ class JsonEvent:
return fixed[name.lower()]
return event

+ def counter_list_to_bitmask(counterlist):
+ counter_ids = list(map(int, counterlist.split(',')))
+ bitmask = sum(1 << pos for pos in counter_ids)
+ return bitmask
+
def unit_to_pmu(unit: str) -> Optional[str]:
"""Convert a JSON Unit to Linux PMU name."""
if not unit:
@@ -401,6 +406,10 @@ class JsonEvent:
else:
raise argparse.ArgumentTypeError('Cannot find arch std event:', arch_std)

+ if self.counter:
+ bitmask = counter_list_to_bitmask(self.counter)
+ event += f',counterid_mask={bitmask:#x}'
+
self.event = real_event(self.name, event)

def __repr__(self) -> str:
--
2.34.1


2024-02-17 01:05:32

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 17/20] perf: Add json file for virt machine supported events

The linux driver will use the event encodings specified in platform
specific json file only for platforms with counter delegation support.

Use the perf json infrastructure to encode those events and let the
driver uses that if counter delegation is available.

Signed-off-by: Atish Patra <[email protected]>
---
tools/perf/pmu-events/arch/riscv/mapfile.csv | 1 +
.../pmu-events/arch/riscv/qemu/virt/cpu.json | 30 ++++++++
.../arch/riscv/qemu/virt/firmware.json | 68 +++++++++++++++++++
3 files changed, 99 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/riscv/qemu/virt/cpu.json
create mode 100644 tools/perf/pmu-events/arch/riscv/qemu/virt/firmware.json

diff --git a/tools/perf/pmu-events/arch/riscv/mapfile.csv b/tools/perf/pmu-events/arch/riscv/mapfile.csv
index cfc449b19810..b3e7d544e29c 100644
--- a/tools/perf/pmu-events/arch/riscv/mapfile.csv
+++ b/tools/perf/pmu-events/arch/riscv/mapfile.csv
@@ -17,3 +17,4 @@
0x489-0x8000000000000007-0x[[:xdigit:]]+,v1,sifive/u74,core
0x5b7-0x0-0x0,v1,thead/c900-legacy,core
0x67e-0x80000000db0000[89]0-0x[[:xdigit:]]+,v1,starfive/dubhe-80,core
+0x0-0x0-0x0,v1,qemu/virt,core
diff --git a/tools/perf/pmu-events/arch/riscv/qemu/virt/cpu.json b/tools/perf/pmu-events/arch/riscv/qemu/virt/cpu.json
new file mode 100644
index 000000000000..9ab631723c88
--- /dev/null
+++ b/tools/perf/pmu-events/arch/riscv/qemu/virt/cpu.json
@@ -0,0 +1,30 @@
+[
+ {
+ "ArchStdEvent": "instructions",
+ "EventCode": "0x02",
+ "Counter":"2,3,4,5,6,7,8,9,10"
+ },
+ {
+ "ArchStdEvent": "cycles",
+ "EventCode": "0x01",
+ "Counter":"0,3,4,5,6,7,8,9,10"
+ },
+ {
+ "EventName": "dTLB-load-misses",
+ "EventCode": "0x10019",
+ "BriefDescription": "Data TLB load miss",
+ "Counter":"3,4,5,6,7,8,9,10"
+ },
+ {
+ "EventName": "dTLB-store-misses",
+ "EventCode": "0x1001B",
+ "BriefDescription": "Data TLB store miss",
+ "Counter":"3,4,5,6,7,8,9,10"
+ },
+ {
+ "EventName": "iTLB-load-misses",
+ "EventCode": "0x10021",
+ "BriefDescription": "Instruction fetch TLB load miss",
+ "Counter":"3,4,5,6,7,8,9,10"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/riscv/qemu/virt/firmware.json b/tools/perf/pmu-events/arch/riscv/qemu/virt/firmware.json
new file mode 100644
index 000000000000..9b4a032186a7
--- /dev/null
+++ b/tools/perf/pmu-events/arch/riscv/qemu/virt/firmware.json
@@ -0,0 +1,68 @@
+[
+ {
+ "ArchStdEvent": "FW_MISALIGNED_LOAD"
+ },
+ {
+ "ArchStdEvent": "FW_MISALIGNED_STORE"
+ },
+ {
+ "ArchStdEvent": "FW_ACCESS_LOAD"
+ },
+ {
+ "ArchStdEvent": "FW_ACCESS_STORE"
+ },
+ {
+ "ArchStdEvent": "FW_ILLEGAL_INSN"
+ },
+ {
+ "ArchStdEvent": "FW_SET_TIMER"
+ },
+ {
+ "ArchStdEvent": "FW_IPI_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_IPI_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_FENCE_I_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_FENCE_I_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_SFENCE_VMA_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_SFENCE_VMA_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_SFENCE_VMA_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_SFENCE_VMA_ASID_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_GVMA_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_GVMA_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_GVMA_VMID_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_GVMA_VMID_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_VVMA_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_VVMA_RECEIVED"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_VVMA_ASID_SENT"
+ },
+ {
+ "ArchStdEvent": "FW_HFENCE_VVMA_ASID_RECEIVED"
+ }
+]
--
2.34.1


2024-02-17 01:05:53

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 18/20] tools arch uapi: Sync the uinstd.h header file for RISC-V

The uninstd.h has been changed since it's last sync. Update it so
that perf tool can use the new RISC-V specific syscall in perf tool.

Signed-off-by: Atish Patra <[email protected]>
---
tools/arch/riscv/include/uapi/asm/unistd.h | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/arch/riscv/include/uapi/asm/unistd.h b/tools/arch/riscv/include/uapi/asm/unistd.h
index f506cca520b0..950ab3fd4409 100644
--- a/tools/arch/riscv/include/uapi/asm/unistd.h
+++ b/tools/arch/riscv/include/uapi/asm/unistd.h
@@ -15,11 +15,14 @@
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*/

-#ifdef __LP64__
+#if defined(__LP64__) && !defined(__SYSCALL_COMPAT)
#define __ARCH_WANT_NEW_STAT
#define __ARCH_WANT_SET_GET_RLIMIT
#endif /* __LP64__ */

+#define __ARCH_WANT_SYS_CLONE3
+#define __ARCH_WANT_MEMFD_SECRET
+
#include <asm-generic/unistd.h>

/*
@@ -40,3 +43,12 @@
#define __NR_riscv_flush_icache (__NR_arch_specific_syscall + 15)
#endif
__SYSCALL(__NR_riscv_flush_icache, sys_riscv_flush_icache)
+
+/*
+ * Allows userspace to query the kernel for CPU architecture and
+ * microarchitecture details across a given set of CPUs.
+ */
+#ifndef __NR_riscv_hwprobe
+#define __NR_riscv_hwprobe (__NR_arch_specific_syscall + 14)
+#endif
+__SYSCALL(__NR_riscv_hwprobe, sys_riscv_hwprobe)
--
2.34.1


2024-02-17 01:06:16

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 19/20] RISC-V: Add hwprobe support for Counter delegation extensions

Even though the counter delegation extensions are all S-mode extension,
perf tool can use it decide whether it wants to map standard events
or not. Remapping is not required for if SBI PMU is being used
for hardware events.

Signed-off-by: Atish Patra <[email protected]>
---
Documentation/arch/riscv/hwprobe.rst | 10 ++++++++++
arch/riscv/include/uapi/asm/hwprobe.h | 4 ++++
arch/riscv/kernel/sys_hwprobe.c | 3 +++
3 files changed, 17 insertions(+)

diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst
index b2bcc9eed9aa..77fa0ed09366 100644
--- a/Documentation/arch/riscv/hwprobe.rst
+++ b/Documentation/arch/riscv/hwprobe.rst
@@ -188,6 +188,16 @@ The following keys are defined:
manual starting from commit 95cf1f9 ("Add changes requested by Ved
during signoff")

+ * :c:macro:`RISCV_HWPROBE_EXT_SMCDELEG`: The Smcdeleg extension is supported as
+ defined in the RISC-V Counter Delegation extension manual starting from
+ commit ff61c1f ("switch to v1.0.0 and frozen")
+ * :c:macro:`RISCV_HWPROBE_EXT_SSCCFG`: The Ssccfg extension is supported as
+ defined in the RISC-V Counter Delegation extension manual starting from
+ commit ff61c1f ("switch to v1.0.0 and frozen")
+ * :c:macro:`RISCV_HWPROBE_EXT_SSCSRIND`: The Sscsrind extension is supported as
+ defined in the RISC-V Indirect CSR extension manual starting from
+ commit a28625c ("mark spec as frozen")
+
* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A bitmask that contains performance
information about the selected set of processors.

diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
index 9f2a8e3ff204..fb7c6bd6822a 100644
--- a/arch/riscv/include/uapi/asm/hwprobe.h
+++ b/arch/riscv/include/uapi/asm/hwprobe.h
@@ -59,6 +59,10 @@ struct riscv_hwprobe {
#define RISCV_HWPROBE_EXT_ZTSO (1ULL << 33)
#define RISCV_HWPROBE_EXT_ZACAS (1ULL << 34)
#define RISCV_HWPROBE_EXT_ZICOND (1ULL << 35)
+#define RISCV_HWPROBE_EXT_SSCSRIND (1ULL << 36)
+#define RISCV_HWPROBE_EXT_SMCDELEG (1ULL << 37)
+#define RISCV_HWPROBE_EXT_SSCCFG (1ULL << 38)
+
#define RISCV_HWPROBE_KEY_CPUPERF_0 5
#define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0)
#define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0)
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index a7c56b41efd2..befb6582b1ce 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -111,6 +111,9 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair,
EXT_KEY(ZTSO);
EXT_KEY(ZACAS);
EXT_KEY(ZICOND);
+ EXT_KEY(SSCSRIND);
+ EXT_KEY(SMCDELEG);
+ EXT_KEY(SSCCFG);

if (has_vector()) {
EXT_KEY(ZVBB);
--
2.34.1


2024-02-17 01:06:35

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 20/20] tools/perf: Detect if platform supports counter delegation

The perf tool currently remap the standard events to the encoding
specified by the platform in the json file. We need that only if
the counter delegation extension is present. Otherwise, SBI PMU
interface is used which defines the encoding for all standard
events.

The hwprobe mechanism can be used to detect the presence of these
extensions and remap the encoding space only in that case.

Signed-off-by: Atish Patra <[email protected]>
---
tools/perf/arch/riscv/util/Build | 1 +
tools/perf/arch/riscv/util/evlist.c | 3 ++-
tools/perf/arch/riscv/util/pmu.c | 41 +++++++++++++++++++++++++++++
tools/perf/arch/riscv/util/pmu.h | 11 ++++++++
4 files changed, 55 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/arch/riscv/util/pmu.c
create mode 100644 tools/perf/arch/riscv/util/pmu.h

diff --git a/tools/perf/arch/riscv/util/Build b/tools/perf/arch/riscv/util/Build
index b581fb3d8677..2358f0666e8d 100644
--- a/tools/perf/arch/riscv/util/Build
+++ b/tools/perf/arch/riscv/util/Build
@@ -1,6 +1,7 @@
perf-y += perf_regs.o
perf-y += header.o
perf-y += evlist.o
+perf-y += pmu.o

perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
diff --git a/tools/perf/arch/riscv/util/evlist.c b/tools/perf/arch/riscv/util/evlist.c
index 9ad287c6f396..aa7eef7280ca 100644
--- a/tools/perf/arch/riscv/util/evlist.c
+++ b/tools/perf/arch/riscv/util/evlist.c
@@ -6,6 +6,7 @@
#include "util/parse-events.h"
#include "util/event.h"
#include "evsel.h"
+#include "pmu.h"

static int pmu_update_cpu_stdevents_callback(const struct pmu_event *pe,
const struct pmu_events_table *table __maybe_unused,
@@ -41,7 +42,7 @@ int arch_evlist__override_default_attrs(struct evlist *evlist, const char *pmu_n
"iTLB-load-misses"};
unsigned int i, len = sizeof(overriden_event_arr) / sizeof(char *);

- if (!pmu)
+ if (!pmu || !perf_pmu_riscv_cdeleg_present())
return 0;

for (i = 0; i < len; i++) {
diff --git a/tools/perf/arch/riscv/util/pmu.c b/tools/perf/arch/riscv/util/pmu.c
new file mode 100644
index 000000000000..79f0974e27f8
--- /dev/null
+++ b/tools/perf/arch/riscv/util/pmu.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright Rivos Inc 2024
+ * Author(s): Atish Patra <[email protected]>
+ */
+
+#include <string.h>
+#include <stdio.h>
+#include <asm/hwprobe.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+
+#include "pmu.h"
+
+static bool counter_deleg_present;
+
+bool perf_pmu_riscv_cdeleg_present(void)
+{
+ return counter_deleg_present;
+}
+
+void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused)
+{
+ struct riscv_hwprobe isa_ext;
+ int ret;
+
+ isa_ext.key = RISCV_HWPROBE_KEY_IMA_EXT_0;
+
+ ret = syscall(__NR_riscv_hwprobe, &isa_ext, 1, 0, NULL, 0);
+ if (ret)
+ return;
+
+ if (isa_ext.key < 0)
+ return;
+
+ if ((isa_ext.value & RISCV_HWPROBE_EXT_SSCSRIND) &&
+ (isa_ext.value & RISCV_HWPROBE_EXT_SMCDELEG) &&
+ (isa_ext.value & RISCV_HWPROBE_EXT_SSCCFG))
+ counter_deleg_present = true;
+}
diff --git a/tools/perf/arch/riscv/util/pmu.h b/tools/perf/arch/riscv/util/pmu.h
new file mode 100644
index 000000000000..21f33f7d323d
--- /dev/null
+++ b/tools/perf/arch/riscv/util/pmu.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __RISCV_UTIL_PMU_H
+#define __RISCV_UTIL_PMU_H
+
+#include "../../../util/pmu.h"
+
+
+bool perf_pmu_riscv_cdeleg_present(void);
+
+#endif
--
2.34.1


2024-02-17 01:07:34

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 05/20] RISC-V: Define indirect CSR access helpers

The indriect CSR requires multiple instructions to read/write CSR.
Add a few helper functions for ease of usage.

Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/csr_ind.h | 42 ++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 arch/riscv/include/asm/csr_ind.h

diff --git a/arch/riscv/include/asm/csr_ind.h b/arch/riscv/include/asm/csr_ind.h
new file mode 100644
index 000000000000..9611c221eb6f
--- /dev/null
+++ b/arch/riscv/include/asm/csr_ind.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2024 Rivos Inc.
+ */
+
+#ifndef _ASM_RISCV_CSR_IND_H
+#define _ASM_RISCV_CSR_IND_H
+
+#include <asm/csr.h>
+
+#define csr_ind_read(iregcsr, iselbase, iseloff) ({ \
+ unsigned long value = 0; \
+ unsigned long flags; \
+ local_irq_save(flags); \
+ csr_write(CSR_ISELECT, iselbase + iseloff); \
+ value = csr_read(iregcsr); \
+ local_irq_restore(flags); \
+ value; \
+})
+
+#define csr_ind_write(iregcsr, iselbase, iseloff, value) ({ \
+ unsigned long flags; \
+ local_irq_save(flags); \
+ csr_write(CSR_ISELECT, iselbase + iseloff); \
+ csr_write(iregcsr, value); \
+ local_irq_restore(flags); \
+})
+
+#define csr_ind_warl(iregcsr, iselbase, iseloff, warl_val) ({ \
+ unsigned long old_val = 0, value = 0; \
+ unsigned long flags; \
+ local_irq_save(flags); \
+ csr_write(CSR_ISELECT, iselbase + iseloff); \
+ old_val = csr_read(iregcsr); \
+ csr_write(iregcsr, value); \
+ value = csr_read(iregcsr); \
+ csr_write(iregcsr, old_val); \
+ local_irq_restore(flags); \
+ value; \
+})
+
+#endif
--
2.34.1


2024-02-17 01:07:44

by Atish Patra

[permalink] [raw]
Subject: [PATCH RFC 06/20] RISC-V: Add Sscfg extension CSR definition

From: Kaiwen Xue <[email protected]>

This adds the scountinhibit CSR definition and S-mode accessible hpmevent
bits defined by smcdeleg/ssccfg. scountinhibit allows S-mode to start/stop
counters directly from S-mode without invoking SBI calls to M-mode. It is
also used to figure out the counters delegated to S-mode by the M-mode as
well.

Signed-off-by: Kaiwen Xue <[email protected]>
---
arch/riscv/include/asm/csr.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 0a54856fd807..e1bf1466f32e 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -214,6 +214,31 @@
#define SMSTATEEN0_HSENVCFG (_ULL(1) << SMSTATEEN0_HSENVCFG_SHIFT)
#define SMSTATEEN0_SSTATEEN0_SHIFT 63
#define SMSTATEEN0_SSTATEEN0 (_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT)
+/* HPMEVENT bits. These are accessible in S-mode via Smcdeleg/Ssccfg */
+#ifdef CONFIG_64BIT
+#define HPMEVENT_OF (_UL(1) << 63)
+#define HPMEVENT_MINH (_UL(1) << 62)
+#define HPMEVENT_SINH (_UL(1) << 61)
+#define HPMEVENT_UINH (_UL(1) << 60)
+#define HPMEVENT_VSINH (_UL(1) << 59)
+#define HPMEVENT_VUINH (_UL(1) << 58)
+#else
+#define HPMEVENTH_OF (_ULL(1) << 31)
+#define HPMEVENTH_MINH (_ULL(1) << 30)
+#define HPMEVENTH_SINH (_ULL(1) << 29)
+#define HPMEVENTH_UINH (_ULL(1) << 28)
+#define HPMEVENTH_VSINH (_ULL(1) << 27)
+#define HPMEVENTH_VUINH (_ULL(1) << 26)
+
+#define HPMEVENT_OF (HPMEVENTH_OF << 32)
+#define HPMEVENT_MINH (HPMEVENTH_MINH << 32)
+#define HPMEVENT_SINH (HPMEVENTH_SINH << 32)
+#define HPMEVENT_UINH (HPMEVENTH_UINH << 32)
+#define HPMEVENT_VSINH (HPMEVENTH_VSINH << 32)
+#define HPMEVENT_VUINH (HPMEVENTH_VUINH << 32)
+#endif
+
+#define SISELECT_SSCCFG_BASE 0x40

/* symbolic CSR names: */
#define CSR_CYCLE 0xc00
@@ -289,6 +314,7 @@
#define CSR_SCOUNTEREN 0x106
#define CSR_SENVCFG 0x10a
#define CSR_SSTATEEN0 0x10c
+#define CSR_SCOUNTINHIBIT 0x120
#define CSR_SSCRATCH 0x140
#define CSR_SEPC 0x141
#define CSR_SCAUSE 0x142
--
2.34.1


2024-02-17 02:33:48

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH RFC 08/20] dt-bindings: riscv: add Ssccfg ISA extension description


On Fri, 16 Feb 2024 16:57:26 -0800, Atish Patra wrote:
> Add description for the Ssccfg extension.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> .../devicetree/bindings/riscv/extensions.yaml | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:
/Documentation/devicetree/bindings/riscv/extensions.yaml:131:1: [error] syntax error: found character '\t' that cannot start any token (syntax)

dtschema/dtc warnings/errors:
Documentation/devicetree/bindings/riscv/extensions.yaml:131:1: found a tab character where an indentation space is expected
/Documentation/devicetree/bindings/riscv/extensions.yaml:131:1: found a tab character where an indentation space is expected
in "<unicode string>", line 125, column 24
in "<unicode string>", line 131, column 1

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/[email protected]

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.


2024-02-17 02:34:07

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH RFC 10/20] dt-bindings: riscv: add Smcntrpmf ISA extension description


On Fri, 16 Feb 2024 16:57:28 -0800, Atish Patra wrote:
> Add the description for Smcntrpmf ISA extension
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> Documentation/devicetree/bindings/riscv/extensions.yaml | 7 +++++++
> 1 file changed, 7 insertions(+)
>

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:


doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/[email protected]

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.


2024-02-18 12:48:07

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH RFC 04/20] dt-bindings: riscv: add Sxcsrind ISA extension description

On Fri, Feb 16, 2024 at 04:57:22PM -0800, Atish Patra wrote:
> Add the S[m|s]csrind ISA extension description.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> .../devicetree/bindings/riscv/extensions.yaml | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
> index 63d81dc895e5..77a9f867e36b 100644
> --- a/Documentation/devicetree/bindings/riscv/extensions.yaml
> +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
> @@ -134,6 +134,20 @@ properties:
> added by other RISC-V extensions in H/S/VS/U/VU modes and as
> ratified at commit a28bfae (Ratified (#7)) of riscv-state-enable.
>
> + - const: smcsrind
> + description: |
> + The standard Smcsrind supervisor-level extension extends the

The indentation here looks weird.

> + indirect CSR access mechanism defined by the Smaia extension. This
> + extension allows other ISA extension to use indirect CSR access
> + mechanism in M-mode.

For this, and the other patches in the series, I want a reference to the
frozen/ratified point for these extensions. See the rest of this file
for examples.

Cheers,
Conor.

> +
> + - const: sscsrind
> + description: |
> + The standard Sscsrind supervisor-level extension extends the
> + indirect CSR access mechanism defined by the Ssaia extension. This
> + extension allows other ISA extension to use indirect CSR access
> + mechanism in S-mode.
> +
> - const: ssaia
> description: |
> The standard Ssaia supervisor-level extension for the advanced
> --
> 2.34.1
>


Attachments:
(No filename) (1.76 kB)
signature.asc (235.00 B)
Download all attachments

2024-02-18 12:50:37

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH RFC 09/20] RISC-V: Add Smcntrpmf extension parsing

On Fri, Feb 16, 2024 at 04:57:27PM -0800, Atish Patra wrote:
> Smcntrpmf extension allows M-mode to enable privilege mode filtering
> for cycle/instret counters. However, the cyclecfg/instretcfg CSRs are
> only available only in Ssccfg only Smcntrpmf is present.

There's some typos in this opening paragraph that makes it hard to
follow.

>
> That's why, kernel needs to detect presence of Smcntrpmf extension and
> enable privilege mode filtering for cycle/instret counters.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/hwcap.h | 1 +
> arch/riscv/kernel/cpufeature.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> index 5f4401e221ee..b82a8d7a9b3b 100644
> --- a/arch/riscv/include/asm/hwcap.h
> +++ b/arch/riscv/include/asm/hwcap.h
> @@ -84,6 +84,7 @@
> #define RISCV_ISA_EXT_SMCSRIND 75
> #define RISCV_ISA_EXT_SSCCFG 76
> #define RISCV_ISA_EXT_SMCDELEG 77
> +#define RISCV_ISA_EXT_SMCNTRPMF 78
>
> #define RISCV_ISA_EXT_MAX 128
> #define RISCV_ISA_EXT_INVALID U32_MAX
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 77cc5dbd73bf..c30be2c924e7 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -302,6 +302,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
> __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
> __RISCV_ISA_EXT_DATA(smcdeleg, RISCV_ISA_EXT_SMCDELEG),
> __RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
> + __RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
> __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
> __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
> __RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
> --
> 2.34.1
>


Attachments:
(No filename) (1.84 kB)
signature.asc (235.00 B)
Download all attachments

2024-02-29 01:25:57

by Atish Patra

[permalink] [raw]
Subject: Re: [PATCH RFC 00/20] Add Counter delegation ISA extension support

On 2/17/24 12:28, Ian Rogers wrote:
> On Fri, Feb 16, 2024 at 4:58 PM Atish Patra <[email protected]> wrote:
>>
>> This series adds the counter delegation extension support. It is based on
>> very early PoC work done by Kevin Xue and mostly rewritten after that.
>> The counter delegation ISA extension(Smcdeleg/Ssccfg) actually depends
>> on multiple ISA extensions.
>>
>> 1. S[m|s]csrind : The indirect CSR extension[1] which defines additional
>> 5 ([M|S|VS]IREG2-[M|S|VS]IREG6) register to address size limitation of
>> RISC-V CSR address space.
>> 2. Smstateen: The stateen bit[60] controls the access to the registers
>> indirectly via the above indirect registers.
>> 3. Smcdeleg/Ssccfg: The counter delegation extensions[2]
>>
>> The counter delegation extension allows Supervisor mode to program the
>> hpmevent and hpmcounters directly without needing the assistance from the
>> M-mode via SBI calls. This results in a faster perf profiling and very
>> few traps. This extension also introduces a scountinhibit CSR which allows
>> to stop/start any counter directly from the S-mode. As the counter
>> delegation extension potentially can have more than 100 CSRs, the specification
>> leverages the indirect CSR extension to save the precious CSR address range.
>>
>> Due to the dependency of these extensions, the following extensions must be
>> enabled in qemu to use the counter delegation feature in S-mode.
>>
>> "smstateen=true,sscofpmf=true,ssccfg=true,smcdeleg=true,smcsrind=true,sscsrind=true"
>>
>> When we access the counters directly in S-mode, we also need to solve the
>> following problems.
>>
>> 1. Event to counter mapping
>> 2. Event encoding discovery
>>
>> The RISC-V ISA doesn't define any standard either for event encoding or the
>> event to counter mapping rules.
>>
>> Until now, the SBI PMU implementation relies on device tree binding[3] to
>> discover the event to counter mapping in RISC-V platform in the firmware. The
>> SBI PMU specification[4] defines event encoding for standard perf events as well.
>> Thus, the kernel can query the appropriate counter for an given event from the
>> firmware.
>>
>> However, the kernel doesn't need any firmware interaction for hardware
>> counters if counter delegation is available in the hardware. Thus, the driver
>> needs to discover the above mappings/encodings by itself without any assistance
>> from firmware. One of the options considered was to extend the PMU DT parsing
>> support to kernel as well. However, that requires additional support in ACPI
>> based system. It also needs more infrastructure in the virtualization as well.
>>
>> This patch series solves the above problem #1 by extending the perf tool in a
>> way so that event json file can specify the counter constraints of each event
>> and that can be passed to the driver to choose the best counter for a given
>> event. The perf stat metric series[5] from Weilin already extend the perf tool
>> to parse "Counter" property to specify the hardware counter restriction.
>> I have included the patch from Weilin in this series for verification purposes
>> only. I will rebase as that series evolves.
>>
>> This series extends that support by converting comma separated string to a
>> bitmap. The counter constraint bitmap is passed to the perf driver via
>> newly introduced "counterid_mask" property set in "config2". Even though, this
>> is a generic perf tool change, this should not affect any other architecture
>> if "counterid_mask" is not mapped.
>>
>> @Weilin: Please let me know if there is a better way to solve the problem I
>> described.
>>
>> The problem #2 is solved by defining a architecture specific override function
>> that will replace the perf standard event encoding with an encoding specified
>> in the json file with the same event name. The alternate solution considered
>> was to specify the encodings in the driver. However, these encodings are vendor
>> specific in absence of an ISA guidelines and will become unmanageable with
>> so many RISC-V vendors touching the driver for their encoding.
>>
>> The override is only required when counter delegation is available in the
>> platform which is detected at the runtime. The SBI PMU (current implementation)
>> doesn't require any override as it defines the standard event encoding. The
>> hwprobe syscall defined for RISC-V is used for this detection in this series.
>> A sysfs based property can be explored to do the same but we may require
>> hwprobe in future given the churn of extensions in RISC-V. That's why, I went
>> with hwprobe. Let me know if anybody thinks that's a bad idea.
>>
>> The perf tool also hook allows RISC-V ISA platform vendors to define their
>> encoding for any standard perf or ISA event. I have tried to cover all the use
>> cases that I am aware of (stat, record, top). Please let me know if I have
>> missed any particular use case where architecture hook must be invoked. I am
>> also open to any other idea to solve the above said problem.
>
> Hi Atish,
>
> Thank you for the work! I know how the perf tool discovers events is
> somewhat assumed knowledge, I thought I'd just go through it here and
> explain a difference that is landing in Linux 6.8, as well as recent
> heterogeneous/hybrid/big.little support changes, just so those who
> aren't up to speed can catch up for the sake of discussion on this
> approach - sorry for turning this into a longer email than it perhaps
> needs to be, and the historical take may lack accuracy that I
> apologize in advance for.
>
> The job of discovering events is to map a name like "cycles" into a
> set up for the perf_event_attr passed to perf_event_open. This sounds
> simple but "cycles" may be encoded differently for different PMUs on a
> heterogeneous system, it may also be an event on an accelerator like a
> GPU. So the first thing to recognize is that a name like "cycles" may
> map onto multiple struct perf_event_attr values. The behavior of how
> the perf tool does this lacks consistency, for example are all or just
> core PMUs considered, but this is deliberate for the sake of somewhat
> consistency by the tool over time. Perhaps in the future we'll
> change/fix this as things like accelerators dominate performance
> concerns.
>
> The next thing is that what "cycles" means has been evolving over
> Linux releases. Originally "cycles" was assumed to be a CPU event and
> there were other events like "page-faults" which were software events.
> In perf_event.h there are enums for the "type" of event (hardware,
> software, cache, etc.) and for the actual event itself - cycles is
> "config" value 0. In the code we tend to refer to this kind of
> encoding as legacy. An ability was added (maybe it was always there)
> to dynamically add PMUs and PMUs advertise the value for the struct
> perf_event_attr through sysfs in "/sys/devices/<pmu name>/type". On
> x86 the performance core typically has a type matching the legacy
> hardware number, but on ARM this isn't the case. So that legacy events
> can work on heterogeneous/hybrid/big.little systems where there should
> be multiple PMUs (looking at most Android devices for misconfiguring
> this), there is an extended type field in the top 32-bits of the
> struct perf_event_attr config. The extended type means I want this
> legacy event type on the extended type PMU.
>
> For non-legacy events there is a problem of how to map a name to a
> config value (I'll say singular config value but overtime it has
> actually become 4 64-bit values). The sysfs format directory
> "/sys/devices/<pmu name>/format" does this. The files in the format
> directory say that on x86 the event is encoded in the first byte of
> the config and the umask in the second byte. If there is an event like
> "assists.any" that has an event of 0xc1 and a umask of 7, then the
> perf tool knows to create a config value of 0x7c1 using the format
> encoding.
>
> To go from an event name like "cycles" to a format encoding there are
> two places to look, the first is "/sys/devices/<pmu name>/events/". In
> the events directory on x86 there is a "cpu-cycles" that contains
> "event=0x3c", i.e. a format style encoding. The second are the json
> files that are mapped to format style encodings for a specific cpuid
> by jevents.py. The easiest way to see the 2nd kind is to run "perf
> list --details":
> ```
> ...
> assists.any
> [Number of occurrences where a microcode assist is invoked by hardware]
> default_core/event=0xc1,period=0x186a3,umask=0x7/
> ...
> ```
> We can see there is a format style encoding that has been built into
> the perf tool.
>
> A place where ambiguity creeps in and is changing in Linux 6.8 is what
> to do when we have the same event in places like the legacy name,
> sysfs and the json? The behavior we have is:
> 1) "perf stat -e cycles ..." - the event was specified without PMUs,
> it is assumed a legacy encoding on all core PMUs is preferred (note
> non-core PMUs that have a cycles event are ignored, but this wouldn't
> be the case if the event weren't a legacy event name)
> 2) "perf stat -e cpu/cycles/" - the event was specified with a core
> PMU, prior to 6.8 (ie any current perf tool), a legacy encoding will
> be used. In 6.8 and after the json and sysfs encoding will have
> priority: https://lore.kernel.org/r/[email protected]
> 3) "perf stat -e pmu/cycles/" - event was specified with a non-core
> PMU so a legacy encoding won't be considered, only json and sysfs.
>
> As I understand the problem you are trying to fix in the perf tool it
> is for behavior 1 above, this is because the PMU driver wants the
> legacy event encodings to be in json so it needn't discover them.
> Behaviors 2 and 3 already prefer json encodings that are built into
> the perf tool.
>
> So given behavior 1 is kind of weird, it considers different PMUs
> dependent on whether the event is legacy or not, it doesn't override
> with a sysfs/json event if one is present, why don't we look to change
> behavior 1 so that it is more like behaviors 2 and 3? I believe this
> gives you the ability to override legacy events you want. At the same
> time I'd like to also remove the "only core PMUs" assumption.
>

Absolutely. Thanks for the detailed context and walking through the
different cases for each event type.

> What would this look like? Well in the current code we take a legacy
> event and then create a perf_event_attr for each core PMU:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1348
> We'd need to change this so that we wild card all the PMUs and
> consider the sysfs/json events first, which is what we already do
> here:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.y?h=perf-tools-next#n305
> with the sysfs/json fix up being here:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1016
>
> As with the 6.8 change to prioritize sysfs/json over legacy the
> largest part of this change will be updating all the test
> expectations. Wdyt?
>

That's awesome. That's exactly what I want. Let me know if you are going
to push that change or want me to work on it while revising this series.

I am happy to test those changes either way.

> Things this patch series does that I don't like:
> - hardcoding the expected CPU's PMU to "cpu", this should almost
> certainly be an iterator over all core PMU types. This allows core

Agreed. I did not like this approach either and not sure if this is the
best method to fix the problem we have. I should have clarified well
that in the cover letter.

> PMUs not to be called "cpu" and for heterogeneous configurations to
> work.
> - doing things in an arch specific way. Test coverage is really hard


We do have an additional architecture specific issue at hand here.
As described, we need the event mapping with platforms with newer ISA
extensions but doesn't need it for older platforms without ISA
extensions. In addition to that, the new ISA extension doesn't cover
virtualization. Thus, kvm guests continue to rely on the old way of
discovering the events through SBI (ecall) interface.

Thus, a platform (e.g Qemu) can have both options present depending on
how it is configured.

To cover both cases, I introduced a RISC-V specific mechanism to detect
the presence of the ISA extension in the perf tool.

"PATCH 18-20 adds hwprobe mechanism to enable perf to detect if platform
supports delegation extensions."

If we don't want to keep any architecture specific hooks in this path,
can we use a sysfs way to enable/disable mapping ?

I believe the pmu driver can update the sysfs for the perf tool to
decide in that case ?

> and when something lives in arch directory we lose coverage unless we
> run on that machine type. Ugh, I'm just reminded that ARM
> heterogeneous is broken because of an arch override that doesn't
> consider >1 core PMU. Testing an ARM heterogenous PMU set up is hard
> but not doing so breaks people running Linux on M1 macs. Really we
> should just have PMU specific behaviors and the arch directory should
> disappear. This would also greatly help cross architecture work where
> you may record from perf on one architecture, but analyze the data on
> another.
> - we've been moving to have perf and the json have less special
> architecture knowledge. Weilin's patches aside, we've added things
> like "/sys/devices/<pmu name>/caps/slots" so that metrics can use
> "#slots" rather than hard coding the pipeline width in each metric. My
> hope for Weilin's patches is that we can make it less Intel specific
> and ultimately we may be able to advertise the specific features like
> number of fixed and generic counters via something like sysfs.

While that's ideal for x86/ARM64 where number of platform vendors are
less (x86) or ISA has defined all-to-all event to counter mapping.

In RISC-V, we tried to push for all-to-all mapping which was not
accepted withing RISC-V working groups. Every platform vendor will most
likely end up in a different mapping of event to counters.

Thus, event to counter mapping via "Counter" field in json solve this
problem quite well. The only downside is that there is no common way
across the architecture to pass that information to the pmu driver.

Hence a RISC-V specific solution via config2
https://lore.kernel.org/lkml/[email protected]/

Please let me know if this can be solved in an eficient way as well.

> However, the counters an event can go on is a property of the event so
> I see a need for the sysfs/json to add this.
>
> Congratulations if you got this far, sorry this email was so long. Thanks,
> Ian
>

Apologies for the delayed reply and Thanks again!

>> PATCH organization:
>> PATCH 1 is from the perf metric series[5]
>> PATCH 2-5 defines and implements the indirect CSR extension.
>> PATCH 6-10 defines the other required ISA extensions.
>> PATCH 11 just an overall restructure of the RISC-V PMU driver.
>> PATCH 12-14 implements the counter delegation extension and new perf tool
>> plumbings to solve #1 and #2.
>> PATCH 15-16 improves the perf tool support to solve #1 and #2.
>> PATCH 17 adds a perf json file for qemu virt machine.
>> PATCH 18-20 adds hwprobe mechanism to enable perf to detect if platform supports
>> delegation extensions.
>>
>> There is no change in process to run perf stat/record and will continue to work
>> as it is as long as the relevant extensions have been enabled in Qemu.
>>
>> However, the perf tool needs to be recompiled with as it requires new kenrel
>> headers.
>>
>> The Qemu patches can be found here:
>> https://github.com/atishp04/qemu/tree/counter_delegation_rfc
>>
>> The opensbi patch can be found here:
>> https://github.com/atishp04/opensbi/tree/counter_delegation_v1
>>
>> The Linux kernel patches can be found here:
>> https://github.com/atishp04/linux/tree/counter_delegation_rfc
>>
>> [1] https://github.com/riscv/riscv-indirect-csr-access
>> [2] https://github.com/riscv/riscv-smcdeleg-ssccfg
>> [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/perf/riscv%2Cpmu.yaml
>> [4] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc
>> [5] https://lore.kernel.org/all/[email protected]/
>>
>> Atish Patra (17):
>> RISC-V: Add Sxcsrind ISA extension definition and parsing
>> dt-bindings: riscv: add Sxcsrind ISA extension description
>> RISC-V: Define indirect CSR access helpers
>> RISC-V: Add Ssccfg ISA extension definition and parsing
>> dt-bindings: riscv: add Ssccfg ISA extension description
>> RISC-V: Add Smcntrpmf extension parsing
>> dt-bindings: riscv: add Smcntrpmf ISA extension description
>> RISC-V: perf: Restructure the SBI PMU code
>> RISC-V: perf: Modify the counter discovery mechanism
>> RISC-V: perf: Implement supervisor counter delegation support
>> RISC-V: perf: Use config2 for event to counter mapping
>> tools/perf: Add arch hooks to override perf standard events
>> tools/perf: Pass the Counter constraint values in the pmu events
>> perf: Add json file for virt machine supported events
>> tools arch uapi: Sync the uinstd.h header file for RISC-V
>> RISC-V: Add hwprobe support for Counter delegation extensions
>> tools/perf: Detect if platform supports counter delegation
>>
>> Kaiwen Xue (2):
>> RISC-V: Add Sxcsrind ISA extension CSR definitions
>> RISC-V: Add Sscfg extension CSR definition
>>
>> Weilin Wang (1):
>> perf pmu-events: Add functions in jevent.py to parse counter and event
>> info for hardware aware grouping
>>
>> Documentation/arch/riscv/hwprobe.rst | 10 +
>> .../devicetree/bindings/riscv/extensions.yaml | 34 +
>> MAINTAINERS | 4 +-
>> arch/riscv/include/asm/csr.h | 47 ++
>> arch/riscv/include/asm/csr_ind.h | 42 ++
>> arch/riscv/include/asm/hwcap.h | 5 +
>> arch/riscv/include/asm/sbi.h | 2 +-
>> arch/riscv/include/uapi/asm/hwprobe.h | 4 +
>> arch/riscv/kernel/cpufeature.c | 5 +
>> arch/riscv/kernel/sys_hwprobe.c | 3 +
>> arch/riscv/kvm/vcpu_pmu.c | 2 +-
>> drivers/perf/Kconfig | 16 +-
>> drivers/perf/Makefile | 4 +-
>> .../perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
>> .../perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 654 ++++++++++++++----
>> include/linux/perf/riscv_pmu.h | 13 +-
>> tools/arch/riscv/include/uapi/asm/unistd.h | 14 +-
>> tools/perf/arch/riscv/util/Build | 2 +
>> tools/perf/arch/riscv/util/evlist.c | 60 ++
>> tools/perf/arch/riscv/util/pmu.c | 41 ++
>> tools/perf/arch/riscv/util/pmu.h | 11 +
>> tools/perf/builtin-record.c | 3 +
>> tools/perf/builtin-stat.c | 2 +
>> tools/perf/builtin-top.c | 3 +
>> .../pmu-events/arch/riscv/arch-standard.json | 10 +
>> tools/perf/pmu-events/arch/riscv/mapfile.csv | 1 +
>> .../pmu-events/arch/riscv/qemu/virt/cpu.json | 30 +
>> .../arch/riscv/qemu/virt/firmware.json | 68 ++
>> tools/perf/pmu-events/jevents.py | 186 ++++-
>> tools/perf/pmu-events/pmu-events.h | 25 +-
>> tools/perf/util/evlist.c | 6 +
>> tools/perf/util/evlist.h | 6 +
>> 32 files changed, 1167 insertions(+), 146 deletions(-)
>> create mode 100644 arch/riscv/include/asm/csr_ind.h
>> rename drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} (100%)
>> rename drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} (61%)
>> create mode 100644 tools/perf/arch/riscv/util/evlist.c
>> create mode 100644 tools/perf/arch/riscv/util/pmu.c
>> create mode 100644 tools/perf/arch/riscv/util/pmu.h
>> create mode 100644 tools/perf/pmu-events/arch/riscv/arch-standard.json
>> create mode 100644 tools/perf/pmu-events/arch/riscv/qemu/virt/cpu.json
>> create mode 100644 tools/perf/pmu-events/arch/riscv/qemu/virt/firmware.json
>>
>> --
>> 2.34.1
>>