This patch series adds bindings and drivers for the Performance
Monitoring Units (PMUs) found in three SiFive cache controllers.
Composable Cache and Extensible Cache support system-wide profiling
with a single hardware instance. Private L2 Cache supports per-task
profiling with a separate hardware instance per core.
All three PMUs have a similar register interface and event encoding,
though the set of supported events is different. The Extensible Cache
additionally contains a pmCounterInhibit register which allows
atomically monitoring multiple counters.
All three of these cache controllers (with PMUs) have been integrated in
SoCs by our customers. However, as none of those SoCs have been publicly
announced yet, I cannot include SoC-specific compatible strings in this
version of the devicetree bindings.
This series is a follow-up to Eric Lin's series "[PATCH v2 0/3] Add
SiFive Private L2 cache and PMU driver":
https://lore.kernel.org/linux-riscv/[email protected]/
Changes in v1:
- Add back select: clause to binding
- Make sifive,pl2cache1 the fallback for sifive,pl2cache0
- Fix the order of the reg property declaration
- Document the sifive,perfmon-counters property
- Drop the non-PMU part of the PL2 cache driver, as the config register
save/restore logic will be moved to M-mode
- Add missing events to PL2 sets 2, 4, and 5
- Use event_base and config_base to precompute register addresses
- Check event validity earlier, in the .event_init hook
- Implement .filter for systems where only some CPUs have a PL2
- Only allocate percpu data when probing each PL2 instance
- Reference count the `struct pmu` to fix unbind/bind crashes
- Probe via DT since the PMU driver is now the only PL2 driver
- Allow the driver to be built as a module
Eric Lin (4):
drivers/perf: Add SiFive Composable Cache PMU driver
dt-bindings: cache: Add SiFive Extensible Cache controller
drivers/perf: Add SiFive Extensible Cache PMU driver
dt-bindings: cache: Add SiFive Private L2 Cache controller
Greentime Hu (1):
drivers/perf: Add SiFive Private L2 Cache PMU driver
Samuel Holland (1):
dt-bindings: cache: Document the sifive,perfmon-counters property
.../bindings/cache/sifive,ccache0.yaml | 5 +
.../cache/sifive,extensiblecache0.yaml | 136 ++++
.../bindings/cache/sifive,pl2cache0.yaml | 81 ++
drivers/perf/Kconfig | 29 +
drivers/perf/Makefile | 3 +
drivers/perf/sifive_ccache_pmu.c | 577 ++++++++++++++
drivers/perf/sifive_ecache_pmu.c | 675 ++++++++++++++++
drivers/perf/sifive_pl2_pmu.c | 748 ++++++++++++++++++
include/linux/cpuhotplug.h | 2 +
9 files changed, 2256 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cache/sifive,extensiblecache0.yaml
create mode 100644 Documentation/devicetree/bindings/cache/sifive,pl2cache0.yaml
create mode 100644 drivers/perf/sifive_ccache_pmu.c
create mode 100644 drivers/perf/sifive_ecache_pmu.c
create mode 100644 drivers/perf/sifive_pl2_pmu.c
--
2.43.0
The SiFive Composable Cache controller contains an optional PMU with a
configurable number of event counters. Document a property which
describes the number of available counters.
Signed-off-by: Samuel Holland <[email protected]>
---
Documentation/devicetree/bindings/cache/sifive,ccache0.yaml | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml b/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml
index 7e8cebe21584..100eda4345de 100644
--- a/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml
+++ b/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml
@@ -81,6 +81,11 @@ properties:
The reference to the reserved-memory for the L2 Loosely Integrated Memory region.
The reserved memory node should be defined as per the bindings in reserved-memory.txt.
+ sifive,perfmon-counters:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 0
+ description: Number of PMU counter registers
+
allOf:
- $ref: /schemas/cache-controller.yaml#
--
2.43.0
From: Eric Lin <[email protected]>
Add a driver for the PMU found in the SiFive Composable Cache
controller. This PMU provides a configurable number of counters and a
variety of events. Events are grouped into sets. Each counter can count
events from only one set at a time; however, it can count any number of
events within that set simultaneously. The PMU hardware does not provide
an overflow interrupt or a way to atomically control groups of counters.
Some events can be filtered further by client ID (e.g. CPU or external
DMA master). That functionality is not supported by this driver.
This driver further assumes that a single Composable Cache instance is
shared by all CPUs in the system.
Example usage:
$ perf stat -a -e sifive_ccache_pmu/inner_acquire_block_btot/,
sifive_ccache_pmu/inner_acquire_block_hit/,
sifive_ccache_pmu/inner_acquire_block_ntob/ ls
Performance counter stats for 'system wide':
542 sifive_ccache_pmu/inner_acquire_block_btot/
22081 sifive_ccache_pmu/inner_acquire_block_hit/
22006 sifive_ccache_pmu/inner_acquire_block_ntob/
0.064672432 seconds time elapsed
Example using numeric event selectors:
$ perf stat -a -e sifive_ccache_pmu/event=0x10001/,
sifive_ccache_pmu/event=0x2002/,
sifive_ccache_pmu/event=0x4001/ ls
Performance counter stats for 'system wide':
478 sifive_ccache_pmu/event=0x10001/
4717 sifive_ccache_pmu/event=0x2002/
44966 sifive_ccache_pmu/event=0x4001/
0.111027326 seconds time elapsed
Signed-off-by: Eric Lin <[email protected]>
Co-developed-by: Samuel Holland <[email protected]>
Signed-off-by: Samuel Holland <[email protected]>
---
drivers/perf/Kconfig | 9 +
drivers/perf/Makefile | 1 +
drivers/perf/sifive_ccache_pmu.c | 577 +++++++++++++++++++++++++++++++
include/linux/cpuhotplug.h | 1 +
4 files changed, 588 insertions(+)
create mode 100644 drivers/perf/sifive_ccache_pmu.c
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index ec6e0d9194a1..b4e4db7424b4 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -155,6 +155,15 @@ config QCOM_L3_PMU
Adds the L3 cache PMU into the perf events subsystem for
monitoring L3 cache events.
+config SIFIVE_CCACHE_PMU
+ tristate "SiFive Composable Cache PMU"
+ depends on RISCV || COMPILE_TEST
+ help
+ Support for the Composable Cache performance monitoring unit (PMU) on
+ SiFive platforms. The Composable Cache PMU provides up to 64 counters
+ for measuring whole-system L2/L3 cache performance using the perf
+ events subsystem.
+
config THUNDERX2_PMU
tristate "Cavium ThunderX2 SoC PMU UNCORE"
depends on ARCH_THUNDER2 || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index a06338e3401c..51ef5f50ace4 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o
+obj-$(CONFIG_SIFIVE_CCACHE_PMU) += sifive_ccache_pmu.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
diff --git a/drivers/perf/sifive_ccache_pmu.c b/drivers/perf/sifive_ccache_pmu.c
new file mode 100644
index 000000000000..8c9ef0d09f48
--- /dev/null
+++ b/drivers/perf/sifive_ccache_pmu.c
@@ -0,0 +1,577 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SiFive Composable Cache PMU driver
+ *
+ * Copyright (C) 2022-2024 SiFive, Inc.
+ * Copyright (C) Eric Lin <[email protected]>
+ *
+ */
+
+#include <linux/cpuhotplug.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/property.h>
+
+#define CCACHE_SELECT_OFFSET 0x2000
+#define CCACHE_CLIENT_FILTER_OFFSET 0x2800
+#define CCACHE_COUNTER_OFFSET 0x3000
+
+#define CCACHE_PMU_MAX_COUNTERS 64
+
+struct sifive_ccache_pmu {
+ struct pmu pmu;
+ struct hlist_node node;
+ struct notifier_block cpu_pm_nb;
+ void __iomem *base;
+ DECLARE_BITMAP(used_mask, CCACHE_PMU_MAX_COUNTERS);
+ unsigned int cpu;
+ int n_counters;
+ struct perf_event *events[] __counted_by(n_counters);
+};
+
+#define to_ccache_pmu(p) (container_of(p, struct sifive_ccache_pmu, pmu))
+
+#ifndef readq
+static inline u64 readq(void __iomem *addr)
+{
+ return readl(addr) | (((u64)readl(addr + 4)) << 32);
+}
+#endif
+
+#ifndef writeq
+static inline void writeq(u64 v, void __iomem *addr)
+{
+ writel(lower_32_bits(v), addr);
+ writel(upper_32_bits(v), addr + 4);
+}
+#endif
+
+/*
+ * sysfs attributes
+ *
+ * We export:
+ * - cpumask, used by perf user space and other tools to know on which CPUs to create events
+ * - events, used by perf user space and other tools to create events symbolically, e.g.:
+ * perf stat -a -e sifive_ccache_pmu/event=inner_put_partial_data_hit/ ls
+ * perf stat -a -e sifive_ccache_pmu/event=0x101/ ls
+ * - formats, used by perf user space and other tools to configure events
+ */
+
+/* cpumask */
+static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct sifive_ccache_pmu *ccache_pmu = dev_get_drvdata(dev);
+
+ if (ccache_pmu->cpu >= nr_cpu_ids)
+ return 0;
+
+ return sysfs_emit(buf, "%d\n", ccache_pmu->cpu);
+};
+
+static DEVICE_ATTR_RO(cpumask);
+
+static struct attribute *sifive_ccache_pmu_cpumask_attrs[] = {
+ &dev_attr_cpumask.attr,
+ NULL,
+};
+
+static const struct attribute_group sifive_ccache_pmu_cpumask_group = {
+ .attrs = sifive_ccache_pmu_cpumask_attrs,
+};
+
+/* events */
+static ssize_t sifive_ccache_pmu_event_show(struct device *dev, struct device_attribute *attr,
+ char *page)
+{
+ struct perf_pmu_events_attr *pmu_attr;
+
+ pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+ return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
+}
+
+#define SET_EVENT_SELECT(_event, _set) (BIT_ULL((_event) + 8) | (_set))
+#define CCACHE_PMU_EVENT_ATTR(_name, _event, _set) \
+ PMU_EVENT_ATTR_ID(_name, sifive_ccache_pmu_event_show, SET_EVENT_SELECT(_event, _set))
+
+enum ccache_pmu_event_set1 {
+ INNER_PUT_FULL_DATA = 0,
+ INNER_PUT_PARTIAL_DATA,
+ INNER_ATOMIC_DATA,
+ INNER_GET,
+ INNER_PREFETCH_READ,
+ INNER_PREFETCH_WRITE,
+ INNER_ACQUIRE_BLOCK_NTOB,
+ INNER_ACQUIRE_BLOCK_NTOT,
+ INNER_ACQUIRE_BLOCK_BTOT,
+ INNER_ACQUIRE_PERM_NTOT,
+ INNER_ACQUIRE_PERM_BTOT,
+ INNER_RELEASE_TTOB,
+ INNER_RELEASE_TTON,
+ INNER_RELEASE_BTON,
+ INNER_RELEASE_DATA_TTOB,
+ INNER_RELEASE_DATA_TTON,
+ INNER_RELEASE_DATA_BTON,
+ OUTER_PROBE_BLOCK_TOT,
+ OUTER_PROBE_BLOCK_TOB,
+ OUTER_PROBE_BLOCK_TON,
+ CCACHE_PMU_MAX_EVENT1_IDX
+};
+
+enum ccache_pmu_event_set2 {
+ INNER_PUT_FULL_DATA_HIT = 0,
+ INNER_PUT_PARTIAL_DATA_HIT,
+ INNER_ATOMIC_DATA_HIT,
+ INNER_GET_HIT,
+ INNER_PREFETCH_HIT,
+ INNER_ACQUIRE_BLOCK_HIT,
+ INNER_ACQUIRE_PERM_HIT,
+ INNER_RELEASE_HIT,
+ INNER_RELEASE_DATA_HIT,
+ OUTER_PROBE_HIT,
+ INNER_PUT_FULL_DATA_HIT_SHARED,
+ INNER_PUT_PARTIAL_DATA_HIT_SHARED,
+ INNER_ATOMIC_DATA_HIT_SHARED,
+ INNER_GET_HIT_SHARED,
+ INNER_PREFETCH_HIT_SHARED,
+ INNER_ACQUIRE_BLOCK_HIT_SHARED,
+ INNER_ACQUIRE_PERM_HIT_SHARED,
+ OUTER_PROBE_HIT_SHARED,
+ OUTER_PROBE_HIT_DIRTY,
+ CCACHE_PMU_MAX_EVENT2_IDX
+};
+
+enum ccache_pmu_event_set3 {
+ OUTER_ACQUIRE_BLOCK_NTOB_MISS = 0,
+ OUTER_ACQUIRE_BLOCK_NTOT_MISS,
+ OUTER_ACQUIRE_BLOCK_BTOT_MISS,
+ OUTER_ACQUIRE_PERM_NTOT_MISS,
+ OUTER_ACQUIRE_PERM_BTOT_MISS,
+ OUTER_RELEASE_TTOB_EVICTION,
+ OUTER_RELEASE_TTON_EVICTION,
+ OUTER_RELEASE_BTON_EVICTION,
+ OUTER_RELEASE_DATA_TTOB_NOT_APPLICABLE,
+ OUTER_RELEASE_DATA_TTON_DIRTY_EVICTION,
+ OUTER_RELEASE_DATA_BTON_NOT_APPLICABLE,
+ INNER_PROBE_BLOCK_TOT_CODE_MISS_HITS_OTHER_HARTS,
+ INNER_PROBE_BLOCK_TOB_LOAD_MISS_HITS_OTHER_HARTS,
+ INNER_PROBE_BLOCK_TON_STORE_MISS_HITS_OTHER_HARTS,
+ CCACHE_PMU_MAX_EVENT3_IDX
+};
+
+enum ccache_pmu_event_set4 {
+ INNER_HINT_HITS_INFLIGHT_MISS = 0,
+ CCACHE_PMU_MAX_EVENT4_IDX
+};
+
+static struct attribute *sifive_ccache_pmu_events[] = {
+ /* pmEventSelect1 */
+ CCACHE_PMU_EVENT_ATTR(inner_put_full_data, INNER_PUT_FULL_DATA, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_put_partial_data, INNER_PUT_PARTIAL_DATA, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_atomic_data, INNER_ATOMIC_DATA, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_get, INNER_GET, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_prefetch_read, INNER_PREFETCH_READ, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_prefetch_write, INNER_PREFETCH_WRITE, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_block_ntob, INNER_ACQUIRE_BLOCK_NTOB, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_block_ntot, INNER_ACQUIRE_BLOCK_NTOT, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_block_btot, INNER_ACQUIRE_BLOCK_BTOT, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_ntot, INNER_ACQUIRE_PERM_NTOT, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_btot, INNER_ACQUIRE_PERM_BTOT, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_release_ttob, INNER_RELEASE_TTOB, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_release_tton, INNER_RELEASE_TTON, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_release_bton, INNER_RELEASE_BTON, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_release_data_ttob, INNER_RELEASE_DATA_TTOB, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_release_data_tton, INNER_RELEASE_DATA_TTON, 1),
+ CCACHE_PMU_EVENT_ATTR(inner_release_data_bton, INNER_RELEASE_DATA_BTON, 1),
+ CCACHE_PMU_EVENT_ATTR(outer_probe_block_tot, OUTER_PROBE_BLOCK_TOT, 1),
+ CCACHE_PMU_EVENT_ATTR(outer_probe_block_tob, OUTER_PROBE_BLOCK_TOB, 1),
+ CCACHE_PMU_EVENT_ATTR(outer_probe_block_ton, OUTER_PROBE_BLOCK_TON, 1),
+
+ /* pmEventSelect2 */
+ CCACHE_PMU_EVENT_ATTR(inner_put_full_data_hit, INNER_PUT_FULL_DATA_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_put_partial_data_hit, INNER_PUT_PARTIAL_DATA_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_atomic_data_hit, INNER_ATOMIC_DATA_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_get_hit, INNER_GET_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_prefetch_hit, INNER_PREFETCH_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_block_hit, INNER_ACQUIRE_BLOCK_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_hit, INNER_ACQUIRE_PERM_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_release_hit, INNER_RELEASE_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_release_data_hit, INNER_RELEASE_DATA_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(outer_probe_hit, OUTER_PROBE_HIT, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_put_full_data_hit_shared, INNER_PUT_FULL_DATA_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_put_partial_data_hit_shared,
+ INNER_PUT_PARTIAL_DATA_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_atomic_data_hit_shared, INNER_ATOMIC_DATA_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_get_hit_shared, INNER_GET_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_prefetch_hit_shared, INNER_PREFETCH_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_block_hit_shared, INNER_ACQUIRE_BLOCK_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_hit_shared, INNER_ACQUIRE_PERM_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(outer_probe_hit_shared, OUTER_PROBE_HIT_SHARED, 2),
+ CCACHE_PMU_EVENT_ATTR(outer_probe_hit_dirty, OUTER_PROBE_HIT_DIRTY, 2),
+
+ /* pmEventSelect3 */
+ CCACHE_PMU_EVENT_ATTR(outer_acquire_block_ntob_miss, OUTER_ACQUIRE_BLOCK_NTOB_MISS, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_acquire_block_ntot_miss, OUTER_ACQUIRE_BLOCK_NTOT_MISS, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_acquire_block_btot_miss, OUTER_ACQUIRE_BLOCK_BTOT_MISS, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_acquire_perm_ntot_miss, OUTER_ACQUIRE_PERM_NTOT_MISS, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_acquire_perm_btot_miss, OUTER_ACQUIRE_PERM_BTOT_MISS, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_release_ttob_eviction, OUTER_RELEASE_TTOB_EVICTION, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_release_tton_eviction, OUTER_RELEASE_TTON_EVICTION, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_release_bton_eviction, OUTER_RELEASE_BTON_EVICTION, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_release_data_ttob_not_applicable,
+ OUTER_RELEASE_DATA_TTOB_NOT_APPLICABLE, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_release_data_tton_dirty_eviction,
+ OUTER_RELEASE_DATA_TTON_DIRTY_EVICTION, 3),
+ CCACHE_PMU_EVENT_ATTR(outer_release_data_bton_not_applicable,
+ OUTER_RELEASE_DATA_BTON_NOT_APPLICABLE, 3),
+ CCACHE_PMU_EVENT_ATTR(inner_probe_block_tot_code_miss_hits_other_harts,
+ INNER_PROBE_BLOCK_TOT_CODE_MISS_HITS_OTHER_HARTS, 3),
+ CCACHE_PMU_EVENT_ATTR(inner_probe_block_tob_load_miss_hits_other_harts,
+ INNER_PROBE_BLOCK_TOB_LOAD_MISS_HITS_OTHER_HARTS, 3),
+ CCACHE_PMU_EVENT_ATTR(inner_probe_block_ton_store_miss_hits_other_harts,
+ INNER_PROBE_BLOCK_TON_STORE_MISS_HITS_OTHER_HARTS, 3),
+
+ /* pm_event_select4 */
+ CCACHE_PMU_EVENT_ATTR(inner_hint_hits_inflight_miss, INNER_HINT_HITS_INFLIGHT_MISS, 4),
+ NULL
+};
+
+static struct attribute_group sifive_ccache_pmu_events_group = {
+ .name = "events",
+ .attrs = sifive_ccache_pmu_events,
+};
+
+/* formats */
+PMU_FORMAT_ATTR(event, "config:0-63");
+
+static struct attribute *sifive_ccache_pmu_formats[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+
+static struct attribute_group sifive_ccache_pmu_format_group = {
+ .name = "format",
+ .attrs = sifive_ccache_pmu_formats,
+};
+
+/*
+ * Per PMU device attribute groups
+ */
+
+static const struct attribute_group *sifive_ccache_pmu_attr_grps[] = {
+ &sifive_ccache_pmu_cpumask_group,
+ &sifive_ccache_pmu_events_group,
+ &sifive_ccache_pmu_format_group,
+ NULL,
+};
+
+/*
+ * Event Initialization
+ */
+
+static int sifive_ccache_pmu_event_init(struct perf_event *event)
+{
+ struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 config = event->attr.config;
+ u64 ev_type = config >> 8;
+ u64 set = config & 0xff;
+
+ /* Check if this is a valid set and event */
+ switch (set) {
+ case 1:
+ if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT1_IDX))
+ return -ENOENT;
+ break;
+ case 2:
+ if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT2_IDX))
+ return -ENOENT;
+ break;
+ case 3:
+ if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT3_IDX))
+ return -ENOENT;
+ break;
+ case 4:
+ if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT4_IDX))
+ return -ENOENT;
+ break;
+ default:
+ return -ENOENT;
+ }
+
+ /* Do not allocate the hardware counter yet */
+ hwc->idx = -1;
+ hwc->config = config;
+
+ event->cpu = ccache_pmu->cpu;
+
+ return 0;
+}
+
+/*
+ * pmu->read: read and update the counter
+ */
+static void sifive_ccache_pmu_read(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ u64 prev_raw_count, new_raw_count;
+ u64 oldval;
+
+ do {
+ prev_raw_count = local64_read(&hwc->prev_count);
+ new_raw_count = readq((void *)hwc->event_base);
+
+ oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count, new_raw_count);
+ } while (oldval != prev_raw_count);
+
+ local64_add(new_raw_count - prev_raw_count, &event->count);
+}
+
+/*
+ * State transition functions:
+ *
+ * start()/stop() & add()/del()
+ */
+
+/*
+ * pmu->start: start the event
+ */
+static void sifive_ccache_pmu_start(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
+ return;
+
+ hwc->state = 0;
+
+ /* Set initial value to 0 */
+ local64_set(&hwc->prev_count, 0);
+ writeq(0, (void *)hwc->event_base);
+
+ /* Enable this counter to count events */
+ writeq(hwc->config, (void *)hwc->config_base);
+}
+
+/*
+ * pmu->stop: stop the counter
+ */
+static void sifive_ccache_pmu_stop(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (hwc->state & PERF_HES_STOPPED)
+ return;
+
+ /* Disable this counter to count events */
+ writeq(0, (void *)hwc->config_base);
+ sifive_ccache_pmu_read(event);
+
+ hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+/*
+ * pmu->add: add the event to the PMU
+ */
+static int sifive_ccache_pmu_add(struct perf_event *event, int flags)
+{
+ struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ int idx;
+
+ /* Find an available counter idx to use for this event */
+ do {
+ idx = find_first_zero_bit(ccache_pmu->used_mask, ccache_pmu->n_counters);
+ if (idx >= ccache_pmu->n_counters)
+ return -EAGAIN;
+ } while (test_and_set_bit(idx, ccache_pmu->used_mask));
+
+ hwc->config_base = (unsigned long)ccache_pmu->base + CCACHE_SELECT_OFFSET + 8 * idx;
+ hwc->event_base = (unsigned long)ccache_pmu->base + CCACHE_COUNTER_OFFSET + 8 * idx;
+ hwc->idx = idx;
+ hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+ ccache_pmu->events[idx] = event;
+
+ if (flags & PERF_EF_START)
+ sifive_ccache_pmu_start(event, PERF_EF_RELOAD);
+
+ perf_event_update_userpage(event);
+
+ return 0;
+}
+
+/*
+ * pmu->del: delete the event from the PMU
+ */
+static void sifive_ccache_pmu_del(struct perf_event *event, int flags)
+{
+ struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ int idx = hwc->idx;
+
+ /* Stop and release this counter */
+ sifive_ccache_pmu_stop(event, PERF_EF_UPDATE);
+
+ ccache_pmu->events[idx] = NULL;
+ clear_bit(idx, ccache_pmu->used_mask);
+
+ perf_event_update_userpage(event);
+}
+
+/*
+ * Driver initialization
+ */
+
+static void sifive_ccache_pmu_hw_init(const struct sifive_ccache_pmu *ccache_pmu)
+{
+ /* Disable the client filter (not supported by this driver) */
+ writeq(0, ccache_pmu->base + CCACHE_CLIENT_FILTER_OFFSET);
+}
+
+static int sifive_ccache_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
+{
+ struct sifive_ccache_pmu *ccache_pmu =
+ hlist_entry_safe(node, struct sifive_ccache_pmu, node);
+
+ if (ccache_pmu->cpu >= nr_cpu_ids)
+ ccache_pmu->cpu = cpu;
+
+ return 0;
+}
+
+static int sifive_ccache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+ struct sifive_ccache_pmu *ccache_pmu =
+ hlist_entry_safe(node, struct sifive_ccache_pmu, node);
+
+ /* Do nothing if this CPU does not own the events */
+ if (cpu != ccache_pmu->cpu)
+ return 0;
+
+ /* Pick a random online CPU */
+ ccache_pmu->cpu = cpumask_any_but(cpu_online_mask, cpu);
+ if (ccache_pmu->cpu >= nr_cpu_ids)
+ return 0;
+
+ /* Migrate PMU events from this CPU to the target CPU */
+ perf_pmu_migrate_context(&ccache_pmu->pmu, cpu, ccache_pmu->cpu);
+
+ return 0;
+}
+
+static int sifive_ccache_pmu_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct sifive_ccache_pmu *ccache_pmu;
+ u32 n_counters;
+ int ret;
+
+ /* Instances without a sifive,perfmon-counters property do not contain a PMU */
+ ret = device_property_read_u32(dev, "sifive,perfmon-counters", &n_counters);
+ if (ret || !n_counters)
+ return -ENODEV;
+
+ ccache_pmu = devm_kzalloc(dev, struct_size(ccache_pmu, events, n_counters), GFP_KERNEL);
+ if (!ccache_pmu)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, ccache_pmu);
+
+ ccache_pmu->pmu = (struct pmu) {
+ .parent = dev,
+ .attr_groups = sifive_ccache_pmu_attr_grps,
+ .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = sifive_ccache_pmu_event_init,
+ .add = sifive_ccache_pmu_add,
+ .del = sifive_ccache_pmu_del,
+ .start = sifive_ccache_pmu_start,
+ .stop = sifive_ccache_pmu_stop,
+ .read = sifive_ccache_pmu_read,
+ };
+ ccache_pmu->cpu = nr_cpu_ids;
+ ccache_pmu->n_counters = n_counters;
+
+ ccache_pmu->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(ccache_pmu->base))
+ return PTR_ERR(ccache_pmu->base);
+
+ sifive_ccache_pmu_hw_init(ccache_pmu);
+
+ ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
+ if (ret)
+ return dev_err_probe(dev, ret, "Failed to add CPU hotplug instance\n");
+
+ ret = perf_pmu_register(&ccache_pmu->pmu, "sifive_ccache_pmu", -1);
+ if (ret) {
+ dev_err_probe(dev, ret, "Failed to register PMU\n");
+ goto err_remove_instance;
+ }
+
+ return 0;
+
+err_remove_instance:
+ cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
+
+ return ret;
+}
+
+static void sifive_ccache_pmu_remove(struct platform_device *pdev)
+{
+ struct sifive_ccache_pmu *ccache_pmu = platform_get_drvdata(pdev);
+
+ perf_pmu_unregister(&ccache_pmu->pmu);
+ cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
+}
+
+static const struct of_device_id sifive_ccache_pmu_of_match[] = {
+ { .compatible = "sifive,ccache0" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, sifive_ccache_pmu_of_match);
+
+static struct platform_driver sifive_ccache_pmu_driver = {
+ .probe = sifive_ccache_pmu_probe,
+ .remove_new = sifive_ccache_pmu_remove,
+ .driver = {
+ .name = "sifive_ccache_pmu",
+ .of_match_table = sifive_ccache_pmu_of_match,
+ },
+};
+
+static void __exit sifive_ccache_pmu_exit(void)
+{
+ platform_driver_unregister(&sifive_ccache_pmu_driver);
+ cpuhp_remove_multi_state(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE);
+}
+module_exit(sifive_ccache_pmu_exit);
+
+static int __init sifive_ccache_pmu_init(void)
+{
+ int ret;
+
+ ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE,
+ "perf/sifive/ccache:online",
+ sifive_ccache_pmu_online_cpu,
+ sifive_ccache_pmu_offline_cpu);
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&sifive_ccache_pmu_driver);
+ if (ret)
+ goto err_remove_state;
+
+ return 0;
+
+err_remove_state:
+ cpuhp_remove_multi_state(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE);
+
+ return ret;
+}
+module_init(sifive_ccache_pmu_init);
+
+MODULE_LICENSE("GPL");
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 172d0a743e5d..be6361fdc8ba 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -230,6 +230,7 @@ enum cpuhp_state {
CPUHP_AP_PERF_POWERPC_TRACE_IMC_ONLINE,
CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE,
+ CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE,
CPUHP_AP_PERF_CSKY_ONLINE,
CPUHP_AP_WATCHDOG_ONLINE,
CPUHP_AP_WORKQUEUE_ONLINE,
--
2.43.0
From: Eric Lin <[email protected]>
Add YAML DT binding documentation for the SiFive Extensible Cache
controller. The Extensible Cache controller interleaves cache blocks
across a number of heterogeneous independently-programmed slices. Each
slice contains an MMIO interface for configuration, cache maintenance,
error reporting, and performance monitoring.
Signed-off-by: Eric Lin <[email protected]>
Co-developed-by: Samuel Holland <[email protected]>
Signed-off-by: Samuel Holland <[email protected]>
---
.../cache/sifive,extensiblecache0.yaml | 136 ++++++++++++++++++
1 file changed, 136 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cache/sifive,extensiblecache0.yaml
diff --git a/Documentation/devicetree/bindings/cache/sifive,extensiblecache0.yaml b/Documentation/devicetree/bindings/cache/sifive,extensiblecache0.yaml
new file mode 100644
index 000000000000..d027114dbdba
--- /dev/null
+++ b/Documentation/devicetree/bindings/cache/sifive,extensiblecache0.yaml
@@ -0,0 +1,136 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright (C) 2023-2024 SiFive, Inc.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/cache/sifive,extensiblecache0.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SiFive Extensible Cache Controller
+
+maintainers:
+ - Eric Lin <[email protected]>
+
+description:
+ The SiFive Extensible Cache Controller provides a high-performance extensible
+ system (L2 or L3) cache. It is divided into several independent heterogeneous
+ slices, enabling a flexible topology and physical design.
+
+allOf:
+ - $ref: /schemas/cache-controller.yaml#
+
+select:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - sifive,extensiblecache0
+
+ required:
+ - compatible
+
+properties:
+ compatible:
+ items:
+ - const: sifive,extensiblecache0
+ - const: cache
+
+ "#address-cells": true
+ "#size-cells": true
+ ranges: true
+
+ interrupts:
+ maxItems: 1
+
+ cache-block-size:
+ const: 64
+
+ cache-level: true
+ cache-sets: true
+ cache-size: true
+ cache-unified: true
+
+patternProperties:
+ "^cache-controller@[0-9a-f]+$":
+ type: object
+ additionalProperties: false
+ properties:
+ reg:
+ maxItems: 1
+
+ cache-block-size:
+ const: 64
+
+ cache-sets: true
+ cache-size: true
+ cache-unified: true
+
+ sifive,bm-event-counters:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 0
+ description: Number of bucket monitor registers in this slice
+
+ sifive,cache-ways:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: Number of ways in this slice (independent of cache size)
+
+ sifive,perfmon-counters:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 0
+ description: Number of PMU counter registers in this slice
+
+ required:
+ - reg
+ - cache-block-size
+ - cache-sets
+ - cache-size
+ - cache-unified
+ - sifive,cache-ways
+
+required:
+ - compatible
+ - ranges
+ - interrupts
+ - cache-block-size
+ - cache-level
+ - cache-sets
+ - cache-size
+ - cache-unified
+
+additionalProperties: false
+
+examples:
+ - |
+ cache-controller@30040000 {
+ compatible = "sifive,extensiblecache0", "cache";
+ ranges = <0x30040000 0x30040000 0x10000>;
+ interrupts = <0x4>;
+ cache-block-size = <0x40>;
+ cache-level = <3>;
+ cache-sets = <0x800>;
+ cache-size = <0x100000>;
+ cache-unified;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ cache-controller@30040000 {
+ reg = <0x30040000 0x4000>;
+ cache-block-size = <0x40>;
+ cache-sets = <0x400>;
+ cache-size = <0x80000>;
+ cache-unified;
+ sifive,bm-event-counters = <8>;
+ sifive,cache-ways = <9>;
+ sifive,perfmon-counters = <8>;
+ };
+
+ cache-controller@30044000 {
+ reg = <0x30044000 0x4000>;
+ cache-block-size = <0x40>;
+ cache-sets = <0x400>;
+ cache-size = <0x80000>;
+ cache-unified;
+ sifive,bm-event-counters = <8>;
+ sifive,cache-ways = <9>;
+ sifive,perfmon-counters = <8>;
+ };
+ };
--
2.43.0
From: Eric Lin <[email protected]>
Add YAML DT binding documentation for the SiFive Private L2 Cache
controller. Some functionality and the corresponding register bits were
removed in the sifive,pl2cache1 version of the hardware, which creates
the unusual situation where the newer hardware's compatible string is
the fallback for the older one.
Signed-off-by: Eric Lin <[email protected]>
Co-developed-by: Samuel Holland <[email protected]>
Signed-off-by: Samuel Holland <[email protected]>
---
Changes in v1:
- Add back select: clause to binding
- Make sifive,pl2cache1 the fallback for sifive,pl2cache0
- Fix the order of the reg property declaration
- Document the sifive,perfmon-counters property
.../bindings/cache/sifive,pl2cache0.yaml | 81 +++++++++++++++++++
1 file changed, 81 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cache/sifive,pl2cache0.yaml
diff --git a/Documentation/devicetree/bindings/cache/sifive,pl2cache0.yaml b/Documentation/devicetree/bindings/cache/sifive,pl2cache0.yaml
new file mode 100644
index 000000000000..d89e2e5d0a97
--- /dev/null
+++ b/Documentation/devicetree/bindings/cache/sifive,pl2cache0.yaml
@@ -0,0 +1,81 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright (C) 2023-2024 SiFive, Inc.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/cache/sifive,pl2cache0.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SiFive Private L2 Cache Controller
+
+maintainers:
+ - Eric Lin <[email protected]>
+
+description:
+ The SiFive Private L2 Cache Controller is a per-core cache which communicates
+ with both the upstream L1 caches and downstream L3 cache or memory, enabling a
+ high-performance cache subsystem.
+
+allOf:
+ - $ref: /schemas/cache-controller.yaml#
+
+select:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - sifive,pl2cache1
+
+ required:
+ - compatible
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - const: sifive,pl2cache0
+ - const: sifive,pl2cache1
+ - const: cache
+ - items:
+ - const: sifive,pl2cache1
+ - const: cache
+
+ reg:
+ maxItems: 1
+
+ cache-block-size: true
+ cache-level: true
+ cache-sets: true
+ cache-size: true
+ cache-unified: true
+
+ next-level-cache: true
+
+ sifive,perfmon-counters:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 0
+ description: Number of PMU counter registers
+
+required:
+ - compatible
+ - reg
+ - cache-block-size
+ - cache-level
+ - cache-sets
+ - cache-size
+ - cache-unified
+
+additionalProperties: false
+
+examples:
+ - |
+ cache-controller@10104000 {
+ compatible = "sifive,pl2cache1", "cache";
+ reg = <0x10104000 0x4000>;
+ cache-block-size = <64>;
+ cache-level = <2>;
+ cache-sets = <512>;
+ cache-size = <262144>;
+ cache-unified;
+ next-level-cache = <&L4>;
+ sifive,perfmon-counters = <6>;
+ };
--
2.43.0
From: Eric Lin <[email protected]>
Add a driver for the PMU found in the SiFive Extensible Cache
controller. This PMU provides a configurable number of counters and a
variety of events. Events are grouped into sets. Each counter can count
events from only one set at a time; however, it can count any number of
events within that set simultaneously. The PMU hardware does not provide
an overflow interrupt.
The counter inhibit register is used to atomically start/stop/read a
group of counters so their values can be usefully compared.
Some events can be filtered further by client ID (e.g. CPU or external
DMA master). That functionality is not supported by this driver.
This driver further assumes that a single Extensible Cache instance is
shared by all CPUs in the system.
Example usage:
$ perf stat -e sifive_ecache_pmu/inner_rd_request/,
sifive_ecache_pmu/inner_wr_request/,
sifive_ecache_pmu/inner_rd_request_hit/,
sifive_ecache_pmu/inner_wr_request_hit/ ls
Performance counter stats for 'system wide':
148001 sifive_ecache_pmu/inner_rd_request/
121064 sifive_ecache_pmu/inner_wr_request/
113124 sifive_ecache_pmu/inner_rd_request_hit/
120860 sifive_ecache_pmu/inner_wr_request_hit/
0.010643962 seconds time elapsed
Example combining the read/write events together within each counter:
$ perf stat -e sifive_ecache_pmu/event=0x601/,
sifive_ecache_pmu/event=0xc001/ ls
Performance counter stats for 'system wide':
262619 sifive_ecache_pmu/event=0x601/
224533 sifive_ecache_pmu/event=0xc001/
0.009794808 seconds time elapsed
Signed-off-by: Eric Lin <[email protected]>
Co-developed-by: Samuel Holland <[email protected]>
Signed-off-by: Samuel Holland <[email protected]>
---
drivers/perf/Kconfig | 10 +
drivers/perf/Makefile | 1 +
drivers/perf/sifive_ecache_pmu.c | 675 +++++++++++++++++++++++++++++++
include/linux/cpuhotplug.h | 1 +
4 files changed, 687 insertions(+)
create mode 100644 drivers/perf/sifive_ecache_pmu.c
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index b4e4db7424b4..8a3b2b88d8b5 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -164,6 +164,16 @@ config SIFIVE_CCACHE_PMU
for measuring whole-system L2/L3 cache performance using the perf
events subsystem.
+config SIFIVE_ECACHE_PMU
+ tristate "SiFive Extensible Cache PMU"
+ depends on RISCV || COMPILE_TEST
+ depends on OF
+ help
+ Support for the Extensible Cache performance monitoring unit (PMU) on
+ SiFive platforms. The Composable Cache PMU provides up to 8 counters
+ for measuring whole-system L2/L3 cache performance using the perf
+ events subsystem.
+
config THUNDERX2_PMU
tristate "Cavium ThunderX2 SoC PMU UNCORE"
depends on ARCH_THUNDER2 || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 51ef5f50ace4..a51686b413f2 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o
obj-$(CONFIG_SIFIVE_CCACHE_PMU) += sifive_ccache_pmu.o
+obj-$(CONFIG_SIFIVE_ECACHE_PMU) += sifive_ecache_pmu.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
diff --git a/drivers/perf/sifive_ecache_pmu.c b/drivers/perf/sifive_ecache_pmu.c
new file mode 100644
index 000000000000..51b2fa3781c9
--- /dev/null
+++ b/drivers/perf/sifive_ecache_pmu.c
@@ -0,0 +1,675 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SiFive EC (Extensible Cache) PMU driver
+ *
+ * Copyright (C) 2023-2024 SiFive, Inc.
+ * Copyright (C) Eric Lin <[email protected]>
+ *
+ */
+
+#include <linux/cpuhotplug.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+
+#define ECACHE_SELECT_OFFSET 0x2000
+#define ECACHE_CLIENT_FILTER_OFFSET 0x2200
+#define ECACHE_COUNTER_INHIBIT_OFFSET 0x2800
+#define ECACHE_COUNTER_OFFSET 0x3000
+
+#define ECACHE_PMU_MAX_COUNTERS 8
+
+struct sifive_ecache_pmu_slice {
+ void __iomem *base;
+};
+
+struct sifive_ecache_pmu {
+ struct pmu pmu;
+ struct hlist_node node;
+ struct notifier_block cpu_pm_nb;
+ struct perf_event *events[ECACHE_PMU_MAX_COUNTERS];
+ DECLARE_BITMAP(used_mask, ECACHE_PMU_MAX_COUNTERS);
+ unsigned int cpu;
+ unsigned int txn_flags;
+ int n_counters;
+ int n_slices;
+ struct sifive_ecache_pmu_slice slice[] __counted_by(n_slices);
+};
+
+#define to_ecache_pmu(p) (container_of(p, struct sifive_ecache_pmu, pmu))
+
+/* Store the counter mask for a group in the leader's extra_reg */
+#define event_group_mask(event) (event->group_leader->hw.extra_reg.config)
+
+#ifndef readq
+static inline u64 readq(void __iomem *addr)
+{
+ return readl(addr) | (((u64)readl(addr + 4)) << 32);
+}
+#endif
+
+#ifndef writeq
+static inline void writeq(u64 v, void __iomem *addr)
+{
+ writel(lower_32_bits(v), addr);
+ writel(upper_32_bits(v), addr + 4);
+}
+#endif
+
+/*
+ * sysfs attributes
+ *
+ * We export:
+ * - cpumask, used by perf user space and other tools to know on which CPUs to create events
+ * - events, used by perf user space and other tools to create events symbolically, e.g.:
+ * perf stat -a -e sifive_ecache_pmu/event=inner_put_partial_data_hit/ ls
+ * perf stat -a -e sifive_ecache_pmu/event=0x101/ ls
+ * - formats, used by perf user space and other tools to configure events
+ */
+
+/* cpumask */
+static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct sifive_ecache_pmu *ecache_pmu = dev_get_drvdata(dev);
+
+ if (ecache_pmu->cpu >= nr_cpu_ids)
+ return 0;
+
+ return sysfs_emit(buf, "%d\n", ecache_pmu->cpu);
+};
+
+static DEVICE_ATTR_RO(cpumask);
+
+static struct attribute *sifive_ecache_pmu_cpumask_attrs[] = {
+ &dev_attr_cpumask.attr,
+ NULL,
+};
+
+static const struct attribute_group sifive_ecache_pmu_cpumask_group = {
+ .attrs = sifive_ecache_pmu_cpumask_attrs,
+};
+
+/* events */
+static ssize_t sifive_ecache_pmu_event_show(struct device *dev, struct device_attribute *attr,
+ char *page)
+{
+ struct perf_pmu_events_attr *pmu_attr;
+
+ pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+ return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
+}
+
+#define SET_EVENT_SELECT(_event, _set) (BIT_ULL((_event) + 8) | (_set))
+#define ECACHE_PMU_EVENT_ATTR(_name, _event, _set) \
+ PMU_EVENT_ATTR_ID(_name, sifive_ecache_pmu_event_show, SET_EVENT_SELECT(_event, _set))
+
+enum ecache_pmu_event_set1 {
+ INNER_REQUEST = 0,
+ INNER_RD_REQUEST,
+ INNER_WR_REQUEST,
+ INNER_PF_REQUEST,
+ OUTER_PRB_REQUEST,
+ INNER_REQUEST_HIT,
+ INNER_RD_REQUEST_HIT,
+ INNER_WR_REQUEST_HIT,
+ INNER_PF_REQUEST_HIT,
+ OUTER_PRB_REQUEST_HIT,
+ INNER_REQUEST_HITPF,
+ INNER_RD_REQUEST_HITPF,
+ INNER_WR_REQUEST_HITPF,
+ INNER_PF_REQUEST_HITPF,
+ OUTER_PRB_REQUEST_HITPF,
+ INNER_REQUEST_MISS,
+ INNER_RD_REQUEST_MISS,
+ INNER_WR_REQUEST_MISS,
+ INNER_PF_REQUEST_MISS,
+ OUTER_PRB_REQUEST_MISS,
+ ECACHE_PMU_MAX_EVENT1_IDX
+};
+
+enum ecache_pmu_event_set2 {
+ OUTER_REQUEST = 0,
+ OUTER_RD_REQUEST,
+ OUTER_PUT_REQUEST,
+ OUTER_EV_REQUEST,
+ OUTER_PF_REQUEST,
+ INNER_PRB_REQUEST,
+ INNER_REQUEST_WCYC,
+ INNER_RD_REQUEST_WCYC,
+ INNER_WR_REQUEST_WCYC,
+ INNER_PF_REQUEST_WCYC,
+ OUTER_PRB_REQUEST_WCYC,
+ OUTER_REQUEST_WCYC,
+ OUTER_RD_REQUEST_WCYC,
+ OUTER_PUT_REQUEST_WCYC,
+ OUTER_EV_REQUEST_WCYC,
+ OUTER_PF_REQUEST_WCYC,
+ INNER_PRB_REQUEST_WCYC,
+ INNER_AG_WCYC,
+ INNER_AP_WCYC,
+ INNER_AH_WCYC,
+ INNER_BP_WCYC,
+ INNER_CP_WCYC,
+ INNER_CX_WCYC,
+ INNER_DG_WCYC,
+ INNER_DP_WCYC,
+ INNER_DX_WCYC,
+ INNER_EG_WCYC,
+ OUTER_AG_WCYC,
+ OUTER_AP_WCYC,
+ OUTER_AH_WCYC,
+ OUTER_BP_WCYC,
+ OUTER_CP_WCYC,
+ OUTER_CX_WCYC,
+ OUTER_DG_WCYC,
+ OUTER_DP_WCYC,
+ OUTER_DX_WCYC,
+ OUTER_EG_WCYC,
+ ECACHE_PMU_MAX_EVENT2_IDX
+};
+
+static struct attribute *sifive_ecache_pmu_events[] = {
+ /* pmEventSelect1 */
+ ECACHE_PMU_EVENT_ATTR(inner_request, INNER_REQUEST, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_rd_request, INNER_RD_REQUEST, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_wr_request, INNER_WR_REQUEST, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_pf_request, INNER_PF_REQUEST, 1),
+ ECACHE_PMU_EVENT_ATTR(outer_prb_request, OUTER_PRB_REQUEST, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_request_hit, INNER_REQUEST_HIT, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_rd_request_hit, INNER_RD_REQUEST_HIT, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_wr_request_hit, INNER_WR_REQUEST_HIT, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_pf_request_hit, INNER_PF_REQUEST_HIT, 1),
+ ECACHE_PMU_EVENT_ATTR(outer_prb_request_hit, OUTER_PRB_REQUEST_HIT, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_request_hitpf, INNER_REQUEST_HITPF, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_rd_request_hitpf, INNER_RD_REQUEST_HITPF, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_wr_request_hitpf, INNER_WR_REQUEST_HITPF, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_pf_request_hitpf, INNER_PF_REQUEST_HITPF, 1),
+ ECACHE_PMU_EVENT_ATTR(outer_prb_request_hitpf, OUTER_PRB_REQUEST_HITPF, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_request_miss, INNER_REQUEST_MISS, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_rd_request_miss, INNER_RD_REQUEST_MISS, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_wr_request_miss, INNER_WR_REQUEST_MISS, 1),
+ ECACHE_PMU_EVENT_ATTR(inner_pf_request_miss, INNER_PF_REQUEST_MISS, 1),
+ ECACHE_PMU_EVENT_ATTR(outer_prb_request_miss, OUTER_PRB_REQUEST_MISS, 1),
+
+ /* pmEventSelect2 */
+ ECACHE_PMU_EVENT_ATTR(outer_request, OUTER_REQUEST, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_rd_request, OUTER_RD_REQUEST, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_put_request, OUTER_PUT_REQUEST, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_ev_request, OUTER_EV_REQUEST, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_pf_request, OUTER_PF_REQUEST, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_prb_request, INNER_PRB_REQUEST, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_request_wcyc, INNER_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_rd_request_wcyc, INNER_RD_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_wr_request_wcyc, INNER_WR_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_pf_request_wcyc, INNER_PF_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_prb_request_wcyc, OUTER_PRB_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_request_wcyc, OUTER_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_rd_request_wcyc, OUTER_RD_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_put_request_wcyc, OUTER_PUT_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_ev_request_wcyc, OUTER_EV_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_pf_request_wcyc, OUTER_PF_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_prb_request_wcyc, INNER_PRB_REQUEST_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_ag_wcyc, INNER_AG_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_ap_wcyc, INNER_AP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_ah_wcyc, INNER_AH_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_bp_wcyc, INNER_BP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_cp_wcyc, INNER_CP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_cx_wcyc, INNER_CX_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_dg_wcyc, INNER_DG_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_dp_wcyc, INNER_DP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_dx_wcyc, INNER_DX_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(inner_eg_wcyc, INNER_EG_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_ag_wcyc, OUTER_AG_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_ap_wcyc, OUTER_AP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_ah_wcyc, OUTER_AH_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_bp_wcyc, OUTER_BP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_cp_wcyc, OUTER_CP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_cx_wcyc, OUTER_CX_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_dg_wcyc, OUTER_DG_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_dp_wcyc, OUTER_DP_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_dx_wcyc, OUTER_DX_WCYC, 2),
+ ECACHE_PMU_EVENT_ATTR(outer_eg_wcyc, OUTER_EG_WCYC, 2),
+ NULL
+};
+
+static struct attribute_group sifive_ecache_pmu_events_group = {
+ .name = "events",
+ .attrs = sifive_ecache_pmu_events,
+};
+
+/* formats */
+PMU_FORMAT_ATTR(event, "config:0-63");
+
+static struct attribute *sifive_ecache_pmu_formats[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+
+static struct attribute_group sifive_ecache_pmu_format_group = {
+ .name = "format",
+ .attrs = sifive_ecache_pmu_formats,
+};
+
+/*
+ * Per PMU device attribute groups
+ */
+
+static const struct attribute_group *sifive_ecache_pmu_attr_grps[] = {
+ &sifive_ecache_pmu_cpumask_group,
+ &sifive_ecache_pmu_events_group,
+ &sifive_ecache_pmu_format_group,
+ NULL,
+};
+
+/*
+ * Event Initialization
+ */
+
+static int sifive_ecache_pmu_event_init(struct perf_event *event)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 config = event->attr.config;
+ u64 ev_type = config >> 8;
+ u64 set = config & 0xff;
+
+ /* Check if this is a valid set and event */
+ switch (set) {
+ case 1:
+ if (ev_type >= BIT_ULL(ECACHE_PMU_MAX_EVENT1_IDX))
+ return -ENOENT;
+ break;
+ case 2:
+ if (ev_type >= BIT_ULL(ECACHE_PMU_MAX_EVENT2_IDX))
+ return -ENOENT;
+ break;
+ default:
+ return -ENOENT;
+ }
+
+ /* Do not allocate the hardware counter yet */
+ hwc->idx = -1;
+ hwc->config = config;
+
+ event->cpu = ecache_pmu->cpu;
+
+ return 0;
+}
+
+/*
+ * Low-level functions: reading and writing counters
+ */
+
+static void configure_counter(const struct sifive_ecache_pmu *ecache_pmu,
+ const struct hw_perf_event *hwc, u64 config)
+{
+ for (int i = 0; i < ecache_pmu->n_slices; i++) {
+ void __iomem *base = ecache_pmu->slice[i].base;
+
+ if (config)
+ writeq(0, base + hwc->event_base);
+ writeq(config, base + hwc->config_base);
+ }
+}
+
+static u64 read_counter(const struct sifive_ecache_pmu *ecache_pmu, const struct hw_perf_event *hwc)
+{
+ u64 value = 0;
+
+ for (int i = 0; i < ecache_pmu->n_slices; i++) {
+ void __iomem *base = ecache_pmu->slice[i].base;
+
+ value += readq(base + hwc->event_base);
+ }
+
+ return value;
+}
+
+static void write_inhibit(const struct sifive_ecache_pmu *ecache_pmu, u64 mask)
+{
+ u64 used_mask;
+
+ /* Inhibit all unused counters in addition to the provided mask */
+ bitmap_to_arr64(&used_mask, ecache_pmu->used_mask, ECACHE_PMU_MAX_COUNTERS);
+ mask |= ~used_mask;
+
+ for (int i = 0; i < ecache_pmu->n_slices; i++) {
+ void __iomem *base = ecache_pmu->slice[i].base;
+
+ writeq(mask, base + ECACHE_COUNTER_INHIBIT_OFFSET);
+ }
+}
+
+/*
+ * pmu->read: read and update the counter
+ */
+static void sifive_ecache_pmu_read(struct perf_event *event)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 prev_raw_count, new_raw_count;
+ u64 oldval;
+
+ /* Inhibit the entire group during a read transaction for atomicity */
+ if (ecache_pmu->txn_flags == PERF_PMU_TXN_READ && event->group_leader == event)
+ write_inhibit(ecache_pmu, event_group_mask(event));
+
+ do {
+ prev_raw_count = local64_read(&hwc->prev_count);
+ new_raw_count = read_counter(ecache_pmu, hwc);
+
+ oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count, new_raw_count);
+ } while (oldval != prev_raw_count);
+
+ local64_add(new_raw_count - prev_raw_count, &event->count);
+}
+
+/*
+ * State transition functions:
+ *
+ * start()/stop() & add()/del()
+ */
+
+/*
+ * pmu->start: start the event
+ */
+static void sifive_ecache_pmu_start(struct perf_event *event, int flags)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
+ return;
+
+ hwc->state = 0;
+
+ /* Set initial value to 0 */
+ local64_set(&hwc->prev_count, 0);
+
+ /* Enable this counter to count events */
+ configure_counter(ecache_pmu, hwc, hwc->config);
+}
+
+/*
+ * pmu->stop: stop the counter
+ */
+static void sifive_ecache_pmu_stop(struct perf_event *event, int flags)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (hwc->state & PERF_HES_STOPPED)
+ return;
+
+ /* Disable this counter to count events */
+ configure_counter(ecache_pmu, hwc, 0);
+ sifive_ecache_pmu_read(event);
+
+ hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+/*
+ * pmu->add: add the event to the PMU
+ */
+static int sifive_ecache_pmu_add(struct perf_event *event, int flags)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ int idx;
+
+ /* Find an available counter idx to use for this event */
+ do {
+ idx = find_first_zero_bit(ecache_pmu->used_mask, ecache_pmu->n_counters);
+ if (idx >= ecache_pmu->n_counters)
+ return -EAGAIN;
+ } while (test_and_set_bit(idx, ecache_pmu->used_mask));
+
+ event_group_mask(event) |= BIT_ULL(idx);
+ hwc->config_base = ECACHE_SELECT_OFFSET + 8 * idx;
+ hwc->event_base = ECACHE_COUNTER_OFFSET + 8 * idx;
+ hwc->idx = idx;
+ hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+ ecache_pmu->events[idx] = event;
+
+ if (flags & PERF_EF_START)
+ sifive_ecache_pmu_start(event, PERF_EF_RELOAD);
+
+ perf_event_update_userpage(event);
+
+ return 0;
+}
+
+/*
+ * pmu->del: delete the event from the PMU
+ */
+static void sifive_ecache_pmu_del(struct perf_event *event, int flags)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ int idx = hwc->idx;
+
+ /* Stop and release this counter */
+ sifive_ecache_pmu_stop(event, PERF_EF_UPDATE);
+
+ ecache_pmu->events[idx] = NULL;
+ clear_bit(idx, ecache_pmu->used_mask);
+
+ perf_event_update_userpage(event);
+}
+
+/*
+ * Transaction synchronization
+ */
+
+static void sifive_ecache_pmu_start_txn(struct pmu *pmu, unsigned int txn_flags)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(pmu);
+
+ ecache_pmu->txn_flags = txn_flags;
+
+ /* Inhibit any counters that were deleted since the last transaction */
+ if (txn_flags == PERF_PMU_TXN_ADD)
+ write_inhibit(ecache_pmu, 0);
+}
+
+static int sifive_ecache_pmu_commit_txn(struct pmu *pmu)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(pmu);
+
+ ecache_pmu->txn_flags = 0;
+
+ /* Successful transaction: atomically uninhibit the counters in this group */
+ write_inhibit(ecache_pmu, 0);
+
+ return 0;
+}
+
+static void sifive_ecache_pmu_cancel_txn(struct pmu *pmu)
+{
+ struct sifive_ecache_pmu *ecache_pmu = to_ecache_pmu(pmu);
+
+ ecache_pmu->txn_flags = 0;
+
+ /* Failed transaction: leave the counters in this group inhibited */
+}
+
+/*
+ * Driver initialization
+ */
+
+static void sifive_ecache_pmu_hw_init(const struct sifive_ecache_pmu *ecache_pmu)
+{
+ for (int i = 0; i < ecache_pmu->n_slices; i++) {
+ void __iomem *base = ecache_pmu->slice[i].base;
+
+ /* Disable the client filter (not supported by this driver) */
+ writeq(0, base + ECACHE_CLIENT_FILTER_OFFSET);
+ }
+}
+
+static int sifive_ecache_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
+{
+ struct sifive_ecache_pmu *ecache_pmu =
+ hlist_entry_safe(node, struct sifive_ecache_pmu, node);
+
+ if (ecache_pmu->cpu >= nr_cpu_ids)
+ ecache_pmu->cpu = cpu;
+
+ return 0;
+}
+
+static int sifive_ecache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+{
+ struct sifive_ecache_pmu *ecache_pmu =
+ hlist_entry_safe(node, struct sifive_ecache_pmu, node);
+
+ /* Do nothing if this CPU does not own the events */
+ if (cpu != ecache_pmu->cpu)
+ return 0;
+
+ /* Pick a random online CPU */
+ ecache_pmu->cpu = cpumask_any_but(cpu_online_mask, cpu);
+ if (ecache_pmu->cpu >= nr_cpu_ids)
+ return 0;
+
+ /* Migrate PMU events from this CPU to the target CPU */
+ perf_pmu_migrate_context(&ecache_pmu->pmu, cpu, ecache_pmu->cpu);
+
+ return 0;
+}
+
+static int sifive_ecache_pmu_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct device_node *ecache_node = dev_of_node(dev);
+ struct sifive_ecache_pmu *ecache_pmu;
+ struct device_node *slice_node;
+ u32 slice_counters;
+ int n_slices, ret;
+ int i = 0;
+
+ n_slices = of_get_available_child_count(ecache_node);
+ if (!n_slices)
+ return -ENODEV;
+
+ ecache_pmu = devm_kzalloc(dev, struct_size(ecache_pmu, slice, n_slices), GFP_KERNEL);
+ if (!ecache_pmu)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, ecache_pmu);
+
+ ecache_pmu->pmu = (struct pmu) {
+ .parent = dev,
+ .attr_groups = sifive_ecache_pmu_attr_grps,
+ .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+ .task_ctx_nr = perf_invalid_context,
+ .event_init = sifive_ecache_pmu_event_init,
+ .add = sifive_ecache_pmu_add,
+ .del = sifive_ecache_pmu_del,
+ .start = sifive_ecache_pmu_start,
+ .stop = sifive_ecache_pmu_stop,
+ .read = sifive_ecache_pmu_read,
+ .start_txn = sifive_ecache_pmu_start_txn,
+ .commit_txn = sifive_ecache_pmu_commit_txn,
+ .cancel_txn = sifive_ecache_pmu_cancel_txn,
+ };
+ ecache_pmu->cpu = nr_cpu_ids;
+ ecache_pmu->n_counters = ECACHE_PMU_MAX_COUNTERS;
+ ecache_pmu->n_slices = n_slices;
+
+ for_each_available_child_of_node(ecache_node, slice_node) {
+ struct sifive_ecache_pmu_slice *slice = &ecache_pmu->slice[i++];
+
+ slice->base = devm_of_iomap(dev, slice_node, 0, NULL);
+ if (IS_ERR(slice->base))
+ return PTR_ERR(slice->base);
+
+ /* Get number of counters from slice node */
+ ret = of_property_read_u32(slice_node, "sifive,perfmon-counters", &slice_counters);
+ if (ret)
+ return dev_err_probe(dev, ret,
+ "Slice %pOF missing sifive,perfmon-counters property\n",
+ slice_node);
+
+ ecache_pmu->n_counters = min_t(u32, slice_counters, ecache_pmu->n_counters);
+ }
+
+ sifive_ecache_pmu_hw_init(ecache_pmu);
+
+ ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE, &ecache_pmu->node);
+ if (ret)
+ return dev_err_probe(dev, ret, "Failed to add CPU hotplug instance\n");
+
+ ret = perf_pmu_register(&ecache_pmu->pmu, "sifive_ecache_pmu", -1);
+ if (ret) {
+ dev_err_probe(dev, ret, "Failed to register PMU\n");
+ goto err_remove_instance;
+ }
+
+ return 0;
+
+err_remove_instance:
+ cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE, &ecache_pmu->node);
+
+ return ret;
+}
+
+static void sifive_ecache_pmu_remove(struct platform_device *pdev)
+{
+ struct sifive_ecache_pmu *ecache_pmu = platform_get_drvdata(pdev);
+
+ perf_pmu_unregister(&ecache_pmu->pmu);
+ cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE, &ecache_pmu->node);
+}
+
+static const struct of_device_id sifive_ecache_pmu_of_match[] = {
+ { .compatible = "sifive,extensiblecache0" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, sifive_ecache_pmu_of_match);
+
+static struct platform_driver sifive_ecache_pmu_driver = {
+ .probe = sifive_ecache_pmu_probe,
+ .remove_new = sifive_ecache_pmu_remove,
+ .driver = {
+ .name = "sifive_ecache_pmu",
+ .of_match_table = sifive_ecache_pmu_of_match,
+ },
+};
+
+static void __exit sifive_ecache_pmu_exit(void)
+{
+ platform_driver_unregister(&sifive_ecache_pmu_driver);
+ cpuhp_remove_multi_state(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE);
+}
+module_exit(sifive_ecache_pmu_exit);
+
+static int __init sifive_ecache_pmu_init(void)
+{
+ int ret;
+
+ ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE,
+ "perf/sifive/ecache:online",
+ sifive_ecache_pmu_online_cpu,
+ sifive_ecache_pmu_offline_cpu);
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&sifive_ecache_pmu_driver);
+ if (ret)
+ goto err_remove_state;
+
+ return 0;
+
+err_remove_state:
+ cpuhp_remove_multi_state(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE);
+
+ return ret;
+}
+module_init(sifive_ecache_pmu_init);
+
+MODULE_LICENSE("GPL");
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index be6361fdc8ba..55bd3a5e0033 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -231,6 +231,7 @@ enum cpuhp_state {
CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE,
CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE,
+ CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE,
CPUHP_AP_PERF_CSKY_ONLINE,
CPUHP_AP_WATCHDOG_ONLINE,
CPUHP_AP_WORKQUEUE_ONLINE,
--
2.43.0
From: Greentime Hu <[email protected]>
Add a driver for the PMU found in the SiFive Private L2 Cache
controller. This PMU provides a configurable number of counters and a
variety of events. Events are grouped into sets. Each counter can count
events from only one set at a time; however, it can count any number of
events within that set simultaneously. The PMU hardware does not provide
an overflow interrupt or a way to atomically control groups of counters.
A separate Private L2 Cache instance exists for each core, so this
driver supports per-core and per-task profiling.
Some events can be filtered further by client ID (e.g. CPU or external
DMA master). That functionality is not supported by this driver.
Example usage:
$ perf stat -e sifive_pl2_pmu/inner_get/,sifive_pl2_pmu/outer_get/ ls
Performance counter stats for 'ls':
95041 sifive_pl2_pmu/inner_get/
3 sifive_pl2_pmu/outer_get/
0.003971538 seconds time elapsed
0.000000000 seconds user
0.006315000 seconds sys
Example combining multiple events together within each counter:
$ perf stat -e sifive_pl2_pmu/event=0x301/, # inner_put_*_data
sifive_pl2_pmu/event=0x303/ ls # outer_put_*_data
Performance counter stats for 'ls':
6828 sifive_pl2_pmu/event=0x301/
11 sifive_pl2_pmu/event=0x303/
0.005696538 seconds time elapsed
0.000000000 seconds user
0.006337000 seconds sys
Signed-off-by: Greentime Hu <[email protected]>
Co-developed-by: Eric Lin <[email protected]>
Signed-off-by: Eric Lin <[email protected]>
Co-developed-by: Samuel Holland <[email protected]>
Signed-off-by: Samuel Holland <[email protected]>
---
Changes in v1:
- Add missing events to PL2 sets 2, 4, and 5
- Use event_base and config_base to precompute register addresses
- Check event validity earlier, in the .event_init hook
- Implement .filter for systems where only some CPUs have a PL2
- Only allocate percpu data when probing each PL2 instance
- Reference count the `struct pmu` to fix unbind/bind crashes
- Probe via DT since the PMU driver is now the only PL2 driver
- Allow the driver to be built as a module
drivers/perf/Kconfig | 10 +
drivers/perf/Makefile | 1 +
drivers/perf/sifive_pl2_pmu.c | 748 ++++++++++++++++++++++++++++++++++
3 files changed, 759 insertions(+)
create mode 100644 drivers/perf/sifive_pl2_pmu.c
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 8a3b2b88d8b5..bd5ebed8630b 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -174,6 +174,16 @@ config SIFIVE_ECACHE_PMU
for measuring whole-system L2/L3 cache performance using the perf
events subsystem.
+config SIFIVE_PL2_PMU
+ tristate "SiFive Private L2 Cache PMU"
+ depends on RISCV || COMPILE_TEST
+ depends on OF
+ help
+ Support for the Private L2 Cache performance monitoring unit (PMU) on
+ SiFive platforms. The Private L2 Cache PMU provides up to 64 counters
+ for measuring per-program or per-hart L2 cache performance using the
+ perf events subsystem.
+
config THUNDERX2_PMU
tristate "Cavium ThunderX2 SoC PMU UNCORE"
depends on ARCH_THUNDER2 || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index a51686b413f2..d5501196dcd8 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o
obj-$(CONFIG_SIFIVE_CCACHE_PMU) += sifive_ccache_pmu.o
obj-$(CONFIG_SIFIVE_ECACHE_PMU) += sifive_ecache_pmu.o
+obj-$(CONFIG_SIFIVE_PL2_PMU) += sifive_pl2_pmu.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
diff --git a/drivers/perf/sifive_pl2_pmu.c b/drivers/perf/sifive_pl2_pmu.c
new file mode 100644
index 000000000000..d0bbac0dec06
--- /dev/null
+++ b/drivers/perf/sifive_pl2_pmu.c
@@ -0,0 +1,748 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SiFive Private L2 Cache PMU driver
+ *
+ * Copyright (C) 2018-2024 SiFive, Inc.
+ */
+
+#include <linux/cpu_pm.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/refcount.h>
+
+#define PL2_SELECT_OFFSET 0x2000
+#define PL2_CLIENT_FILTER_OFFSET 0x2800
+#define PL2_COUNTER_OFFSET 0x3000
+
+#define PL2_PMU_MAX_COUNTERS 64
+
+struct sifive_pl2_pmu_event {
+ void __iomem *base;
+ DECLARE_BITMAP(used_mask, PL2_PMU_MAX_COUNTERS);
+ unsigned int cpu;
+ int n_counters;
+ struct perf_event *events[] __counted_by(n_counters);
+};
+
+struct sifive_pl2_pmu {
+ struct pmu pmu;
+ struct notifier_block cpu_pm_nb;
+ refcount_t refcount;
+ struct sifive_pl2_pmu_event *__percpu *event;
+};
+
+#define to_pl2_pmu(p) (container_of(p, struct sifive_pl2_pmu, pmu))
+
+static DEFINE_MUTEX(g_mutex);
+static struct sifive_pl2_pmu *g_pl2_pmu;
+
+#ifndef readq
+static inline u64 readq(void __iomem *addr)
+{
+ return readl(addr) | (((u64)readl(addr + 4)) << 32);
+}
+#endif
+
+#ifndef writeq
+static inline void writeq(u64 v, void __iomem *addr)
+{
+ writel(lower_32_bits(v), addr);
+ writel(upper_32_bits(v), addr + 4);
+}
+#endif
+
+/*
+ * sysfs attributes
+ *
+ * We export:
+ * - events, used by perf user space and other tools to create events symbolically, e.g.:
+ * perf stat -a -e sifive_pl2_pmu/event=inner_put_partial_data_hit/ ls
+ * perf stat -a -e sifive_pl2_pmu/event=0x101/ ls
+ * - formats, used by perf user space and other tools to configure events
+ */
+
+/* events */
+static ssize_t sifive_pl2_pmu_event_show(struct device *dev, struct device_attribute *attr,
+ char *page)
+{
+ struct perf_pmu_events_attr *pmu_attr;
+
+ pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+ return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
+}
+
+#define SET_EVENT_SELECT(_event, _set) (BIT_ULL((_event) + 8) | (_set))
+#define PL2_PMU_EVENT_ATTR(_name, _event, _set) \
+ PMU_EVENT_ATTR_ID(_name, sifive_pl2_pmu_event_show, SET_EVENT_SELECT(_event, _set))
+
+enum pl2_pmu_event_set1 {
+ INNER_PUT_FULL_DATA = 0,
+ INNER_PUT_PARTIAL_DATA,
+ INNER_ATOMIC_DATA,
+ INNER_GET,
+ INNER_PREFETCH_READ,
+ INNER_PREFETCH_WRITE,
+ INNER_ACQUIRE_BLOCK_NTOB,
+ INNER_ACQUIRE_BLOCK_NTOT,
+ INNER_ACQUIRE_BLOCK_BTOT,
+ INNER_ACQUIRE_PERM_NTOT,
+ INNER_ACQUIRE_PERM_BTOT,
+ INNER_RELEASE_TTOB,
+ INNER_RELEASE_TTON,
+ INNER_RELEASE_BTON,
+ INNER_RELEASE_DATA_TTOB,
+ INNER_RELEASE_DATA_TTON,
+ INNER_RELEASE_DATA_BTON,
+ INNER_RELEASE_DATA_TTOT,
+ INNER_PROBE_BLOCK_TOT,
+ INNER_PROBE_BLOCK_TOB,
+ INNER_PROBE_BLOCK_TON,
+ INNER_PROBE_PERM_TON,
+ INNER_PROBE_ACK_TTOB,
+ INNER_PROBE_ACK_TTON,
+ INNER_PROBE_ACK_BTON,
+ INNER_PROBE_ACK_TTOT,
+ INNER_PROBE_ACK_BTOB,
+ INNER_PROBE_ACK_NTON,
+ INNER_PROBE_ACK_DATA_TTOB,
+ INNER_PROBE_ACK_DATA_TTON,
+ INNER_PROBE_ACK_DATA_TTOT,
+ PL2_PMU_MAX_EVENT1_IDX
+};
+
+enum pl2_pmu_event_set2 {
+ INNER_PUT_FULL_DATA_HIT = 0,
+ INNER_PUT_PARTIAL_DATA_HIT,
+ INNER_ATOMIC_DATA_HIT,
+ INNER_GET_HIT,
+ INNER_PREFETCH_READ_HIT,
+ INNER_ACQUIRE_BLOCK_NTOB_HIT,
+ INNER_ACQUIRE_PERM_NTOT_HIT,
+ INNER_RELEASE_TTOB_HIT,
+ INNER_RELEASE_DATA_TTOB_HIT,
+ OUTER_PROBE_BLOCK_TOT_HIT,
+ INNER_PUT_FULL_DATA_HIT_SHARED,
+ INNER_PUT_PARTIAL_DATA_HIT_SHARED,
+ INNER_ATOMIC_DATA_HIT_SHARED,
+ INNER_GET_HIT_SHARED,
+ INNER_PREFETCH_READ_HIT_SHARED,
+ INNER_ACQUIRE_BLOCK_NTOB_HIT_SHARED,
+ INNER_ACQUIRE_PERM_NTOT_HIT_SHARED,
+ INNER_RELEASE_TTOB_HIT_SHARED,
+ INNER_RELEASE_DATA_TTOB_HIT_SHARED,
+ OUTER_PROBE_BLOCK_TOT_HIT_SHARED,
+ OUTER_PROBE_BLOCK_TOT_HIT_DIRTY,
+ PL2_PMU_MAX_EVENT2_IDX
+};
+
+enum pl2_pmu_event_set3 {
+ OUTER_PUT_FULL_DATA = 0,
+ OUTER_PUT_PARTIAL_DATA,
+ OUTER_ATOMIC_DATA,
+ OUTER_GET,
+ OUTER_PREFETCH_READ,
+ OUTER_PREFETCH_WRITE,
+ OUTER_ACQUIRE_BLOCK_NTOB,
+ OUTER_ACQUIRE_BLOCK_NTOT,
+ OUTER_ACQUIRE_BLOCK_BTOT,
+ OUTER_ACQUIRE_PERM_NTOT,
+ OUTER_ACQUIRE_PERM_BTOT,
+ OUTER_RELEARE_TTOB,
+ OUTER_RELEARE_TTON,
+ OUTER_RELEARE_BTON,
+ OUTER_RELEARE_DATA_TTOB,
+ OUTER_RELEARE_DATA_TTON,
+ OUTER_RELEARE_DATA_BTON,
+ OUTER_RELEARE_DATA_TTOT,
+ OUTER_PROBE_BLOCK_TOT,
+ OUTER_PROBE_BLOCK_TOB,
+ OUTER_PROBE_BLOCK_TON,
+ OUTER_PROBE_PERM_TON,
+ OUTER_PROBE_ACK_TTOB,
+ OUTER_PROBE_ACK_TTON,
+ OUTER_PROBE_ACK_BTON,
+ OUTER_PROBE_ACK_TTOT,
+ OUTER_PROBE_ACK_BTOB,
+ OUTER_PROBE_ACK_NTON,
+ OUTER_PROBE_ACK_DATA_TTOB,
+ OUTER_PROBE_ACK_DATA_TTON,
+ OUTER_PROBE_ACK_DATA_TTOT,
+ PL2_PMU_MAX_EVENT3_IDX
+};
+
+enum pl2_pmu_event_set4 {
+ INNER_HINT_HITS_MSHR = 0,
+ INNER_READ_HITS_MSHR,
+ INNER_WRITE_HITS_MSHR,
+ INNER_READ_REPLAY,
+ INNER_WRITE_REPLAY,
+ OUTER_PROBE_REPLAY,
+ REPLAY,
+ SLEEP_BY_MISS_QUEUE,
+ SLEEP_BY_EVICT_QUEUE,
+ SLEEP_FOR_BACK_PROBE,
+ SLEEP,
+ PL2_PMU_MAX_EVENT4_IDX
+};
+
+enum pl2_pmu_event_set5 {
+ READ_SLEEP_TIMER_EXPIRE = 0,
+ READ_OLDEST_TIMER_EXPIRE,
+ WRITE_SLEEP_TIMER_EXPIRE,
+ WRITE_OLDEST_TIMER_EXPIRE,
+ READ_SLEEP,
+ READ_DIR_UPDATE_WAKEUP,
+ READ_MISS_QUEUE_WAKEUP,
+ READ_EVICT_QUEUE_WAKEUP,
+ READ_SLEEP_TIMER_WAKEUP,
+ WRITE_SLEEP,
+ WRITE_DIR_UPDATE_WAKEUP,
+ WRITE_MISS_QUEUE_WAKEUP,
+ WRITE_EVICT_QUEUE_WAKEUP,
+ WRITE_SLEEP_TIMER_WAKEUP,
+ PL2_PMU_MAX_EVENT5_IDX
+};
+
+static struct attribute *sifive_pl2_pmu_events[] = {
+ PL2_PMU_EVENT_ATTR(inner_put_full_data, INNER_PUT_FULL_DATA, 1),
+ PL2_PMU_EVENT_ATTR(inner_put_partial_data, INNER_PUT_PARTIAL_DATA, 1),
+ PL2_PMU_EVENT_ATTR(inner_atomic_data, INNER_ATOMIC_DATA, 1),
+ PL2_PMU_EVENT_ATTR(inner_get, INNER_GET, 1),
+ PL2_PMU_EVENT_ATTR(inner_prefetch_read, INNER_PREFETCH_READ, 1),
+ PL2_PMU_EVENT_ATTR(inner_prefetch_write, INNER_PREFETCH_WRITE, 1),
+ PL2_PMU_EVENT_ATTR(inner_acquire_block_ntob, INNER_ACQUIRE_BLOCK_NTOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_acquire_block_ntot, INNER_ACQUIRE_BLOCK_NTOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_acquire_block_btot, INNER_ACQUIRE_BLOCK_BTOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_acquire_perm_ntot, INNER_ACQUIRE_PERM_NTOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_acquire_perm_btot, INNER_ACQUIRE_PERM_BTOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_ttob, INNER_RELEASE_TTOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_tton, INNER_RELEASE_TTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_bton, INNER_RELEASE_BTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_data_ttob, INNER_RELEASE_DATA_TTOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_data_tton, INNER_RELEASE_DATA_TTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_data_bton, INNER_RELEASE_DATA_BTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_release_data_ttot, INNER_RELEASE_DATA_TTOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_block_tot, INNER_PROBE_BLOCK_TOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_block_tob, INNER_PROBE_BLOCK_TOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_block_ton, INNER_PROBE_BLOCK_TON, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_perm_ton, INNER_PROBE_PERM_TON, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_ttob, INNER_PROBE_ACK_TTOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_tton, INNER_PROBE_ACK_TTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_bton, INNER_PROBE_ACK_BTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_ttot, INNER_PROBE_ACK_TTOT, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_btob, INNER_PROBE_ACK_BTOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_nton, INNER_PROBE_ACK_NTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_data_ttob, INNER_PROBE_ACK_DATA_TTOB, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_data_tton, INNER_PROBE_ACK_DATA_TTON, 1),
+ PL2_PMU_EVENT_ATTR(inner_probe_ack_data_ttot, INNER_PROBE_ACK_DATA_TTOT, 1),
+
+ PL2_PMU_EVENT_ATTR(inner_put_full_data_hit, INNER_PUT_FULL_DATA_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_put_partial_data_hit, INNER_PUT_PARTIAL_DATA_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_atomic_data_hit, INNER_ATOMIC_DATA_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_get_hit, INNER_GET_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_prefetch_read_hit, INNER_PREFETCH_READ_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_acquire_block_ntob_hit, INNER_ACQUIRE_BLOCK_NTOB_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_acquire_perm_ntot_hit, INNER_ACQUIRE_PERM_NTOT_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_release_ttob_hit, INNER_RELEASE_TTOB_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_release_data_ttob_hit, INNER_RELEASE_DATA_TTOB_HIT, 2),
+ PL2_PMU_EVENT_ATTR(outer_probe_block_tot_hit, OUTER_PROBE_BLOCK_TOT_HIT, 2),
+ PL2_PMU_EVENT_ATTR(inner_put_full_data_hit_shared, INNER_PUT_FULL_DATA_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_put_partial_data_hit_shared, INNER_PUT_PARTIAL_DATA_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_atomic_data_hit_shared, INNER_ATOMIC_DATA_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_get_hit_shared, INNER_GET_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_prefetch_read_hit_shared, INNER_PREFETCH_READ_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_acquire_block_ntob_hit_shared,
+ INNER_ACQUIRE_BLOCK_NTOB_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_acquire_perm_ntot_hit_shared,
+ INNER_ACQUIRE_PERM_NTOT_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_release_ttob_hit_shared, INNER_RELEASE_TTOB_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(inner_release_data_ttob_hit_shared,
+ INNER_RELEASE_DATA_TTOB_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(outer_probe_block_tot_hit_shared, OUTER_PROBE_BLOCK_TOT_HIT_SHARED, 2),
+ PL2_PMU_EVENT_ATTR(outer_probe_block_tot_hit_dirty, OUTER_PROBE_BLOCK_TOT_HIT_DIRTY, 2),
+
+ PL2_PMU_EVENT_ATTR(outer_put_full_data, OUTER_PUT_FULL_DATA, 3),
+ PL2_PMU_EVENT_ATTR(outer_put_partial_data, OUTER_PUT_PARTIAL_DATA, 3),
+ PL2_PMU_EVENT_ATTR(outer_atomic_data, OUTER_ATOMIC_DATA, 3),
+ PL2_PMU_EVENT_ATTR(outer_get, OUTER_GET, 3),
+ PL2_PMU_EVENT_ATTR(outer_prefetch_read, OUTER_PREFETCH_READ, 3),
+ PL2_PMU_EVENT_ATTR(outer_prefetch_write, OUTER_PREFETCH_WRITE, 3),
+ PL2_PMU_EVENT_ATTR(outer_acquire_block_ntob, OUTER_ACQUIRE_BLOCK_NTOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_acquire_block_ntot, OUTER_ACQUIRE_BLOCK_NTOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_acquire_block_btot, OUTER_ACQUIRE_BLOCK_BTOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_acquire_perm_ntot, OUTER_ACQUIRE_PERM_NTOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_acquire_perm_btot, OUTER_ACQUIRE_PERM_BTOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_ttob, OUTER_RELEARE_TTOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_tton, OUTER_RELEARE_TTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_bton, OUTER_RELEARE_BTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_data_ttob, OUTER_RELEARE_DATA_TTOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_data_tton, OUTER_RELEARE_DATA_TTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_data_bton, OUTER_RELEARE_DATA_BTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_release_data_ttot, OUTER_RELEARE_DATA_TTOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_block_tot, OUTER_PROBE_BLOCK_TOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_block_tob, OUTER_PROBE_BLOCK_TOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_block_ton, OUTER_PROBE_BLOCK_TON, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_perm_ton, OUTER_PROBE_PERM_TON, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_ttob, OUTER_PROBE_ACK_TTOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_tton, OUTER_PROBE_ACK_TTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_bton, OUTER_PROBE_ACK_BTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_ttot, OUTER_PROBE_ACK_TTOT, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_btob, OUTER_PROBE_ACK_BTOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_nton, OUTER_PROBE_ACK_NTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_data_ttob, OUTER_PROBE_ACK_DATA_TTOB, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_data_tton, OUTER_PROBE_ACK_DATA_TTON, 3),
+ PL2_PMU_EVENT_ATTR(outer_probe_ack_data_ttot, OUTER_PROBE_ACK_DATA_TTOT, 3),
+
+ PL2_PMU_EVENT_ATTR(inner_hint_hits_mshr, INNER_HINT_HITS_MSHR, 4),
+ PL2_PMU_EVENT_ATTR(inner_read_hits_mshr, INNER_READ_HITS_MSHR, 4),
+ PL2_PMU_EVENT_ATTR(inner_write_hits_mshr, INNER_WRITE_HITS_MSHR, 4),
+ PL2_PMU_EVENT_ATTR(inner_read_replay, INNER_READ_REPLAY, 4),
+ PL2_PMU_EVENT_ATTR(inner_write_replay, INNER_WRITE_REPLAY, 4),
+ PL2_PMU_EVENT_ATTR(outer_probe_replay, OUTER_PROBE_REPLAY, 4),
+ PL2_PMU_EVENT_ATTR(replay, REPLAY, 4),
+ PL2_PMU_EVENT_ATTR(sleep_by_miss_queue, SLEEP_BY_MISS_QUEUE, 4),
+ PL2_PMU_EVENT_ATTR(sleep_by_evict_queue, SLEEP_BY_EVICT_QUEUE, 4),
+ PL2_PMU_EVENT_ATTR(sleep_for_back_probe, SLEEP_FOR_BACK_PROBE, 4),
+ PL2_PMU_EVENT_ATTR(sleep, SLEEP, 4),
+
+ PL2_PMU_EVENT_ATTR(read_sleep_timer_expire, READ_SLEEP_TIMER_EXPIRE, 5),
+ PL2_PMU_EVENT_ATTR(read_oldest_timer_expire, READ_OLDEST_TIMER_EXPIRE, 5),
+ PL2_PMU_EVENT_ATTR(write_sleep_timer_expire, WRITE_SLEEP_TIMER_EXPIRE, 5),
+ PL2_PMU_EVENT_ATTR(write_oldest_timer_expire, WRITE_OLDEST_TIMER_EXPIRE, 5),
+ PL2_PMU_EVENT_ATTR(read_sleep, READ_SLEEP, 5),
+ PL2_PMU_EVENT_ATTR(read_dir_update_wakeup, READ_DIR_UPDATE_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(read_miss_queue_wakeup, READ_MISS_QUEUE_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(read_evict_queue_wakeup, READ_EVICT_QUEUE_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(read_sleep_timer_wakeup, READ_SLEEP_TIMER_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(write_sleep, WRITE_SLEEP, 5),
+ PL2_PMU_EVENT_ATTR(write_dir_update_wakeup, WRITE_DIR_UPDATE_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(write_miss_queue_wakeup, WRITE_MISS_QUEUE_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(write_evict_queue_wakeup, WRITE_EVICT_QUEUE_WAKEUP, 5),
+ PL2_PMU_EVENT_ATTR(write_sleep_timer_wakeup, WRITE_SLEEP_TIMER_WAKEUP, 5),
+ NULL
+};
+
+static struct attribute_group sifive_pl2_pmu_events_group = {
+ .name = "events",
+ .attrs = sifive_pl2_pmu_events,
+};
+
+/* formats */
+PMU_FORMAT_ATTR(event, "config:0-63");
+
+static struct attribute *sifive_pl2_pmu_formats[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+
+static struct attribute_group sifive_pl2_pmu_format_group = {
+ .name = "format",
+ .attrs = sifive_pl2_pmu_formats,
+};
+
+/*
+ * Per PMU device attribute groups
+ */
+
+static const struct attribute_group *sifive_pl2_pmu_attr_grps[] = {
+ &sifive_pl2_pmu_events_group,
+ &sifive_pl2_pmu_format_group,
+ NULL,
+};
+
+/*
+ * Event Initialization
+ */
+
+static int sifive_pl2_pmu_event_init(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ u64 config = event->attr.config;
+ u64 ev_type = config >> 8;
+ u64 set = config & 0xff;
+
+ /* Check if this is a valid set and event */
+ switch (set) {
+ case 1:
+ if (ev_type >= BIT_ULL(PL2_PMU_MAX_EVENT1_IDX))
+ return -ENOENT;
+ break;
+ case 2:
+ if (ev_type >= BIT_ULL(PL2_PMU_MAX_EVENT2_IDX))
+ return -ENOENT;
+ break;
+ case 3:
+ if (ev_type >= BIT_ULL(PL2_PMU_MAX_EVENT3_IDX))
+ return -ENOENT;
+ break;
+ case 4:
+ if (ev_type >= BIT_ULL(PL2_PMU_MAX_EVENT4_IDX))
+ return -ENOENT;
+ break;
+ case 5:
+ if (ev_type >= BIT_ULL(PL2_PMU_MAX_EVENT5_IDX))
+ return -ENOENT;
+ break;
+ default:
+ return -ENOENT;
+ }
+
+ /* Do not allocate the hardware counter yet */
+ hwc->idx = -1;
+ hwc->config = config;
+
+ return 0;
+}
+
+/*
+ * pmu->read: read and update the counter
+ */
+static void sifive_pl2_pmu_read(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ u64 prev_raw_count, new_raw_count;
+ u64 oldval;
+
+ do {
+ prev_raw_count = local64_read(&hwc->prev_count);
+ new_raw_count = readq((void *)hwc->event_base);
+
+ oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count, new_raw_count);
+ } while (oldval != prev_raw_count);
+
+ local64_add(new_raw_count - prev_raw_count, &event->count);
+}
+
+/*
+ * State transition functions:
+ *
+ * start()/stop() & add()/del()
+ */
+
+/*
+ * pmu->start: start the event
+ */
+static void sifive_pl2_pmu_start(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
+ return;
+
+ hwc->state = 0;
+
+ /* Set initial value to 0 */
+ local64_set(&hwc->prev_count, 0);
+ writeq(0, (void *)hwc->event_base);
+
+ /* Enable this counter to count events */
+ writeq(hwc->config, (void *)hwc->config_base);
+}
+
+/*
+ * pmu->stop: stop the counter
+ */
+static void sifive_pl2_pmu_stop(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (hwc->state & PERF_HES_STOPPED)
+ return;
+
+ /* Disable this counter to count events */
+ writeq(0, (void *)hwc->config_base);
+ sifive_pl2_pmu_read(event);
+
+ hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+/*
+ * pmu->add: add the event to the PMU
+ */
+static int sifive_pl2_pmu_add(struct perf_event *event, int flags)
+{
+ struct sifive_pl2_pmu *pl2_pmu = to_pl2_pmu(event->pmu);
+ struct sifive_pl2_pmu_event *ptr = *this_cpu_ptr(pl2_pmu->event);
+ struct hw_perf_event *hwc = &event->hw;
+ int idx;
+
+ /* Find an available counter idx to use for this event */
+ do {
+ idx = find_first_zero_bit(ptr->used_mask, ptr->n_counters);
+ if (idx >= ptr->n_counters)
+ return -EAGAIN;
+ } while (test_and_set_bit(idx, ptr->used_mask));
+
+ hwc->config_base = (unsigned long)ptr->base + PL2_SELECT_OFFSET + 8 * idx;
+ hwc->event_base = (unsigned long)ptr->base + PL2_COUNTER_OFFSET + 8 * idx;
+ hwc->idx = idx;
+ hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+ ptr->events[idx] = event;
+
+ if (flags & PERF_EF_START)
+ sifive_pl2_pmu_start(event, PERF_EF_RELOAD);
+
+ perf_event_update_userpage(event);
+
+ return 0;
+}
+
+/*
+ * pmu->del: delete the event from the PMU
+ */
+static void sifive_pl2_pmu_del(struct perf_event *event, int flags)
+{
+ struct sifive_pl2_pmu *pl2_pmu = to_pl2_pmu(event->pmu);
+ struct sifive_pl2_pmu_event *ptr = *this_cpu_ptr(pl2_pmu->event);
+ struct hw_perf_event *hwc = &event->hw;
+ int idx = hwc->idx;
+
+ /* Stop and release this counter */
+ sifive_pl2_pmu_stop(event, PERF_EF_UPDATE);
+
+ ptr->events[idx] = NULL;
+ clear_bit(idx, ptr->used_mask);
+
+ perf_event_update_userpage(event);
+}
+
+/*
+ * pmu->filter: check if the PMU can be used with a CPU
+ */
+static bool sifive_pl2_pmu_filter(struct pmu *pmu, int cpu)
+{
+ struct sifive_pl2_pmu *pl2_pmu = to_pl2_pmu(pmu);
+ struct sifive_pl2_pmu_event *ptr = *this_cpu_ptr(pl2_pmu->event);
+
+ /* Filter out CPUs with no PL2 instance (no percpu data allocated) */
+ return !ptr;
+}
+
+/*
+ * Driver initialization
+ */
+
+static void sifive_pl2_pmu_hw_init(const struct sifive_pl2_pmu_event *ptr)
+{
+ /* Disable the client filter (not supported by this driver) */
+ writeq(0, ptr->base + PL2_CLIENT_FILTER_OFFSET);
+}
+
+static int sifive_pl2_pmu_pm_notify(struct notifier_block *nb, unsigned long cmd, void *v)
+{
+ struct sifive_pl2_pmu *pl2_pmu = container_of(nb, struct sifive_pl2_pmu, cpu_pm_nb);
+ struct sifive_pl2_pmu_event *ptr = *this_cpu_ptr(pl2_pmu->event);
+ struct perf_event *event;
+
+ if (!ptr || bitmap_empty(ptr->used_mask, PL2_PMU_MAX_COUNTERS))
+ return NOTIFY_OK;
+
+ for (int idx = 0; idx < ptr->n_counters; idx++) {
+ event = ptr->events[idx];
+ if (!event)
+ continue;
+
+ switch (cmd) {
+ case CPU_PM_ENTER:
+ /* Stop and update the counter */
+ sifive_pl2_pmu_stop(event, PERF_EF_UPDATE);
+ break;
+ case CPU_PM_ENTER_FAILED:
+ case CPU_PM_EXIT:
+ /* Restore and enable the counter */
+ sifive_pl2_pmu_start(event, PERF_EF_RELOAD);
+ break;
+ default:
+ break;
+ }
+ }
+
+ return NOTIFY_OK;
+}
+
+static int sifive_pl2_pmu_pm_register(struct sifive_pl2_pmu *pl2_pmu)
+{
+ if (!IS_ENABLED(CONFIG_CPU_PM))
+ return 0;
+
+ pl2_pmu->cpu_pm_nb.notifier_call = sifive_pl2_pmu_pm_notify;
+ return cpu_pm_register_notifier(&pl2_pmu->cpu_pm_nb);
+}
+
+static void sifive_pl2_pmu_pm_unregister(struct sifive_pl2_pmu *pl2_pmu)
+{
+ if (!IS_ENABLED(CONFIG_CPU_PM))
+ return;
+
+ cpu_pm_unregister_notifier(&pl2_pmu->cpu_pm_nb);
+}
+
+static struct sifive_pl2_pmu *sifive_pl2_pmu_get(void)
+{
+ struct sifive_pl2_pmu *pl2_pmu;
+ int ret;
+
+ guard(mutex)(&g_mutex);
+
+ pl2_pmu = g_pl2_pmu;
+ if (pl2_pmu) {
+ refcount_inc(&pl2_pmu->refcount);
+ return pl2_pmu;
+ }
+
+ pl2_pmu = kzalloc(sizeof(*pl2_pmu), GFP_KERNEL);
+ if (!pl2_pmu)
+ return ERR_PTR(-ENOMEM);
+
+ pl2_pmu->pmu = (struct pmu) {
+ .attr_groups = sifive_pl2_pmu_attr_grps,
+ .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+ .task_ctx_nr = perf_sw_context,
+ .event_init = sifive_pl2_pmu_event_init,
+ .add = sifive_pl2_pmu_add,
+ .del = sifive_pl2_pmu_del,
+ .start = sifive_pl2_pmu_start,
+ .stop = sifive_pl2_pmu_stop,
+ .read = sifive_pl2_pmu_read,
+ .filter = sifive_pl2_pmu_filter,
+ };
+
+ refcount_set(&pl2_pmu->refcount, 1);
+
+ pl2_pmu->event = alloc_percpu(typeof(*pl2_pmu->event));
+ if (!pl2_pmu->event) {
+ ret = -ENOMEM;
+ goto err_free;
+ }
+
+ ret = sifive_pl2_pmu_pm_register(pl2_pmu);
+ if (ret)
+ goto err_free_percpu;
+
+ ret = perf_pmu_register(&pl2_pmu->pmu, "sifive_pl2_pmu", -1);
+ if (ret) {
+ pr_err("%s: Failed to register PMU: %d\n", __func__, ret);
+ goto err_unregister_pm;
+ }
+
+ g_pl2_pmu = pl2_pmu;
+
+ return pl2_pmu;
+
+err_unregister_pm:
+ sifive_pl2_pmu_pm_unregister(pl2_pmu);
+err_free_percpu:
+ free_percpu(pl2_pmu->event);
+err_free:
+ kfree(pl2_pmu);
+
+ return ERR_PTR(ret);
+}
+
+static void sifive_pl2_pmu_put(void)
+{
+ struct sifive_pl2_pmu *pl2_pmu;
+
+ guard(mutex)(&g_mutex);
+
+ pl2_pmu = g_pl2_pmu;
+ if (!refcount_dec_and_test(&pl2_pmu->refcount))
+ return;
+
+ g_pl2_pmu = NULL;
+ perf_pmu_unregister(&pl2_pmu->pmu);
+ sifive_pl2_pmu_pm_unregister(pl2_pmu);
+ free_percpu(pl2_pmu->event);
+ kfree(pl2_pmu);
+}
+
+static int sifive_pl2_pmu_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct device_node *np = dev_of_node(dev);
+ struct sifive_pl2_pmu_event *ptr;
+ struct sifive_pl2_pmu *pl2_pmu;
+ unsigned int cpu;
+ u32 n_counters;
+ int ret;
+
+ /* Instances without a sifive,perfmon-counters property do not contain a PMU */
+ ret = of_property_read_u32(np, "sifive,perfmon-counters", &n_counters);
+ if (ret || !n_counters)
+ return -ENODEV;
+
+ /* Determine the CPU affinity of this PL2 instance */
+ for_each_possible_cpu(cpu) {
+ struct device_node *cache_node, *cpu_node;
+
+ cpu_node = of_cpu_device_node_get(cpu);
+ if (!cpu_node)
+ continue;
+
+ cache_node = of_parse_phandle(cpu_node, "next-level-cache", 0);
+ of_node_put(cpu_node);
+ if (!cache_node)
+ continue;
+
+ of_node_put(cache_node);
+ if (cache_node == np)
+ break;
+ }
+ if (cpu >= nr_cpu_ids)
+ return -ENODEV;
+
+ ptr = devm_kzalloc(dev, struct_size(ptr, events, n_counters), GFP_KERNEL);
+ if (!ptr)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, ptr);
+
+ ptr->cpu = cpu;
+ ptr->n_counters = n_counters;
+
+ ptr->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(ptr->base))
+ return PTR_ERR(ptr->base);
+
+ sifive_pl2_pmu_hw_init(ptr);
+
+ pl2_pmu = sifive_pl2_pmu_get();
+ if (IS_ERR(pl2_pmu))
+ return PTR_ERR(pl2_pmu);
+
+ *per_cpu_ptr(pl2_pmu->event, cpu) = ptr;
+
+ return 0;
+}
+
+static void sifive_pl2_pmu_remove(struct platform_device *pdev)
+{
+ struct sifive_pl2_pmu_event *ptr = platform_get_drvdata(pdev);
+
+ *per_cpu_ptr(g_pl2_pmu->event, ptr->cpu) = NULL;
+ sifive_pl2_pmu_put();
+}
+
+static const struct of_device_id sifve_pl2_pmu_of_match[] = {
+ { .compatible = "sifive,pl2cache1" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, sifve_pl2_pmu_of_match);
+
+static struct platform_driver sifive_pl2_pmu_driver = {
+ .probe = sifive_pl2_pmu_probe,
+ .remove_new = sifive_pl2_pmu_remove,
+ .driver = {
+ .name = "sifive_pl2_pmu",
+ .of_match_table = sifve_pl2_pmu_of_match,
+ },
+};
+module_platform_driver(sifive_pl2_pmu_driver);
+
+MODULE_LICENSE("GPL");
--
2.43.0
On Thu, Feb 15, 2024 at 04:08:12PM -0800, Samuel Holland wrote:
> All three of these cache controllers (with PMUs) have been integrated in
> SoCs by our customers. However, as none of those SoCs have been publicly
> announced yet, I cannot include SoC-specific compatible strings in this
> version of the devicetree bindings.
And I don't want to apply any of those dt-binding patches until then.
Stuff like "sifive,perfmon-counters" seems like a property that would
go away with a device-specific compatible, at least for the ccache.
On 16/02/2024 01:08, Samuel Holland wrote:
> The SiFive Composable Cache controller contains an optional PMU with a
> configurable number of event counters. Document a property which
Configurable in what context? By chip designers or by OS? Why this
cannot be deduced from the compatible?
> describes the number of available counters.
>
> Signed-off-by: Samuel Holland <[email protected]>
> ---
>
> Documentation/devicetree/bindings/cache/sifive,ccache0.yaml | 5 +++++
> 1 file changed, 5 insertions(+)
>
Best regards,
Krzysztof
On 16/02/2024 01:08, Samuel Holland wrote:
> From: Eric Lin <[email protected]>
>
> Add YAML DT binding documentation for the SiFive Extensible Cache
> controller. The Extensible Cache controller interleaves cache blocks
> across a number of heterogeneous independently-programmed slices. Each
> slice contains an MMIO interface for configuration, cache maintenance,
> error reporting, and performance monitoring.
>
> +allOf:
> + - $ref: /schemas/cache-controller.yaml#
> +
> +select:
> + properties:
> + compatible:
> + contains:
> + enum:
> + - sifive,extensiblecache0
> +
> + required:
> + - compatible
> +
> +properties:
> + compatible:
> + items:
> + - const: sifive,extensiblecache0
> + - const: cache
> +
> + "#address-cells": true
const or enum: [1, 2], depending on the addressing you need here.
> + "#size-cells": true
ditto
> + ranges: true
> +
> + interrupts:
> + maxItems: 1
> +
> + cache-block-size:
> + const: 64
> +
> + cache-level: true
5 is acceptable? I would argue this should be even const.
> + cache-sets: true
> + cache-size: true
Some constraints on any of these?
> + cache-unified: true
> +
> +patternProperties:
> + "^cache-controller@[0-9a-f]+$":
> + type: object
> + additionalProperties: false
What is this object supposed to represent? Add description.
> + properties:
> + reg:
> + maxItems: 1
> +
> + cache-block-size:
> + const: 64
> +
> + cache-sets: true
> + cache-size: true
> + cache-unified: true
cache-level
> +
> + sifive,bm-event-counters:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 0
> + description: Number of bucket monitor registers in this slice
> +
> + sifive,cache-ways:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: Number of ways in this slice (independent of cache size)
> +
> + sifive,perfmon-counters:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 0
> + description: Number of PMU counter registers in this slice
> +
> + required:
> + - reg
> + - cache-block-size
> + - cache-sets
> + - cache-size
> + - cache-unified
> + - sifive,cache-ways
> +
> +required:
> + - compatible
> + - ranges
> + - interrupts
> + - cache-block-size
> + - cache-level
> + - cache-sets
> + - cache-size
> + - cache-unified
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + cache-controller@30040000 {
> + compatible = "sifive,extensiblecache0", "cache";
> + ranges = <0x30040000 0x30040000 0x10000>;
> + interrupts = <0x4>;
You use hex as interrupt numbers on your platforms?
> + cache-block-size = <0x40>;
> + cache-level = <3>;
> + cache-sets = <0x800>;
Best regards,
Krzysztof
On 16/02/2024 01:08, Samuel Holland wrote:
> From: Eric Lin <[email protected]>
>
> Add YAML DT binding documentation for the SiFive Private L2 Cache
> controller. Some functionality and the corresponding register bits were
> removed in the sifive,pl2cache1 version of the hardware, which creates
> the unusual situation where the newer hardware's compatible string is
> the fallback for the older one.
>
> Signed-off-by: Eric Lin <[email protected]>
> Co-developed-by: Samuel Holland <[email protected]>
> Signed-off-by: Samuel Holland <[email protected]>
> ---
>
> Changes in v1:
> - Add back select: clause to binding
> - Make sifive,pl2cache1 the fallback for sifive,pl2cache0
> - Fix the order of the reg property declaration
> - Document the sifive,perfmon-counters property
This is no v1. Please implement entire feedback from previous v2, v3 or
whatever it was and reference old posting or continue the numbering.
Best regards,
Krzysztof
Hi Krzysztof,
On 2024-02-17 3:00 AM, Krzysztof Kozlowski wrote:
> On 16/02/2024 01:08, Samuel Holland wrote:
>> The SiFive Composable Cache controller contains an optional PMU with a
>> configurable number of event counters. Document a property which
>
> Configurable in what context? By chip designers or by OS? Why this
> cannot be deduced from the compatible?
This parameter is configurable by the chip designers.
The information certainly can be deduced from the SoC-specific compatible
string, but doing so makes the driver only work on that specific list of SoCs.
When provided via a property, the driver can work without changes on any SoC
that uses this IP block. (None of the SoCs currently listed in the binding
contain a PMU, so there is no backward compatibility concern with adding the new
property.)
My understanding of the purpose of the SoC-specific compatible string is to
handle eventualities (silicon bugs, integration quirks, etc.), not to
intentionally limit the driver to a narrow list of hardware.
Regards,
Samuel
>> describes the number of available counters.
>>
>> Signed-off-by: Samuel Holland <[email protected]>
>> ---
>>
>> Documentation/devicetree/bindings/cache/sifive,ccache0.yaml | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
Hi Krzysztof,
On 2024-02-17 3:12 AM, Krzysztof Kozlowski wrote:
> On 16/02/2024 01:08, Samuel Holland wrote:
>> From: Eric Lin <[email protected]>
>>
>> Add YAML DT binding documentation for the SiFive Private L2 Cache
>> controller. Some functionality and the corresponding register bits were
>> removed in the sifive,pl2cache1 version of the hardware, which creates
>> the unusual situation where the newer hardware's compatible string is
>> the fallback for the older one.
>>
>> Signed-off-by: Eric Lin <[email protected]>
>> Co-developed-by: Samuel Holland <[email protected]>
>> Signed-off-by: Samuel Holland <[email protected]>
>> ---
>>
>> Changes in v1:
>> - Add back select: clause to binding
>> - Make sifive,pl2cache1 the fallback for sifive,pl2cache0
>> - Fix the order of the reg property declaration
>> - Document the sifive,perfmon-counters property
>
> This is no v1. Please implement entire feedback from previous v2, v3 or
> whatever it was and reference old posting or continue the numbering.
The old posting is referenced in the cover letter:
This series is a follow-up to Eric Lin's series "[PATCH v2 0/3] Add
SiFive Private L2 cache and PMU driver":
https://lore.kernel.org/linux-riscv/[email protected]/
So these changes include implementation of the feedback from that v2.
Regards,
Samuel
Hi Krzysztof,
On 2024-02-17 3:09 AM, Krzysztof Kozlowski wrote:
> On 16/02/2024 01:08, Samuel Holland wrote:
>> From: Eric Lin <[email protected]>
>>
>> Add YAML DT binding documentation for the SiFive Extensible Cache
>> controller. The Extensible Cache controller interleaves cache blocks
>> across a number of heterogeneous independently-programmed slices. Each
>> slice contains an MMIO interface for configuration, cache maintenance,
>> error reporting, and performance monitoring.
>>
>> +allOf:
>> + - $ref: /schemas/cache-controller.yaml#
>> +
>> +select:
>> + properties:
>> + compatible:
>> + contains:
>> + enum:
>> + - sifive,extensiblecache0
>> +
>> + required:
>> + - compatible
>> +
>> +properties:
>> + compatible:
>> + items:
>> + - const: sifive,extensiblecache0
>> + - const: cache
>> +
>> + "#address-cells": true
>
> const or enum: [1, 2], depending on the addressing you need here.
>
>> + "#size-cells": true
>
> ditto
>
>> + ranges: true
>> +
>> + interrupts:
>> + maxItems: 1
>> +
>> + cache-block-size:
>> + const: 64
>> +
>> + cache-level: true
>
> 5 is acceptable? I would argue this should be even const.
>
>> + cache-sets: true
>> + cache-size: true
>
> Some constraints on any of these?
Thanks for the feedback. I will add the various constraints in v2, though some
constraints will be somewhat loose as the topology is highly configurable.
>> + cache-unified: true
>> +
>> +patternProperties:
>> + "^cache-controller@[0-9a-f]+$":
>> + type: object
>> + additionalProperties: false
>
> What is this object supposed to represent? Add description.
I will add a description in v2.
This object represents a single slice of the cache. Requests from clients are
interleaved between cache slices depending on the client, the address, etc.
Since there is no strong relationship between client (i.e. CPU) and cache slice,
the next-level-cache property must point to the top-level EC node, not a slice.
>> + properties:
>> + reg:
>> + maxItems: 1
>> +
>> + cache-block-size:
>> + const: 64
>> +
>> + cache-sets: true
>> + cache-size: true
>> + cache-unified: true
>
> cache-level
I will add this in v2. It seemed redundant since the value cannot differ between
slices.
Regards,
Samuel
>> +
>> + sifive,bm-event-counters:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + default: 0
>> + description: Number of bucket monitor registers in this slice
>> +
>> + sifive,cache-ways:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: Number of ways in this slice (independent of cache size)
>> +
>> + sifive,perfmon-counters:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + default: 0
>> + description: Number of PMU counter registers in this slice
>> +
>> + required:
>> + - reg
>> + - cache-block-size
>> + - cache-sets
>> + - cache-size
>> + - cache-unified
>> + - sifive,cache-ways
>> +
>> +required:
>> + - compatible
>> + - ranges
>> + - interrupts
>> + - cache-block-size
>> + - cache-level
>> + - cache-sets
>> + - cache-size
>> + - cache-unified
>> +
>> +additionalProperties: false
>> +
>> +examples:
>> + - |
>> + cache-controller@30040000 {
>> + compatible = "sifive,extensiblecache0", "cache";
>> + ranges = <0x30040000 0x30040000 0x10000>;
>> + interrupts = <0x4>;
>
> You use hex as interrupt numbers on your platforms?
>
>> + cache-block-size = <0x40>;
>> + cache-level = <3>;
>> + cache-sets = <0x800>;
>
> Best regards,
> Krzysztof
>
On 18/02/2024 16:29, Samuel Holland wrote:
> Hi Krzysztof,
>
> On 2024-02-17 3:00 AM, Krzysztof Kozlowski wrote:
>> On 16/02/2024 01:08, Samuel Holland wrote:
>>> The SiFive Composable Cache controller contains an optional PMU with a
>>> configurable number of event counters. Document a property which
>>
>> Configurable in what context? By chip designers or by OS? Why this
>> cannot be deduced from the compatible?
>
> This parameter is configurable by the chip designers.
>
> The information certainly can be deduced from the SoC-specific compatible
> string, but doing so makes the driver only work on that specific list of SoCs.
Usually that's exactly what's expected, so why here usual approach is wrong?
> When provided via a property, the driver can work without changes on any SoC
> that uses this IP block. (None of the SoCs currently listed in the binding
Sorry, properties are not a work-around for missing compatibles.
> contain a PMU, so there is no backward compatibility concern with adding the new
> property.)
>
> My understanding of the purpose of the SoC-specific compatible string is to
> handle eventualities (silicon bugs, integration quirks, etc.), not to
> intentionally limit the driver to a narrow list of hardware.
Depends what is the hardware. For most of licensed blocks, the final
design is the hardware so equals to its compatible.
Best regards,
Krzysztof
On Thu, 15 Feb 2024 16:08:14 -0800
Samuel Holland <[email protected]> wrote:
> From: Eric Lin <[email protected]>
>
> Add a driver for the PMU found in the SiFive Composable Cache
> controller. This PMU provides a configurable number of counters and a
> variety of events. Events are grouped into sets. Each counter can count
> events from only one set at a time; however, it can count any number of
> events within that set simultaneously. The PMU hardware does not provide
> an overflow interrupt or a way to atomically control groups of counters.
>
> Some events can be filtered further by client ID (e.g. CPU or external
> DMA master). That functionality is not supported by this driver.
>
> This driver further assumes that a single Composable Cache instance is
> shared by all CPUs in the system.
>
> Example usage:
>
> $ perf stat -a -e sifive_ccache_pmu/inner_acquire_block_btot/,
> sifive_ccache_pmu/inner_acquire_block_hit/,
> sifive_ccache_pmu/inner_acquire_block_ntob/ ls
>
> Performance counter stats for 'system wide':
>
> 542 sifive_ccache_pmu/inner_acquire_block_btot/
> 22081 sifive_ccache_pmu/inner_acquire_block_hit/
> 22006 sifive_ccache_pmu/inner_acquire_block_ntob/
>
> 0.064672432 seconds time elapsed
>
> Example using numeric event selectors:
>
> $ perf stat -a -e sifive_ccache_pmu/event=0x10001/,
> sifive_ccache_pmu/event=0x2002/,
> sifive_ccache_pmu/event=0x4001/ ls
>
> Performance counter stats for 'system wide':
>
> 478 sifive_ccache_pmu/event=0x10001/
> 4717 sifive_ccache_pmu/event=0x2002/
> 44966 sifive_ccache_pmu/event=0x4001/
>
> 0.111027326 seconds time elapsed
>
> Signed-off-by: Eric Lin <[email protected]>
> Co-developed-by: Samuel Holland <[email protected]>
> Signed-off-by: Samuel Holland <[email protected]>
Hi Samuel,
A few comments inline.
> diff --git a/drivers/perf/sifive_ccache_pmu.c b/drivers/perf/sifive_ccache_pmu.c
> new file mode 100644
> index 000000000000..8c9ef0d09f48
> --- /dev/null
> +++ b/drivers/perf/sifive_ccache_pmu.c
> +
> +#define to_ccache_pmu(p) (container_of(p, struct sifive_ccache_pmu, pmu))
> +
> +#ifndef readq
> +static inline u64 readq(void __iomem *addr)
> +{
> + return readl(addr) | (((u64)readl(addr + 4)) << 32);
> +}
> +#endif
> +
> +#ifndef writeq
> +static inline void writeq(u64 v, void __iomem *addr)
> +{
> + writel(lower_32_bits(v), addr);
> + writel(upper_32_bits(v), addr + 4);
Include io-64-nonatomic-lo-hi.h
and you shouldn't need these.
> +}
> +#endif
> +
> +/*
> + * pmu->stop: stop the counter
> + */
> +static void sifive_ccache_pmu_stop(struct perf_event *event, int flags)
> +{
> + struct hw_perf_event *hwc = &event->hw;
> +
> + if (hwc->state & PERF_HES_STOPPED)
> + return;
> +
> + /* Disable this counter to count events */
> + writeq(0, (void *)hwc->config_base);
Not going to give address space warnings as writeq expects
__iomem?
> + sifive_ccache_pmu_read(event);
> +
> + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
> +}
> +/*
> + * pmu->del: delete the event from the PMU
Why use multi line comments?
> + */
> +static void sifive_ccache_pmu_del(struct perf_event *event, int flags)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
> + struct hw_perf_event *hwc = &event->hw;
> + int idx = hwc->idx;
> +
> + /* Stop and release this counter */
> + sifive_ccache_pmu_stop(event, PERF_EF_UPDATE);
> +
> + ccache_pmu->events[idx] = NULL;
> + clear_bit(idx, ccache_pmu->used_mask);
> +
> + perf_event_update_userpage(event);
> +}
> +
> +/*
> + * Driver initialization
Probably drop generic code organization comments like this.
They just rot over time and provide little benefit.
> + */
..
> +
> +static int sifive_ccache_pmu_probe(struct platform_device *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct sifive_ccache_pmu *ccache_pmu;
> + u32 n_counters;
> + int ret;
> +
> + /* Instances without a sifive,perfmon-counters property do not contain a PMU */
> + ret = device_property_read_u32(dev, "sifive,perfmon-counters", &n_counters);
> + if (ret || !n_counters)
> + return -ENODEV;
if (ret)
return ret;
if (!n_counters)
return -ENODEV;
In general don't eat potentially useful return codes.
> +
> + ccache_pmu = devm_kzalloc(dev, struct_size(ccache_pmu, events, n_counters), GFP_KERNEL);
> + if (!ccache_pmu)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, ccache_pmu);
> +
> + ccache_pmu->pmu = (struct pmu) {
> + .parent = dev,
> + .attr_groups = sifive_ccache_pmu_attr_grps,
> + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
> + .task_ctx_nr = perf_invalid_context,
> + .event_init = sifive_ccache_pmu_event_init,
> + .add = sifive_ccache_pmu_add,
> + .del = sifive_ccache_pmu_del,
> + .start = sifive_ccache_pmu_start,
> + .stop = sifive_ccache_pmu_stop,
> + .read = sifive_ccache_pmu_read,
> + };
> + ccache_pmu->cpu = nr_cpu_ids;
> + ccache_pmu->n_counters = n_counters;
> +
> + ccache_pmu->base = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(ccache_pmu->base))
> + return PTR_ERR(ccache_pmu->base);
> +
> + sifive_ccache_pmu_hw_init(ccache_pmu);
> +
> + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
> + if (ret)
> + return dev_err_probe(dev, ret, "Failed to add CPU hotplug instance\n");
you could use devm_add_action_or_reset() and trivial callback to unwind this in remove + error
paths automatically. Slight simplification of code, though may end up a line or two longer.
Do the same for perf_pmu_unregister() and you can get rid of remove entirely.
> +
> + ret = perf_pmu_register(&ccache_pmu->pmu, "sifive_ccache_pmu", -1);
> + if (ret) {
> + dev_err_probe(dev, ret, "Failed to register PMU\n");
> + goto err_remove_instance;
> + }
> +
> + return 0;
> +
> +err_remove_instance:
> + cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
> +
> + return ret;
> +}
> +
> +static void sifive_ccache_pmu_remove(struct platform_device *pdev)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = platform_get_drvdata(pdev);
> +
> + perf_pmu_unregister(&ccache_pmu->pmu);
> + cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
> +}
> +
> +static const struct of_device_id sifive_ccache_pmu_of_match[] = {
> + { .compatible = "sifive,ccache0" },
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, sifive_ccache_pmu_of_match);
> +
> +static struct platform_driver sifive_ccache_pmu_driver = {
> + .probe = sifive_ccache_pmu_probe,
> + .remove_new = sifive_ccache_pmu_remove,
Is this actually aligning anything in a useful fashion?
I'd just use a single space instead and not bother. The alignment tends to
just end up broken and provides little readability advantage.
> + .driver = {
> + .name = "sifive_ccache_pmu",
> + .of_match_table = sifive_ccache_pmu_of_match,
> + },
> +};
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 172d0a743e5d..be6361fdc8ba 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -230,6 +230,7 @@ enum cpuhp_state {
> CPUHP_AP_PERF_POWERPC_TRACE_IMC_ONLINE,
> CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
> CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE,
> + CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE,
Not sure, but can you get away with CPUHP_AP_ONLINE_DYN ?
Nicer to avoid adding more entries to this list if that is suitable here.
> CPUHP_AP_PERF_CSKY_ONLINE,
> CPUHP_AP_WATCHDOG_ONLINE,
> CPUHP_AP_WORKQUEUE_ONLINE,
On Thu, 15 Feb 2024 16:08:16 -0800
Samuel Holland <[email protected]> wrote:
> From: Eric Lin <[email protected]>
>
> Add a driver for the PMU found in the SiFive Extensible Cache
> controller. This PMU provides a configurable number of counters and a
> variety of events. Events are grouped into sets. Each counter can count
> events from only one set at a time; however, it can count any number of
> events within that set simultaneously. The PMU hardware does not provide
> an overflow interrupt.
>
> The counter inhibit register is used to atomically start/stop/read a
> group of counters so their values can be usefully compared.
>
> Some events can be filtered further by client ID (e.g. CPU or external
> DMA master). That functionality is not supported by this driver.
>
> This driver further assumes that a single Extensible Cache instance is
> shared by all CPUs in the system.
>
> Example usage:
>
> $ perf stat -e sifive_ecache_pmu/inner_rd_request/,
> sifive_ecache_pmu/inner_wr_request/,
> sifive_ecache_pmu/inner_rd_request_hit/,
> sifive_ecache_pmu/inner_wr_request_hit/ ls
>
> Performance counter stats for 'system wide':
>
> 148001 sifive_ecache_pmu/inner_rd_request/
> 121064 sifive_ecache_pmu/inner_wr_request/
> 113124 sifive_ecache_pmu/inner_rd_request_hit/
> 120860 sifive_ecache_pmu/inner_wr_request_hit/
>
> 0.010643962 seconds time elapsed
>
> Example combining the read/write events together within each counter:
>
> $ perf stat -e sifive_ecache_pmu/event=0x601/,
> sifive_ecache_pmu/event=0xc001/ ls
>
> Performance counter stats for 'system wide':
>
> 262619 sifive_ecache_pmu/event=0x601/
> 224533 sifive_ecache_pmu/event=0xc001/
>
> 0.009794808 seconds time elapsed
>
> Signed-off-by: Eric Lin <[email protected]>
> Co-developed-by: Samuel Holland <[email protected]>
> Signed-off-by: Samuel Holland <[email protected]>
Hi Samuel,
Some comments inline. Note this is a driver by review so not very
thorough!
Jonathan
> +
> +static u64 read_counter(const struct sifive_ecache_pmu *ecache_pmu, const struct hw_perf_event *hwc)
> +{
> + u64 value = 0;
> +
> + for (int i = 0; i < ecache_pmu->n_slices; i++) {
> + void __iomem *base = ecache_pmu->slice[i].base;
> +
> + value += readq(base + hwc->event_base);
Feels like this summing should be a userspace problem.
Knowing about slice imbalance is often useful (depending on the
micro architecture obviously!)
> + }
> +
> + return value;
> +}
> +
> +static int sifive_ecache_pmu_probe(struct platform_device *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct device_node *ecache_node = dev_of_node(dev);
> + struct sifive_ecache_pmu *ecache_pmu;
> + struct device_node *slice_node;
> + u32 slice_counters;
> + int n_slices, ret;
> + int i = 0;
> +
> + n_slices = of_get_available_child_count(ecache_node);
fwnode_get_available_child_count(dev_fwnode(&pdev->dev));
Not sure why there isn't yet a device version of this (IIRC anyway).
> + if (!n_slices)
> + return -ENODEV;
> +
> + ecache_pmu = devm_kzalloc(dev, struct_size(ecache_pmu, slice, n_slices), GFP_KERNEL);
> + if (!ecache_pmu)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, ecache_pmu);
> +
> + ecache_pmu->pmu = (struct pmu) {
> + .parent = dev,
> + .attr_groups = sifive_ecache_pmu_attr_grps,
> + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
> + .task_ctx_nr = perf_invalid_context,
> + .event_init = sifive_ecache_pmu_event_init,
> + .add = sifive_ecache_pmu_add,
> + .del = sifive_ecache_pmu_del,
> + .start = sifive_ecache_pmu_start,
> + .stop = sifive_ecache_pmu_stop,
> + .read = sifive_ecache_pmu_read,
> + .start_txn = sifive_ecache_pmu_start_txn,
> + .commit_txn = sifive_ecache_pmu_commit_txn,
> + .cancel_txn = sifive_ecache_pmu_cancel_txn,
> + };
> + ecache_pmu->cpu = nr_cpu_ids;
> + ecache_pmu->n_counters = ECACHE_PMU_MAX_COUNTERS;
> + ecache_pmu->n_slices = n_slices;
> +
> + for_each_available_child_of_node(ecache_node, slice_node) {
device_for_each_child_node() (generic handlers only provide the available version btw
which is non obvious from naming)
> + struct sifive_ecache_pmu_slice *slice = &ecache_pmu->slice[i++];
> +
> + slice->base = devm_of_iomap(dev, slice_node, 0, NULL);
> + if (IS_ERR(slice->base))
Leaked slice_node
FWIW https://lore.kernel.org/linux-iio/[email protected]/
adds device_for_each_child_node_scoped() which deals with this stuff using
cleanup.h magic.
> + return PTR_ERR(slice->base);
> +
> + /* Get number of counters from slice node */
> + ret = of_property_read_u32(slice_node, "sifive,perfmon-counters", &slice_counters);
Not sure on what perf maintainers want, but I'd go with
device_property_read etc as in the previous driver.
> + if (ret)
leaked slice_node
> + return dev_err_probe(dev, ret,
> + "Slice %pOF missing sifive,perfmon-counters property\n",
> + slice_node);
> +
> + ecache_pmu->n_counters = min_t(u32, slice_counters, ecache_pmu->n_counters);
> + }
> +
> + sifive_ecache_pmu_hw_init(ecache_pmu);
> +
> + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE, &ecache_pmu->node);
> + if (ret)
> + return dev_err_probe(dev, ret, "Failed to add CPU hotplug instance\n");
> +
> + ret = perf_pmu_register(&ecache_pmu->pmu, "sifive_ecache_pmu", -1);
> + if (ret) {
> + dev_err_probe(dev, ret, "Failed to register PMU\n");
> + goto err_remove_instance;
Comments from other review apply here as well so if you agree apply them in both drivers.
> + }
> +
> + return 0;
> +
> +err_remove_instance:
> + cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_ECACHE_ONLINE, &ecache_pmu->node);
> +
> + return ret;
> +}
On Sun, Feb 18, 2024 at 07:35:35PM +0100, Krzysztof Kozlowski wrote:
> On 18/02/2024 16:29, Samuel Holland wrote:
> > Hi Krzysztof,
> >
> > On 2024-02-17 3:00 AM, Krzysztof Kozlowski wrote:
> >> On 16/02/2024 01:08, Samuel Holland wrote:
> >>> The SiFive Composable Cache controller contains an optional PMU with a
> >>> configurable number of event counters. Document a property which
> >>
> >> Configurable in what context? By chip designers or by OS? Why this
> >> cannot be deduced from the compatible?
> >
> > This parameter is configurable by the chip designers.
> >
> > The information certainly can be deduced from the SoC-specific compatible
> > string, but doing so makes the driver only work on that specific list of SoCs.
>
> Usually that's exactly what's expected, so why here usual approach is wrong?
>
> > When provided via a property, the driver can work without changes on any SoC
> > that uses this IP block. (None of the SoCs currently listed in the binding
>
> Sorry, properties are not a work-around for missing compatibles.
>
> > contain a PMU, so there is no backward compatibility concern with adding the new
> > property.)
> >
> > My understanding of the purpose of the SoC-specific compatible string is to
> > handle eventualities (silicon bugs, integration quirks, etc.), not to
> > intentionally limit the driver to a narrow list of hardware.
>
> Depends what is the hardware. For most of licensed blocks, the final
> design is the hardware so equals to its compatible.
While I generally agree, I think a property is fine here for 2 reasons.
This is going to vary on just about every design. That's true for any
PMU. So maybe this shouldn't even be SiFfive specific.
The second is counters available to the OS may not equal the number in
h/w because counters could be reserved for different priviledge levels
(secure, hypervisor, guest, etc.). No idea if Risc-V supports this, but
if not it is a matter of time. That's more likely for a core PMU than an
uncore PMU.
Rob
On Thu, Feb 15, 2024 at 04:08:13PM -0800, Samuel Holland wrote:
> The SiFive Composable Cache controller contains an optional PMU with a
> configurable number of event counters. Document a property which
> describes the number of available counters.
>
> Signed-off-by: Samuel Holland <[email protected]>
> ---
>
> Documentation/devicetree/bindings/cache/sifive,ccache0.yaml | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml b/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml
> index 7e8cebe21584..100eda4345de 100644
> --- a/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml
> +++ b/Documentation/devicetree/bindings/cache/sifive,ccache0.yaml
> @@ -81,6 +81,11 @@ properties:
> The reference to the reserved-memory for the L2 Loosely Integrated Memory region.
> The reserved memory node should be defined as per the bindings in reserved-memory.txt.
>
> + sifive,perfmon-counters:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 0
> + description: Number of PMU counter registers
I think this should be restricted to devices that actually have it,
given we've already gone pretty hard in this binding with specific
requirements.
> +
> allOf:
> - $ref: /schemas/cache-controller.yaml#
>
> --
> 2.43.0
>
On Fri, Feb 16, 2024 at 10:05:04AM +0000, Conor Dooley wrote:
> On Thu, Feb 15, 2024 at 04:08:12PM -0800, Samuel Holland wrote:
>
> > All three of these cache controllers (with PMUs) have been integrated in
> > SoCs by our customers. However, as none of those SoCs have been publicly
> > announced yet, I cannot include SoC-specific compatible strings in this
> > version of the devicetree bindings.
>
> And I don't want to apply any of those dt-binding patches until then.
> Stuff like "sifive,perfmon-counters" seems like a property that would
> go away with a device-specific compatible, at least for the ccache.
Reading the P550 stuff today reminded me that I had not got around to
looking at this series again. You should be able to use that to satisfy
my wish for some soc-specific compatibles, right?
And w.r.r. the perfmon-counters property, looked to me like Rob was
proposing it not even having to be vendor specific.
On 2024-02-16 12:08 am, Samuel Holland wrote:
> From: Eric Lin <[email protected]>
>
> Add a driver for the PMU found in the SiFive Composable Cache
> controller. This PMU provides a configurable number of counters and a
> variety of events. Events are grouped into sets. Each counter can count
> events from only one set at a time; however, it can count any number of
> events within that set simultaneously. The PMU hardware does not provide
> an overflow interrupt or a way to atomically control groups of counters.
>
> Some events can be filtered further by client ID (e.g. CPU or external
> DMA master). That functionality is not supported by this driver.
>
> This driver further assumes that a single Composable Cache instance is
> shared by all CPUs in the system.
>
> Example usage:
>
> $ perf stat -a -e sifive_ccache_pmu/inner_acquire_block_btot/,
> sifive_ccache_pmu/inner_acquire_block_hit/,
> sifive_ccache_pmu/inner_acquire_block_ntob/ ls
>
> Performance counter stats for 'system wide':
>
> 542 sifive_ccache_pmu/inner_acquire_block_btot/
> 22081 sifive_ccache_pmu/inner_acquire_block_hit/
> 22006 sifive_ccache_pmu/inner_acquire_block_ntob/
>
> 0.064672432 seconds time elapsed
>
> Example using numeric event selectors:
>
> $ perf stat -a -e sifive_ccache_pmu/event=0x10001/,
> sifive_ccache_pmu/event=0x2002/,
> sifive_ccache_pmu/event=0x4001/ ls
>
> Performance counter stats for 'system wide':
>
> 478 sifive_ccache_pmu/event=0x10001/
> 4717 sifive_ccache_pmu/event=0x2002/
> 44966 sifive_ccache_pmu/event=0x4001/
>
> 0.111027326 seconds time elapsed
>
> Signed-off-by: Eric Lin <[email protected]>
> Co-developed-by: Samuel Holland <[email protected]>
> Signed-off-by: Samuel Holland <[email protected]>
> ---
>
> drivers/perf/Kconfig | 9 +
> drivers/perf/Makefile | 1 +
> drivers/perf/sifive_ccache_pmu.c | 577 +++++++++++++++++++++++++++++++
> include/linux/cpuhotplug.h | 1 +
> 4 files changed, 588 insertions(+)
> create mode 100644 drivers/perf/sifive_ccache_pmu.c
>
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index ec6e0d9194a1..b4e4db7424b4 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -155,6 +155,15 @@ config QCOM_L3_PMU
> Adds the L3 cache PMU into the perf events subsystem for
> monitoring L3 cache events.
>
> +config SIFIVE_CCACHE_PMU
> + tristate "SiFive Composable Cache PMU"
> + depends on RISCV || COMPILE_TEST
> + help
> + Support for the Composable Cache performance monitoring unit (PMU) on
> + SiFive platforms. The Composable Cache PMU provides up to 64 counters
> + for measuring whole-system L2/L3 cache performance using the perf
> + events subsystem.
> +
> config THUNDERX2_PMU
> tristate "Cavium ThunderX2 SoC PMU UNCORE"
> depends on ARCH_THUNDER2 || COMPILE_TEST
> diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> index a06338e3401c..51ef5f50ace4 100644
> --- a/drivers/perf/Makefile
> +++ b/drivers/perf/Makefile
> @@ -15,6 +15,7 @@ obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
> obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
> obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
> obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o
> +obj-$(CONFIG_SIFIVE_CCACHE_PMU) += sifive_ccache_pmu.o
> obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
> obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
> obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
> diff --git a/drivers/perf/sifive_ccache_pmu.c b/drivers/perf/sifive_ccache_pmu.c
> new file mode 100644
> index 000000000000..8c9ef0d09f48
> --- /dev/null
> +++ b/drivers/perf/sifive_ccache_pmu.c
> @@ -0,0 +1,577 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SiFive Composable Cache PMU driver
> + *
> + * Copyright (C) 2022-2024 SiFive, Inc.
> + * Copyright (C) Eric Lin <[email protected]>
> + *
> + */
> +
> +#include <linux/cpuhotplug.h>
> +#include <linux/cpumask.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/perf_event.h>
> +#include <linux/platform_device.h>
> +#include <linux/property.h>
> +
> +#define CCACHE_SELECT_OFFSET 0x2000
> +#define CCACHE_CLIENT_FILTER_OFFSET 0x2800
> +#define CCACHE_COUNTER_OFFSET 0x3000
> +
> +#define CCACHE_PMU_MAX_COUNTERS 64
> +
> +struct sifive_ccache_pmu {
> + struct pmu pmu;
> + struct hlist_node node;
> + struct notifier_block cpu_pm_nb;
This seems unused.
> + void __iomem *base;
> + DECLARE_BITMAP(used_mask, CCACHE_PMU_MAX_COUNTERS);
> + unsigned int cpu;
> + int n_counters;
> + struct perf_event *events[] __counted_by(n_counters);
> +};
> +
> +#define to_ccache_pmu(p) (container_of(p, struct sifive_ccache_pmu, pmu))
> +
> +#ifndef readq
> +static inline u64 readq(void __iomem *addr)
> +{
> + return readl(addr) | (((u64)readl(addr + 4)) << 32);
> +}
> +#endif
> +
> +#ifndef writeq
> +static inline void writeq(u64 v, void __iomem *addr)
> +{
> + writel(lower_32_bits(v), addr);
> + writel(upper_32_bits(v), addr + 4);
> +}
> +#endif
As Jonathan says, please include the io-64-nonatomic header of
preference and don't reinvent these as-is. However, see later...
> +
> +/*
> + * sysfs attributes
> + *
> + * We export:
> + * - cpumask, used by perf user space and other tools to know on which CPUs to create events
> + * - events, used by perf user space and other tools to create events symbolically, e.g.:
> + * perf stat -a -e sifive_ccache_pmu/event=inner_put_partial_data_hit/ ls
> + * perf stat -a -e sifive_ccache_pmu/event=0x101/ ls
> + * - formats, used by perf user space and other tools to configure events
> + */
> +
> +/* cpumask */
> +static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = dev_get_drvdata(dev);
> +
> + if (ccache_pmu->cpu >= nr_cpu_ids)
> + return 0;
I'm not sure it's really correct to return no data, but then this is
impossible in the first place (if there are no online CPUs, who's
reading the file?)
> +
> + return sysfs_emit(buf, "%d\n", ccache_pmu->cpu);
> +};
> +
> +static DEVICE_ATTR_RO(cpumask);
> +
> +static struct attribute *sifive_ccache_pmu_cpumask_attrs[] = {
> + &dev_attr_cpumask.attr,
> + NULL,
> +};
> +
> +static const struct attribute_group sifive_ccache_pmu_cpumask_group = {
> + .attrs = sifive_ccache_pmu_cpumask_attrs,
> +};
> +
> +/* events */
> +static ssize_t sifive_ccache_pmu_event_show(struct device *dev, struct device_attribute *attr,
> + char *page)
> +{
> + struct perf_pmu_events_attr *pmu_attr;
> +
> + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
> + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id);
> +}
> +
> +#define SET_EVENT_SELECT(_event, _set) (BIT_ULL((_event) + 8) | (_set))
This seems like you really want to have distinct "set" and "event"
fields in the config.
> +#define CCACHE_PMU_EVENT_ATTR(_name, _event, _set) \
> + PMU_EVENT_ATTR_ID(_name, sifive_ccache_pmu_event_show, SET_EVENT_SELECT(_event, _set))
> +
> +enum ccache_pmu_event_set1 {
> + INNER_PUT_FULL_DATA = 0,
> + INNER_PUT_PARTIAL_DATA,
> + INNER_ATOMIC_DATA,
> + INNER_GET,
> + INNER_PREFETCH_READ,
> + INNER_PREFETCH_WRITE,
> + INNER_ACQUIRE_BLOCK_NTOB,
> + INNER_ACQUIRE_BLOCK_NTOT,
> + INNER_ACQUIRE_BLOCK_BTOT,
> + INNER_ACQUIRE_PERM_NTOT,
> + INNER_ACQUIRE_PERM_BTOT,
> + INNER_RELEASE_TTOB,
> + INNER_RELEASE_TTON,
> + INNER_RELEASE_BTON,
> + INNER_RELEASE_DATA_TTOB,
> + INNER_RELEASE_DATA_TTON,
> + INNER_RELEASE_DATA_BTON,
> + OUTER_PROBE_BLOCK_TOT,
> + OUTER_PROBE_BLOCK_TOB,
> + OUTER_PROBE_BLOCK_TON,
> + CCACHE_PMU_MAX_EVENT1_IDX
> +};
> +
> +enum ccache_pmu_event_set2 {
> + INNER_PUT_FULL_DATA_HIT = 0,
> + INNER_PUT_PARTIAL_DATA_HIT,
> + INNER_ATOMIC_DATA_HIT,
> + INNER_GET_HIT,
> + INNER_PREFETCH_HIT,
> + INNER_ACQUIRE_BLOCK_HIT,
> + INNER_ACQUIRE_PERM_HIT,
> + INNER_RELEASE_HIT,
> + INNER_RELEASE_DATA_HIT,
> + OUTER_PROBE_HIT,
> + INNER_PUT_FULL_DATA_HIT_SHARED,
> + INNER_PUT_PARTIAL_DATA_HIT_SHARED,
> + INNER_ATOMIC_DATA_HIT_SHARED,
> + INNER_GET_HIT_SHARED,
> + INNER_PREFETCH_HIT_SHARED,
> + INNER_ACQUIRE_BLOCK_HIT_SHARED,
> + INNER_ACQUIRE_PERM_HIT_SHARED,
> + OUTER_PROBE_HIT_SHARED,
> + OUTER_PROBE_HIT_DIRTY,
> + CCACHE_PMU_MAX_EVENT2_IDX
> +};
> +
> +enum ccache_pmu_event_set3 {
> + OUTER_ACQUIRE_BLOCK_NTOB_MISS = 0,
> + OUTER_ACQUIRE_BLOCK_NTOT_MISS,
> + OUTER_ACQUIRE_BLOCK_BTOT_MISS,
> + OUTER_ACQUIRE_PERM_NTOT_MISS,
> + OUTER_ACQUIRE_PERM_BTOT_MISS,
> + OUTER_RELEASE_TTOB_EVICTION,
> + OUTER_RELEASE_TTON_EVICTION,
> + OUTER_RELEASE_BTON_EVICTION,
> + OUTER_RELEASE_DATA_TTOB_NOT_APPLICABLE,
> + OUTER_RELEASE_DATA_TTON_DIRTY_EVICTION,
> + OUTER_RELEASE_DATA_BTON_NOT_APPLICABLE,
> + INNER_PROBE_BLOCK_TOT_CODE_MISS_HITS_OTHER_HARTS,
> + INNER_PROBE_BLOCK_TOB_LOAD_MISS_HITS_OTHER_HARTS,
> + INNER_PROBE_BLOCK_TON_STORE_MISS_HITS_OTHER_HARTS,
> + CCACHE_PMU_MAX_EVENT3_IDX
> +};
> +
> +enum ccache_pmu_event_set4 {
> + INNER_HINT_HITS_INFLIGHT_MISS = 0,
> + CCACHE_PMU_MAX_EVENT4_IDX
> +};
> +
> +static struct attribute *sifive_ccache_pmu_events[] = {
> + /* pmEventSelect1 */
> + CCACHE_PMU_EVENT_ATTR(inner_put_full_data, INNER_PUT_FULL_DATA, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_put_partial_data, INNER_PUT_PARTIAL_DATA, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_atomic_data, INNER_ATOMIC_DATA, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_get, INNER_GET, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_prefetch_read, INNER_PREFETCH_READ, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_prefetch_write, INNER_PREFETCH_WRITE, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_block_ntob, INNER_ACQUIRE_BLOCK_NTOB, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_block_ntot, INNER_ACQUIRE_BLOCK_NTOT, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_block_btot, INNER_ACQUIRE_BLOCK_BTOT, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_ntot, INNER_ACQUIRE_PERM_NTOT, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_btot, INNER_ACQUIRE_PERM_BTOT, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_release_ttob, INNER_RELEASE_TTOB, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_release_tton, INNER_RELEASE_TTON, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_release_bton, INNER_RELEASE_BTON, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_release_data_ttob, INNER_RELEASE_DATA_TTOB, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_release_data_tton, INNER_RELEASE_DATA_TTON, 1),
> + CCACHE_PMU_EVENT_ATTR(inner_release_data_bton, INNER_RELEASE_DATA_BTON, 1),
> + CCACHE_PMU_EVENT_ATTR(outer_probe_block_tot, OUTER_PROBE_BLOCK_TOT, 1),
> + CCACHE_PMU_EVENT_ATTR(outer_probe_block_tob, OUTER_PROBE_BLOCK_TOB, 1),
> + CCACHE_PMU_EVENT_ATTR(outer_probe_block_ton, OUTER_PROBE_BLOCK_TON, 1),
Yuck, those enums not only make the code overly verbose and repetitive,
but they completely obfuscate the significant information here. I don't
personally care enough to go digging for documentation to review whether
the event numbers are correct for the event names they claim to be, but
anyone who did wish to do that is clearly in for an unnecessarily hard
time :(
> +
> + /* pmEventSelect2 */
> + CCACHE_PMU_EVENT_ATTR(inner_put_full_data_hit, INNER_PUT_FULL_DATA_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_put_partial_data_hit, INNER_PUT_PARTIAL_DATA_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_atomic_data_hit, INNER_ATOMIC_DATA_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_get_hit, INNER_GET_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_prefetch_hit, INNER_PREFETCH_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_block_hit, INNER_ACQUIRE_BLOCK_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_hit, INNER_ACQUIRE_PERM_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_release_hit, INNER_RELEASE_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_release_data_hit, INNER_RELEASE_DATA_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(outer_probe_hit, OUTER_PROBE_HIT, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_put_full_data_hit_shared, INNER_PUT_FULL_DATA_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_put_partial_data_hit_shared,
> + INNER_PUT_PARTIAL_DATA_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_atomic_data_hit_shared, INNER_ATOMIC_DATA_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_get_hit_shared, INNER_GET_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_prefetch_hit_shared, INNER_PREFETCH_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_block_hit_shared, INNER_ACQUIRE_BLOCK_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(inner_acquire_perm_hit_shared, INNER_ACQUIRE_PERM_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(outer_probe_hit_shared, OUTER_PROBE_HIT_SHARED, 2),
> + CCACHE_PMU_EVENT_ATTR(outer_probe_hit_dirty, OUTER_PROBE_HIT_DIRTY, 2),
> +
> + /* pmEventSelect3 */
> + CCACHE_PMU_EVENT_ATTR(outer_acquire_block_ntob_miss, OUTER_ACQUIRE_BLOCK_NTOB_MISS, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_acquire_block_ntot_miss, OUTER_ACQUIRE_BLOCK_NTOT_MISS, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_acquire_block_btot_miss, OUTER_ACQUIRE_BLOCK_BTOT_MISS, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_acquire_perm_ntot_miss, OUTER_ACQUIRE_PERM_NTOT_MISS, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_acquire_perm_btot_miss, OUTER_ACQUIRE_PERM_BTOT_MISS, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_release_ttob_eviction, OUTER_RELEASE_TTOB_EVICTION, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_release_tton_eviction, OUTER_RELEASE_TTON_EVICTION, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_release_bton_eviction, OUTER_RELEASE_BTON_EVICTION, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_release_data_ttob_not_applicable,
> + OUTER_RELEASE_DATA_TTOB_NOT_APPLICABLE, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_release_data_tton_dirty_eviction,
> + OUTER_RELEASE_DATA_TTON_DIRTY_EVICTION, 3),
> + CCACHE_PMU_EVENT_ATTR(outer_release_data_bton_not_applicable,
> + OUTER_RELEASE_DATA_BTON_NOT_APPLICABLE, 3),
> + CCACHE_PMU_EVENT_ATTR(inner_probe_block_tot_code_miss_hits_other_harts,
> + INNER_PROBE_BLOCK_TOT_CODE_MISS_HITS_OTHER_HARTS, 3),
> + CCACHE_PMU_EVENT_ATTR(inner_probe_block_tob_load_miss_hits_other_harts,
> + INNER_PROBE_BLOCK_TOB_LOAD_MISS_HITS_OTHER_HARTS, 3),
> + CCACHE_PMU_EVENT_ATTR(inner_probe_block_ton_store_miss_hits_other_harts,
> + INNER_PROBE_BLOCK_TON_STORE_MISS_HITS_OTHER_HARTS, 3),
> +
> + /* pm_event_select4 */
> + CCACHE_PMU_EVENT_ATTR(inner_hint_hits_inflight_miss, INNER_HINT_HITS_INFLIGHT_MISS, 4),
> + NULL
> +};
> +
> +static struct attribute_group sifive_ccache_pmu_events_group = {
> + .name = "events",
> + .attrs = sifive_ccache_pmu_events,
> +};
> +
> +/* formats */
> +PMU_FORMAT_ATTR(event, "config:0-63");
> +
> +static struct attribute *sifive_ccache_pmu_formats[] = {
> + &format_attr_event.attr,
> + NULL,
> +};
> +
> +static struct attribute_group sifive_ccache_pmu_format_group = {
> + .name = "format",
> + .attrs = sifive_ccache_pmu_formats,
> +};
> +
> +/*
> + * Per PMU device attribute groups
> + */
> +
> +static const struct attribute_group *sifive_ccache_pmu_attr_grps[] = {
> + &sifive_ccache_pmu_cpumask_group,
> + &sifive_ccache_pmu_events_group,
> + &sifive_ccache_pmu_format_group,
> + NULL,
> +};
> +
> +/*
> + * Event Initialization
> + */
> +
> +static int sifive_ccache_pmu_event_init(struct perf_event *event)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
> + struct hw_perf_event *hwc = &event->hw;
> + u64 config = event->attr.config;
> + u64 ev_type = config >> 8;
> + u64 set = config & 0xff;
> +
I have some patches in progress trying to improve the situation, but for
now you still must check that the event is actually for your PMU before
doing anything else, and return -ENOENT if not, otherwise you may be
offered raw or hardware events, misinterpret them, and confuse the user
with nonsense counts from the wrong PMU.
> + /* Check if this is a valid set and event */
> + switch (set) {
> + case 1:
> + if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT1_IDX))
> + return -ENOENT;
Conversely if the event *is* yours, then you must not return -ENOENT for
any other error - these look like they would all probably be -EINVAL
conditions (see the comments in the definition of struct pmu).
> + break;
> + case 2:
> + if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT2_IDX))
> + return -ENOENT;
> + break;
> + case 3:
> + if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT3_IDX))
> + return -ENOENT;
> + break;
> + case 4:
> + if (ev_type >= BIT_ULL(CCACHE_PMU_MAX_EVENT4_IDX))
> + return -ENOENT;
> + break;
> + default:
> + return -ENOENT;
> + }
There's also the matter of validating event groups - if you can't
support them at all that should mean simply rejecting any event grouped
with any other non-software event.
> + /* Do not allocate the hardware counter yet */
> + hwc->idx = -1;
> + hwc->config = config;
> +
> + event->cpu = ccache_pmu->cpu;
> +
> + return 0;
> +}
> +
> +/*
> + * pmu->read: read and update the counter
> + */
> +static void sifive_ccache_pmu_read(struct perf_event *event)
> +{
> + struct hw_perf_event *hwc = &event->hw;
> + u64 prev_raw_count, new_raw_count;
> + u64 oldval;
> +
> + do {
> + prev_raw_count = local64_read(&hwc->prev_count);
> + new_raw_count = readq((void *)hwc->event_base);
The 32-bit non-atomic version could read a torn value here, thus give an
apparent large jump backwards in the count. If the 32-bit support is
entirely nominal for build-testing then that's probably OK (possibly
worth a comment here), but if it's expected to actually be used then I
think you want an explicit hi-lo-hi sequence in teh 32-bit case to
ensure the value is read correctly.
> +
> + oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count, new_raw_count);
> + } while (oldval != prev_raw_count);
If you don't have an overflow interrupt and don't expect to ever have
one (not entirely unreasonable if counters are always a full 64 bits),
then strictly you shouldn't need the cmpxchg loop, however there is
certainly still a consistency argument for sticking to the familiar
pattern anyway if it's not getting in the way.
> +
> + local64_add(new_raw_count - prev_raw_count, &event->count);
> +}
> +
> +/*
> + * State transition functions:
> + *
> + * start()/stop() & add()/del()
> + */
> +
> +/*
> + * pmu->start: start the event
> + */
> +static void sifive_ccache_pmu_start(struct perf_event *event, int flags)
> +{
> + struct hw_perf_event *hwc = &event->hw;
> +
> + if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED))) > + return;
> +
> + hwc->state = 0;
> +
> + /* Set initial value to 0 */
> + local64_set(&hwc->prev_count, 0);
> + writeq(0, (void *)hwc->event_base);
> +
> + /* Enable this counter to count events */
> + writeq(hwc->config, (void *)hwc->config_base);
> +}
> +
> +/*
> + * pmu->stop: stop the counter
> + */
> +static void sifive_ccache_pmu_stop(struct perf_event *event, int flags)
> +{
> + struct hw_perf_event *hwc = &event->hw;
> +
> + if (hwc->state & PERF_HES_STOPPED)
> + return;
> +
> + /* Disable this counter to count events */
> + writeq(0, (void *)hwc->config_base);
> + sifive_ccache_pmu_read(event);
> +
> + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
> +}
> +
> +/*
> + * pmu->add: add the event to the PMU
> + */
> +static int sifive_ccache_pmu_add(struct perf_event *event, int flags)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
> + struct hw_perf_event *hwc = &event->hw;
> + int idx;
> +
> + /* Find an available counter idx to use for this event */
> + do {
> + idx = find_first_zero_bit(ccache_pmu->used_mask, ccache_pmu->n_counters);
> + if (idx >= ccache_pmu->n_counters)
> + return -EAGAIN;
> + } while (test_and_set_bit(idx, ccache_pmu->used_mask));
FWIW I continue to maintain the opinion that faffing around with bitmaps
is more costly than simply scanning for an empty slot in the events
array that you still have to touch anyway.
> +
> + hwc->config_base = (unsigned long)ccache_pmu->base + CCACHE_SELECT_OFFSET + 8 * idx;
> + hwc->event_base = (unsigned long)ccache_pmu->base + CCACHE_COUNTER_OFFSET + 8 * idx;
> + hwc->idx = idx;
> + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
> +
> + ccache_pmu->events[idx] = event;
> +
> + if (flags & PERF_EF_START)
> + sifive_ccache_pmu_start(event, PERF_EF_RELOAD);
> +
> + perf_event_update_userpage(event);
> +
> + return 0;
> +}
> +
> +/*
> + * pmu->del: delete the event from the PMU
> + */
> +static void sifive_ccache_pmu_del(struct perf_event *event, int flags)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = to_ccache_pmu(event->pmu);
> + struct hw_perf_event *hwc = &event->hw;
> + int idx = hwc->idx;
> +
> + /* Stop and release this counter */
> + sifive_ccache_pmu_stop(event, PERF_EF_UPDATE);
> +
> + ccache_pmu->events[idx] = NULL;
> + clear_bit(idx, ccache_pmu->used_mask);
> +
> + perf_event_update_userpage(event);
> +}
> +
> +/*
> + * Driver initialization
> + */
> +
> +static void sifive_ccache_pmu_hw_init(const struct sifive_ccache_pmu *ccache_pmu)
> +{
> + /* Disable the client filter (not supported by this driver) */
Note that if filtering is something you may want to add support for in
future, it's well worth planning ahead in terms of config fields to
avoid churning the user API too much later.
> + writeq(0, ccache_pmu->base + CCACHE_CLIENT_FILTER_OFFSET);
> +}
> +
> +static int sifive_ccache_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> + struct sifive_ccache_pmu *ccache_pmu =
> + hlist_entry_safe(node, struct sifive_ccache_pmu, node);
> +
> + if (ccache_pmu->cpu >= nr_cpu_ids)
> + ccache_pmu->cpu = cpu;
If you don't have any NUMA of other affinity-related shenangians to
worry about, then it's really not worth bothering with an online
callback, just assign a CPU directly at the same point as all the other
initialisation. Plus it's then also that much clearer to reason about it
always being valid and not needing spurious nr_cpu_ids checks.
> +
> + return 0;
> +}
> +
> +static int sifive_ccache_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
> +{
> + struct sifive_ccache_pmu *ccache_pmu =
> + hlist_entry_safe(node, struct sifive_ccache_pmu, node);
> +
> + /* Do nothing if this CPU does not own the events */
> + if (cpu != ccache_pmu->cpu)
> + return 0;
> +
> + /* Pick a random online CPU */
> + ccache_pmu->cpu = cpumask_any_but(cpu_online_mask, cpu);
> + if (ccache_pmu->cpu >= nr_cpu_ids)
> + return 0;
> +
> + /* Migrate PMU events from this CPU to the target CPU */
> + perf_pmu_migrate_context(&ccache_pmu->pmu, cpu, ccache_pmu->cpu);
> +
> + return 0;
> +}
> +
> +static int sifive_ccache_pmu_probe(struct platform_device *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct sifive_ccache_pmu *ccache_pmu;
> + u32 n_counters;
> + int ret;
> +
> + /* Instances without a sifive,perfmon-counters property do not contain a PMU */
> + ret = device_property_read_u32(dev, "sifive,perfmon-counters", &n_counters);
> + if (ret || !n_counters)
Even simpler is to just initialise n_counters to 0 in the first place,
if the return value doesn't matter beyond "did it work or not?"
> + return -ENODEV;
> +
> + ccache_pmu = devm_kzalloc(dev, struct_size(ccache_pmu, events, n_counters), GFP_KERNEL);
Maybe worth a sanity check that n_counters isn't unrealistically large
either?
> + if (!ccache_pmu)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, ccache_pmu);
> +
> + ccache_pmu->pmu = (struct pmu) {
> + .parent = dev,
> + .attr_groups = sifive_ccache_pmu_attr_grps,
> + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
> + .task_ctx_nr = perf_invalid_context,
> + .event_init = sifive_ccache_pmu_event_init,
> + .add = sifive_ccache_pmu_add,
> + .del = sifive_ccache_pmu_del,
> + .start = sifive_ccache_pmu_start,
> + .stop = sifive_ccache_pmu_stop,
> + .read = sifive_ccache_pmu_read,
> + };
> + ccache_pmu->cpu = nr_cpu_ids;
> + ccache_pmu->n_counters = n_counters;
> +
> + ccache_pmu->base = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(ccache_pmu->base))
> + return PTR_ERR(ccache_pmu->base);
> +
> + sifive_ccache_pmu_hw_init(ccache_pmu);
> +
> + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
> + if (ret)
> + return dev_err_probe(dev, ret, "Failed to add CPU hotplug instance\n");
> +
> + ret = perf_pmu_register(&ccache_pmu->pmu, "sifive_ccache_pmu", -1);
> + if (ret) {
> + dev_err_probe(dev, ret, "Failed to register PMU\n");
> + goto err_remove_instance;
> + }
> +
> + return 0;
> +
> +err_remove_instance:
> + cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
You need the _nocalls variant here, since attempting to migrate the
context of an unregistered and uninitialised PMU tends to end badly.
> +
> + return ret;
> +}
> +
> +static void sifive_ccache_pmu_remove(struct platform_device *pdev)
> +{
> + struct sifive_ccache_pmu *ccache_pmu = platform_get_drvdata(pdev);
> +
> + perf_pmu_unregister(&ccache_pmu->pmu);
> + cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE, &ccache_pmu->node);
Similarly here.
From a quick skim I think a lot of these comments will also apply to
the other drivers in this series, so I'll stop here for now, and please
consider them in general.
Thanks,
Robin.
> +}
> +
> +static const struct of_device_id sifive_ccache_pmu_of_match[] = {
> + { .compatible = "sifive,ccache0" },
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, sifive_ccache_pmu_of_match);
> +
> +static struct platform_driver sifive_ccache_pmu_driver = {
> + .probe = sifive_ccache_pmu_probe,
> + .remove_new = sifive_ccache_pmu_remove,
> + .driver = {
> + .name = "sifive_ccache_pmu",
> + .of_match_table = sifive_ccache_pmu_of_match,
> + },
> +};
> +
> +static void __exit sifive_ccache_pmu_exit(void)
> +{
> + platform_driver_unregister(&sifive_ccache_pmu_driver);
> + cpuhp_remove_multi_state(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE);
> +}
> +module_exit(sifive_ccache_pmu_exit);
> +
> +static int __init sifive_ccache_pmu_init(void)
> +{
> + int ret;
> +
> + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE,
> + "perf/sifive/ccache:online",
> + sifive_ccache_pmu_online_cpu,
> + sifive_ccache_pmu_offline_cpu);
> + if (ret)
> + return ret;
> +
> + ret = platform_driver_register(&sifive_ccache_pmu_driver);
> + if (ret)
> + goto err_remove_state;
> +
> + return 0;
> +
> +err_remove_state:
> + cpuhp_remove_multi_state(CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE);
> +
> + return ret;
> +}
> +module_init(sifive_ccache_pmu_init);
> +
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 172d0a743e5d..be6361fdc8ba 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -230,6 +230,7 @@ enum cpuhp_state {
> CPUHP_AP_PERF_POWERPC_TRACE_IMC_ONLINE,
> CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
> CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE,
> + CPUHP_AP_PERF_RISCV_SIFIVE_CCACHE_ONLINE,
> CPUHP_AP_PERF_CSKY_ONLINE,
> CPUHP_AP_WATCHDOG_ONLINE,
> CPUHP_AP_WORKQUEUE_ONLINE,