LinuxLists.cc - [PATCH v2 0/8] drivers/perf: CPU PMU driver for Apple M1

2021-12-01 13:50:02

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 0/8] drivers/perf: CPU PMU driver for Apple M1

The M1 SoC embeds a per-CPU PMU that has a very different programming
interface compared to the architected PMUv3 that is normally present
on standard implementations.

This small series adds a driver for this HW by leveraging the arm_pmu
infrastructure, resulting in a rather simple driver.

Of course, we know next to nothing about the actual events this PMU
counts, aside from CPU cycles and instructions. Everything else is
undocumented (though as Dougall pointed out, someone could extract the
relevant information from a macOS install if they wanted -- I don't).

My hope is that this driver will help people to explore the event
space and propose possible interpretations for these events using
reproducible test cases.

* From v1 [1]:
- Added a few comments clarifying the event mapping to counters
- Spelling fixes
- Collected Acks from Rob

[1] https://lore.kernel.org/r/[email protected]

Marc Zyngier (8):
dt-bindings: arm-pmu: Document Apple PMU compatible strings
dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts
irqchip/apple-aic: Add cpumasks for E and P cores
irqchip/apple-aic: Wire PMU interrupts
irqchip/apple-aic: Move PMU-specific registers to their own include
file
arm64: apple: t8301: Add PMU nodes
drivers/perf: arm_pmu: Handle 47 bit counters
drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

.../devicetree/bindings/arm/pmu.yaml | 2 +
.../interrupt-controller/apple,aic.yaml | 2 +
arch/arm64/boot/dts/apple/t8103.dtsi | 12 +
arch/arm64/include/asm/apple_m1_pmu.h | 64 ++
drivers/irqchip/irq-apple-aic.c | 59 +-
drivers/perf/Kconfig | 7 +
drivers/perf/Makefile | 1 +
drivers/perf/apple_m1_cpu_pmu.c | 637 ++++++++++++++++++
drivers/perf/arm_pmu.c | 2 +
.../interrupt-controller/apple-aic.h | 2 +
include/linux/perf/arm_pmu.h | 2 +
11 files changed, 768 insertions(+), 22 deletions(-)
create mode 100644 arch/arm64/include/asm/apple_m1_pmu.h
create mode 100644 drivers/perf/apple_m1_cpu_pmu.c

--
2.30.2

2021-12-01 13:50:13

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 2/8] dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts

Advertise the two pseudo-interrupts that tied to the two PMU
flavours present in the Apple M1 SoC.

We choose the expose two different pseudo-interrupts to the OS
as the e-core PMU is obviously different from the p-core one,
effectively presenting two different devices.

Acked-by: Rob Herring <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
.../devicetree/bindings/interrupt-controller/apple,aic.yaml | 2 ++
include/dt-bindings/interrupt-controller/apple-aic.h | 2 ++
2 files changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
index cf6c091a07b1..b95e41816953 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
+++ b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
@@ -56,6 +56,8 @@ properties:
- 1: virtual HV timer
- 2: physical guest timer
- 3: virtual guest timer
+ - 4: 'efficient' CPU PMU
+ - 5: 'performance' CPU PMU

The 3rd cell contains the interrupt flags. This is normally
IRQ_TYPE_LEVEL_HIGH (4).
diff --git a/include/dt-bindings/interrupt-controller/apple-aic.h b/include/dt-bindings/interrupt-controller/apple-aic.h
index 604f2bb30ac0..bf3aac0e5491 100644
--- a/include/dt-bindings/interrupt-controller/apple-aic.h
+++ b/include/dt-bindings/interrupt-controller/apple-aic.h
@@ -11,5 +11,7 @@
#define AIC_TMR_HV_VIRT 1
#define AIC_TMR_GUEST_PHYS 2
#define AIC_TMR_GUEST_VIRT 3
+#define AIC_CPU_PMU_E 4
+#define AIC_CPU_PMU_P 5

#endif
--
2.30.2

2021-12-01 13:50:18

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 6/8] arm64: apple: t8301: Add PMU nodes

Advertise the two PMU nodes for the t8103 SoC.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/boot/dts/apple/t8103.dtsi | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/apple/t8103.dtsi b/arch/arm64/boot/dts/apple/t8103.dtsi
index fc8b2bb06ffe..d2e9afde3729 100644
--- a/arch/arm64/boot/dts/apple/t8103.dtsi
+++ b/arch/arm64/boot/dts/apple/t8103.dtsi
@@ -96,6 +96,18 @@ timer {
<AIC_FIQ AIC_TMR_HV_VIRT IRQ_TYPE_LEVEL_HIGH>;
};

+ pmu-e {
+ compatible = "apple,icestorm-pmu";
+ interrupt-parent = <&aic>;
+ interrupts = <AIC_FIQ AIC_CPU_PMU_E IRQ_TYPE_LEVEL_HIGH>;
+ };
+
+ pmu-p {
+ compatible = "apple,firestorm-pmu";
+ interrupt-parent = <&aic>;
+ interrupts = <AIC_FIQ AIC_CPU_PMU_P IRQ_TYPE_LEVEL_HIGH>;
+ };
+
clk24: clock-24m {
compatible = "fixed-clock";
#clock-cells = <0>;
--
2.30.2

2021-12-01 13:50:22

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

Add a new, weird and wonderful driver for the equally weird Apple
PMU HW. Although the PMU itself is functional, we don't know much
about the events yet, so this can be considered as yet another
random number generator...

Nonetheless, it can reliably count at least cycles and instructions
in the usually wonky big-little way. For anything else, it of course
supports raw event numbers.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/apple_m1_pmu.h | 45 ++
drivers/perf/Kconfig | 7 +
drivers/perf/Makefile | 1 +
drivers/perf/apple_m1_cpu_pmu.c | 637 ++++++++++++++++++++++++++
4 files changed, 690 insertions(+)
create mode 100644 drivers/perf/apple_m1_cpu_pmu.c

diff --git a/arch/arm64/include/asm/apple_m1_pmu.h b/arch/arm64/include/asm/apple_m1_pmu.h
index b848af7faadc..99483b19b99f 100644
--- a/arch/arm64/include/asm/apple_m1_pmu.h
+++ b/arch/arm64/include/asm/apple_m1_pmu.h
@@ -6,8 +6,21 @@
#include <linux/bits.h>
#include <asm/sysreg.h>

+/* Counters */
+#define SYS_IMP_APL_PMC0_EL1 sys_reg(3, 2, 15, 0, 0)
+#define SYS_IMP_APL_PMC1_EL1 sys_reg(3, 2, 15, 1, 0)
+#define SYS_IMP_APL_PMC2_EL1 sys_reg(3, 2, 15, 2, 0)
+#define SYS_IMP_APL_PMC3_EL1 sys_reg(3, 2, 15, 3, 0)
+#define SYS_IMP_APL_PMC4_EL1 sys_reg(3, 2, 15, 4, 0)
+#define SYS_IMP_APL_PMC5_EL1 sys_reg(3, 2, 15, 5, 0)
+#define SYS_IMP_APL_PMC6_EL1 sys_reg(3, 2, 15, 6, 0)
+#define SYS_IMP_APL_PMC7_EL1 sys_reg(3, 2, 15, 7, 0)
+#define SYS_IMP_APL_PMC8_EL1 sys_reg(3, 2, 15, 9, 0)
+#define SYS_IMP_APL_PMC9_EL1 sys_reg(3, 2, 15, 10, 0)
+
/* Core PMC control register */
#define SYS_IMP_APL_PMCR0_EL1 sys_reg(3, 1, 15, 0, 0)
+#define PMCR0_CNT_ENABLE_0_7 GENMASK(7, 0)
#define PMCR0_IMODE GENMASK(10, 8)
#define PMCR0_IMODE_OFF 0
#define PMCR0_IMODE_PMI 1
@@ -15,5 +28,37 @@
#define PMCR0_IMODE_HALT 3
#define PMCR0_IMODE_FIQ 4
#define PMCR0_IACT BIT(11)
+#define PMCR0_PMI_ENABLE_0_7 GENMASK(19, 12)
+#define PMCR0_STOP_CNT_ON_PMI BIT(20)
+#define PMCR0_CNT_GLOB_L2C_EVT BIT(21)
+#define PMCR0_DEFER_PMI_TO_ERET BIT(22)
+#define PMCR0_ALLOW_CNT_EN_EL0 BIT(30)
+#define PMCR0_CNT_ENABLE_8_9 GENMASK(33, 32)
+#define PMCR0_PMI_ENABLE_8_9 GENMASK(45, 44)
+
+#define SYS_IMP_APL_PMCR1_EL1 sys_reg(3, 1, 15, 1, 0)
+#define PMCR1_COUNT_A64_EL0_0_7 GENMASK(15, 8)
+#define PMCR1_COUNT_A64_EL1_0_7 GENMASK(23, 16)
+#define PMCR1_COUNT_A64_EL0_8_9 GENMASK(41, 40)
+#define PMCR1_COUNT_A64_EL1_8_9 GENMASK(49, 48)
+
+#define SYS_IMP_APL_PMCR2_EL1 sys_reg(3, 1, 15, 2, 0)
+#define SYS_IMP_APL_PMCR3_EL1 sys_reg(3, 1, 15, 3, 0)
+#define SYS_IMP_APL_PMCR4_EL1 sys_reg(3, 1, 15, 4, 0)
+
+#define SYS_IMP_APL_PMESR0_EL1 sys_reg(3, 1, 15, 5, 0)
+#define PMESR0_EVT_CNT_2 GENMASK(7, 0)
+#define PMESR0_EVT_CNT_3 GENMASK(15, 8)
+#define PMESR0_EVT_CNT_4 GENMASK(23, 16)
+#define PMESR0_EVT_CNT_5 GENMASK(31, 24)
+
+#define SYS_IMP_APL_PMESR1_EL1 sys_reg(3, 1, 15, 6, 0)
+#define PMESR1_EVT_CNT_6 GENMASK(7, 0)
+#define PMESR1_EVT_CNT_7 GENMASK(15, 8)
+#define PMESR1_EVT_CNT_8 GENMASK(23, 16)
+#define PMESR1_EVT_CNT_9 GENMASK(31, 24)
+
+#define SYS_IMP_APL_PMSR_EL1 sys_reg(3, 1, 15, 13, 0)
+#define PMSR_OVERFLOW GENMASK(9, 0)

#endif /* __ASM_APPLE_M1_PMU_h */
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 4374af292e6d..a6af7bcb82ef 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -139,6 +139,13 @@ config ARM_DMC620_PMU
Support for PMU events monitoring on the ARM DMC-620 memory
controller.

+config APPLE_M1_CPU_PMU
+ bool "Apple M1 CPU PMU support"
+ depends on ARM_PMU && ARCH_APPLE
+ help
+ Provides support for the non-architectural CPU PMUs present on
+ the Apple M1 SoCs and derivatives.
+
source "drivers/perf/hisilicon/Kconfig"

endmenu
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 5260b116c7da..1c8cffc8c326 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -14,3 +14,4 @@ obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
+obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
new file mode 100644
index 000000000000..560efa9731e7
--- /dev/null
+++ b/drivers/perf/apple_m1_cpu_pmu.c
@@ -0,0 +1,637 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * CPU PMU driver for the Apple M1 and derivatives
+ *
+ * Copyright (C) 2021 Google LLC
+ *
+ * Author: Marc Zyngier <[email protected]>
+ *
+ * Most of the information used in this driver was provided by the
+ * Asahi Linux project. The rest was experimentally discovered.
+ */
+
+#include <linux/of.h>
+#include <linux/perf/arm_pmu.h>
+#include <linux/platform_device.h>
+
+#include <asm/apple_m1_pmu.h>
+#include <asm/irq_regs.h>
+#include <asm/perf_event.h>
+
+#define M1_PMU_NR_COUNTERS 10
+
+#define M1_PMU_CFG_EVENT GENMASK(7, 0)
+
+#define ANY_BUT_0_1 GENMASK(9, 2)
+#define ONLY_2_TO_7 GENMASK(7, 2)
+#define ONLY_2_4_6 (BIT(2) | BIT(4) | BIT(6))
+#define ONLY_5_6_7 GENMASK(7, 5)
+
+/*
+ * Description of the events we actually know about, as well as those with
+ * a specific counter affinity. Yes, this is a grand total of two known
+ * counters, and the rest is anybody's guess.
+ *
+ * Not all counters can count all events. Counters #0 and #1 are wired to
+ * count cycles and instructions respectively, and some events have
+ * bizarre mappings (every other counter, or even *one* counter). These
+ * restrictions equally apply to both P and E cores.
+ *
+ * It is worth noting that the PMUs attached to P and E cores are likely
+ * to be different because the underlying uarches are different. At the
+ * moment, we don't really need to distinguish between the two because we
+ * know next to nothing about the events themselves, and we already have
+ * per cpu-type PMU abstractions.
+ *
+ * If we eventually find out that the events are different across
+ * implementations, we'll have to introduce per cpu-type tables.
+ */
+enum m1_pmu_events {
+ M1_PMU_PERFCTR_UNKNOWN_01 = 0x01,
+ M1_PMU_PERFCTR_CPU_CYCLES = 0x02,
+ M1_PMU_PERFCTR_INSTRUCTIONS = 0x8c,
+ M1_PMU_PERFCTR_UNKNOWN_8d = 0x8d,
+ M1_PMU_PERFCTR_UNKNOWN_8e = 0x8e,
+ M1_PMU_PERFCTR_UNKNOWN_8f = 0x8f,
+ M1_PMU_PERFCTR_UNKNOWN_90 = 0x90,
+ M1_PMU_PERFCTR_UNKNOWN_93 = 0x93,
+ M1_PMU_PERFCTR_UNKNOWN_94 = 0x94,
+ M1_PMU_PERFCTR_UNKNOWN_95 = 0x95,
+ M1_PMU_PERFCTR_UNKNOWN_96 = 0x96,
+ M1_PMU_PERFCTR_UNKNOWN_97 = 0x97,
+ M1_PMU_PERFCTR_UNKNOWN_98 = 0x98,
+ M1_PMU_PERFCTR_UNKNOWN_99 = 0x99,
+ M1_PMU_PERFCTR_UNKNOWN_9a = 0x9a,
+ M1_PMU_PERFCTR_UNKNOWN_9b = 0x9b,
+ M1_PMU_PERFCTR_UNKNOWN_9c = 0x9c,
+ M1_PMU_PERFCTR_UNKNOWN_9f = 0x9f,
+ M1_PMU_PERFCTR_UNKNOWN_bf = 0xbf,
+ M1_PMU_PERFCTR_UNKNOWN_c0 = 0xc0,
+ M1_PMU_PERFCTR_UNKNOWN_c1 = 0xc1,
+ M1_PMU_PERFCTR_UNKNOWN_c4 = 0xc4,
+ M1_PMU_PERFCTR_UNKNOWN_c5 = 0xc5,
+ M1_PMU_PERFCTR_UNKNOWN_c6 = 0xc6,
+ M1_PMU_PERFCTR_UNKNOWN_c8 = 0xc8,
+ M1_PMU_PERFCTR_UNKNOWN_ca = 0xca,
+ M1_PMU_PERFCTR_UNKNOWN_cb = 0xcb,
+ M1_PMU_PERFCTR_UNKNOWN_f5 = 0xf5,
+ M1_PMU_PERFCTR_UNKNOWN_f6 = 0xf6,
+ M1_PMU_PERFCTR_UNKNOWN_f7 = 0xf7,
+ M1_PMU_PERFCTR_UNKNOWN_f8 = 0xf8,
+ M1_PMU_PERFCTR_UNKNOWN_fd = 0xfd,
+ M1_PMU_PERFCTR_LAST = M1_PMU_CFG_EVENT,
+
+ /*
+ * From this point onwards, these are not actual HW events,
+ * but attributes that get stored in hw->config_base.
+ */
+ M1_PMU_CFG_COUNT_USER = BIT(8),
+ M1_PMU_CFG_COUNT_KERNEL = BIT(9),
+};
+
+/*
+ * Per-event affinity table. Most events can be installed on counter
+ * 2-9, but there are a numbre of exceptions. Note that this table
+ * has been created experimentally, and I wouldn't be surprised if more
+ * counters had strange affinities.
+ */
+static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
+ [0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
+ [M1_PMU_PERFCTR_UNKNOWN_01] = BIT(7),
+ [M1_PMU_PERFCTR_CPU_CYCLES] = ANY_BUT_0_1 | BIT(0),
+ [M1_PMU_PERFCTR_INSTRUCTIONS] = BIT(7) | BIT(1),
+ [M1_PMU_PERFCTR_UNKNOWN_8d] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_8e] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_8f] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_90] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_93] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_94] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_95] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_96] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_97] = BIT(7),
+ [M1_PMU_PERFCTR_UNKNOWN_98] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_99] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_9a] = BIT(7),
+ [M1_PMU_PERFCTR_UNKNOWN_9b] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_9c] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
+ [M1_PMU_PERFCTR_UNKNOWN_bf] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_c0] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_c1] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_c4] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_c5] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_c6] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_c8] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_ca] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_cb] = ONLY_5_6_7,
+ [M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
+ [M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
+ [M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
+ [M1_PMU_PERFCTR_UNKNOWN_f8] = ONLY_2_TO_7,
+ [M1_PMU_PERFCTR_UNKNOWN_fd] = ONLY_2_4_6,
+};
+
+static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = {
+ PERF_MAP_ALL_UNSUPPORTED,
+ [PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES,
+ [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS,
+ /* No idea about the rest yet */
+};
+
+/* sysfs definitions */
+static ssize_t m1_pmu_events_sysfs_show(struct device *dev,
+ struct device_attribute *attr,
+ char *page)
+{
+ struct perf_pmu_events_attr *pmu_attr;
+
+ pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
+
+ return sprintf(page, "event=0x%04llx\n", pmu_attr->id);
+}
+
+#define M1_PMU_EVENT_ATTR(name, config) \
+ PMU_EVENT_ATTR_ID(name, m1_pmu_events_sysfs_show, config)
+
+static struct attribute *m1_pmu_event_attrs[] = {
+ M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES),
+ M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS),
+ NULL,
+};
+
+static const struct attribute_group m1_pmu_events_attr_group = {
+ .name = "events",
+ .attrs = m1_pmu_event_attrs,
+};
+
+PMU_FORMAT_ATTR(event, "config:0-7");
+
+static struct attribute *m1_pmu_format_attrs[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+
+static const struct attribute_group m1_pmu_format_attr_group = {
+ .name = "format",
+ .attrs = m1_pmu_format_attrs,
+};
+
+/* Low level accessors. No synchronisation. */
+#define PMU_READ_COUNTER(_idx) \
+ case _idx: return read_sysreg_s(SYS_IMP_APL_PMC## _idx ##_EL1)
+
+#define PMU_WRITE_COUNTER(_val, _idx) \
+ case _idx: \
+ write_sysreg_s(_val, SYS_IMP_APL_PMC## _idx ##_EL1); \
+ return
+
+static u64 m1_pmu_read_hw_counter(unsigned int index)
+{
+ switch (index) {
+ PMU_READ_COUNTER(0);
+ PMU_READ_COUNTER(1);
+ PMU_READ_COUNTER(2);
+ PMU_READ_COUNTER(3);
+ PMU_READ_COUNTER(4);
+ PMU_READ_COUNTER(5);
+ PMU_READ_COUNTER(6);
+ PMU_READ_COUNTER(7);
+ PMU_READ_COUNTER(8);
+ PMU_READ_COUNTER(9);
+ }
+
+ BUG();
+}
+
+static void m1_pmu_write_hw_counter(u64 val, unsigned int index)
+{
+ switch (index) {
+ PMU_WRITE_COUNTER(val, 0);
+ PMU_WRITE_COUNTER(val, 1);
+ PMU_WRITE_COUNTER(val, 2);
+ PMU_WRITE_COUNTER(val, 3);
+ PMU_WRITE_COUNTER(val, 4);
+ PMU_WRITE_COUNTER(val, 5);
+ PMU_WRITE_COUNTER(val, 6);
+ PMU_WRITE_COUNTER(val, 7);
+ PMU_WRITE_COUNTER(val, 8);
+ PMU_WRITE_COUNTER(val, 9);
+ }
+
+ BUG();
+}
+
+#define get_bit_offset(index, mask) (__ffs(mask) + (index))
+
+static void __m1_pmu_enable_counter(unsigned int index, bool en)
+{
+ u64 val, bit;
+
+ switch (index) {
+ case 0 ... 7:
+ bit = BIT(get_bit_offset(index, PMCR0_CNT_ENABLE_0_7));
+ break;
+ case 8 ... 9:
+ bit = BIT(get_bit_offset(index - 8, PMCR0_CNT_ENABLE_8_9));
+ break;
+ default:
+ BUG();
+ }
+
+ val = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
+
+ if (en)
+ val |= bit;
+ else
+ val &= ~bit;
+
+ write_sysreg_s(val, SYS_IMP_APL_PMCR0_EL1);
+}
+
+static void m1_pmu_enable_counter(unsigned int index)
+{
+ __m1_pmu_enable_counter(index, true);
+}
+
+static void m1_pmu_disable_counter(unsigned int index)
+{
+ __m1_pmu_enable_counter(index, false);
+}
+
+static void __m1_pmu_enable_counter_interrupt(unsigned int index, bool en)
+{
+ u64 val, bit;
+
+ switch (index) {
+ case 0 ... 7:
+ bit = BIT(get_bit_offset(index, PMCR0_PMI_ENABLE_0_7));
+ break;
+ case 8 ... 9:
+ bit = BIT(get_bit_offset(index - 8, PMCR0_PMI_ENABLE_8_9));
+ break;
+ default:
+ BUG();
+ }
+
+ val = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
+
+ if (en)
+ val |= bit;
+ else
+ val &= ~bit;
+
+ write_sysreg_s(val, SYS_IMP_APL_PMCR0_EL1);
+}
+
+static void m1_pmu_enable_counter_interrupt(unsigned int index)
+{
+ __m1_pmu_enable_counter_interrupt(index, true);
+}
+
+static void m1_pmu_disable_counter_interrupt(unsigned int index)
+{
+ __m1_pmu_enable_counter_interrupt(index, false);
+}
+
+static void m1_pmu_configure_counter(unsigned int index, u8 event,
+ bool user, bool kernel)
+{
+ u64 val, user_bit, kernel_bit;
+ int shift;
+
+ switch (index) {
+ case 0 ... 7:
+ user_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL0_0_7));
+ kernel_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL1_0_7));
+ break;
+ case 8 ... 9:
+ user_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL0_8_9));
+ kernel_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL1_8_9));
+ break;
+ default:
+ BUG();
+ }
+
+ val = read_sysreg_s(SYS_IMP_APL_PMCR1_EL1);
+
+ if (user)
+ val |= user_bit;
+ else
+ val &= ~user_bit;
+
+ if (kernel)
+ val |= kernel_bit;
+ else
+ val &= ~kernel_bit;
+
+ write_sysreg_s(val, SYS_IMP_APL_PMCR1_EL1);
+
+ /*
+ * Counters 0 and 1 have fixed events. For anything else,
+ * place the event at the expected location in the relevant
+ * register (PMESR0 holds the event configuration for counters
+ * 2-5, resp. PMESR1 for counters 6-9).
+ */
+ switch (index) {
+ case 0 ... 1:
+ break;
+ case 2 ... 5:
+ shift = (index - 2) * 8;
+ val = read_sysreg_s(SYS_IMP_APL_PMESR0_EL1);
+ val &= ~((u64)0xff << shift);
+ val |= (u64)event << shift;
+ write_sysreg_s(val, SYS_IMP_APL_PMESR0_EL1);
+ break;
+ case 6 ... 9:
+ shift = (index - 6) * 8;
+ val = read_sysreg_s(SYS_IMP_APL_PMESR1_EL1);
+ val &= ~((u64)0xff << shift);
+ val |= (u64)event << shift;
+ write_sysreg_s(val, SYS_IMP_APL_PMESR1_EL1);
+ break;
+ }
+}
+
+/* arm_pmu backend */
+static void m1_pmu_enable_event(struct perf_event *event)
+{
+ struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
+ struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
+ unsigned long flags;
+ bool user, kernel;
+ u8 evt;
+
+ evt = event->hw.config_base & M1_PMU_CFG_EVENT;
+ user = event->hw.config_base & M1_PMU_CFG_COUNT_USER;
+ kernel = event->hw.config_base & M1_PMU_CFG_COUNT_KERNEL;
+
+ raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
+
+ m1_pmu_disable_counter_interrupt(event->hw.idx);
+ m1_pmu_disable_counter(event->hw.idx);
+ isb();
+
+ m1_pmu_configure_counter(event->hw.idx, evt, user, kernel);
+ m1_pmu_enable_counter(event->hw.idx);
+ m1_pmu_enable_counter_interrupt(event->hw.idx);
+ isb();
+
+ raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
+}
+
+static void __m1_pmu_disable_event(struct perf_event *event)
+{
+ m1_pmu_disable_counter_interrupt(event->hw.idx);
+ m1_pmu_disable_counter(event->hw.idx);
+ isb();
+}
+
+static void m1_pmu_disable_event(struct perf_event *event)
+{
+ struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
+ struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
+
+ __m1_pmu_disable_event(event);
+
+ raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
+}
+
+static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu)
+{
+ struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
+ irqreturn_t ret = IRQ_HANDLED;
+ struct pt_regs *regs;
+ u64 overflow, state;
+ unsigned long flags;
+ int idx;
+
+ raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
+ state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
+ overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);
+ if (!overflow) {
+ ret = IRQ_NONE;
+ goto out;
+ }
+
+ regs = get_irq_regs();
+
+ for (idx = 0; idx < cpu_pmu->num_events; idx++) {
+ struct perf_event *event = cpuc->events[idx];
+ struct perf_sample_data data;
+
+ if (!event)
+ continue;
+
+ armpmu_event_update(event);
+ perf_sample_data_init(&data, 0, event->hw.last_period);
+ if (!armpmu_event_set_period(event))
+ continue;
+
+ if (perf_event_overflow(event, &data, regs))
+ __m1_pmu_disable_event(event);
+ }
+
+out:
+ state &= ~PMCR0_IACT;
+ write_sysreg_s(state, SYS_IMP_APL_PMCR0_EL1);
+ isb();
+
+ raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
+
+ return ret;
+}
+
+static u64 m1_pmu_read_counter(struct perf_event *event)
+{
+ return m1_pmu_read_hw_counter(event->hw.idx);
+}
+
+static void m1_pmu_write_counter(struct perf_event *event, u64 value)
+{
+ m1_pmu_write_hw_counter(value, event->hw.idx);
+ isb();
+}
+
+static int m1_pmu_get_event_idx(struct pmu_hw_events *cpuc,
+ struct perf_event *event)
+{
+ unsigned long evtype = event->hw.config_base & M1_PMU_CFG_EVENT;
+ unsigned long affinity = m1_pmu_event_affinity[evtype];
+ int idx;
+
+ /*
+ * Place the event on the first free counter that can count
+ * this event.
+ *
+ * We could do a better job if we had a view of all the events
+ * counting on the PMU at any given time, and by placing the
+ * most constraining events first.
+ */
+ for_each_set_bit(idx, &affinity, M1_PMU_NR_COUNTERS) {
+ if (!test_and_set_bit(idx, cpuc->used_mask))
+ return idx;
+ }
+
+ return -EAGAIN;
+}
+
+static void m1_pmu_clear_event_idx(struct pmu_hw_events *cpuc,
+ struct perf_event *event)
+{
+ clear_bit(event->hw.idx, cpuc->used_mask);
+}
+
+static void m1_pmu_start(struct arm_pmu *cpu_pmu)
+{
+ struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
+ unsigned long flags;
+ u64 val;
+
+ raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
+
+ val = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
+ val &= ~PMCR0_IMODE;
+ val |= FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_FIQ);
+ val |= PMCR0_STOP_CNT_ON_PMI;
+
+ write_sysreg_s(val, SYS_IMP_APL_PMCR0_EL1);
+ isb();
+
+ raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
+}
+
+static void __m1_pmu_stop(void)
+{
+ u64 val;
+
+ val = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
+ val &= ~PMCR0_IMODE;
+ val |= FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_OFF);
+ write_sysreg_s(val, SYS_IMP_APL_PMCR0_EL1);
+ isb();
+}
+
+static void m1_pmu_stop(struct arm_pmu *cpu_pmu)
+{
+ struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
+
+ __m1_pmu_stop();
+
+ raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
+}
+
+static int m1_pmu_map_event(struct perf_event *event)
+{
+ /*
+ * Although the counters are 48bit wide, bit 47 is what
+ * triggers the overflow interrupt. Advertise the counters
+ * being 47bit wide to mimick the behaviour of the ARM PMU.
+ */
+ event->hw.flags |= ARMPMU_EVT_47BIT;
+ return armpmu_map_event(event, &m1_pmu_perf_map, NULL, M1_PMU_CFG_EVENT);
+}
+
+static void m1_pmu_reset(void *info)
+{
+ int i;
+
+ __m1_pmu_stop();
+
+ for (i = 0; i < M1_PMU_NR_COUNTERS; i++) {
+ m1_pmu_disable_counter(i);
+ m1_pmu_disable_counter_interrupt(i);
+ m1_pmu_write_hw_counter(0, i);
+ }
+
+ isb();
+}
+
+static int m1_pmu_set_event_filter(struct hw_perf_event *event,
+ struct perf_event_attr *attr)
+{
+ unsigned long config_base = 0;
+
+ if (!attr->exclude_kernel)
+ config_base |= M1_PMU_CFG_COUNT_KERNEL;
+ if (!attr->exclude_user)
+ config_base |= M1_PMU_CFG_COUNT_USER;
+
+ event->config_base = config_base;
+
+ return 0;
+}
+
+static int m1_pmu_init(struct arm_pmu *cpu_pmu)
+{
+ cpu_pmu->handle_irq = m1_pmu_handle_irq;
+ cpu_pmu->enable = m1_pmu_enable_event;
+ cpu_pmu->disable = m1_pmu_disable_event;
+ cpu_pmu->read_counter = m1_pmu_read_counter;
+ cpu_pmu->write_counter = m1_pmu_write_counter;
+ cpu_pmu->get_event_idx = m1_pmu_get_event_idx;
+ cpu_pmu->clear_event_idx = m1_pmu_clear_event_idx;
+ cpu_pmu->start = m1_pmu_start;
+ cpu_pmu->stop = m1_pmu_stop;
+ cpu_pmu->map_event = m1_pmu_map_event;
+ cpu_pmu->reset = m1_pmu_reset;
+ cpu_pmu->set_event_filter = m1_pmu_set_event_filter;
+
+ cpu_pmu->num_events = M1_PMU_NR_COUNTERS;
+ cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = &m1_pmu_events_attr_group;
+ cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = &m1_pmu_format_attr_group;
+ return 0;
+}
+
+/* Device driver gunk */
+static int m1_pmu_ice_init(struct arm_pmu *cpu_pmu)
+{
+ cpu_pmu->name = "apple_icestorm_pmu";
+ return m1_pmu_init(cpu_pmu);
+}
+
+static int m1_pmu_fire_init(struct arm_pmu *cpu_pmu)
+{
+ cpu_pmu->name = "apple_firestorm_pmu";
+ return m1_pmu_init(cpu_pmu);
+}
+
+static const struct of_device_id m1_pmu_of_device_ids[] = {
+ { .compatible = "apple,icestorm-pmu", .data = m1_pmu_ice_init, },
+ { .compatible = "apple,firestorm-pmu", .data = m1_pmu_fire_init, },
+ { },
+};
+MODULE_DEVICE_TABLE(of, m1_pmu_of_device_ids);
+
+static int m1_pmu_device_probe(struct platform_device *pdev)
+{
+ int ret;
+
+ ret = arm_pmu_device_probe(pdev, m1_pmu_of_device_ids, NULL);
+ if (!ret) {
+ /*
+ * If probe succeeds, taint the kernel as this is all
+ * undocumented, implementation defined black magic.
+ */
+ add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
+ }
+
+ return ret;
+}
+
+static struct platform_driver m1_pmu_driver = {
+ .driver = {
+ .name = "apple-m1-cpu-pmu",
+ .of_match_table = m1_pmu_of_device_ids,
+ .suppress_bind_attrs = true,
+ },
+ .probe = m1_pmu_device_probe,
+};
+
+module_platform_driver(m1_pmu_driver);
+MODULE_LICENSE("GPL v2");
--
2.30.2

2021-12-01 13:50:25

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 4/8] irqchip/apple-aic: Wire PMU interrupts

Add the necessary code to configure and P and E-core PMU interrupts
with their respective affinities. When such an interrupt fires, map
it onto the right pseudo-interrupt.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-apple-aic.c | 34 +++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 30ca80ccda8b..23f5f10e974e 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -155,7 +155,7 @@
#define SYS_IMP_APL_UPMSR_EL1 sys_reg(3, 7, 15, 6, 4)
#define UPMSR_IACT BIT(0)

-#define AIC_NR_FIQ 4
+#define AIC_NR_FIQ 6
#define AIC_NR_SWIPI 32

/*
@@ -207,6 +207,11 @@ static bool __is_pcore(u64 mpidr)
return MPIDR_AFFINITY_LEVEL(mpidr, 2) == 1;
}

+static bool is_pcore(void)
+{
+ return __is_pcore(read_cpuid_mpidr());
+}
+
/*
* IRQ irqchip
*/
@@ -420,16 +425,10 @@ static void __exception_irq_entry aic_handle_fiq(struct pt_regs *regs)
aic_irqc->nr_hw + AIC_TMR_EL02_VIRT);
}

- if ((read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & (PMCR0_IMODE | PMCR0_IACT)) ==
- (FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_FIQ) | PMCR0_IACT)) {
- /*
- * Not supported yet, let's figure out how to handle this when
- * we implement these proprietary performance counters. For now,
- * just mask it and move on.
- */
- pr_err_ratelimited("PMC FIQ fired. Masking.\n");
- sysreg_clear_set_s(SYS_IMP_APL_PMCR0_EL1, PMCR0_IMODE | PMCR0_IACT,
- FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_OFF));
+ if (read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & PMCR0_IACT) {
+ int irq = is_pcore() ? AIC_CPU_PMU_P : AIC_CPU_PMU_E;
+ generic_handle_domain_irq(aic_irqc->hw_domain,
+ aic_irqc->nr_hw + irq);
}

if (FIELD_GET(UPMCR0_IMODE, read_sysreg_s(SYS_IMP_APL_UPMCR0_EL1)) == UPMCR0_IMODE_FIQ &&
@@ -469,7 +468,18 @@ static int aic_irq_domain_map(struct irq_domain *id, unsigned int irq,
handle_fasteoi_irq, NULL, NULL);
irqd_set_single_target(irq_desc_get_irq_data(irq_to_desc(irq)));
} else {
- irq_set_percpu_devid(irq);
+ switch (hw - ic->nr_hw) {
+ case AIC_CPU_PMU_P:
+ irq_set_percpu_devid_partition(irq, &ic->pcore_mask);
+ break;
+ case AIC_CPU_PMU_E:
+ irq_set_percpu_devid_partition(irq, &ic->ecore_mask);
+ break;
+ default:
+ irq_set_percpu_devid(irq);
+ break;
+ }
+
irq_domain_set_info(id, irq, hw, &fiq_chip, id->host_data,
handle_percpu_devid_irq, NULL, NULL);
}
--
2.30.2

2021-12-01 13:50:28

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 3/8] irqchip/apple-aic: Add cpumasks for E and P cores

In order to be able to tell the core IRQ code about the affinity
of the PMU interrupt in later patches, compute the cpumasks of the
P and E cores at boot time.

This relies on the affinity scheme used by the vendor, which seems
to work for the couple of SoCs that are out in the wild.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-apple-aic.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 3759dc36cc8f..30ca80ccda8b 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -177,6 +177,8 @@ struct aic_irq_chip {
void __iomem *base;
struct irq_domain *hw_domain;
struct irq_domain *ipi_domain;
+ struct cpumask ecore_mask;
+ struct cpumask pcore_mask;
int nr_hw;
int ipi_hwirq;
};
@@ -200,6 +202,11 @@ static void aic_ic_write(struct aic_irq_chip *ic, u32 reg, u32 val)
writel_relaxed(val, ic->base + reg);
}

+static bool __is_pcore(u64 mpidr)
+{
+ return MPIDR_AFFINITY_LEVEL(mpidr, 2) == 1;
+}
+
/*
* IRQ irqchip
*/
@@ -833,6 +840,13 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
return -ENODEV;
}

+ for_each_possible_cpu(i) {
+ if (__is_pcore(cpu_logical_map(i)))
+ cpumask_set_cpu(i, &irqc->pcore_mask);
+ else
+ cpumask_set_cpu(i, &irqc->ecore_mask);
+ }
+
set_handle_irq(aic_handle_irq);
set_handle_fiq(aic_handle_fiq);

--
2.30.2

2021-12-01 13:50:33

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 1/8] dt-bindings: arm-pmu: Document Apple PMU compatible strings

As we are about to add support fur the Apple PMUs, document the compatible
strings associated with the two micro-architectures present in the Apple M1.

Acked-by: Rob Herring <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
---
Documentation/devicetree/bindings/arm/pmu.yaml | 2 ++
1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/pmu.yaml b/Documentation/devicetree/bindings/arm/pmu.yaml
index e17ac049e890..1a6986b4e552 100644
--- a/Documentation/devicetree/bindings/arm/pmu.yaml
+++ b/Documentation/devicetree/bindings/arm/pmu.yaml
@@ -20,6 +20,8 @@ properties:
items:
- enum:
- apm,potenza-pmu
+ - apple,firestorm-pmu
+ - apple,icestorm-pmu
- arm,armv8-pmuv3 # Only for s/w models
- arm,arm1136-pmu
- arm,arm1176-pmu
--
2.30.2

2021-12-01 13:50:40

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 7/8] drivers/perf: arm_pmu: Handle 47 bit counters

The current ARM PMU framework can only deal with 32 or 64bit counters.
Teach it about a 47bit flavour.

Yes, this is odd.

Signed-off-by: Marc Zyngier <[email protected]>
---
drivers/perf/arm_pmu.c | 2 ++
include/linux/perf/arm_pmu.h | 2 ++
2 files changed, 4 insertions(+)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 295cc7952d0e..0a9ed1a061ac 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -109,6 +109,8 @@ static inline u64 arm_pmu_event_max_period(struct perf_event *event)
{
if (event->hw.flags & ARMPMU_EVT_64BIT)
return GENMASK_ULL(63, 0);
+ else if (event->hw.flags & ARMPMU_EVT_47BIT)
+ return GENMASK_ULL(46, 0);
else
return GENMASK_ULL(31, 0);
}
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 2512e2f9cd4e..0407a38b470a 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -26,6 +26,8 @@
*/
/* Event uses a 64bit counter */
#define ARMPMU_EVT_64BIT 1
+/* Event uses a 47bit counter */
+#define ARMPMU_EVT_47BIT 2

#define HW_OP_UNSUPPORTED 0xFFFF
#define C(_x) PERF_COUNT_HW_CACHE_##_x
--
2.30.2

2021-12-01 13:50:47

by Marc Zyngier

[permalink] [raw]

Subject: [PATCH v2 5/8] irqchip/apple-aic: Move PMU-specific registers to their own include file

As we are about to have a PMU driver, move the PMU bits from the AIC
driver into a common include file.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/apple_m1_pmu.h | 19 +++++++++++++++++++
drivers/irqchip/irq-apple-aic.c | 11 +----------
2 files changed, 20 insertions(+), 10 deletions(-)
create mode 100644 arch/arm64/include/asm/apple_m1_pmu.h

diff --git a/arch/arm64/include/asm/apple_m1_pmu.h b/arch/arm64/include/asm/apple_m1_pmu.h
new file mode 100644
index 000000000000..b848af7faadc
--- /dev/null
+++ b/arch/arm64/include/asm/apple_m1_pmu.h
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#ifndef __ASM_APPLE_M1_PMU_h
+#define __ASM_APPLE_M1_PMU_h
+
+#include <linux/bits.h>
+#include <asm/sysreg.h>
+
+/* Core PMC control register */
+#define SYS_IMP_APL_PMCR0_EL1 sys_reg(3, 1, 15, 0, 0)
+#define PMCR0_IMODE GENMASK(10, 8)
+#define PMCR0_IMODE_OFF 0
+#define PMCR0_IMODE_PMI 1
+#define PMCR0_IMODE_AIC 2
+#define PMCR0_IMODE_HALT 3
+#define PMCR0_IMODE_FIQ 4
+#define PMCR0_IACT BIT(11)
+
+#endif /* __ASM_APPLE_M1_PMU_h */
diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
index 23f5f10e974e..9663166fd97f 100644
--- a/drivers/irqchip/irq-apple-aic.c
+++ b/drivers/irqchip/irq-apple-aic.c
@@ -55,6 +55,7 @@
#include <linux/limits.h>
#include <linux/of_address.h>
#include <linux/slab.h>
+#include <asm/apple_m1_pmu.h>
#include <asm/exception.h>
#include <asm/sysreg.h>
#include <asm/virt.h>
@@ -109,16 +110,6 @@
* Note: sysreg-based IPIs are not supported yet.
*/

-/* Core PMC control register */
-#define SYS_IMP_APL_PMCR0_EL1 sys_reg(3, 1, 15, 0, 0)
-#define PMCR0_IMODE GENMASK(10, 8)
-#define PMCR0_IMODE_OFF 0
-#define PMCR0_IMODE_PMI 1
-#define PMCR0_IMODE_AIC 2
-#define PMCR0_IMODE_HALT 3
-#define PMCR0_IMODE_FIQ 4
-#define PMCR0_IACT BIT(11)
-
/* IPI request registers */
#define SYS_IMP_APL_IPI_RR_LOCAL_EL1 sys_reg(3, 5, 15, 0, 0)
#define SYS_IMP_APL_IPI_RR_GLOBAL_EL1 sys_reg(3, 5, 15, 0, 1)
--
2.30.2

2021-12-01 16:08:22

by Mark Rutland

[permalink] [raw]

Subject: Re: [PATCH v2 3/8] irqchip/apple-aic: Add cpumasks for E and P cores

On Wed, Dec 01, 2021 at 01:49:04PM +0000, Marc Zyngier wrote:
> In order to be able to tell the core IRQ code about the affinity
> of the PMU interrupt in later patches, compute the cpumasks of the
> P and E cores at boot time.
>
> This relies on the affinity scheme used by the vendor, which seems
> to work for the couple of SoCs that are out in the wild.

... but may change at any arbitrary point in future?

Can we please put the affinity in the DT, like we do for other SoCs and
devices?

I don't think we should treat this HW specially in this regard; we certaintly
don't want other folk hard-coding system topology in their irqchip driver, and
it should be possible to do something like the ppi-partitions binding, no?

Thanks,
Mark.

> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-apple-aic.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> index 3759dc36cc8f..30ca80ccda8b 100644
> --- a/drivers/irqchip/irq-apple-aic.c
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -177,6 +177,8 @@ struct aic_irq_chip {
> void __iomem *base;
> struct irq_domain *hw_domain;
> struct irq_domain *ipi_domain;
> + struct cpumask ecore_mask;
> + struct cpumask pcore_mask;
> int nr_hw;
> int ipi_hwirq;
> };
> @@ -200,6 +202,11 @@ static void aic_ic_write(struct aic_irq_chip *ic, u32 reg, u32 val)
> writel_relaxed(val, ic->base + reg);
> }
>
> +static bool __is_pcore(u64 mpidr)
> +{
> + return MPIDR_AFFINITY_LEVEL(mpidr, 2) == 1;
> +}
> +
> /*
> * IRQ irqchip
> */
> @@ -833,6 +840,13 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
> return -ENODEV;
> }
>
> + for_each_possible_cpu(i) {
> + if (__is_pcore(cpu_logical_map(i)))
> + cpumask_set_cpu(i, &irqc->pcore_mask);
> + else
> + cpumask_set_cpu(i, &irqc->ecore_mask);
> + }
> +
> set_handle_irq(aic_handle_irq);
> set_handle_fiq(aic_handle_fiq);
>
> --
> 2.30.2
>

2021-12-01 16:58:19

by Mark Rutland

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

On Wed, Dec 01, 2021 at 01:49:09PM +0000, Marc Zyngier wrote:
> Add a new, weird and wonderful driver for the equally weird Apple
> PMU HW. Although the PMU itself is functional, we don't know much
> about the events yet, so this can be considered as yet another
> random number generator...

It's really frustrating that Apple built this rather than the architected PMU,
because we've generally pushed back on IMPLEMENTATION DEFINED junk in this
area, and supporting this makes it harder to push back on other vendors going
the same route, which I'm not keen on. That, and the usual state of IMP-DEF
stuff making this stupidly painful to reason about.

I can see that we can get this working bare-metal with DT, but I really don't
want to try to support this in other cases (e.g. in a VM, potentially with
ACPI), or this IMP-DEFness is going to spread more throughout the arm_pmu code.

How does this interact with PMU emulation for a KVM guest?

I have a bunch of comments and questions below.

[...]

> +#define ANY_BUT_0_1 GENMASK(9, 2)
> +#define ONLY_2_TO_7 GENMASK(7, 2)
> +#define ONLY_2_4_6 (BIT(2) | BIT(4) | BIT(6))
> +#define ONLY_5_6_7 GENMASK(7, 5)

For clarity/consistency it might be better to use separate BIT()s for
ONLY_5_6_7 too.

> +/*
> + * Description of the events we actually know about, as well as those with
> + * a specific counter affinity. Yes, this is a grand total of two known
> + * counters, and the rest is anybody's guess.
> + *
> + * Not all counters can count all events. Counters #0 and #1 are wired to
> + * count cycles and instructions respectively, and some events have
> + * bizarre mappings (every other counter, or even *one* counter). These
> + * restrictions equally apply to both P and E cores.
> + *
> + * It is worth noting that the PMUs attached to P and E cores are likely
> + * to be different because the underlying uarches are different. At the
> + * moment, we don't really need to distinguish between the two because we
> + * know next to nothing about the events themselves, and we already have
> + * per cpu-type PMU abstractions.
> + *
> + * If we eventually find out that the events are different across
> + * implementations, we'll have to introduce per cpu-type tables.
> + */
> +enum m1_pmu_events {
> + M1_PMU_PERFCTR_UNKNOWN_01 = 0x01,
> + M1_PMU_PERFCTR_CPU_CYCLES = 0x02,
> + M1_PMU_PERFCTR_INSTRUCTIONS = 0x8c,
> + M1_PMU_PERFCTR_UNKNOWN_8d = 0x8d,
> + M1_PMU_PERFCTR_UNKNOWN_8e = 0x8e,
> + M1_PMU_PERFCTR_UNKNOWN_8f = 0x8f,
> + M1_PMU_PERFCTR_UNKNOWN_90 = 0x90,
> + M1_PMU_PERFCTR_UNKNOWN_93 = 0x93,
> + M1_PMU_PERFCTR_UNKNOWN_94 = 0x94,
> + M1_PMU_PERFCTR_UNKNOWN_95 = 0x95,
> + M1_PMU_PERFCTR_UNKNOWN_96 = 0x96,
> + M1_PMU_PERFCTR_UNKNOWN_97 = 0x97,
> + M1_PMU_PERFCTR_UNKNOWN_98 = 0x98,
> + M1_PMU_PERFCTR_UNKNOWN_99 = 0x99,
> + M1_PMU_PERFCTR_UNKNOWN_9a = 0x9a,
> + M1_PMU_PERFCTR_UNKNOWN_9b = 0x9b,
> + M1_PMU_PERFCTR_UNKNOWN_9c = 0x9c,
> + M1_PMU_PERFCTR_UNKNOWN_9f = 0x9f,
> + M1_PMU_PERFCTR_UNKNOWN_bf = 0xbf,
> + M1_PMU_PERFCTR_UNKNOWN_c0 = 0xc0,
> + M1_PMU_PERFCTR_UNKNOWN_c1 = 0xc1,
> + M1_PMU_PERFCTR_UNKNOWN_c4 = 0xc4,
> + M1_PMU_PERFCTR_UNKNOWN_c5 = 0xc5,
> + M1_PMU_PERFCTR_UNKNOWN_c6 = 0xc6,
> + M1_PMU_PERFCTR_UNKNOWN_c8 = 0xc8,
> + M1_PMU_PERFCTR_UNKNOWN_ca = 0xca,
> + M1_PMU_PERFCTR_UNKNOWN_cb = 0xcb,
> + M1_PMU_PERFCTR_UNKNOWN_f5 = 0xf5,
> + M1_PMU_PERFCTR_UNKNOWN_f6 = 0xf6,
> + M1_PMU_PERFCTR_UNKNOWN_f7 = 0xf7,
> + M1_PMU_PERFCTR_UNKNOWN_f8 = 0xf8,
> + M1_PMU_PERFCTR_UNKNOWN_fd = 0xfd,
> + M1_PMU_PERFCTR_LAST = M1_PMU_CFG_EVENT,
> +
> + /*
> + * From this point onwards, these are not actual HW events,
> + * but attributes that get stored in hw->config_base.
> + */
> + M1_PMU_CFG_COUNT_USER = BIT(8),
> + M1_PMU_CFG_COUNT_KERNEL = BIT(9),
> +};
> +
> +/*
> + * Per-event affinity table. Most events can be installed on counter
> + * 2-9, but there are a numbre of exceptions. Note that this table
> + * has been created experimentally, and I wouldn't be surprised if more
> + * counters had strange affinities.
> + */
> +static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
> + [0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
> + [M1_PMU_PERFCTR_UNKNOWN_01] = BIT(7),
> + [M1_PMU_PERFCTR_CPU_CYCLES] = ANY_BUT_0_1 | BIT(0),
> + [M1_PMU_PERFCTR_INSTRUCTIONS] = BIT(7) | BIT(1),
> + [M1_PMU_PERFCTR_UNKNOWN_8d] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_8e] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_8f] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_90] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_93] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_94] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_95] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_96] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_97] = BIT(7),
> + [M1_PMU_PERFCTR_UNKNOWN_98] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_99] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_9a] = BIT(7),
> + [M1_PMU_PERFCTR_UNKNOWN_9b] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_9c] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
> + [M1_PMU_PERFCTR_UNKNOWN_bf] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_c0] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_c1] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_c4] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_c5] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_c6] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_c8] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_ca] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_cb] = ONLY_5_6_7,
> + [M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
> + [M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
> + [M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
> + [M1_PMU_PERFCTR_UNKNOWN_f8] = ONLY_2_TO_7,
> + [M1_PMU_PERFCTR_UNKNOWN_fd] = ONLY_2_4_6,
> +};

I don't entirely follow what's going on here. Is this a matrix scheme like what
QC had in their IMP-DEF Krait PMUs? See:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/kernel/perf_event_v7.c?h=v5.16-rc3#n1286

I'm a bit worried about this, since is this is of that shape, there are
potential constraints on which counters and/or events you can use concurrently,
and if you violate those they can conflict. If so, we need to be *very* careful
about the abstraction we provide to userspace.

[...]

> +/* Low level accessors. No synchronisation. */
> +#define PMU_READ_COUNTER(_idx) \
> + case _idx: return read_sysreg_s(SYS_IMP_APL_PMC## _idx ##_EL1)
> +
> +#define PMU_WRITE_COUNTER(_val, _idx) \
> + case _idx: \
> + write_sysreg_s(_val, SYS_IMP_APL_PMC## _idx ##_EL1); \
> + return
> +
> +static u64 m1_pmu_read_hw_counter(unsigned int index)
> +{
> + switch (index) {
> + PMU_READ_COUNTER(0);
> + PMU_READ_COUNTER(1);
> + PMU_READ_COUNTER(2);
> + PMU_READ_COUNTER(3);
> + PMU_READ_COUNTER(4);
> + PMU_READ_COUNTER(5);
> + PMU_READ_COUNTER(6);
> + PMU_READ_COUNTER(7);
> + PMU_READ_COUNTER(8);
> + PMU_READ_COUNTER(9);
> + }
> +
> + BUG();
> +}
> +
> +static void m1_pmu_write_hw_counter(u64 val, unsigned int index)
> +{
> + switch (index) {
> + PMU_WRITE_COUNTER(val, 0);
> + PMU_WRITE_COUNTER(val, 1);
> + PMU_WRITE_COUNTER(val, 2);
> + PMU_WRITE_COUNTER(val, 3);
> + PMU_WRITE_COUNTER(val, 4);
> + PMU_WRITE_COUNTER(val, 5);
> + PMU_WRITE_COUNTER(val, 6);
> + PMU_WRITE_COUNTER(val, 7);
> + PMU_WRITE_COUNTER(val, 8);
> + PMU_WRITE_COUNTER(val, 9);
> + }
> +
> + BUG();
> +}

As an aside, since this pattern has cropped up in a few places, maybe we want
to look into scripting the generation of sysreg banki accessors like this.

[...]

> +static void m1_pmu_configure_counter(unsigned int index, u8 event,
> + bool user, bool kernel)
> +{
> + u64 val, user_bit, kernel_bit;
> + int shift;
> +
> + switch (index) {
> + case 0 ... 7:
> + user_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL0_0_7));
> + kernel_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL1_0_7));
> + break;
> + case 8 ... 9:
> + user_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL0_8_9));
> + kernel_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL1_8_9));
> + break;

When this says 'EL1', presuambly that's counting at EL2 in VHE?

Are there separate EL1 / EL2 controls, or anythign of that sort we need to be
aware of?

[...]

> +/* arm_pmu backend */
> +static void m1_pmu_enable_event(struct perf_event *event)
> +{
> + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
> + struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
> + unsigned long flags;
> + bool user, kernel;
> + u8 evt;
> +
> + evt = event->hw.config_base & M1_PMU_CFG_EVENT;
> + user = event->hw.config_base & M1_PMU_CFG_COUNT_USER;
> + kernel = event->hw.config_base & M1_PMU_CFG_COUNT_KERNEL;
> +
> + raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);

You shouldn't need this locking. The perf core always calls into the HW access
functions with IRQs disabled and we don't do cross-cpu state modification.
Likewise elsewhere in this file.

We pulled similar out of the architectural PMU driver in commit:

2a0e2a02e4b71917 ("arm64: perf: Remove PMU locking")

... though that says non-preemptible when it should say non-interruptible.

I should clean up the 32-bit drivers and remove pmu_hw_events::pmu_lock
entirely.

> +
> + m1_pmu_disable_counter_interrupt(event->hw.idx);
> + m1_pmu_disable_counter(event->hw.idx);
> + isb();
> +
> + m1_pmu_configure_counter(event->hw.idx, evt, user, kernel);
> + m1_pmu_enable_counter(event->hw.idx);
> + m1_pmu_enable_counter_interrupt(event->hw.idx);
> + isb();
> +
> + raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
> +}
> +
> +static void __m1_pmu_disable_event(struct perf_event *event)
> +{
> + m1_pmu_disable_counter_interrupt(event->hw.idx);
> + m1_pmu_disable_counter(event->hw.idx);
> + isb();
> +}
> +
> +static void m1_pmu_disable_event(struct perf_event *event)
> +{
> + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
> + struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
> +
> + __m1_pmu_disable_event(event);
> +
> + raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
> +}

As with m1_pmu_enable_event(), I don't believe the locking is necessary here.

> +static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu)
> +{
> + struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
> + irqreturn_t ret = IRQ_HANDLED;
> + struct pt_regs *regs;
> + u64 overflow, state;
> + unsigned long flags;
> + int idx;
> +
> + raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);

Likewise, no need for a lock here, and certainly not an irqsave!

Same comment for subsequent usage in this file.

> + state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
> + overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);

I assume the overflow behaviour is free-running rather than stopping?

> + if (!overflow) {
> + ret = IRQ_NONE;
> + goto out;
> + }
> +
> + regs = get_irq_regs();
> +
> + for (idx = 0; idx < cpu_pmu->num_events; idx++) {
> + struct perf_event *event = cpuc->events[idx];
> + struct perf_sample_data data;
> +
> + if (!event)
> + continue;
> +
> + armpmu_event_update(event);
> + perf_sample_data_init(&data, 0, event->hw.last_period);
> + if (!armpmu_event_set_period(event))
> + continue;
> +
> + if (perf_event_overflow(event, &data, regs))
> + __m1_pmu_disable_event(event);
> + }
> +
> +out:
> + state &= ~PMCR0_IACT;
> + write_sysreg_s(state, SYS_IMP_APL_PMCR0_EL1);
> + isb();
> +
> + raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
> +
> + return ret;
> +}

[...]

> +static int m1_pmu_device_probe(struct platform_device *pdev)
> +{
> + int ret;
> +
> + ret = arm_pmu_device_probe(pdev, m1_pmu_of_device_ids, NULL);
> + if (!ret) {
> + /*
> + * If probe succeeds, taint the kernel as this is all
> + * undocumented, implementation defined black magic.
> + */
> + add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> + }
> +
> + return ret;
> +}

Hmmm... that means we're always going to TAINT on this HW with an appropriate
DT, which could mask other reasons TAINT_CPU_OUT_OF_SPEC would be set, even
where the user isn't using the PMU.

Maybe we should have a cmdline option to opt-in to using the IMP-DEF PMU (and
only tainting in that case)?

Thanks,
Mark.

2021-12-01 17:57:03

by Alyssa Rosenzweig

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

> > Add a new, weird and wonderful driver for the equally weird Apple
> > PMU HW. Although the PMU itself is functional, we don't know much
> > about the events yet, so this can be considered as yet another
> > random number generator...
>
> It's really frustrating that Apple built this rather than the architected PMU,
> because we've generally pushed back on IMPLEMENTATION DEFINED junk in this
> area, and supporting this makes it harder to push back on other vendors going
> the same route, which I'm not keen on. That, and the usual state of IMP-DEF
> stuff making this stupidly painful to reason about.

Rules can be a bit stricter for vendors than for ragtag
reverse-engineers. The kernel community can push back on vendor's
choices because vendors have the power to choose otherwise.
But reverse engineers' hands are sometimes forced by bad vendor
decisions; rejecting the driver means mainline can never support the
hardware. I believe there's precedent for distinguishing these cases,
at least in the graphics subsystem.

I don't know if this applies to this driver. I only wish to offer a
rebuttal to a future vendor trying to mainline something questionable
with the defence "Asahi Linux / Nouveau / ... did it, so we can too".

(This will be relevant to the Apple M1 display controller driver, which
would be a hard NAK if submitted by Apple...)

2021-12-02 15:39:58

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

On Wed, 01 Dec 2021 16:58:10 +0000,
Mark Rutland <[email protected]> wrote:
>
> On Wed, Dec 01, 2021 at 01:49:09PM +0000, Marc Zyngier wrote:
> > Add a new, weird and wonderful driver for the equally weird Apple
> > PMU HW. Although the PMU itself is functional, we don't know much
> > about the events yet, so this can be considered as yet another
> > random number generator...
>
> It's really frustrating that Apple built this rather than the
> architected PMU, because we've generally pushed back on
> IMPLEMENTATION DEFINED junk in this area, and supporting this makes
> it harder to push back on other vendors going the same route, which
> I'm not keen on. That, and the usual state of IMP-DEF stuff making
> this stupidly painful to reason about.

As much as I agree with you on the stinking aspect of an IMPDEF PMU,
this doesn't contradicts the architecture. To avoid the spread of this
madness, forbidding an IMPDEF implementation in the architecture would
be the right thing to do.

>
> I can see that we can get this working bare-metal with DT, but I
> really don't want to try to support this in other cases (e.g. in a
> VM, potentially with ACPI), or this IMP-DEFness is going to spread
> more throughout the arm_pmu code.

Well, an alternative would be to sidestep the arm_pmu framework
altogether. Which would probably suck even more.

> How does this interact with PMU emulation for a KVM guest?

It doesn't. No non-architected PMU will get exposed to a KVM guest,
and the usual "inject an UNDEF exception on IMPDEF access" applies. As
far as I am concerned, KVM is purely architectural and doesn't need to
be encumbered with this.

>
> I have a bunch of comments and questions below.
>
> [...]
>
> > +#define ANY_BUT_0_1 GENMASK(9, 2)
> > +#define ONLY_2_TO_7 GENMASK(7, 2)
> > +#define ONLY_2_4_6 (BIT(2) | BIT(4) | BIT(6))
> > +#define ONLY_5_6_7 GENMASK(7, 5)
>
> For clarity/consistency it might be better to use separate BIT()s for
> ONLY_5_6_7 too.

Sure.

[...]

> > +static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
> > + [0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
> > + [M1_PMU_PERFCTR_UNKNOWN_01] = BIT(7),
> > + [M1_PMU_PERFCTR_CPU_CYCLES] = ANY_BUT_0_1 | BIT(0),
> > + [M1_PMU_PERFCTR_INSTRUCTIONS] = BIT(7) | BIT(1),
> > + [M1_PMU_PERFCTR_UNKNOWN_8d] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_8e] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_8f] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_90] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_93] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_94] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_95] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_96] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_97] = BIT(7),
> > + [M1_PMU_PERFCTR_UNKNOWN_98] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_99] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_9a] = BIT(7),
> > + [M1_PMU_PERFCTR_UNKNOWN_9b] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_9c] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
> > + [M1_PMU_PERFCTR_UNKNOWN_bf] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_c0] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_c1] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_c4] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_c5] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_c6] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_c8] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_ca] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_cb] = ONLY_5_6_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
> > + [M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
> > + [M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
> > + [M1_PMU_PERFCTR_UNKNOWN_f8] = ONLY_2_TO_7,
> > + [M1_PMU_PERFCTR_UNKNOWN_fd] = ONLY_2_4_6,
> > +};
>
> I don't entirely follow what's going on here. Is this a matrix
> scheme like what QC had in their IMP-DEF Krait PMUs? See:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/kernel/perf_event_v7.c?h=v5.16-rc3#n1286

It is nowhere as complicated as that.

> I'm a bit worried about this, since is this is of that shape, there
> are potential constraints on which counters and/or events you can
> use concurrently, and if you violate those they can conflict. If so,
> we need to be *very* careful about the abstraction we provide to
> userspace.

The HW does have placement constraints (this is what this per-event
bitmap is expressing), but the counting seems completely independent
as long as you find an ad-hoc counter to place the event. Which means
that if you try and count (for example) 4 events that would only fit
in {5,6,7}, we'll say NO to the fourth one.

As I say somewhere in a comment, we could do a better job if we had a
global view of the events to be counted, and split them in batches
that the core perf would then schedule.

If you think any of this somehow breaks the userspace ABI, please let
me know (my understand of perf is pretty limited...).

>
> [...]
>
> > +/* Low level accessors. No synchronisation. */
> > +#define PMU_READ_COUNTER(_idx) \
> > + case _idx: return read_sysreg_s(SYS_IMP_APL_PMC## _idx ##_EL1)
> > +
> > +#define PMU_WRITE_COUNTER(_val, _idx) \
> > + case _idx: \
> > + write_sysreg_s(_val, SYS_IMP_APL_PMC## _idx ##_EL1); \
> > + return
> > +
> > +static u64 m1_pmu_read_hw_counter(unsigned int index)
> > +{
> > + switch (index) {
> > + PMU_READ_COUNTER(0);
> > + PMU_READ_COUNTER(1);
> > + PMU_READ_COUNTER(2);
> > + PMU_READ_COUNTER(3);
> > + PMU_READ_COUNTER(4);
> > + PMU_READ_COUNTER(5);
> > + PMU_READ_COUNTER(6);
> > + PMU_READ_COUNTER(7);
> > + PMU_READ_COUNTER(8);
> > + PMU_READ_COUNTER(9);
> > + }
> > +
> > + BUG();
> > +}
> > +
> > +static void m1_pmu_write_hw_counter(u64 val, unsigned int index)
> > +{
> > + switch (index) {
> > + PMU_WRITE_COUNTER(val, 0);
> > + PMU_WRITE_COUNTER(val, 1);
> > + PMU_WRITE_COUNTER(val, 2);
> > + PMU_WRITE_COUNTER(val, 3);
> > + PMU_WRITE_COUNTER(val, 4);
> > + PMU_WRITE_COUNTER(val, 5);
> > + PMU_WRITE_COUNTER(val, 6);
> > + PMU_WRITE_COUNTER(val, 7);
> > + PMU_WRITE_COUNTER(val, 8);
> > + PMU_WRITE_COUNTER(val, 9);
> > + }
> > +
> > + BUG();
> > +}
>
> As an aside, since this pattern has cropped up in a few places, maybe we want
> to look into scripting the generation of sysreg banki accessors like this.
>
> [...]
>
> > +static void m1_pmu_configure_counter(unsigned int index, u8 event,
> > + bool user, bool kernel)
> > +{
> > + u64 val, user_bit, kernel_bit;
> > + int shift;
> > +
> > + switch (index) {
> > + case 0 ... 7:
> > + user_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL0_0_7));
> > + kernel_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL1_0_7));
> > + break;
> > + case 8 ... 9:
> > + user_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL0_8_9));
> > + kernel_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL1_8_9));
> > + break;
>
> When this says 'EL1', presuambly that's counting at EL2 in VHE?

It does.

> Are there separate EL1 / EL2 controls, or anythign of that sort we need to be
> aware of?

No, there is a single, per-counter control for EL0 and EL2. I couldn't
get the counters to report anything useful while a guest was running,
but that doesn't mean such control doesn't exist.

>
> [...]
>
> > +/* arm_pmu backend */
> > +static void m1_pmu_enable_event(struct perf_event *event)
> > +{
> > + struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
> > + struct pmu_hw_events *cpuc = this_cpu_ptr(cpu_pmu->hw_events);
> > + unsigned long flags;
> > + bool user, kernel;
> > + u8 evt;
> > +
> > + evt = event->hw.config_base & M1_PMU_CFG_EVENT;
> > + user = event->hw.config_base & M1_PMU_CFG_COUNT_USER;
> > + kernel = event->hw.config_base & M1_PMU_CFG_COUNT_KERNEL;
> > +
> > + raw_spin_lock_irqsave(&cpuc->pmu_lock, flags);
>
> You shouldn't need this locking. The perf core always calls into the HW access
> functions with IRQs disabled and we don't do cross-cpu state modification.
> Likewise elsewhere in this file.
>
> We pulled similar out of the architectural PMU driver in commit:
>
> 2a0e2a02e4b71917 ("arm64: perf: Remove PMU locking")
>
> ... though that says non-preemptible when it should say non-interruptible.

Ah, nice. I'll get rid of it.

[...]

> > + state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
> > + overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);
>
> I assume the overflow behaviour is free-running rather than stopping?

Configurable, apparently. At the moment, I set it to stop on overflow.
Happy to change the behaviour though.

> > + if (!overflow) {
> > + ret = IRQ_NONE;
> > + goto out;
> > + }
> > +
> > + regs = get_irq_regs();
> > +
> > + for (idx = 0; idx < cpu_pmu->num_events; idx++) {
> > + struct perf_event *event = cpuc->events[idx];
> > + struct perf_sample_data data;
> > +
> > + if (!event)
> > + continue;
> > +
> > + armpmu_event_update(event);
> > + perf_sample_data_init(&data, 0, event->hw.last_period);
> > + if (!armpmu_event_set_period(event))
> > + continue;
> > +
> > + if (perf_event_overflow(event, &data, regs))
> > + __m1_pmu_disable_event(event);
> > + }
> > +
> > +out:
> > + state &= ~PMCR0_IACT;
> > + write_sysreg_s(state, SYS_IMP_APL_PMCR0_EL1);
> > + isb();
> > +
> > + raw_spin_unlock_irqrestore(&cpuc->pmu_lock, flags);
> > +
> > + return ret;
> > +}
>
> [...]
>
> > +static int m1_pmu_device_probe(struct platform_device *pdev)
> > +{
> > + int ret;
> > +
> > + ret = arm_pmu_device_probe(pdev, m1_pmu_of_device_ids, NULL);
> > + if (!ret) {
> > + /*
> > + * If probe succeeds, taint the kernel as this is all
> > + * undocumented, implementation defined black magic.
> > + */
> > + add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> > + }
> > +
> > + return ret;
> > +}
>
> Hmmm... that means we're always going to TAINT on this HW with an appropriate
> DT, which could mask other reasons TAINT_CPU_OUT_OF_SPEC would be set, even
> where the user isn't using the PMU.
>
> Maybe we should have a cmdline option to opt-in to using the IMP-DEF PMU (and
> only tainting in that case)?

I'd rather taint on first use. Requiring a command-line argument for
this seems a bit over the top...

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-12-02 16:14:09

by Mark Rutland

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

On Thu, Dec 02, 2021 at 03:39:46PM +0000, Marc Zyngier wrote:
> On Wed, 01 Dec 2021 16:58:10 +0000,
> Mark Rutland <[email protected]> wrote:
> >
> > On Wed, Dec 01, 2021 at 01:49:09PM +0000, Marc Zyngier wrote:
> > > Add a new, weird and wonderful driver for the equally weird Apple
> > > PMU HW. Although the PMU itself is functional, we don't know much
> > > about the events yet, so this can be considered as yet another
> > > random number generator...
> >
> > It's really frustrating that Apple built this rather than the
> > architected PMU, because we've generally pushed back on
> > IMPLEMENTATION DEFINED junk in this area, and supporting this makes
> > it harder to push back on other vendors going the same route, which
> > I'm not keen on. That, and the usual state of IMP-DEF stuff making
> > this stupidly painful to reason about.
>
> As much as I agree with you on the stinking aspect of an IMPDEF PMU,
> this doesn't contradicts the architecture. To avoid the spread of this
> madness, forbidding an IMPDEF implementation in the architecture would
> be the right thing to do.

Yeah; I'll see what I can do. ;)

> > I can see that we can get this working bare-metal with DT, but I
> > really don't want to try to support this in other cases (e.g. in a
> > VM, potentially with ACPI), or this IMP-DEFness is going to spread
> > more throughout the arm_pmu code.
>
> Well, an alternative would be to sidestep the arm_pmu framework
> altogether. Which would probably suck even more.
>
> > How does this interact with PMU emulation for a KVM guest?
>
> It doesn't. No non-architected PMU will get exposed to a KVM guest,
> and the usual "inject an UNDEF exception on IMPDEF access" applies. As
> far as I am concerned, KVM is purely architectural and doesn't need to
> be encumbered with this.

Cool; I think not exposing this into a VM rules out the other issues I
was concerned with, so as long as we're ruling that out I think we're
agreed (and I see no reason for us to try to force this platform to work
with ACPI on bare-metal).

> > > +static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
> > > + [0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
> > > + [M1_PMU_PERFCTR_UNKNOWN_01] = BIT(7),
> > > + [M1_PMU_PERFCTR_CPU_CYCLES] = ANY_BUT_0_1 | BIT(0),
> > > + [M1_PMU_PERFCTR_INSTRUCTIONS] = BIT(7) | BIT(1),
> > > + [M1_PMU_PERFCTR_UNKNOWN_8d] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_8e] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_8f] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_90] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_93] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_94] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_95] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_96] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_97] = BIT(7),
> > > + [M1_PMU_PERFCTR_UNKNOWN_98] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_99] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_9a] = BIT(7),
> > > + [M1_PMU_PERFCTR_UNKNOWN_9b] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_9c] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
> > > + [M1_PMU_PERFCTR_UNKNOWN_bf] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_c0] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_c1] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_c4] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_c5] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_c6] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_c8] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_ca] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_cb] = ONLY_5_6_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
> > > + [M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
> > > + [M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
> > > + [M1_PMU_PERFCTR_UNKNOWN_f8] = ONLY_2_TO_7,
> > > + [M1_PMU_PERFCTR_UNKNOWN_fd] = ONLY_2_4_6,
> > > +};
> >
> > I don't entirely follow what's going on here. Is this a matrix
> > scheme like what QC had in their IMP-DEF Krait PMUs? See:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/kernel/perf_event_v7.c?h=v5.16-rc3#n1286
>
> It is nowhere as complicated as that.

Good to hear!

> > I'm a bit worried about this, since is this is of that shape, there
> > are potential constraints on which counters and/or events you can
> > use concurrently, and if you violate those they can conflict. If so,
> > we need to be *very* careful about the abstraction we provide to
> > userspace.
>
> The HW does have placement constraints (this is what this per-event
> bitmap is expressing), but the counting seems completely independent
> as long as you find an ad-hoc counter to place the event. Which means
> that if you try and count (for example) 4 events that would only fit
> in {5,6,7}, we'll say NO to the fourth one.
>
> As I say somewhere in a comment, we could do a better job if we had a
> global view of the events to be counted, and split them in batches
> that the core perf would then schedule.

For better or worse I don't think there's a good way to do that due to
the way the core perf event lists are managed. You basically have a
choice between either stopping prematurely or iterating over *all* the
events redundantly (which is potentially very expensive and
deliberately avoided today).

If (as I understand from the above) the constraints are independent then
I don't think there's anything we can do in the PMU driver.

> If you think any of this somehow breaks the userspace ABI, please let
> me know (my understand of perf is pretty limited...).

As long as there are no cross-event or cross-counter constraints, then I
don't think there's a userspace ABI problem here.

[...]

> > > +static void m1_pmu_configure_counter(unsigned int index, u8 event,
> > > + bool user, bool kernel)
> > > +{
> > > + u64 val, user_bit, kernel_bit;
> > > + int shift;
> > > +
> > > + switch (index) {
> > > + case 0 ... 7:
> > > + user_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL0_0_7));
> > > + kernel_bit = BIT(get_bit_offset(index, PMCR1_COUNT_A64_EL1_0_7));
> > > + break;
> > > + case 8 ... 9:
> > > + user_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL0_8_9));
> > > + kernel_bit = BIT(get_bit_offset(index - 8, PMCR1_COUNT_A64_EL1_8_9));
> > > + break;
> >
> > When this says 'EL1', presuambly that's counting at EL2 in VHE?
>
> It does.
>
> > Are there separate EL1 / EL2 controls, or anythign of that sort we need to be
> > aware of?
>
> No, there is a single, per-counter control for EL0 and EL2. I couldn't
> get the counters to report anything useful while a guest was running,
> but that doesn't mean such control doesn't exist.

Ok. We might need to require the exclude_guest flag for now, assuming
the perf tool automatically sets that.

[...]

> > > + state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
> > > + overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);
> >
> > I assume the overflow behaviour is free-running rather than stopping?
>
> Configurable, apparently. At the moment, I set it to stop on overflow.
> Happy to change the behaviour though.

The architected PMU continues counting upon overflow (which prevents
losing counts around the overlflow occurring), so I'd prefer that.

Is that behaviour per-counter, or for the PMU as a whole?

[...]

> > > +static int m1_pmu_device_probe(struct platform_device *pdev)
> > > +{
> > > + int ret;
> > > +
> > > + ret = arm_pmu_device_probe(pdev, m1_pmu_of_device_ids, NULL);
> > > + if (!ret) {
> > > + /*
> > > + * If probe succeeds, taint the kernel as this is all
> > > + * undocumented, implementation defined black magic.
> > > + */
> > > + add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> > > + }
> > > +
> > > + return ret;
> > > +}
> >
> > Hmmm... that means we're always going to TAINT on this HW with an appropriate
> > DT, which could mask other reasons TAINT_CPU_OUT_OF_SPEC would be set, even
> > where the user isn't using the PMU.
> >
> > Maybe we should have a cmdline option to opt-in to using the IMP-DEF PMU (and
> > only tainting in that case)?
>
> I'd rather taint on first use. Requiring a command-line argument for
> this seems a bit over the top...

That does sound nicer.

That said, if we've probed the thing, we're going to be poking it to
reset it (including out of idle), even if the user hasn't tried to use
it, so I'm not sure what's best after all...

Thanks,
Mark.

2021-12-03 11:22:59

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

On Thu, 02 Dec 2021 16:14:01 +0000,
Mark Rutland <[email protected]> wrote:
>
> On Thu, Dec 02, 2021 at 03:39:46PM +0000, Marc Zyngier wrote:
> > On Wed, 01 Dec 2021 16:58:10 +0000,
> > Mark Rutland <[email protected]> wrote:
> > >
> > > On Wed, Dec 01, 2021 at 01:49:09PM +0000, Marc Zyngier wrote:
> > > > Add a new, weird and wonderful driver for the equally weird Apple
> > > > PMU HW. Although the PMU itself is functional, we don't know much
> > > > about the events yet, so this can be considered as yet another
> > > > random number generator...
> > >
> > > It's really frustrating that Apple built this rather than the
> > > architected PMU, because we've generally pushed back on
> > > IMPLEMENTATION DEFINED junk in this area, and supporting this makes
> > > it harder to push back on other vendors going the same route, which
> > > I'm not keen on. That, and the usual state of IMP-DEF stuff making
> > > this stupidly painful to reason about.
> >
> > As much as I agree with you on the stinking aspect of an IMPDEF PMU,
> > this doesn't contradicts the architecture. To avoid the spread of this
> > madness, forbidding an IMPDEF implementation in the architecture would
> > be the right thing to do.
>
> Yeah; I'll see what I can do. ;)
>
> > > I can see that we can get this working bare-metal with DT, but I
> > > really don't want to try to support this in other cases (e.g. in a
> > > VM, potentially with ACPI), or this IMP-DEFness is going to spread
> > > more throughout the arm_pmu code.
> >
> > Well, an alternative would be to sidestep the arm_pmu framework
> > altogether. Which would probably suck even more.
> >
> > > How does this interact with PMU emulation for a KVM guest?
> >
> > It doesn't. No non-architected PMU will get exposed to a KVM guest,
> > and the usual "inject an UNDEF exception on IMPDEF access" applies. As
> > far as I am concerned, KVM is purely architectural and doesn't need to
> > be encumbered with this.
>
> Cool; I think not exposing this into a VM rules out the other issues I
> was concerned with, so as long as we're ruling that out I think we're
> agreed (and I see no reason for us to try to force this platform to work
> with ACPI on bare-metal).

Nah. This is a tortuous enough system.

> > No, there is a single, per-counter control for EL0 and EL2. I couldn't
> > get the counters to report anything useful while a guest was running,
> > but that doesn't mean such control doesn't exist.
>
> Ok. We might need to require the exclude_guest flag for now, assuming
> the perf tool automatically sets that.

OK.

>
> [...]
>
> > > > + state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
> > > > + overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);
> > >
> > > I assume the overflow behaviour is free-running rather than stopping?
> >
> > Configurable, apparently. At the moment, I set it to stop on overflow.
> > Happy to change the behaviour though.
>
> The architected PMU continues counting upon overflow (which prevents
> losing counts around the overlflow occurring), so I'd prefer that.
>
> Is that behaviour per-counter, or for the PMU as a whole?

It is global. This will probably require some additional rework to
clear bit 47 in overflowing counters, which we can't do atomically.

>
> [...]
>
> > > > +static int m1_pmu_device_probe(struct platform_device *pdev)
> > > > +{
> > > > + int ret;
> > > > +
> > > > + ret = arm_pmu_device_probe(pdev, m1_pmu_of_device_ids, NULL);
> > > > + if (!ret) {
> > > > + /*
> > > > + * If probe succeeds, taint the kernel as this is all
> > > > + * undocumented, implementation defined black magic.
> > > > + */
> > > > + add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> > > > + }
> > > > +
> > > > + return ret;
> > > > +}
> > >
> > > Hmmm... that means we're always going to TAINT on this HW with an appropriate
> > > DT, which could mask other reasons TAINT_CPU_OUT_OF_SPEC would be set, even
> > > where the user isn't using the PMU.
> > >
> > > Maybe we should have a cmdline option to opt-in to using the IMP-DEF PMU (and
> > > only tainting in that case)?
> >
> > I'd rather taint on first use. Requiring a command-line argument for
> > this seems a bit over the top...
>
> That does sound nicer.
>
> That said, if we've probed the thing, we're going to be poking it to
> reset it (including out of idle), even if the user hasn't tried to use
> it, so I'm not sure what's best after all...

Yup, there is that. I'll have another look.

M.

--
Without deviation from the norm, progress is not possible.

2021-12-03 12:04:48

by Mark Rutland

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

On Fri, Dec 03, 2021 at 11:22:53AM +0000, Marc Zyngier wrote:
> On Thu, 02 Dec 2021 16:14:01 +0000, Mark Rutland <[email protected]> wrote:
> > On Thu, Dec 02, 2021 at 03:39:46PM +0000, Marc Zyngier wrote:
> > > On Wed, 01 Dec 2021 16:58:10 +0000, Mark Rutland <[email protected]> wrote:
> > > > On Wed, Dec 01, 2021 at 01:49:09PM +0000, Marc Zyngier wrote:
> > > > > + state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
> > > > > + overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);
> > > >
> > > > I assume the overflow behaviour is free-running rather than stopping?
> > >
> > > Configurable, apparently. At the moment, I set it to stop on overflow.
> > > Happy to change the behaviour though.
> >
> > The architected PMU continues counting upon overflow (which prevents
> > losing counts around the overlflow occurring), so I'd prefer that.
> >
> > Is that behaviour per-counter, or for the PMU as a whole?
>
> It is global. This will probably require some additional rework to
> clear bit 47 in overflowing counters, which we can't do atomically.

Ah; I see.

To calrify my comment above, the reason for wanting the counter to keep
counting is to count during the window between the IRQ being asserted and the
PMU IRQ handler being invoked, and it's fine for there to be a blackout period
*within* the PMU IRQ handler.

So for example it would be fine to have:

irq_handler()
{
if (!any_counter_overflowed())
return IRQ_NONE;

stop_all_counters();

for_each_counter(c) {
handle_counter(c);
}

start_all_counters();

return IRQ_HANDLED;

}

... and I think with that the regular per-counter period reprogramming would do
the right thing?

Really, all the PMU drivers should do that so that repgoramming is consistent
and we don't get skewed groups.

Thanks,
Mark.

2021-12-03 16:22:56

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v2 8/8] drivers/perf: Add Apple icestorm/firestorm CPU PMU driver

On Fri, 03 Dec 2021 12:04:35 +0000,
Mark Rutland <[email protected]> wrote:
>
> On Fri, Dec 03, 2021 at 11:22:53AM +0000, Marc Zyngier wrote:
> > On Thu, 02 Dec 2021 16:14:01 +0000, Mark Rutland <[email protected]> wrote:
> > > On Thu, Dec 02, 2021 at 03:39:46PM +0000, Marc Zyngier wrote:
> > > > On Wed, 01 Dec 2021 16:58:10 +0000, Mark Rutland <[email protected]> wrote:
> > > > > On Wed, Dec 01, 2021 at 01:49:09PM +0000, Marc Zyngier wrote:
> > > > > > + state = read_sysreg_s(SYS_IMP_APL_PMCR0_EL1);
> > > > > > + overflow = read_sysreg_s(SYS_IMP_APL_PMSR_EL1);
> > > > >
> > > > > I assume the overflow behaviour is free-running rather than stopping?
> > > >
> > > > Configurable, apparently. At the moment, I set it to stop on overflow.
> > > > Happy to change the behaviour though.
> > >
> > > The architected PMU continues counting upon overflow (which prevents
> > > losing counts around the overlflow occurring), so I'd prefer that.
> > >
> > > Is that behaviour per-counter, or for the PMU as a whole?
> >
> > It is global. This will probably require some additional rework to
> > clear bit 47 in overflowing counters, which we can't do atomically.
>
> Ah; I see.
>
> To calrify my comment above, the reason for wanting the counter to keep
> counting is to count during the window between the IRQ being asserted and the
> PMU IRQ handler being invoked, and it's fine for there to be a blackout period
> *within* the PMU IRQ handler.
>
> So for example it would be fine to have:
>
> irq_handler()
> {
> if (!any_counter_overflowed())
> return IRQ_NONE;
>
> stop_all_counters();
>
> for_each_counter(c) {
> handle_counter(c);
> }
>
> start_all_counters();
>
> return IRQ_HANDLED;
>
> }
>
> ... and I think with that the regular per-counter period
> reprogramming would do the right thing?

Yup. It looks like this works just fine.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-12-03 16:32:15

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v2 3/8] irqchip/apple-aic: Add cpumasks for E and P cores

On Wed, 01 Dec 2021 16:08:13 +0000,
Mark Rutland <[email protected]> wrote:
>
> On Wed, Dec 01, 2021 at 01:49:04PM +0000, Marc Zyngier wrote:
> > In order to be able to tell the core IRQ code about the affinity
> > of the PMU interrupt in later patches, compute the cpumasks of the
> > P and E cores at boot time.
> >
> > This relies on the affinity scheme used by the vendor, which seems
> > to work for the couple of SoCs that are out in the wild.
>
> ... but may change at any arbitrary point in future?

Crystal balls are in short supply, sorry! ;-)

> Can we please put the affinity in the DT, like we do for other SoCs and
> devices?
>
> I don't think we should treat this HW specially in this regard; we certaintly
> don't want other folk hard-coding system topology in their irqchip driver, and
> it should be possible to do something like the ppi-partitions binding, no?

The PPI partition is totally overkill here. What it deals with is
multiple devices sharing a single PPI across the system.

Here, we can invent our own interrupt number, so the sharing is
avoided by construction (the joy of not having an interrupt controller
the first place!).

I'm happy to stick the affinity in the DT (after all, it is likely
that other devices on these systems have the same requirements) and
have it consumed by the irqchip driver. I only need to find a way that
doesn't completely invalidate the existing binding...

M.

--
Without deviation from the norm, progress is not possible.

2021-12-12 07:22:09

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 3/8] irqchip/apple-aic: Add cpumasks for E and P cores

On 04/12/2021 01.32, Marc Zyngier wrote:
> On Wed, 01 Dec 2021 16:08:13 +0000,
> Mark Rutland <[email protected]> wrote:
>>
>> On Wed, Dec 01, 2021 at 01:49:04PM +0000, Marc Zyngier wrote:
>>> In order to be able to tell the core IRQ code about the affinity
>>> of the PMU interrupt in later patches, compute the cpumasks of the
>>> P and E cores at boot time.
>>>
>>> This relies on the affinity scheme used by the vendor, which seems
>>> to work for the couple of SoCs that are out in the wild.
>>
>> ... but may change at any arbitrary point in future?
>
> Crystal balls are in short supply, sorry! ;-)

Considering Apple seem to rely on this all over the place, I think they
probably won't be changing the meaning of Aff2 at least until they
decide to come up with SoCs that have 3 types of cores, or something
like that :)

But yeah, ultimately this is a guess.

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:23:53

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 5/8] irqchip/apple-aic: Move PMU-specific registers to their own include file

On 01/12/2021 22.49, Marc Zyngier wrote:
> As we are about to have a PMU driver, move the PMU bits from the AIC
> driver into a common include file.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm64/include/asm/apple_m1_pmu.h | 19 +++++++++++++++++++
> drivers/irqchip/irq-apple-aic.c | 11 +----------
> 2 files changed, 20 insertions(+), 10 deletions(-)
> create mode 100644 arch/arm64/include/asm/apple_m1_pmu.h
>
> diff --git a/arch/arm64/include/asm/apple_m1_pmu.h b/arch/arm64/include/asm/apple_m1_pmu.h
> new file mode 100644
> index 000000000000..b848af7faadc
> --- /dev/null
> +++ b/arch/arm64/include/asm/apple_m1_pmu.h
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#ifndef __ASM_APPLE_M1_PMU_h
> +#define __ASM_APPLE_M1_PMU_h
> +
> +#include <linux/bits.h>
> +#include <asm/sysreg.h>
> +
> +/* Core PMC control register */
> +#define SYS_IMP_APL_PMCR0_EL1 sys_reg(3, 1, 15, 0, 0)
> +#define PMCR0_IMODE GENMASK(10, 8)
> +#define PMCR0_IMODE_OFF 0
> +#define PMCR0_IMODE_PMI 1
> +#define PMCR0_IMODE_AIC 2
> +#define PMCR0_IMODE_HALT 3
> +#define PMCR0_IMODE_FIQ 4
> +#define PMCR0_IACT BIT(11)
> +
> +#endif /* __ASM_APPLE_M1_PMU_h */
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> index 23f5f10e974e..9663166fd97f 100644
> --- a/drivers/irqchip/irq-apple-aic.c
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -55,6 +55,7 @@
> #include <linux/limits.h>
> #include <linux/of_address.h>
> #include <linux/slab.h>
> +#include <asm/apple_m1_pmu.h>
> #include <asm/exception.h>
> #include <asm/sysreg.h>
> #include <asm/virt.h>
> @@ -109,16 +110,6 @@
> * Note: sysreg-based IPIs are not supported yet.
> */
>
> -/* Core PMC control register */
> -#define SYS_IMP_APL_PMCR0_EL1 sys_reg(3, 1, 15, 0, 0)
> -#define PMCR0_IMODE GENMASK(10, 8)
> -#define PMCR0_IMODE_OFF 0
> -#define PMCR0_IMODE_PMI 1
> -#define PMCR0_IMODE_AIC 2
> -#define PMCR0_IMODE_HALT 3
> -#define PMCR0_IMODE_FIQ 4
> -#define PMCR0_IACT BIT(11)
> -
> /* IPI request registers */
> #define SYS_IMP_APL_IPI_RR_LOCAL_EL1 sys_reg(3, 5, 15, 0, 0)
> #define SYS_IMP_APL_IPI_RR_GLOBAL_EL1 sys_reg(3, 5, 15, 0, 1)
>

Reviewed-by: Hector Martin <[email protected]>

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:25:59

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 4/8] irqchip/apple-aic: Wire PMU interrupts

On 01/12/2021 22.49, Marc Zyngier wrote:
> Add the necessary code to configure and P and E-core PMU interrupts
> with their respective affinities. When such an interrupt fires, map
> it onto the right pseudo-interrupt.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-apple-aic.c | 34 +++++++++++++++++++++------------
> 1 file changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> index 30ca80ccda8b..23f5f10e974e 100644
> --- a/drivers/irqchip/irq-apple-aic.c
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -155,7 +155,7 @@
> #define SYS_IMP_APL_UPMSR_EL1 sys_reg(3, 7, 15, 6, 4)
> #define UPMSR_IACT BIT(0)
>
> -#define AIC_NR_FIQ 4
> +#define AIC_NR_FIQ 6
> #define AIC_NR_SWIPI 32
>
> /*
> @@ -207,6 +207,11 @@ static bool __is_pcore(u64 mpidr)
> return MPIDR_AFFINITY_LEVEL(mpidr, 2) == 1;
> }
>
> +static bool is_pcore(void)
> +{
> + return __is_pcore(read_cpuid_mpidr());
> +}
> +
> /*
> * IRQ irqchip
> */
> @@ -420,16 +425,10 @@ static void __exception_irq_entry aic_handle_fiq(struct pt_regs *regs)
> aic_irqc->nr_hw + AIC_TMR_EL02_VIRT);
> }
>
> - if ((read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & (PMCR0_IMODE | PMCR0_IACT)) ==
> - (FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_FIQ) | PMCR0_IACT)) {
> - /*
> - * Not supported yet, let's figure out how to handle this when
> - * we implement these proprietary performance counters. For now,
> - * just mask it and move on.
> - */
> - pr_err_ratelimited("PMC FIQ fired. Masking.\n");
> - sysreg_clear_set_s(SYS_IMP_APL_PMCR0_EL1, PMCR0_IMODE | PMCR0_IACT,
> - FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_OFF));
> + if (read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & PMCR0_IACT) {
> + int irq = is_pcore() ? AIC_CPU_PMU_P : AIC_CPU_PMU_E;
> + generic_handle_domain_irq(aic_irqc->hw_domain,
> + aic_irqc->nr_hw + irq);
> }
>
> if (FIELD_GET(UPMCR0_IMODE, read_sysreg_s(SYS_IMP_APL_UPMCR0_EL1)) == UPMCR0_IMODE_FIQ &&
> @@ -469,7 +468,18 @@ static int aic_irq_domain_map(struct irq_domain *id, unsigned int irq,
> handle_fasteoi_irq, NULL, NULL);
> irqd_set_single_target(irq_desc_get_irq_data(irq_to_desc(irq)));
> } else {
> - irq_set_percpu_devid(irq);
> + switch (hw - ic->nr_hw) {
> + case AIC_CPU_PMU_P:
> + irq_set_percpu_devid_partition(irq, &ic->pcore_mask);
> + break;
> + case AIC_CPU_PMU_E:
> + irq_set_percpu_devid_partition(irq, &ic->ecore_mask);
> + break;
> + default:
> + irq_set_percpu_devid(irq);
> + break;
> + }
> +
> irq_domain_set_info(id, irq, hw, &fiq_chip, id->host_data,
> handle_percpu_devid_irq, NULL, NULL);
> }
>

I still find this idea that we need to split the IRQs virtually quite
bizarre, but if that's how bit.LITTLE systems do it...

Reviewed-by: Hector Martin <[email protected]>

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:26:27

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 7/8] drivers/perf: arm_pmu: Handle 47 bit counters

On 01/12/2021 22.49, Marc Zyngier wrote:
> The current ARM PMU framework can only deal with 32 or 64bit counters.
> Teach it about a 47bit flavour.
>
> Yes, this is odd.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/perf/arm_pmu.c | 2 ++
> include/linux/perf/arm_pmu.h | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 295cc7952d0e..0a9ed1a061ac 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -109,6 +109,8 @@ static inline u64 arm_pmu_event_max_period(struct perf_event *event)
> {
> if (event->hw.flags & ARMPMU_EVT_64BIT)
> return GENMASK_ULL(63, 0);
> + else if (event->hw.flags & ARMPMU_EVT_47BIT)
> + return GENMASK_ULL(46, 0);
> else
> return GENMASK_ULL(31, 0);
> }
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 2512e2f9cd4e..0407a38b470a 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -26,6 +26,8 @@
> */
> /* Event uses a 64bit counter */
> #define ARMPMU_EVT_64BIT 1
> +/* Event uses a 47bit counter */
> +#define ARMPMU_EVT_47BIT 2
>
> #define HW_OP_UNSUPPORTED 0xFFFF
> #define C(_x) PERF_COUNT_HW_CACHE_##_x
>

Reviewed-by: Hector Martin <[email protected]>

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:26:35

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 6/8] arm64: apple: t8301: Add PMU nodes

On 01/12/2021 22.49, Marc Zyngier wrote:
> Advertise the two PMU nodes for the t8103 SoC.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> arch/arm64/boot/dts/apple/t8103.dtsi | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/apple/t8103.dtsi b/arch/arm64/boot/dts/apple/t8103.dtsi
> index fc8b2bb06ffe..d2e9afde3729 100644
> --- a/arch/arm64/boot/dts/apple/t8103.dtsi
> +++ b/arch/arm64/boot/dts/apple/t8103.dtsi
> @@ -96,6 +96,18 @@ timer {
> <AIC_FIQ AIC_TMR_HV_VIRT IRQ_TYPE_LEVEL_HIGH>;
> };
>
> + pmu-e {
> + compatible = "apple,icestorm-pmu";
> + interrupt-parent = <&aic>;
> + interrupts = <AIC_FIQ AIC_CPU_PMU_E IRQ_TYPE_LEVEL_HIGH>;
> + };
> +
> + pmu-p {
> + compatible = "apple,firestorm-pmu";
> + interrupt-parent = <&aic>;
> + interrupts = <AIC_FIQ AIC_CPU_PMU_P IRQ_TYPE_LEVEL_HIGH>;
> + };
> +
> clk24: clock-24m {
> compatible = "fixed-clock";
> #clock-cells = <0>;
>

Reviewed-by: Hector Martin <[email protected]>

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:27:02

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 2/8] dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts

On 01/12/2021 22.49, Marc Zyngier wrote:
> Advertise the two pseudo-interrupts that tied to the two PMU
> flavours present in the Apple M1 SoC.
>
> We choose the expose two different pseudo-interrupts to the OS
> as the e-core PMU is obviously different from the p-core one,
> effectively presenting two different devices.
>
> Acked-by: Rob Herring <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> .../devicetree/bindings/interrupt-controller/apple,aic.yaml | 2 ++
> include/dt-bindings/interrupt-controller/apple-aic.h | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
> index cf6c091a07b1..b95e41816953 100644
> --- a/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
> +++ b/Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
> @@ -56,6 +56,8 @@ properties:
> - 1: virtual HV timer
> - 2: physical guest timer
> - 3: virtual guest timer
> + - 4: 'efficient' CPU PMU
> + - 5: 'performance' CPU PMU
>
> The 3rd cell contains the interrupt flags. This is normally
> IRQ_TYPE_LEVEL_HIGH (4).
> diff --git a/include/dt-bindings/interrupt-controller/apple-aic.h b/include/dt-bindings/interrupt-controller/apple-aic.h
> index 604f2bb30ac0..bf3aac0e5491 100644
> --- a/include/dt-bindings/interrupt-controller/apple-aic.h
> +++ b/include/dt-bindings/interrupt-controller/apple-aic.h
> @@ -11,5 +11,7 @@
> #define AIC_TMR_HV_VIRT 1
> #define AIC_TMR_GUEST_PHYS 2
> #define AIC_TMR_GUEST_VIRT 3
> +#define AIC_CPU_PMU_E 4
> +#define AIC_CPU_PMU_P 5
>
> #endif
>

Reviewed-by: Hector Martin <[email protected]>

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:27:27

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 1/8] dt-bindings: arm-pmu: Document Apple PMU compatible strings

On 01/12/2021 22.49, Marc Zyngier wrote:
> As we are about to add support fur the Apple PMUs, document the compatible
> strings associated with the two micro-architectures present in the Apple M1.
>
> Acked-by: Rob Herring <[email protected]>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> Documentation/devicetree/bindings/arm/pmu.yaml | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/arm/pmu.yaml b/Documentation/devicetree/bindings/arm/pmu.yaml
> index e17ac049e890..1a6986b4e552 100644
> --- a/Documentation/devicetree/bindings/arm/pmu.yaml
> +++ b/Documentation/devicetree/bindings/arm/pmu.yaml
> @@ -20,6 +20,8 @@ properties:
> items:
> - enum:
> - apm,potenza-pmu
> + - apple,firestorm-pmu
> + - apple,icestorm-pmu
> - arm,armv8-pmuv3 # Only for s/w models
> - arm,arm1136-pmu
> - arm,arm1176-pmu
>

Reviewed-by: Hector Martin <[email protected]>

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-12 07:30:29

by Hector Martin

[permalink] [raw]

Subject: Re: [PATCH v2 3/8] irqchip/apple-aic: Add cpumasks for E and P cores

On 01/12/2021 22.49, Marc Zyngier wrote:
> In order to be able to tell the core IRQ code about the affinity
> of the PMU interrupt in later patches, compute the cpumasks of the
> P and E cores at boot time.
>
> This relies on the affinity scheme used by the vendor, which seems
> to work for the couple of SoCs that are out in the wild.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-apple-aic.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> index 3759dc36cc8f..30ca80ccda8b 100644
> --- a/drivers/irqchip/irq-apple-aic.c
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -177,6 +177,8 @@ struct aic_irq_chip {
> void __iomem *base;
> struct irq_domain *hw_domain;
> struct irq_domain *ipi_domain;
> + struct cpumask ecore_mask;
> + struct cpumask pcore_mask;
> int nr_hw;
> int ipi_hwirq;
> };
> @@ -200,6 +202,11 @@ static void aic_ic_write(struct aic_irq_chip *ic, u32 reg, u32 val)
> writel_relaxed(val, ic->base + reg);
> }
>
> +static bool __is_pcore(u64 mpidr)
> +{
> + return MPIDR_AFFINITY_LEVEL(mpidr, 2) == 1;
> +}
> +
> /*
> * IRQ irqchip
> */
> @@ -833,6 +840,13 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
> return -ENODEV;
> }
>
> + for_each_possible_cpu(i) {
> + if (__is_pcore(cpu_logical_map(i)))
> + cpumask_set_cpu(i, &irqc->pcore_mask);
> + else
> + cpumask_set_cpu(i, &irqc->ecore_mask);
> + }
> +
> set_handle_irq(aic_handle_irq);
> set_handle_fiq(aic_handle_fiq);
>
>

I'm okay with this approach, but if we want to be more explicit about
the affinities, maybe something like apple,pmu-irq-index in the CPU
nodes? Then we can either start at a higher FIQ offset for these (in
case we need to add more FIQs in the future), or just make up a new
AIC_PMU top level interrupt type and start at 0.

--
Hector Martin ([email protected])
Public Key: https://mrcn.st/pub

2021-12-13 14:43:27

by Marc Zyngier

[permalink] [raw]

Subject: Re: [PATCH v2 3/8] irqchip/apple-aic: Add cpumasks for E and P cores

On Sun, 12 Dec 2021 07:30:20 +0000,
Hector Martin <[email protected]> wrote:
>
> On 01/12/2021 22.49, Marc Zyngier wrote:
> > In order to be able to tell the core IRQ code about the affinity
> > of the PMU interrupt in later patches, compute the cpumasks of the
> > P and E cores at boot time.
> >
> > This relies on the affinity scheme used by the vendor, which seems
> > to work for the couple of SoCs that are out in the wild.
> >
> > Signed-off-by: Marc Zyngier <[email protected]>
> > ---
> > drivers/irqchip/irq-apple-aic.c | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> > index 3759dc36cc8f..30ca80ccda8b 100644
> > --- a/drivers/irqchip/irq-apple-aic.c
> > +++ b/drivers/irqchip/irq-apple-aic.c
> > @@ -177,6 +177,8 @@ struct aic_irq_chip {
> > void __iomem *base;
> > struct irq_domain *hw_domain;
> > struct irq_domain *ipi_domain;
> > + struct cpumask ecore_mask;
> > + struct cpumask pcore_mask;
> > int nr_hw;
> > int ipi_hwirq;
> > };
> > @@ -200,6 +202,11 @@ static void aic_ic_write(struct aic_irq_chip *ic, u32 reg, u32 val)
> > writel_relaxed(val, ic->base + reg);
> > }
> > +static bool __is_pcore(u64 mpidr)
> > +{
> > + return MPIDR_AFFINITY_LEVEL(mpidr, 2) == 1;
> > +}
> > +
> > /*
> > * IRQ irqchip
> > */
> > @@ -833,6 +840,13 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
> > return -ENODEV;
> > }
> > + for_each_possible_cpu(i) {
> > + if (__is_pcore(cpu_logical_map(i)))
> > + cpumask_set_cpu(i, &irqc->pcore_mask);
> > + else
> > + cpumask_set_cpu(i, &irqc->ecore_mask);
> > + }
> > +
> > set_handle_irq(aic_handle_irq);
> > set_handle_fiq(aic_handle_fiq);
> >
>
> I'm okay with this approach, but if we want to be more explicit about
> the affinities, maybe something like apple,pmu-irq-index in the CPU
> nodes? Then we can either start at a higher FIQ offset for these (in
> case we need to add more FIQs in the future), or just make up a new
> AIC_PMU top level interrupt type and start at 0.

I'm actually worried that we'll need more of these "asymmetric FIQs"
in the future, and that the PMU-specific hack won't scale.

Do you know of any other per-CPU device that could differ between
small and big cores? This would certainly help guiding my
implementation between a device specific hack (the PMU irq index) or
something more generic (interrupt specifier containing the affinity
and following the AICv2 scheme).

Thanks,

M.

--
Without deviation from the norm, progress is not possible.