This patchset adds support for the Krait CPU PMUs. I split the main patch
up into two parts: first the basic support that gets us just the architected
events and second the full support patch that tackles the PMRESR interface.
Please note, this patchset relies on the per-cpu irq patch from Vinayak Kale,
7f4a8e7b1943 (genirq: Add an accessor for IRQ_PER_CPU flag, 2013-12-04),
that's sitting in linux-next. Patches are based on commit 21dea6695134
(ARM: msm_defconfig: Enable restart driver, 2013-12-20) in linux-next.
Changes since v1:
* Dropped sparse warning patch
* Reworked percpu irq support patch to hide double pointers in dispatch func
* Expanded on comments explaining Krait raw event syntax
* Expanded on DT binding
* Added qcom,no-pc-write property instead of using cpuid scheme
Stephen Boyd (7):
ARM: perf_event: Support percpu irqs for the CPU PMU
ARM: perf_event: Assign pdev pointer earlier for CPU PMUs
ARM: perf_event: Add basic support for Krait CPU PMUs
ARM: perf_event: Add hook for event index clearing
ARM: perf_event: Fully support Krait CPU PMU events
devicetree: bindings: Document Krait performance monitor units (PMU)
ARM: dts: msm: Add krait-pmu to platforms with Krait CPUs
Documentation/devicetree/bindings/arm/pmu.txt | 9 +-
arch/arm/boot/dts/qcom-msm8960-cdp.dts | 6 +
arch/arm/boot/dts/qcom-msm8974.dtsi | 5 +
arch/arm/include/asm/pmu.h | 1 +
arch/arm/kernel/perf_event.c | 16 +-
arch/arm/kernel/perf_event_cpu.c | 103 +++--
arch/arm/kernel/perf_event_v7.c | 550 ++++++++++++++++++++++++++
7 files changed, 656 insertions(+), 34 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
Add basic support for the Krait CPU PMU. This allows us to use
the architected functionality of the PMU.
This is based on code originally written by Ashwin Chaugule and
Neil Leeder [1].
[1] https://www.codeaurora.org/cgit/quic/la/kernel/msm/tree/arch/arm/kernel/perf_event_msm_krait.c?h=msm-3.4
Cc: Neil Leeder <[email protected]>
Cc: Ashwin Chaugule <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/kernel/perf_event_cpu.c | 1 +
arch/arm/kernel/perf_event_v7.c | 164 +++++++++++++++++++++++++++++++++++++++
2 files changed, 165 insertions(+)
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c
index 68d02ca0ca1b..ed571d386c0b 100644
--- a/arch/arm/kernel/perf_event_cpu.c
+++ b/arch/arm/kernel/perf_event_cpu.c
@@ -229,6 +229,7 @@ static struct of_device_id cpu_pmu_of_device_ids[] = {
{.compatible = "arm,arm11mpcore-pmu", .data = armv6mpcore_pmu_init},
{.compatible = "arm,arm1176-pmu", .data = armv6pmu_init},
{.compatible = "arm,arm1136-pmu", .data = armv6pmu_init},
+ {.compatible = "qcom,krait-pmu", .data = krait_pmu_init},
{},
};
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 039cffb053a7..16386b1d27a8 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -732,6 +732,138 @@ static const unsigned armv7_a7_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
};
/*
+ * Krait HW events mapping
+ */
+static const unsigned krait_perf_map[PERF_COUNT_HW_MAX] = {
+ [PERF_COUNT_HW_CPU_CYCLES] = ARMV7_PERFCTR_CPU_CYCLES,
+ [PERF_COUNT_HW_INSTRUCTIONS] = ARMV7_PERFCTR_INSTR_EXECUTED,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = HW_OP_UNSUPPORTED,
+ [PERF_COUNT_HW_CACHE_MISSES] = HW_OP_UNSUPPORTED,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = ARMV7_PERFCTR_PC_WRITE,
+ [PERF_COUNT_HW_BRANCH_MISSES] = ARMV7_PERFCTR_PC_BRANCH_MIS_PRED,
+ [PERF_COUNT_HW_BUS_CYCLES] = ARMV7_PERFCTR_CLOCK_CYCLES,
+};
+
+static const unsigned krait_perf_map_no_branch[PERF_COUNT_HW_MAX] = {
+ [PERF_COUNT_HW_CPU_CYCLES] = ARMV7_PERFCTR_CPU_CYCLES,
+ [PERF_COUNT_HW_INSTRUCTIONS] = ARMV7_PERFCTR_INSTR_EXECUTED,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = HW_OP_UNSUPPORTED,
+ [PERF_COUNT_HW_CACHE_MISSES] = HW_OP_UNSUPPORTED,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = HW_OP_UNSUPPORTED,
+ [PERF_COUNT_HW_BRANCH_MISSES] = ARMV7_PERFCTR_PC_BRANCH_MIS_PRED,
+ [PERF_COUNT_HW_BUS_CYCLES] = ARMV7_PERFCTR_CLOCK_CYCLES,
+};
+
+static const unsigned krait_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+ [C(L1D)] = {
+ /*
+ * The performance counters don't differentiate between read
+ * and write accesses/misses so this isn't strictly correct,
+ * but it's the best we can do. Writes and reads get
+ * combined.
+ */
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = ARMV7_PERFCTR_L1_DCACHE_ACCESS,
+ [C(RESULT_MISS)] = ARMV7_PERFCTR_L1_DCACHE_REFILL,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = ARMV7_PERFCTR_L1_DCACHE_ACCESS,
+ [C(RESULT_MISS)] = ARMV7_PERFCTR_L1_DCACHE_REFILL,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+ [C(L1I)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+ [C(LL)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+ [C(DTLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+ [C(ITLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+ [C(BPU)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = ARMV7_PERFCTR_PC_BRANCH_PRED,
+ [C(RESULT_MISS)] = ARMV7_PERFCTR_PC_BRANCH_MIS_PRED,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = ARMV7_PERFCTR_PC_BRANCH_PRED,
+ [C(RESULT_MISS)] = ARMV7_PERFCTR_PC_BRANCH_MIS_PRED,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+ [C(NODE)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ },
+ },
+};
+
+/*
* Perf Events' indices
*/
#define ARMV7_IDX_CYCLE_COUNTER 0
@@ -1212,6 +1344,18 @@ static int armv7_a7_map_event(struct perf_event *event)
&armv7_a7_perf_cache_map, 0xFF);
}
+static int krait_map_event(struct perf_event *event)
+{
+ return armpmu_map_event(event, &krait_perf_map,
+ &krait_perf_cache_map, 0xFFFFF);
+}
+
+static int krait_map_event_no_branch(struct perf_event *event)
+{
+ return armpmu_map_event(event, &krait_perf_map_no_branch,
+ &krait_perf_cache_map, 0xFFFFF);
+}
+
static void armv7pmu_init(struct arm_pmu *cpu_pmu)
{
cpu_pmu->handle_irq = armv7pmu_handle_irq;
@@ -1283,6 +1427,21 @@ static int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
return 0;
}
+
+static int krait_pmu_init(struct arm_pmu *cpu_pmu)
+{
+ armv7pmu_init(cpu_pmu);
+ cpu_pmu->name = "ARMv7 Krait";
+ /* Some early versions of Krait don't support PC write events */
+ if (of_property_read_bool(cpu_pmu->plat_device->dev.of_node,
+ "qcom,no-pc-write"))
+ cpu_pmu->map_event = krait_map_event_no_branch;
+ else
+ cpu_pmu->map_event = krait_map_event;
+ cpu_pmu->num_events = armv7_read_num_pmnc_events();
+ cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
+ return 0;
+}
#else
static inline int armv7_a8_pmu_init(struct arm_pmu *cpu_pmu)
{
@@ -1308,4 +1467,9 @@ static inline int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu)
{
return -ENODEV;
}
+
+static inline int krait_pmu_init(struct arm_pmu *cpu_pmu)
+{
+ return -ENODEV;
+}
#endif /* CONFIG_CPU_V7 */
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
Document the Krait PMU compatible string.
Cc: <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
Documentation/devicetree/bindings/arm/pmu.txt | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/arm/pmu.txt b/Documentation/devicetree/bindings/arm/pmu.txt
index 3e1e498fea96..ce731441e64f 100644
--- a/Documentation/devicetree/bindings/arm/pmu.txt
+++ b/Documentation/devicetree/bindings/arm/pmu.txt
@@ -16,7 +16,14 @@ Required properties:
"arm,arm11mpcore-pmu"
"arm,arm1176-pmu"
"arm,arm1136-pmu"
-- interrupts : 1 combined interrupt or 1 per core.
+ "qcom,krait-pmu"
+- interrupts : 1 combined interrupt or 1 per core. If the interrupt is a per-cpu
+ interrupt (PPI) then 1 interrupt should be specified.
+
+Optional properties:
+
+- qcom,no-pc-write : Indicates that this PMU doesn't support the 0xc and 0xd
+ events.
Example:
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
Krait supports a set of performance monitor region event
selection registers (PMRESR) sitting behind a cp15 based
interface that extend the architected PMU events to include Krait
CPU and Venum VFP specific events. To use these events the user
is expected to program the region register (PMRESRn) with the
event code shifted into the group they care about and then point
the PMNx event at that region+group combo by writing a
PMRESRn_GROUPx event. Add support for this hardware.
Note: the raw event number is a pure software construct that
allows us to map the multi-dimensional number space of regions,
groups, and event codes into a flat event number space suitable
for use by the perf framework.
This is based on code originally written by Ashwin Chaugule and
Neil Leeder [1].
[1] https://www.codeaurora.org/cgit/quic/la/kernel/msm/tree/arch/arm/kernel/perf_event_msm_krait.c?h=msm-3.4
Cc: Neil Leeder <[email protected]>
Cc: Ashwin Chaugule <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/kernel/perf_event_v7.c | 398 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 392 insertions(+), 6 deletions(-)
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 16386b1d27a8..daa675529c12 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -18,6 +18,10 @@
#ifdef CONFIG_CPU_V7
+#include <asm/cp15.h>
+#include <asm/vfp.h>
+#include "../vfp/vfpinstr.h"
+
/*
* Common ARMv7 event types
*
@@ -109,6 +113,20 @@ enum armv7_a15_perf_types {
ARMV7_A15_PERFCTR_PC_WRITE_SPEC = 0x76,
};
+/* ARMv7 Krait specific event types */
+enum krait_perf_types {
+ KRAIT_PMRESR0_GROUP0 = 0xcc,
+ KRAIT_PMRESR1_GROUP0 = 0xd0,
+ KRAIT_PMRESR2_GROUP0 = 0xd4,
+ KRAIT_VPMRESR0_GROUP0 = 0xd8,
+
+ KRAIT_PERFCTR_L1_ICACHE_ACCESS = 0x10011,
+ KRAIT_PERFCTR_L1_ICACHE_MISS = 0x10010,
+
+ KRAIT_PERFCTR_L1_ITLB_ACCESS = 0x12222,
+ KRAIT_PERFCTR_L1_DTLB_ACCESS = 0x12210,
+};
+
/*
* Cortex-A8 HW events mapping
*
@@ -779,8 +797,8 @@ static const unsigned krait_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
},
[C(L1I)] = {
[C(OP_READ)] = {
- [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
- [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_ACCESS)] = KRAIT_PERFCTR_L1_ICACHE_ACCESS,
+ [C(RESULT_MISS)] = KRAIT_PERFCTR_L1_ICACHE_MISS,
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
@@ -807,11 +825,11 @@ static const unsigned krait_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
},
[C(DTLB)] = {
[C(OP_READ)] = {
- [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_ACCESS)] = KRAIT_PERFCTR_L1_DTLB_ACCESS,
[C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
},
[C(OP_WRITE)] = {
- [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_ACCESS)] = KRAIT_PERFCTR_L1_DTLB_ACCESS,
[C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
},
[C(OP_PREFETCH)] = {
@@ -821,11 +839,11 @@ static const unsigned krait_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
},
[C(ITLB)] = {
[C(OP_READ)] = {
- [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_ACCESS)] = KRAIT_PERFCTR_L1_ITLB_ACCESS,
[C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
},
[C(OP_WRITE)] = {
- [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED,
+ [C(RESULT_ACCESS)] = KRAIT_PERFCTR_L1_ITLB_ACCESS,
[C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED,
},
[C(OP_PREFETCH)] = {
@@ -1428,6 +1446,369 @@ static int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu)
return 0;
}
+/*
+ * Krait Performance Monitor Region Event Selection Register (PMRESRn)
+ *
+ * 31 30 24 16 8 0
+ * +--------------------------------+
+ * PMRESR0 | EN | CC | CC | CC | CC | N = 1, R = 0
+ * +--------------------------------+
+ * PMRESR1 | EN | CC | CC | CC | CC | N = 1, R = 1
+ * +--------------------------------+
+ * PMRESR2 | EN | CC | CC | CC | CC | N = 1, R = 2
+ * +--------------------------------+
+ * VPMRESR0 | EN | CC | CC | CC | CC | N = 2, R = ?
+ * +--------------------------------+
+ * EN | G=3 | G=2 | G=1 | G=0
+ *
+ * Event Encoding:
+ *
+ * hwc->config_base = 0xNRCCG
+ *
+ * N = prefix, 1 for Krait CPU (PMRESRn), 2 for Venum VFP (VPMRESR)
+ * R = region register
+ * CC = class of events the group G is choosing from
+ * G = group or particular event
+ *
+ * Example: 0x12021 is a Krait CPU event in PMRESR2's group 1 with code 2
+ *
+ * A region (R) corresponds to a piece of the CPU (execution unit, instruction
+ * unit, etc.) while the event code (CC) corresponds to a particular class of
+ * events (interrupts for example). An event code is broken down into
+ * groups (G) that can be mapped into the PMU (irq, fiqs, and irq+fiqs for
+ * example).
+ */
+
+#define KRAIT_EVENT (1 << 16)
+#define VENUM_EVENT (2 << 16)
+#define KRAIT_EVENT_MASK (KRAIT_EVENT | VENUM_EVENT)
+#define PMRESRn_EN BIT(31)
+#define NUM_PMRESR (KRAIT_VPMRESR0_GROUP0 + 4 - KRAIT_PMRESR0_GROUP0)
+
+static DEFINE_PER_CPU(unsigned long [BITS_TO_LONGS(NUM_PMRESR)], pmresrn_used);
+
+static u32 krait_read_pmresrn(int n)
+{
+ u32 val;
+
+ switch (n) {
+ case 0:
+ asm volatile("mrc p15, 1, %0, c9, c15, 0" : "=r" (val));
+ break;
+ case 1:
+ asm volatile("mrc p15, 1, %0, c9, c15, 1" : "=r" (val));
+ break;
+ case 2:
+ asm volatile("mrc p15, 1, %0, c9, c15, 2" : "=r" (val));
+ break;
+ default:
+ BUG(); /* Should be validated in krait_pmu_get_event_idx() */
+ }
+
+ return val;
+}
+
+static void krait_write_pmresrn(int n, u32 val)
+{
+ switch (n) {
+ case 0:
+ asm volatile("mcr p15, 1, %0, c9, c15, 0" : : "r" (val));
+ break;
+ case 1:
+ asm volatile("mcr p15, 1, %0, c9, c15, 1" : : "r" (val));
+ break;
+ case 2:
+ asm volatile("mcr p15, 1, %0, c9, c15, 2" : : "r" (val));
+ break;
+ default:
+ BUG(); /* Should be validated in krait_pmu_get_event_idx() */
+ }
+}
+
+static u32 krait_read_vpmresr0(void)
+{
+ u32 val;
+ asm volatile("mrc p10, 7, %0, c11, c0, 0" : "=r" (val));
+ return val;
+}
+
+static void krait_write_vpmresr0(u32 val)
+{
+ asm volatile("mcr p10, 7, %0, c11, c0, 0" : : "r" (val));
+}
+
+static void krait_pre_vpmresr0(u32 *venum_orig_val, u32 *fp_orig_val)
+{
+ u32 venum_new_val;
+ u32 fp_new_val;
+
+ /* CPACR Enable CP10 and CP11 access */
+ *venum_orig_val = get_copro_access();
+ venum_new_val = *venum_orig_val | CPACC_SVC(10) | CPACC_SVC(11);
+ set_copro_access(venum_new_val);
+
+ /* Enable FPEXC */
+ *fp_orig_val = fmrx(FPEXC);
+ fp_new_val = *fp_orig_val | FPEXC_EN;
+ fmxr(FPEXC, fp_new_val);
+}
+
+static void krait_post_vpmresr0(u32 venum_orig_val, u32 fp_orig_val)
+{
+ /* Restore FPEXC */
+ fmxr(FPEXC, fp_orig_val);
+ isb();
+ /* Restore CPACR */
+ set_copro_access(venum_orig_val);
+}
+
+static u32 krait_get_pmresrn_event(int region)
+{
+ static const u32 pmresrn_table[] = { KRAIT_PMRESR0_GROUP0,
+ KRAIT_PMRESR1_GROUP0,
+ KRAIT_PMRESR2_GROUP0 };
+ return pmresrn_table[region];
+}
+
+static void krait_evt_setup(int idx, u32 config_base)
+{
+ u32 val;
+ u32 mask;
+ u32 vval, fval;
+ unsigned int region;
+ unsigned int group;
+ unsigned int code;
+ unsigned int group_shift;
+ bool venum_event;
+
+ venum_event = !!(config_base & VENUM_EVENT);
+ region = (config_base >> 12) & 0xf;
+ code = (config_base >> 4) & 0xff;
+ group = (config_base >> 0) & 0xf;
+
+ group_shift = group * 8;
+ mask = 0xff << group_shift;
+
+ /* Configure evtsel for the region and group */
+ if (venum_event)
+ val = KRAIT_VPMRESR0_GROUP0;
+ else
+ val = krait_get_pmresrn_event(region);
+ val += group;
+ /* Mix in mode-exclusion bits */
+ val |= config_base & (ARMV7_EXCLUDE_USER | ARMV7_EXCLUDE_PL1);
+ armv7_pmnc_write_evtsel(idx, val);
+
+ asm volatile("mcr p15, 0, %0, c9, c15, 0" : : "r" (0));
+
+ if (venum_event) {
+ krait_pre_vpmresr0(&vval, &fval);
+ val = krait_read_vpmresr0();
+ val &= ~mask;
+ val |= code << group_shift;
+ val |= PMRESRn_EN;
+ krait_write_vpmresr0(val);
+ krait_post_vpmresr0(vval, fval);
+ } else {
+ val = krait_read_pmresrn(region);
+ val &= ~mask;
+ val |= code << group_shift;
+ val |= PMRESRn_EN;
+ krait_write_pmresrn(region, val);
+ }
+}
+
+static u32 krait_clear_pmresrn_group(u32 val, int group)
+{
+ u32 mask;
+ int group_shift;
+
+ group_shift = group * 8;
+ mask = 0xff << group_shift;
+ val &= ~mask;
+
+ /* Don't clear enable bit if entire region isn't disabled */
+ if (val & ~PMRESRn_EN)
+ return val |= PMRESRn_EN;
+
+ return 0;
+}
+
+static void krait_clearpmu(u32 config_base)
+{
+ u32 val;
+ u32 vval, fval;
+ unsigned int region;
+ unsigned int group;
+ bool venum_event;
+
+ venum_event = !!(config_base & VENUM_EVENT);
+ region = (config_base >> 12) & 0xf;
+ group = (config_base >> 0) & 0xf;
+
+ if (venum_event) {
+ krait_pre_vpmresr0(&vval, &fval);
+ val = krait_read_vpmresr0();
+ val = krait_clear_pmresrn_group(val, group);
+ krait_write_vpmresr0(val);
+ krait_post_vpmresr0(vval, fval);
+ } else {
+ val = krait_read_pmresrn(region);
+ val = krait_clear_pmresrn_group(val, group);
+ krait_write_pmresrn(region, val);
+ }
+}
+
+static void krait_pmu_disable_event(struct perf_event *event)
+{
+ unsigned long flags;
+ struct hw_perf_event *hwc = &event->hw;
+ int idx = hwc->idx;
+ struct pmu_hw_events *events = cpu_pmu->get_hw_events();
+
+ /* Disable counter and interrupt */
+ raw_spin_lock_irqsave(&events->pmu_lock, flags);
+
+ /* Disable counter */
+ armv7_pmnc_disable_counter(idx);
+
+ /*
+ * Clear pmresr code (if destined for PMNx counters)
+ */
+ if (hwc->config_base & KRAIT_EVENT_MASK)
+ krait_clearpmu(hwc->config_base);
+
+ /* Disable interrupt for this counter */
+ armv7_pmnc_disable_intens(idx);
+
+ raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+}
+
+static void krait_pmu_enable_event(struct perf_event *event)
+{
+ unsigned long flags;
+ struct hw_perf_event *hwc = &event->hw;
+ int idx = hwc->idx;
+ struct pmu_hw_events *events = cpu_pmu->get_hw_events();
+
+ /*
+ * Enable counter and interrupt, and set the counter to count
+ * the event that we're interested in.
+ */
+ raw_spin_lock_irqsave(&events->pmu_lock, flags);
+
+ /* Disable counter */
+ armv7_pmnc_disable_counter(idx);
+
+ /*
+ * Set event (if destined for PMNx counters)
+ * We set the event for the cycle counter because we
+ * have the ability to perform event filtering.
+ */
+ if (hwc->config_base & KRAIT_EVENT_MASK)
+ krait_evt_setup(idx, hwc->config_base);
+ else
+ armv7_pmnc_write_evtsel(idx, hwc->config_base);
+
+ /* Enable interrupt for this counter */
+ armv7_pmnc_enable_intens(idx);
+
+ /* Enable counter */
+ armv7_pmnc_enable_counter(idx);
+
+ raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+}
+
+static void krait_pmu_reset(void *info)
+{
+ u32 vval, fval;
+
+ armv7pmu_reset(info);
+
+ /* Clear all pmresrs */
+ krait_write_pmresrn(0, 0);
+ krait_write_pmresrn(1, 0);
+ krait_write_pmresrn(2, 0);
+
+ krait_pre_vpmresr0(&vval, &fval);
+ krait_write_vpmresr0(0);
+ krait_post_vpmresr0(vval, fval);
+}
+
+/*
+ * We check for column exclusion constraints here.
+ * Two events cant use the same group within a pmresr register.
+ */
+static int krait_pmu_get_event_idx(struct pmu_hw_events *cpuc,
+ struct perf_event *event)
+{
+ int idx;
+ int bit;
+ unsigned int prefix;
+ unsigned int region;
+ unsigned int code;
+ unsigned int group;
+ bool krait_event;
+ struct hw_perf_event *hwc = &event->hw;
+ unsigned long *bitmap = this_cpu_ptr(pmresrn_used);
+
+ region = (hwc->config_base >> 12) & 0xf;
+ code = (hwc->config_base >> 4) & 0xff;
+ group = (hwc->config_base >> 0) & 0xf;
+ krait_event = !!(hwc->config_base & KRAIT_EVENT_MASK);
+
+ if (krait_event) {
+ /* Ignore invalid events */
+ if (group > 3 || region > 2)
+ return -EINVAL;
+ prefix = hwc->config_base & KRAIT_EVENT_MASK;
+ if (prefix != KRAIT_EVENT && prefix != VENUM_EVENT)
+ return -EINVAL;
+ if (prefix == VENUM_EVENT && (code & 0xe0))
+ return -EINVAL;
+
+ if (prefix == VENUM_EVENT)
+ bit = KRAIT_VPMRESR0_GROUP0;
+ else
+ bit = krait_get_pmresrn_event(region);
+ bit -= krait_get_pmresrn_event(0);
+ bit += group;
+
+ if (test_and_set_bit(bit, bitmap))
+ return -EAGAIN;
+ }
+
+ idx = armv7pmu_get_event_idx(cpuc, event);
+ if (idx < 0 && krait_event)
+ clear_bit(bit, bitmap);
+
+ return idx;
+}
+
+static void krait_pmu_clear_event_idx(struct perf_event *event)
+{
+ int bit;
+ struct hw_perf_event *hwc = &event->hw;
+ unsigned int region;
+ unsigned int group;
+ bool krait_event;
+ unsigned long *bitmap = this_cpu_ptr(pmresrn_used);
+
+ region = (hwc->config_base >> 12) & 0xf;
+ group = (hwc->config_base >> 0) & 0xf;
+ krait_event = !!(hwc->config_base & KRAIT_EVENT_MASK);
+
+ if (krait_event) {
+ if (hwc->config_base & VENUM_EVENT)
+ bit = KRAIT_VPMRESR0_GROUP0;
+ else
+ bit = krait_get_pmresrn_event(region);
+ bit -= krait_get_pmresrn_event(0);
+ bit += group;
+ clear_bit(bit, bitmap);
+ }
+}
+
static int krait_pmu_init(struct arm_pmu *cpu_pmu)
{
armv7pmu_init(cpu_pmu);
@@ -1440,6 +1821,11 @@ static int krait_pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->map_event = krait_map_event;
cpu_pmu->num_events = armv7_read_num_pmnc_events();
cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
+ cpu_pmu->reset = krait_pmu_reset;
+ cpu_pmu->enable = krait_pmu_enable_event;
+ cpu_pmu->disable = krait_pmu_disable_event;
+ cpu_pmu->get_event_idx = krait_pmu_get_event_idx;
+ cpu_pmu->clear_event_idx = krait_pmu_clear_event_idx;
return 0;
}
#else
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
Allows us to probe the performance counters on Krait CPUs.
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/boot/dts/qcom-msm8960-cdp.dts | 6 ++++++
arch/arm/boot/dts/qcom-msm8974.dtsi | 5 +++++
2 files changed, 11 insertions(+)
diff --git a/arch/arm/boot/dts/qcom-msm8960-cdp.dts b/arch/arm/boot/dts/qcom-msm8960-cdp.dts
index ea24fdfbdd06..20aca33edfa2 100644
--- a/arch/arm/boot/dts/qcom-msm8960-cdp.dts
+++ b/arch/arm/boot/dts/qcom-msm8960-cdp.dts
@@ -39,6 +39,12 @@
};
};
+ cpu-pmu {
+ compatible = "qcom,krait-pmu";
+ interrupts = <1 10 0x304>;
+ qcom,no-pc-write;
+ };
+
intc: interrupt-controller@2000000 {
compatible = "qcom,msm-qgic2";
interrupt-controller;
diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
index f607cf4a11d9..24d63c82a15a 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -52,6 +52,11 @@
};
};
+ cpu-pmu {
+ compatible = "qcom,krait-pmu";
+ interrupts = <1 7 0xf04>;
+ };
+
soc: soc {
#address-cells = <1>;
#size-cells = <1>;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
Some CPU PMUs are wired up with one PPI for all the CPUs instead
of with a different SPI for each CPU. Add support for these
devices.
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/kernel/perf_event.c | 14 ++++--
arch/arm/kernel/perf_event_cpu.c | 97 ++++++++++++++++++++++++++++------------
2 files changed, 80 insertions(+), 31 deletions(-)
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 789d846a9184..e76750980b38 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -16,6 +16,8 @@
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
#include <linux/uaccess.h>
+#include <linux/irq.h>
+#include <linux/irqdesc.h>
#include <asm/irq_regs.h>
#include <asm/pmu.h>
@@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
{
- struct arm_pmu *armpmu = (struct arm_pmu *) dev;
- struct platform_device *plat_device = armpmu->plat_device;
- struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
+ struct arm_pmu *armpmu;
+ struct platform_device *plat_device;
+ struct arm_pmu_platdata *plat;
+
+ if (irq_is_percpu(irq))
+ dev = *(struct arm_pmu_cpu **)dev;
+ armpmu = dev;
+ plat_device = armpmu->plat_device;
+ plat = dev_get_platdata(&plat_device->dev);
if (plat && plat->handle_irq)
return plat->handle_irq(irq, dev, armpmu->handle_irq);
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c
index 20d553c9f5e2..6efd8aab15df 100644
--- a/arch/arm/kernel/perf_event_cpu.c
+++ b/arch/arm/kernel/perf_event_cpu.c
@@ -25,6 +25,8 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/irq.h>
+#include <linux/irqdesc.h>
#include <asm/cputype.h>
#include <asm/irq_regs.h>
@@ -33,6 +35,7 @@
/* Set at runtime when we know what CPU type we are. */
static struct arm_pmu *cpu_pmu;
+static DEFINE_PER_CPU(struct arm_pmu *, percpu_pmu);
static DEFINE_PER_CPU(struct perf_event * [ARMPMU_MAX_HWEVENTS], hw_events);
static DEFINE_PER_CPU(unsigned long [BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)], used_mask);
static DEFINE_PER_CPU(struct pmu_hw_events, cpu_hw_events);
@@ -71,6 +74,26 @@ static struct pmu_hw_events *cpu_pmu_get_cpu_events(void)
return this_cpu_ptr(&cpu_hw_events);
}
+static void cpu_pmu_enable_percpu_irq(void *data)
+{
+ struct arm_pmu *cpu_pmu = data;
+ struct platform_device *pmu_device = cpu_pmu->plat_device;
+ int irq = platform_get_irq(pmu_device, 0);
+
+ enable_percpu_irq(irq, IRQ_TYPE_NONE);
+ cpumask_set_cpu(smp_processor_id(), &cpu_pmu->active_irqs);
+}
+
+static void cpu_pmu_disable_percpu_irq(void *data)
+{
+ struct arm_pmu *cpu_pmu = data;
+ struct platform_device *pmu_device = cpu_pmu->plat_device;
+ int irq = platform_get_irq(pmu_device, 0);
+
+ cpumask_clear_cpu(smp_processor_id(), &cpu_pmu->active_irqs);
+ disable_percpu_irq(irq);
+}
+
static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)
{
int i, irq, irqs;
@@ -78,12 +101,18 @@ static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)
irqs = min(pmu_device->num_resources, num_possible_cpus());
- for (i = 0; i < irqs; ++i) {
- if (!cpumask_test_and_clear_cpu(i, &cpu_pmu->active_irqs))
- continue;
- irq = platform_get_irq(pmu_device, i);
- if (irq >= 0)
- free_irq(irq, cpu_pmu);
+ irq = platform_get_irq(pmu_device, 0);
+ if (irq >= 0 && irq_is_percpu(irq)) {
+ on_each_cpu(cpu_pmu_disable_percpu_irq, cpu_pmu, 1);
+ free_percpu_irq(irq, &percpu_pmu);
+ } else {
+ for (i = 0; i < irqs; ++i) {
+ if (!cpumask_test_and_clear_cpu(i, &cpu_pmu->active_irqs))
+ continue;
+ irq = platform_get_irq(pmu_device, i);
+ if (irq >= 0)
+ free_irq(irq, cpu_pmu);
+ }
}
}
@@ -101,33 +130,44 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)
return -ENODEV;
}
- for (i = 0; i < irqs; ++i) {
- err = 0;
- irq = platform_get_irq(pmu_device, i);
- if (irq < 0)
- continue;
-
- /*
- * If we have a single PMU interrupt that we can't shift,
- * assume that we're running on a uniprocessor machine and
- * continue. Otherwise, continue without this interrupt.
- */
- if (irq_set_affinity(irq, cpumask_of(i)) && irqs > 1) {
- pr_warning("unable to set irq affinity (irq=%d, cpu=%u)\n",
- irq, i);
- continue;
- }
-
- err = request_irq(irq, handler,
- IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu",
- cpu_pmu);
+ irq = platform_get_irq(pmu_device, 0);
+ if (irq >= 0 && irq_is_percpu(irq)) {
+ err = request_percpu_irq(irq, handler, "arm-pmu", &percpu_pmu);
if (err) {
pr_err("unable to request IRQ%d for ARM PMU counters\n",
irq);
return err;
}
-
- cpumask_set_cpu(i, &cpu_pmu->active_irqs);
+ on_each_cpu(cpu_pmu_enable_percpu_irq, cpu_pmu, 1);
+ } else {
+ for (i = 0; i < irqs; ++i) {
+ err = 0;
+ irq = platform_get_irq(pmu_device, i);
+ if (irq < 0)
+ continue;
+
+ /*
+ * If we have a single PMU interrupt that we can't shift,
+ * assume that we're running on a uniprocessor machine and
+ * continue. Otherwise, continue without this interrupt.
+ */
+ if (irq_set_affinity(irq, cpumask_of(i)) && irqs > 1) {
+ pr_warning("unable to set irq affinity (irq=%d, cpu=%u)\n",
+ irq, i);
+ continue;
+ }
+
+ err = request_irq(irq, handler,
+ IRQF_NOBALANCING | IRQF_NO_THREAD, "arm-pmu",
+ cpu_pmu);
+ if (err) {
+ pr_err("unable to request IRQ%d for ARM PMU counters\n",
+ irq);
+ return err;
+ }
+
+ cpumask_set_cpu(i, &cpu_pmu->active_irqs);
+ }
}
return 0;
@@ -141,6 +181,7 @@ static void cpu_pmu_init(struct arm_pmu *cpu_pmu)
events->events = per_cpu(hw_events, cpu);
events->used_mask = per_cpu(used_mask, cpu);
raw_spin_lock_init(&events->pmu_lock);
+ per_cpu(percpu_pmu, cpu) = cpu_pmu;
}
cpu_pmu->get_hw_events = cpu_pmu_get_cpu_events;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
We want to inspect the of_node that the pdev is pointing to in
the Krait CPU specific PMU initialization function. Assign it
earlier so that we don't crash with a NULL pointer dereference.
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/kernel/perf_event_cpu.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c
index 6efd8aab15df..68d02ca0ca1b 100644
--- a/arch/arm/kernel/perf_event_cpu.c
+++ b/arch/arm/kernel/perf_event_cpu.c
@@ -311,6 +311,9 @@ static int cpu_pmu_device_probe(struct platform_device *pdev)
return -ENOMEM;
}
+ cpu_pmu = pmu;
+ cpu_pmu->plat_device = pdev;
+
if (node && (of_id = of_match_node(cpu_pmu_of_device_ids, pdev->dev.of_node))) {
init_fn = of_id->data;
ret = init_fn(pmu);
@@ -323,8 +326,6 @@ static int cpu_pmu_device_probe(struct platform_device *pdev)
goto out_free;
}
- cpu_pmu = pmu;
- cpu_pmu->plat_device = pdev;
cpu_pmu_init(cpu_pmu);
ret = armpmu_register(cpu_pmu, PERF_TYPE_RAW);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
On Krait processors we have a many-to-one relationship between
raw CPU events and the event programmed into the PMNx counter.
Two raw CPU events could map to the same value programmed in the
PMNx counter. To avoid this problem, we check for collisions
during the get_event_idx() callback by setting a bit in a bitmap
whenever a certain event is used in a PMNx counter (see the next
patch). Unfortunately, we don't have a hook to clear this bit in
the bitmap when the event is deleted so let's add an optional
clear_event_idx() callback for this purpose.
Signed-off-by: Stephen Boyd <[email protected]>
---
arch/arm/include/asm/pmu.h | 1 +
arch/arm/kernel/perf_event.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
index f24edad26c70..994ac445e792 100644
--- a/arch/arm/include/asm/pmu.h
+++ b/arch/arm/include/asm/pmu.h
@@ -71,6 +71,7 @@ struct arm_pmu {
void (*disable)(struct perf_event *event);
int (*get_event_idx)(struct pmu_hw_events *hw_events,
struct perf_event *event);
+ void (*clear_event_idx)(struct perf_event *event);
int (*set_event_filter)(struct hw_perf_event *evt,
struct perf_event_attr *attr);
u32 (*read_counter)(struct perf_event *event);
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index e76750980b38..06c6b191c8db 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -207,6 +207,8 @@ armpmu_del(struct perf_event *event, int flags)
armpmu_stop(event, PERF_EF_UPDATE);
hw_events->events[idx] = NULL;
clear_bit(idx, hw_events->used_mask);
+ if (armpmu->clear_event_idx)
+ armpmu->clear_event_idx(event);
perf_event_update_userpage(event);
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
On 01/15, Stephen Boyd wrote:
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 789d846a9184..e76750980b38 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
>
> static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> {
> - struct arm_pmu *armpmu = (struct arm_pmu *) dev;
> - struct platform_device *plat_device = armpmu->plat_device;
> - struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
> + struct arm_pmu *armpmu;
> + struct platform_device *plat_device;
> + struct arm_pmu_platdata *plat;
> +
> + if (irq_is_percpu(irq))
> + dev = *(struct arm_pmu_cpu **)dev;
Oh. I just realized that struct arm_pmu_cpu doesn't even exist. This
still compiles though because we're dealing with a void pointer.
Perhaps its better to just do
dev = *(void **)dev;
here. Can you fix that up when applying? Otherwise I'll do it on
the next send if there are more comments.
> + armpmu = dev;
> + plat_device = armpmu->plat_device;
> + plat = dev_get_platdata(&plat_device->dev);
>
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
Hi Stephen,
On Wed, Jan 15, 2014 at 08:54:27PM +0000, Stephen Boyd wrote:
> On 01/15, Stephen Boyd wrote:
> > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> > index 789d846a9184..e76750980b38 100644
> > --- a/arch/arm/kernel/perf_event.c
> > +++ b/arch/arm/kernel/perf_event.c
> > @@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
> >
> > static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> > {
> > - struct arm_pmu *armpmu = (struct arm_pmu *) dev;
> > - struct platform_device *plat_device = armpmu->plat_device;
> > - struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
> > + struct arm_pmu *armpmu;
> > + struct platform_device *plat_device;
> > + struct arm_pmu_platdata *plat;
> > +
> > + if (irq_is_percpu(irq))
> > + dev = *(struct arm_pmu_cpu **)dev;
>
> Oh. I just realized that struct arm_pmu_cpu doesn't even exist. This
> still compiles though because we're dealing with a void pointer.
>
> Perhaps its better to just do
>
> dev = *(void **)dev;
>
> here. Can you fix that up when applying? Otherwise I'll do it on
> the next send if there are more comments.
Shouldn't that actually be some per_cpu accessor like this_cpu_ptr?
Will
On 01/17/14 07:04, Will Deacon wrote:
> Hi Stephen,
>
> On Wed, Jan 15, 2014 at 08:54:27PM +0000, Stephen Boyd wrote:
>> On 01/15, Stephen Boyd wrote:
>>> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
>>> index 789d846a9184..e76750980b38 100644
>>> --- a/arch/arm/kernel/perf_event.c
>>> +++ b/arch/arm/kernel/perf_event.c
>>> @@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
>>>
>>> static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
>>> {
>>> - struct arm_pmu *armpmu = (struct arm_pmu *) dev;
>>> - struct platform_device *plat_device = armpmu->plat_device;
>>> - struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
>>> + struct arm_pmu *armpmu;
>>> + struct platform_device *plat_device;
>>> + struct arm_pmu_platdata *plat;
>>> +
>>> + if (irq_is_percpu(irq))
>>> + dev = *(struct arm_pmu_cpu **)dev;
>> Oh. I just realized that struct arm_pmu_cpu doesn't even exist. This
>> still compiles though because we're dealing with a void pointer.
>>
>> Perhaps its better to just do
>>
>> dev = *(void **)dev;
>>
>> here. Can you fix that up when applying? Otherwise I'll do it on
>> the next send if there are more comments.
> Shouldn't that actually be some per_cpu accessor like this_cpu_ptr?
>
Nope. The genirq layer unwraps the per_cpu pointer and passes it to the
handler.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
On Fri, Jan 17, 2014 at 05:54:36PM +0000, Stephen Boyd wrote:
> On 01/17/14 07:04, Will Deacon wrote:
> > Hi Stephen,
> >
> > On Wed, Jan 15, 2014 at 08:54:27PM +0000, Stephen Boyd wrote:
> >> On 01/15, Stephen Boyd wrote:
> >>> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> >>> index 789d846a9184..e76750980b38 100644
> >>> --- a/arch/arm/kernel/perf_event.c
> >>> +++ b/arch/arm/kernel/perf_event.c
> >>> @@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
> >>>
> >>> static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> >>> {
> >>> - struct arm_pmu *armpmu = (struct arm_pmu *) dev;
> >>> - struct platform_device *plat_device = armpmu->plat_device;
> >>> - struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
> >>> + struct arm_pmu *armpmu;
> >>> + struct platform_device *plat_device;
> >>> + struct arm_pmu_platdata *plat;
> >>> +
> >>> + if (irq_is_percpu(irq))
> >>> + dev = *(struct arm_pmu_cpu **)dev;
> >> Oh. I just realized that struct arm_pmu_cpu doesn't even exist. This
> >> still compiles though because we're dealing with a void pointer.
> >>
> >> Perhaps its better to just do
> >>
> >> dev = *(void **)dev;
> >>
> >> here. Can you fix that up when applying? Otherwise I'll do it on
> >> the next send if there are more comments.
> > Shouldn't that actually be some per_cpu accessor like this_cpu_ptr?
> >
>
> Nope. The genirq layer unwraps the per_cpu pointer and passes it to the
> handler.
Ah yeah, I forget the dispatcher is what genirq sees as the handler. In
which case your idea looks right.
Sorry for the noise.
Will
Hi Stephen,
Thanks for the updates. A few more comments inline.
On Wed, Jan 15, 2014 at 05:55:33PM +0000, Stephen Boyd wrote:
> Krait supports a set of performance monitor region event
> selection registers (PMRESR) sitting behind a cp15 based
> interface that extend the architected PMU events to include Krait
> CPU and Venum VFP specific events. To use these events the user
> is expected to program the region register (PMRESRn) with the
> event code shifted into the group they care about and then point
> the PMNx event at that region+group combo by writing a
> PMRESRn_GROUPx event. Add support for this hardware.
>
> Note: the raw event number is a pure software construct that
> allows us to map the multi-dimensional number space of regions,
> groups, and event codes into a flat event number space suitable
> for use by the perf framework.
[...]
> +static u32 krait_read_pmresrn(int n)
> +{
> + u32 val;
> +
> + switch (n) {
> + case 0:
> + asm volatile("mrc p15, 1, %0, c9, c15, 0" : "=r" (val));
> + break;
> + case 1:
> + asm volatile("mrc p15, 1, %0, c9, c15, 1" : "=r" (val));
> + break;
> + case 2:
> + asm volatile("mrc p15, 1, %0, c9, c15, 2" : "=r" (val));
> + break;
> + default:
> + BUG(); /* Should be validated in krait_pmu_get_event_idx() */
> + }
> +
> + return val;
> +}
> +
> +static void krait_write_pmresrn(int n, u32 val)
> +{
> + switch (n) {
> + case 0:
> + asm volatile("mcr p15, 1, %0, c9, c15, 0" : : "r" (val));
> + break;
> + case 1:
> + asm volatile("mcr p15, 1, %0, c9, c15, 1" : : "r" (val));
> + break;
> + case 2:
> + asm volatile("mcr p15, 1, %0, c9, c15, 2" : : "r" (val));
> + break;
> + default:
> + BUG(); /* Should be validated in krait_pmu_get_event_idx() */
> + }
> +}
Do you need isbs to ensure the pmresrn side-effects have happened, or are
the registers self-synchronising? Similarly for your other IMP DEF
registers.
> +static void krait_pre_vpmresr0(u32 *venum_orig_val, u32 *fp_orig_val)
> +{
> + u32 venum_new_val;
> + u32 fp_new_val;
> +
> + /* CPACR Enable CP10 and CP11 access */
> + *venum_orig_val = get_copro_access();
> + venum_new_val = *venum_orig_val | CPACC_SVC(10) | CPACC_SVC(11);
> + set_copro_access(venum_new_val);
> +
> + /* Enable FPEXC */
> + *fp_orig_val = fmrx(FPEXC);
> + fp_new_val = *fp_orig_val | FPEXC_EN;
> + fmxr(FPEXC, fp_new_val);
Messing around with the lot (especially with kernel-mode neon now in
mainline) does scare me. I'd like some BUG_ON(preemptible()) and you could
consider using kernel_neon_{begin,end} but they're a lot heavier than you
need (due to non-lazy switching)
Finally, I'd really like to see this get some test coverage, but I don't
want to try running mainline on my phone :) Could you give your patches a
spin with Vince's perf fuzzer please?
https://github.com/deater/perf_event_tests.git
(then build the contents of the fuzzer directory and run it for as long as
you can).
Cheers,
Will
On 01/21/14 10:07, Will Deacon wrote:
> Hi Stephen,
>
> Thanks for the updates. A few more comments inline.
>
> On Wed, Jan 15, 2014 at 05:55:33PM +0000, Stephen Boyd wrote:
>> Krait supports a set of performance monitor region event
>> selection registers (PMRESR) sitting behind a cp15 based
>> interface that extend the architected PMU events to include Krait
>> CPU and Venum VFP specific events. To use these events the user
>> is expected to program the region register (PMRESRn) with the
>> event code shifted into the group they care about and then point
>> the PMNx event at that region+group combo by writing a
>> PMRESRn_GROUPx event. Add support for this hardware.
>>
>> Note: the raw event number is a pure software construct that
>> allows us to map the multi-dimensional number space of regions,
>> groups, and event codes into a flat event number space suitable
>> for use by the perf framework.
> [...]
>
>> +static u32 krait_read_pmresrn(int n)
>> +{
>> + u32 val;
>> +
>> + switch (n) {
>> + case 0:
>> + asm volatile("mrc p15, 1, %0, c9, c15, 0" : "=r" (val));
>> + break;
>> + case 1:
>> + asm volatile("mrc p15, 1, %0, c9, c15, 1" : "=r" (val));
>> + break;
>> + case 2:
>> + asm volatile("mrc p15, 1, %0, c9, c15, 2" : "=r" (val));
>> + break;
>> + default:
>> + BUG(); /* Should be validated in krait_pmu_get_event_idx() */
>> + }
>> +
>> + return val;
>> +}
>> +
>> +static void krait_write_pmresrn(int n, u32 val)
>> +{
>> + switch (n) {
>> + case 0:
>> + asm volatile("mcr p15, 1, %0, c9, c15, 0" : : "r" (val));
>> + break;
>> + case 1:
>> + asm volatile("mcr p15, 1, %0, c9, c15, 1" : : "r" (val));
>> + break;
>> + case 2:
>> + asm volatile("mcr p15, 1, %0, c9, c15, 2" : : "r" (val));
>> + break;
>> + default:
>> + BUG(); /* Should be validated in krait_pmu_get_event_idx() */
>> + }
>> +}
> Do you need isbs to ensure the pmresrn side-effects have happened, or are
> the registers self-synchronising? Similarly for your other IMP DEF
> registers.
There aren't any isbs in the downstream android sources so I assume
they're self synchronizing. I'll confirm with the CPU designers to make
sure.
>
>> +static void krait_pre_vpmresr0(u32 *venum_orig_val, u32 *fp_orig_val)
>> +{
>> + u32 venum_new_val;
>> + u32 fp_new_val;
>> +
>> + /* CPACR Enable CP10 and CP11 access */
>> + *venum_orig_val = get_copro_access();
>> + venum_new_val = *venum_orig_val | CPACC_SVC(10) | CPACC_SVC(11);
>> + set_copro_access(venum_new_val);
>> +
>> + /* Enable FPEXC */
>> + *fp_orig_val = fmrx(FPEXC);
>> + fp_new_val = *fp_orig_val | FPEXC_EN;
>> + fmxr(FPEXC, fp_new_val);
> Messing around with the lot (especially with kernel-mode neon now in
> mainline) does scare me. I'd like some BUG_ON(preemptible()) and you could
> consider using kernel_neon_{begin,end} but they're a lot heavier than you
> need (due to non-lazy switching)
>
> Finally, I'd really like to see this get some test coverage, but I don't
> want to try running mainline on my phone :) Could you give your patches a
> spin with Vince's perf fuzzer please?
>
> https://github.com/deater/perf_event_tests.git
>
> (then build the contents of the fuzzer directory and run it for as long as
> you can).
>
Ok. I'll see what I can do.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
On 01/21/14 10:37, Stephen Boyd wrote:
> On 01/21/14 10:07, Will Deacon wrote:
>> Finally, I'd really like to see this get some test coverage, but I don't
>> want to try running mainline on my phone :) Could you give your patches a
>> spin with Vince's perf fuzzer please?
>>
>> https://github.com/deater/perf_event_tests.git
>>
>> (then build the contents of the fuzzer directory and run it for as long as
>> you can).
>>
> Ok. I'll see what I can do.
Neat. This quickly discovered a problem.
BUG: using smp_processor_id() in preemptible [00000000] code: perf_fuzzer/70
caller is krait_pmu_get_event_idx+0x58/0xd8
CPU: 2 PID: 70 Comm: perf_fuzzer Not tainted 3.13.0-rc7-next-20140108-00041-gb038353c8516-dirty #58
[<c02165dc>] (unwind_backtrace) from [<c0212ffc>] (show_stack+0x10/0x14)
[<c0212ffc>] (show_stack) from [<c06ad5fc>] (dump_stack+0x6c/0xb8)
[<c06ad5fc>] (dump_stack) from [<c04b0928>] (debug_smp_processor_id+0xc4/0xe8)
[<c04b0928>] (debug_smp_processor_id) from [<c021a19c>] (krait_pmu_get_event_idx+0x58/0xd8)
[<c021a19c>] (krait_pmu_get_event_idx) from [<c021833c>] (validate_event+0x2c/0x54)
[<c021833c>] (validate_event) from [<c02186b4>] (armpmu_event_init+0x264/0x2dc)
[<c02186b4>] (armpmu_event_init) from [<c02b28e4>] (perf_init_event+0x138/0x194)
[<c02b28e4>] (perf_init_event) from [<c02b2c04>] (perf_event_alloc+0x2c4/0x348)
[<c02b2c04>] (perf_event_alloc) from [<c02b37e4>] (SyS_perf_event_open+0x7b8/0x9cc)
[<c02b37e4>] (SyS_perf_event_open) from [<c020ef80>] (ret_fast_syscall+0x0/0x48)
This is happening because the pmresrn_used mask is per-cpu but
validate_group() is called in preemptible context. It looks like we'll
need to add another unsigned long * field to struct pmu_hw_events just
for Krait purposes? Or we could use the upper 16 bits of the used_mask
bitmap to hold the pmresr_used values? Sounds sort of hacky but it
avoids adding another bitmap for every PMU user and there aren't any
Krait CPUs with more than 5 counters anyway.
This is the diff for the latter version. I'll let the fuzzer run overnight.
diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
index 994ac445e792..ae1919be8f98 100644
--- a/arch/arm/include/asm/pmu.h
+++ b/arch/arm/include/asm/pmu.h
@@ -71,7 +71,8 @@ struct arm_pmu {
void (*disable)(struct perf_event *event);
int (*get_event_idx)(struct pmu_hw_events *hw_events,
struct perf_event *event);
- void (*clear_event_idx)(struct perf_event *event);
+ void (*clear_event_idx)(struct pmu_hw_events *hw_events,
+ struct perf_event *event);
int (*set_event_filter)(struct hw_perf_event *evt,
struct perf_event_attr *attr);
u32 (*read_counter)(struct perf_event *event);
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 70191f68c26d..361a1aaee7c8 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -208,7 +208,7 @@ armpmu_del(struct perf_event *event, int flags)
hw_events->events[idx] = NULL;
clear_bit(idx, hw_events->used_mask);
if (armpmu->clear_event_idx)
- armpmu->clear_event_idx(event);
+ armpmu->clear_event_idx(hw_events, event);
perf_event_update_userpage(event);
}
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index daa675529c12..1a905395b74c 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -1483,9 +1483,6 @@ static int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu)
#define VENUM_EVENT (2 << 16)
#define KRAIT_EVENT_MASK (KRAIT_EVENT | VENUM_EVENT)
#define PMRESRn_EN BIT(31)
-#define NUM_PMRESR (KRAIT_VPMRESR0_GROUP0 + 4 - KRAIT_PMRESR0_GROUP0)
-
-static DEFINE_PER_CPU(unsigned long [BITS_TO_LONGS(NUM_PMRESR)], pmresrn_used);
static u32 krait_read_pmresrn(int n)
{
@@ -1542,6 +1539,7 @@ static void krait_pre_vpmresr0(u32 *venum_orig_val, u32 *fp_orig_val)
u32 venum_new_val;
u32 fp_new_val;
+ BUG_ON(preemptible());
/* CPACR Enable CP10 and CP11 access */
*venum_orig_val = get_copro_access();
venum_new_val = *venum_orig_val | CPACC_SVC(10) | CPACC_SVC(11);
@@ -1555,6 +1553,7 @@ static void krait_pre_vpmresr0(u32 *venum_orig_val, u32 *fp_orig_val)
static void krait_post_vpmresr0(u32 venum_orig_val, u32 fp_orig_val)
{
+ BUG_ON(preemptible());
/* Restore FPEXC */
fmxr(FPEXC, fp_orig_val);
isb();
@@ -1750,7 +1749,6 @@ static int krait_pmu_get_event_idx(struct pmu_hw_events *cpuc,
unsigned int group;
bool krait_event;
struct hw_perf_event *hwc = &event->hw;
- unsigned long *bitmap = this_cpu_ptr(pmresrn_used);
region = (hwc->config_base >> 12) & 0xf;
code = (hwc->config_base >> 4) & 0xff;
@@ -1773,26 +1771,27 @@ static int krait_pmu_get_event_idx(struct pmu_hw_events *cpuc,
bit = krait_get_pmresrn_event(region);
bit -= krait_get_pmresrn_event(0);
bit += group;
+ bit <<= 16; /* Use the upper 16 bits of the used_mask */
- if (test_and_set_bit(bit, bitmap))
+ if (test_and_set_bit(bit, cpuc->used_mask))
return -EAGAIN;
}
idx = armv7pmu_get_event_idx(cpuc, event);
if (idx < 0 && krait_event)
- clear_bit(bit, bitmap);
+ clear_bit(bit, cpuc->used_mask);
return idx;
}
-static void krait_pmu_clear_event_idx(struct perf_event *event)
+static void krait_pmu_clear_event_idx(struct pmu_hw_events *cpuc,
+ struct perf_event *event)
{
int bit;
struct hw_perf_event *hwc = &event->hw;
unsigned int region;
unsigned int group;
bool krait_event;
- unsigned long *bitmap = this_cpu_ptr(pmresrn_used);
region = (hwc->config_base >> 12) & 0xf;
group = (hwc->config_base >> 0) & 0xf;
@@ -1805,7 +1804,8 @@ static void krait_pmu_clear_event_idx(struct perf_event *event)
bit = krait_get_pmresrn_event(region);
bit -= krait_get_pmresrn_event(0);
bit += group;
- clear_bit(bit, bitmap);
+ bit <<= 16;
+ clear_bit(bit, cpuc->used_mask);
}
}
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
On Tue, Jan 21, 2014 at 09:59:54PM +0000, Stephen Boyd wrote:
> On 01/21/14 10:37, Stephen Boyd wrote:
> > On 01/21/14 10:07, Will Deacon wrote:
> >> Finally, I'd really like to see this get some test coverage, but I don't
> >> want to try running mainline on my phone :) Could you give your patches a
> >> spin with Vince's perf fuzzer please?
> >>
> >> https://github.com/deater/perf_event_tests.git
> >>
> >> (then build the contents of the fuzzer directory and run it for as long as
> >> you can).
> >>
> > Ok. I'll see what I can do.
>
> Neat. This quickly discovered a problem.
Yeah, the fuzzer is good at that!
> BUG: using smp_processor_id() in preemptible [00000000] code: perf_fuzzer/70
> caller is krait_pmu_get_event_idx+0x58/0xd8
> CPU: 2 PID: 70 Comm: perf_fuzzer Not tainted 3.13.0-rc7-next-20140108-00041-gb038353c8516-dirty #58
> [<c02165dc>] (unwind_backtrace) from [<c0212ffc>] (show_stack+0x10/0x14)
> [<c0212ffc>] (show_stack) from [<c06ad5fc>] (dump_stack+0x6c/0xb8)
> [<c06ad5fc>] (dump_stack) from [<c04b0928>] (debug_smp_processor_id+0xc4/0xe8)
> [<c04b0928>] (debug_smp_processor_id) from [<c021a19c>] (krait_pmu_get_event_idx+0x58/0xd8)
> [<c021a19c>] (krait_pmu_get_event_idx) from [<c021833c>] (validate_event+0x2c/0x54)
> [<c021833c>] (validate_event) from [<c02186b4>] (armpmu_event_init+0x264/0x2dc)
> [<c02186b4>] (armpmu_event_init) from [<c02b28e4>] (perf_init_event+0x138/0x194)
> [<c02b28e4>] (perf_init_event) from [<c02b2c04>] (perf_event_alloc+0x2c4/0x348)
> [<c02b2c04>] (perf_event_alloc) from [<c02b37e4>] (SyS_perf_event_open+0x7b8/0x9cc)
> [<c02b37e4>] (SyS_perf_event_open) from [<c020ef80>] (ret_fast_syscall+0x0/0x48)
>
> This is happening because the pmresrn_used mask is per-cpu but
> validate_group() is called in preemptible context. It looks like we'll
> need to add another unsigned long * field to struct pmu_hw_events just
> for Krait purposes? Or we could use the upper 16 bits of the used_mask
> bitmap to hold the pmresr_used values? Sounds sort of hacky but it
> avoids adding another bitmap for every PMU user and there aren't any
> Krait CPUs with more than 5 counters anyway.
I'm fine with that approach, but a comment saying what we're doing with
those upper bits wouldn't go amiss.
> This is the diff for the latter version. I'll let the fuzzer run overnight.
Cheers,
Will
On 01/21/14 10:37, Stephen Boyd wrote:
> On 01/21/14 10:07, Will Deacon wrote:
>> Do you need isbs to ensure the pmresrn side-effects have happened, or are
>> the registers self-synchronising? Similarly for your other IMP DEF
>> registers.
> There aren't any isbs in the downstream android sources so I assume
> they're self synchronizing. I'll confirm with the CPU designers to make
> sure.
>
CPU folks say no need for isb. They mentioned that the lack of an isb
after the armv7_pmnc_enable_counter() call will leave the action of
enabling the counter "in-flight". The window is probably pretty short on
an SMP kernel because of the spin_unlock right after with the barriers
in it, but the same can't be said for a UP kernel.
Also, the fuzzer didn't find anything else, but I found a bug in the
bitmap logic, updated and reran the fuzzer this morning. Everything
looks good.
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
On Wed, Jan 22, 2014 at 08:47:58PM +0000, Stephen Boyd wrote:
> On 01/21/14 10:37, Stephen Boyd wrote:
> > On 01/21/14 10:07, Will Deacon wrote:
> >> Do you need isbs to ensure the pmresrn side-effects have happened, or are
> >> the registers self-synchronising? Similarly for your other IMP DEF
> >> registers.
> > There aren't any isbs in the downstream android sources so I assume
> > they're self synchronizing. I'll confirm with the CPU designers to make
> > sure.
> >
>
> CPU folks say no need for isb.
Good, good!
> They mentioned that the lack of an isb after the
> armv7_pmnc_enable_counter() call will leave the action of enabling the
> counter "in-flight". The window is probably pretty short on an SMP kernel
> because of the spin_unlock right after with the barriers in it, but the
> same can't be said for a UP kernel.
Yep, we rely on the exception return for that.
> Also, the fuzzer didn't find anything else, but I found a bug in the
> bitmap logic, updated and reran the fuzzer this morning. Everything
> looks good.
Okey doke, I guess if you can repost at -rc1 then I can look at pulling
this into my tree.
Cheers,
Will
On 01/15/2014 12:55 PM, Stephen Boyd wrote:
> Some CPU PMUs are wired up with one PPI for all the CPUs instead
> of with a different SPI for each CPU. Add support for these
> devices.
>
> Signed-off-by: Stephen Boyd <[email protected]>
> ---
> arch/arm/kernel/perf_event.c | 14 ++++--
> arch/arm/kernel/perf_event_cpu.c | 97 ++++++++++++++++++++++++++++------------
> 2 files changed, 80 insertions(+), 31 deletions(-)
>
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 789d846a9184..e76750980b38 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -16,6 +16,8 @@
> #include <linux/platform_device.h>
> #include <linux/pm_runtime.h>
> #include <linux/uaccess.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
>
> #include <asm/irq_regs.h>
> #include <asm/pmu.h>
> @@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
>
> static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> {
> - struct arm_pmu *armpmu = (struct arm_pmu *) dev;
> - struct platform_device *plat_device = armpmu->plat_device;
> - struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
> + struct arm_pmu *armpmu;
> + struct platform_device *plat_device;
> + struct arm_pmu_platdata *plat;
> +
> + if (irq_is_percpu(irq))
In case anyone else is trying to follow along, this requires:
http://lkml.org/lkml/2013/12/4/316
Regards,
Christopher
--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.
Hi Stephen,
I just remembered about this series.
On Fri, Jan 17, 2014 at 05:54:36PM +0000, Stephen Boyd wrote:
> On 01/17/14 07:04, Will Deacon wrote:
> > Hi Stephen,
> >
> > On Wed, Jan 15, 2014 at 08:54:27PM +0000, Stephen Boyd wrote:
> >> On 01/15, Stephen Boyd wrote:
> >>> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> >>> index 789d846a9184..e76750980b38 100644
> >>> --- a/arch/arm/kernel/perf_event.c
> >>> +++ b/arch/arm/kernel/perf_event.c
> >>> @@ -295,9 +297,15 @@ validate_group(struct perf_event *event)
> >>>
> >>> static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
> >>> {
> >>> - struct arm_pmu *armpmu = (struct arm_pmu *) dev;
> >>> - struct platform_device *plat_device = armpmu->plat_device;
> >>> - struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev);
> >>> + struct arm_pmu *armpmu;
> >>> + struct platform_device *plat_device;
> >>> + struct arm_pmu_platdata *plat;
> >>> +
> >>> + if (irq_is_percpu(irq))
> >>> + dev = *(struct arm_pmu_cpu **)dev;
> >> Oh. I just realized that struct arm_pmu_cpu doesn't even exist. This
> >> still compiles though because we're dealing with a void pointer.
> >>
> >> Perhaps its better to just do
> >>
> >> dev = *(void **)dev;
> >>
> >> here. Can you fix that up when applying? Otherwise I'll do it on
> >> the next send if there are more comments.
> > Shouldn't that actually be some per_cpu accessor like this_cpu_ptr?
> >
>
> Nope. The genirq layer unwraps the per_cpu pointer and passes it to the
> handler.
I think we resolved all the questions/issues that came up during review.
Please can you send a new version that I can take into my tree for 3.15?
Cheers,
Will