2018-05-18 14:42:44

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 0/5]

PMUv3 has been introduced with ARMv8 and, while it has only been used
on 64bit systems so far, it would definitely be useful for 32bit
guests running under KVM/arm64, for example.

There is also the case of people natively running 32bit kernels on
64bit HW and trying to upstream unspeakable hacks, hoping that the
stars will align and that they'll win the lottery (see [1]).

So let's try again, and make the PMUv3 driver usable for everyone.

This is done in three steps:
(1) Move the driver from arch/arm64 to drivers/perf
(2) Add a handful of system register accessors so that we can reuse
the driver on 32bit
(3) Provide the same accessors on 32bit, enable compilation, and
make it the default selection for mach-virt.

Tested on a Seattle box with 32bit guests.

* From v1:
- Fixed encodings for some CP15 accessors
- Added a terse note saying that CPU_V7 also covers ARMv8
- Rebased on v4.12-rc5

* From v2:
- SPDX tags on new and moved files. Yeah!
- Annual rebase on 4.17-rc5

[1] https://patchwork.kernel.org/patch/10406793/

Marc Zyngier (5):
arm64: perf: Move PMUv3 driver to drivers/perf
arm64: perf: Abstract system register accesses away
ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations
ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM
ARM: mach-virt: Select PMUv3 driver by default

arch/arm/Kconfig | 1 +
arch/arm/include/asm/arm_pmuv3.h | 125 +++++++++++++++++++++
arch/arm/mm/Kconfig | 2 +-
arch/arm64/include/asm/arm_pmuv3.h | 111 ++++++++++++++++++
arch/arm64/include/asm/perf_event.h | 55 ---------
arch/arm64/kernel/Makefile | 1 -
drivers/perf/Kconfig | 8 ++
drivers/perf/Makefile | 1 +
.../perf_event.c => drivers/perf/arm_pmuv3.c | 42 ++++---
include/kvm/arm_pmu.h | 2 +-
include/linux/perf/arm_pmuv3.h | 78 +++++++++++++
11 files changed, 346 insertions(+), 80 deletions(-)
create mode 100644 arch/arm/include/asm/arm_pmuv3.h
create mode 100644 arch/arm64/include/asm/arm_pmuv3.h
rename arch/arm64/kernel/perf_event.c => drivers/perf/arm_pmuv3.c (97%)
create mode 100644 include/linux/perf/arm_pmuv3.h

--
2.14.2



2018-05-18 14:40:35

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 4/5] ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM

The only thing stopping the PMUv3 driver from compiling on 32bit
is the lack of defined system registers names. This is easily
solved by providing the sysreg accessors and updating the Kconfig entry.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/include/asm/arm_pmuv3.h | 125 +++++++++++++++++++++++++++++++++++++++
drivers/perf/Kconfig | 4 +-
2 files changed, 127 insertions(+), 2 deletions(-)
create mode 100644 arch/arm/include/asm/arm_pmuv3.h

diff --git a/arch/arm/include/asm/arm_pmuv3.h b/arch/arm/include/asm/arm_pmuv3.h
new file mode 100644
index 000000000000..e0f66c1d42b4
--- /dev/null
+++ b/arch/arm/include/asm/arm_pmuv3.h
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_PMUV3_H
+#define __ASM_PMUV3_H
+
+#include <asm/cp15.h>
+#include <asm/cputype.h>
+
+#define PMCCNTR __ACCESS_CP15_64(0, c9)
+
+#define PMCR __ACCESS_CP15(c9, 0, c12, 0)
+#define PMCNTENSET __ACCESS_CP15(c9, 0, c12, 1)
+#define PMCNTENCLR __ACCESS_CP15(c9, 0, c12, 2)
+#define PMOVSR __ACCESS_CP15(c9, 0, c12, 3)
+#define PMSELR __ACCESS_CP15(c9, 0, c12, 5)
+#define PMCEID0 __ACCESS_CP15(c9, 0, c12, 6)
+#define PMCEID1 __ACCESS_CP15(c9, 0, c12, 7)
+#define PMXEVTYPER __ACCESS_CP15(c9, 0, c13, 1)
+#define PMXEVCNTR __ACCESS_CP15(c9, 0, c13, 2)
+#define PMINTENSET __ACCESS_CP15(c9, 0, c14, 1)
+#define PMINTENCLR __ACCESS_CP15(c9, 0, c14, 2)
+
+static inline int read_pmuver(void)
+{
+ /* PMUVers is not a signed field */
+ u32 dfr0 = read_cpuid_ext(CPUID_EXT_DFR0);
+ return (dfr0 >> 24) & 0xf;
+}
+
+static inline void write_pmcr(u32 val)
+{
+ write_sysreg(val, PMCR);
+}
+
+static inline u32 read_pmcr(void)
+{
+ return read_sysreg(PMCR);
+}
+
+static inline void write_pmselr(u32 val)
+{
+ write_sysreg(val, PMSELR);
+}
+
+static inline void write_pmccntr(u64 val)
+{
+ write_sysreg(val, PMCCNTR);
+}
+
+static inline u64 read_pmccntr(void)
+{
+ return read_sysreg(PMCCNTR);
+}
+
+static inline void write_pmxevcntr(u32 val)
+{
+ write_sysreg(val, PMXEVCNTR);
+}
+
+static inline u32 read_pmxevcntr(void)
+{
+ return read_sysreg(PMXEVCNTR);
+}
+
+static inline void write_pmxevtyper(u32 val)
+{
+ write_sysreg(val, PMXEVTYPER);
+}
+
+static inline void write_pmcntenset(u32 val)
+{
+ write_sysreg(val, PMCNTENSET);
+}
+
+static inline void write_pmcntenclr(u32 val)
+{
+ write_sysreg(val, PMCNTENCLR);
+}
+
+static inline void write_pmintenset(u32 val)
+{
+ write_sysreg(val, PMINTENSET);
+}
+
+static inline void write_pmintenclr(u32 val)
+{
+ write_sysreg(val, PMINTENCLR);
+}
+
+static inline void write_pmovsclr(u32 val)
+{
+ write_sysreg(val, PMOVSR);
+}
+
+static inline u32 read_pmovsclr(void)
+{
+ return read_sysreg(PMOVSR);
+}
+
+static inline u32 read_pmceid0(void)
+{
+ return read_sysreg(PMCEID0);
+}
+
+static inline u32 read_pmceid1(void)
+{
+ return read_sysreg(PMCEID1);
+}
+
+#endif
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 39808b86b346..bf01dc1414c8 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -51,9 +51,9 @@ config ARM_PMU_ACPI
def_bool y

config ARM_PMUV3
- depends on HW_PERF_EVENTS && ARM64
+ depends on HW_PERF_EVENTS && ((ARM && CPU_V7) || ARM64)
bool "ARM PMUv3 support" if !ARM64
- default y
+ default ARM64
help
Say y if you want to use CPU performance monitors on ARMv8
systems that implement the PMUv3 architecture.
--
2.14.2


2018-05-18 14:40:38

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 1/5] arm64: perf: Move PMUv3 driver to drivers/perf

Having the ARM PMUv3 driver sitting in arch/arm64/kernel is getting
in the way of being able to use perf on ARMv8 cores running a 32bit
kernel, such as 32bit KVM guests.

This patch moves it into drivers/perf/arm_pmuv3.c, with an include
file in include/linux/perf/arm_pmuv3.h. The only thing left in
arch/arm64 is some mundane perf stuff.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/perf_event.h | 55 ----------------
arch/arm64/kernel/Makefile | 1 -
drivers/perf/Kconfig | 8 +++
drivers/perf/Makefile | 1 +
.../perf_event.c => drivers/perf/arm_pmuv3.c | 2 +
include/kvm/arm_pmu.h | 2 +-
include/linux/perf/arm_pmuv3.h | 76 ++++++++++++++++++++++
7 files changed, 88 insertions(+), 57 deletions(-)
rename arch/arm64/kernel/perf_event.c => drivers/perf/arm_pmuv3.c (99%)
create mode 100644 include/linux/perf/arm_pmuv3.h

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index f9ccc36d3dc3..5b33efeebabf 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -20,61 +20,6 @@
#include <asm/stack_pointer.h>
#include <asm/ptrace.h>

-#define ARMV8_PMU_MAX_COUNTERS 32
-#define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1)
-
-/*
- * Per-CPU PMCR: config reg
- */
-#define ARMV8_PMU_PMCR_E (1 << 0) /* Enable all counters */
-#define ARMV8_PMU_PMCR_P (1 << 1) /* Reset all counters */
-#define ARMV8_PMU_PMCR_C (1 << 2) /* Cycle counter reset */
-#define ARMV8_PMU_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */
-#define ARMV8_PMU_PMCR_X (1 << 4) /* Export to ETM */
-#define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/
-#define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */
-#define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */
-#define ARMV8_PMU_PMCR_N_MASK 0x1f
-#define ARMV8_PMU_PMCR_MASK 0x7f /* Mask for writable bits */
-
-/*
- * PMOVSR: counters overflow flag status reg
- */
-#define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */
-#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK
-
-/*
- * PMXEVTYPER: Event selection reg
- */
-#define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */
-#define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */
-
-/*
- * PMUv3 event types: required events
- */
-#define ARMV8_PMUV3_PERFCTR_SW_INCR 0x00
-#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL 0x03
-#define ARMV8_PMUV3_PERFCTR_L1D_CACHE 0x04
-#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED 0x10
-#define ARMV8_PMUV3_PERFCTR_CPU_CYCLES 0x11
-#define ARMV8_PMUV3_PERFCTR_BR_PRED 0x12
-
-/*
- * Event filters for PMUv3
- */
-#define ARMV8_PMU_EXCLUDE_EL1 (1 << 31)
-#define ARMV8_PMU_EXCLUDE_EL0 (1 << 30)
-#define ARMV8_PMU_INCLUDE_EL2 (1 << 27)
-
-/*
- * PMUSERENR: user enable reg
- */
-#define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */
-#define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */
-#define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */
-#define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */
-#define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */
-
#ifdef CONFIG_PERF_EVENTS
struct pt_regs;
extern unsigned long perf_instruction_pointer(struct pt_regs *regs);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index bf825f38d206..aba9344a72ca 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -32,7 +32,6 @@ arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_ARM64_MODULE_PLTS) += module-plts.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
-arm64-obj-$(CONFIG_HW_PERF_EVENTS) += perf_event.o
arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
arm64-obj-$(CONFIG_CPU_PM) += sleep.o suspend.o
arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 28bb5a029558..39808b86b346 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -50,6 +50,14 @@ config ARM_PMU_ACPI
depends on ARM_PMU && ACPI
def_bool y

+config ARM_PMUV3
+ depends on HW_PERF_EVENTS && ARM64
+ bool "ARM PMUv3 support" if !ARM64
+ default y
+ help
+ Say y if you want to use CPU performance monitors on ARMv8
+ systems that implement the PMUv3 architecture.
+
config ARM_DSU_PMU
tristate "ARM DynamIQ Shared Unit (DSU) PMU"
depends on ARM64
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index b3902bd37d53..a1a2f64e0c8f 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_ARM_CCN) += arm-ccn.o
obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
+obj-$(CONFIG_ARM_PMUV3) += arm_pmuv3.o
obj-$(CONFIG_HISI_PMU) += hisilicon/
obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o
obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
diff --git a/arch/arm64/kernel/perf_event.c b/drivers/perf/arm_pmuv3.c
similarity index 99%
rename from arch/arm64/kernel/perf_event.c
rename to drivers/perf/arm_pmuv3.c
index 85a251b6dfa8..bd19b16c44eb 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
/*
* PMU support
*
@@ -27,6 +28,7 @@
#include <linux/acpi.h>
#include <linux/of.h>
#include <linux/perf/arm_pmu.h>
+#include <linux/perf/arm_pmuv3.h>
#include <linux/platform_device.h>

/*
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index f87fe20fcb05..d16ce92cb2c0 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -19,7 +19,7 @@
#define __ASM_ARM_KVM_PMU_H

#include <linux/perf_event.h>
-#include <asm/perf_event.h>
+#include <linux/perf/arm_pmuv3.h>

#define ARMV8_PMU_CYCLE_IDX (ARMV8_PMU_MAX_COUNTERS - 1)

diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
new file mode 100644
index 000000000000..131f486643bc
--- /dev/null
+++ b/include/linux/perf/arm_pmuv3.h
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __PERF_ARM_PMUV3_H
+#define __PERF_ARM_PMUV3_H
+
+#define ARMV8_PMU_MAX_COUNTERS 32
+#define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1)
+
+/*
+ * Per-CPU PMCR: config reg
+ */
+#define ARMV8_PMU_PMCR_E (1 << 0) /* Enable all counters */
+#define ARMV8_PMU_PMCR_P (1 << 1) /* Reset all counters */
+#define ARMV8_PMU_PMCR_C (1 << 2) /* Cycle counter reset */
+#define ARMV8_PMU_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */
+#define ARMV8_PMU_PMCR_X (1 << 4) /* Export to ETM */
+#define ARMV8_PMU_PMCR_DP (1 << 5) /* Disable CCNT if non-invasive debug*/
+#define ARMV8_PMU_PMCR_LC (1 << 6) /* Overflow on 64 bit cycle counter */
+#define ARMV8_PMU_PMCR_N_SHIFT 11 /* Number of counters supported */
+#define ARMV8_PMU_PMCR_N_MASK 0x1f
+#define ARMV8_PMU_PMCR_MASK 0x7f /* Mask for writable bits */
+
+/*
+ * PMOVSR: counters overflow flag status reg
+ */
+#define ARMV8_PMU_OVSR_MASK 0xffffffff /* Mask for writable bits */
+#define ARMV8_PMU_OVERFLOWED_MASK ARMV8_PMU_OVSR_MASK
+
+/*
+ * PMXEVTYPER: Event selection reg
+ */
+#define ARMV8_PMU_EVTYPE_MASK 0xc800ffff /* Mask for writable bits */
+#define ARMV8_PMU_EVTYPE_EVENT 0xffff /* Mask for EVENT bits */
+
+/*
+ * PMUv3 event types: required events
+ */
+#define ARMV8_PMUV3_PERFCTR_SW_INCR 0x00
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL 0x03
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE 0x04
+#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED 0x10
+#define ARMV8_PMUV3_PERFCTR_CPU_CYCLES 0x11
+#define ARMV8_PMUV3_PERFCTR_BR_PRED 0x12
+
+/*
+ * Event filters for PMUv3
+ */
+#define ARMV8_PMU_EXCLUDE_EL1 (1 << 31)
+#define ARMV8_PMU_EXCLUDE_EL0 (1 << 30)
+#define ARMV8_PMU_INCLUDE_EL2 (1 << 27)
+
+/*
+ * PMUSERENR: user enable reg
+ */
+#define ARMV8_PMU_USERENR_MASK 0xf /* Mask for writable bits */
+#define ARMV8_PMU_USERENR_EN (1 << 0) /* PMU regs can be accessed at EL0 */
+#define ARMV8_PMU_USERENR_SW (1 << 1) /* PMSWINC can be written at EL0 */
+#define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */
+#define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */
+
+#endif
--
2.14.2


2018-05-18 14:40:50

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 5/5] ARM: mach-virt: Select PMUv3 driver by default

Since 32bit guests are not unlikely to run on an ARMv8 host,
let's select the PMUv3 driver, which allows the PMU to be used
on such systems.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a7f8e7f4b88f..5dc5d5f15560 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -703,6 +703,7 @@ config ARCH_VIRT
select ARM_GIC_V3
select ARM_GIC_V3_ITS if PCI
select ARM_PSCI
+ select ARM_PMUV3 if PERF_EVENTS
select HAVE_ARM_ARCH_TIMER

#
--
2.14.2


2018-05-18 14:42:51

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 2/5] arm64: perf: Abstract system register accesses away

As we want to enable 32bit support, we need to distanciate the
PMUv3 driver from the AArch64 system register names.

This patch moves all system register accesses to an architecture
specific include file, allowing the 32bit counterpart to be
slotted in at a later time.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/arm_pmuv3.h | 111 +++++++++++++++++++++++++++++++++++++
drivers/perf/arm_pmuv3.c | 40 ++++++-------
include/linux/perf/arm_pmuv3.h | 2 +
3 files changed, 131 insertions(+), 22 deletions(-)
create mode 100644 arch/arm64/include/asm/arm_pmuv3.h

diff --git a/arch/arm64/include/asm/arm_pmuv3.h b/arch/arm64/include/asm/arm_pmuv3.h
new file mode 100644
index 000000000000..558ed16b0f70
--- /dev/null
+++ b/arch/arm64/include/asm/arm_pmuv3.h
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_PMUV3_H
+#define __ASM_PMUV3_H
+
+#include <asm/cpufeature.h>
+#include <asm/sysreg.h>
+
+static inline int read_pmuver(void)
+{
+ u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
+ return cpuid_feature_extract_unsigned_field(dfr0,
+ ID_AA64DFR0_PMUVER_SHIFT);
+}
+
+static inline void write_pmcr(u32 val)
+{
+ write_sysreg(val, pmcr_el0);
+}
+
+static inline u32 read_pmcr(void)
+{
+ return read_sysreg(pmcr_el0);
+}
+
+static inline void write_pmselr(u32 val)
+{
+ write_sysreg(val, pmselr_el0);
+}
+
+static inline void write_pmccntr(u64 val)
+{
+ write_sysreg(val, pmccntr_el0);
+}
+
+static inline u64 read_pmccntr(void)
+{
+ return read_sysreg(pmccntr_el0);
+}
+
+static inline void write_pmxevcntr(u32 val)
+{
+ write_sysreg(val, pmxevcntr_el0);
+}
+
+static inline u32 read_pmxevcntr(void)
+{
+ return read_sysreg(pmxevcntr_el0);
+}
+
+static inline void write_pmxevtyper(u32 val)
+{
+ write_sysreg(val, pmxevtyper_el0);
+}
+
+static inline void write_pmcntenset(u32 val)
+{
+ write_sysreg(val, pmcntenset_el0);
+}
+
+static inline void write_pmcntenclr(u32 val)
+{
+ write_sysreg(val, pmcntenclr_el0);
+}
+
+static inline void write_pmintenset(u32 val)
+{
+ write_sysreg(val, pmintenset_el1);
+}
+
+static inline void write_pmintenclr(u32 val)
+{
+ write_sysreg(val, pmintenclr_el1);
+}
+
+static inline void write_pmovsclr(u32 val)
+{
+ write_sysreg(val, pmovsclr_el0);
+}
+
+static inline u32 read_pmovsclr(void)
+{
+ return read_sysreg(pmovsclr_el0);
+}
+
+static inline u32 read_pmceid0(void)
+{
+ return read_sysreg(pmceid0_el0);
+}
+
+static inline u32 read_pmceid1(void)
+{
+ return read_sysreg(pmceid1_el0);
+}
+
+#endif
diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
index bd19b16c44eb..9c7b2c10b52e 100644
--- a/drivers/perf/arm_pmuv3.c
+++ b/drivers/perf/arm_pmuv3.c
@@ -22,7 +22,6 @@

#include <asm/irq_regs.h>
#include <asm/perf_event.h>
-#include <asm/sysreg.h>
#include <asm/virt.h>

#include <linux/acpi.h>
@@ -479,14 +478,14 @@ static struct attribute_group armv8_pmuv3_format_attr_group = {

static inline u32 armv8pmu_pmcr_read(void)
{
- return read_sysreg(pmcr_el0);
+ return read_pmcr();
}

static inline void armv8pmu_pmcr_write(u32 val)
{
val &= ARMV8_PMU_PMCR_MASK;
isb();
- write_sysreg(val, pmcr_el0);
+ write_pmcr(val);
}

static inline int armv8pmu_has_overflowed(u32 pmovsr)
@@ -508,7 +507,7 @@ static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
static inline int armv8pmu_select_counter(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
- write_sysreg(counter, pmselr_el0);
+ write_pmselr(counter);
isb();

return idx;
@@ -525,9 +524,9 @@ static inline u32 armv8pmu_read_counter(struct perf_event *event)
pr_err("CPU%u reading wrong counter %d\n",
smp_processor_id(), idx);
else if (idx == ARMV8_IDX_CYCLE_COUNTER)
- value = read_sysreg(pmccntr_el0);
+ value = read_pmccntr();
else if (armv8pmu_select_counter(idx) == idx)
- value = read_sysreg(pmxevcntr_el0);
+ value = read_pmxevcntr();

return value;
}
@@ -549,47 +548,47 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u32 value)
*/
u64 value64 = 0xffffffff00000000ULL | value;

- write_sysreg(value64, pmccntr_el0);
+ write_pmccntr(value64);
} else if (armv8pmu_select_counter(idx) == idx)
- write_sysreg(value, pmxevcntr_el0);
+ write_pmxevcntr(value);
}

static inline void armv8pmu_write_evtype(int idx, u32 val)
{
if (armv8pmu_select_counter(idx) == idx) {
val &= ARMV8_PMU_EVTYPE_MASK;
- write_sysreg(val, pmxevtyper_el0);
+ write_pmxevtyper(val);
}
}

static inline int armv8pmu_enable_counter(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
- write_sysreg(BIT(counter), pmcntenset_el0);
+ write_pmcntenset(BIT(counter));
return idx;
}

static inline int armv8pmu_disable_counter(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
- write_sysreg(BIT(counter), pmcntenclr_el0);
+ write_pmcntenclr(BIT(counter));
return idx;
}

static inline int armv8pmu_enable_intens(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
- write_sysreg(BIT(counter), pmintenset_el1);
+ write_pmintenset(BIT(counter));
return idx;
}

static inline int armv8pmu_disable_intens(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
- write_sysreg(BIT(counter), pmintenclr_el1);
+ write_pmintenclr(BIT(counter));
isb();
/* Clear the overflow flag in case an interrupt is pending. */
- write_sysreg(BIT(counter), pmovsclr_el0);
+ write_pmovsclr(BIT(counter));
isb();

return idx;
@@ -600,11 +599,11 @@ static inline u32 armv8pmu_getreset_flags(void)
u32 value;

/* Read */
- value = read_sysreg(pmovsclr_el0);
+ value = read_pmovsclr();

/* Write to clear flags */
value &= ARMV8_PMU_OVSR_MASK;
- write_sysreg(value, pmovsclr_el0);
+ write_pmovsclr(value);

return value;
}
@@ -905,13 +904,10 @@ static void __armv8pmu_probe_pmu(void *info)
{
struct armv8pmu_probe_info *probe = info;
struct arm_pmu *cpu_pmu = probe->pmu;
- u64 dfr0;
u32 pmceid[2];
int pmuver;

- dfr0 = read_sysreg(id_aa64dfr0_el1);
- pmuver = cpuid_feature_extract_unsigned_field(dfr0,
- ID_AA64DFR0_PMUVER_SHIFT);
+ pmuver = read_pmuver();
if (pmuver == 0xf || pmuver == 0)
return;

@@ -924,8 +920,8 @@ static void __armv8pmu_probe_pmu(void *info)
/* Add the CPU cycles counter */
cpu_pmu->num_events += 1;

- pmceid[0] = read_sysreg(pmceid0_el0);
- pmceid[1] = read_sysreg(pmceid1_el0);
+ pmceid[0] = read_pmceid0();
+ pmceid[1] = read_pmceid1();

bitmap_from_arr32(cpu_pmu->pmceid_bitmap,
pmceid, ARMV8_PMUV3_MAX_COMMON_EVENTS);
diff --git a/include/linux/perf/arm_pmuv3.h b/include/linux/perf/arm_pmuv3.h
index 131f486643bc..909188bd7db0 100644
--- a/include/linux/perf/arm_pmuv3.h
+++ b/include/linux/perf/arm_pmuv3.h
@@ -18,6 +18,8 @@
#ifndef __PERF_ARM_PMUV3_H
#define __PERF_ARM_PMUV3_H

+#include <asm/arm_pmuv3.h>
+
#define ARMV8_PMU_MAX_COUNTERS 32
#define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1)

--
2.14.2


2018-05-18 14:42:53

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 3/5] ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations

ARMv8 is a superset of ARMv7, and all the ARMv8 features are
discoverable with a set of ID registers. It means that we can
use CPU_V7 to guard ARMv8 features at compile time.

This commit simply amends the CPU_V7 configuration symbol comment
to reflect that CPU_V7 also covers ARMv8.

Signed-off-by: Marc Zyngier <[email protected]>
---
arch/arm/mm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 7f14acf67caf..86550040de19 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -402,7 +402,7 @@ config CPU_V6K
select CPU_THUMB_CAPABLE
select CPU_TLB_V6 if MMU

-# ARMv7
+# ARMv7 and ARMv8 architectures
config CPU_V7
bool
select CPU_32v6K
--
2.14.2


2018-05-18 16:31:09

by Vince Weaver

[permalink] [raw]
Subject: Re: [PATCH v3 0/5]

On Fri, 18 May 2018, Marc Zyngier wrote:

> There is also the case of people natively running 32bit kernels on
> 64bit HW and trying to upstream unspeakable hacks, hoping that the
> stars will align and that they'll win the lottery (see [1]).

I've tested these patches on a Raspberry Pi 3B running a 32-bit upstream
(4.17-rc5-git) kernel and they work.

[ 0.472906] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available

I only needed to add this to the devicetree

arm-pmu {
compatible = "arm,cortex-a53-pmu";
interrupt-parent = <&local_intc>;
interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
};


Tested-by: Vince Weaver <[email protected]>

Vince

2018-05-18 16:43:09

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 0/5]

[/me beats himself for not writing a subject line...]

On 18/05/18 17:29, Vince Weaver wrote:
> On Fri, 18 May 2018, Marc Zyngier wrote:
>
>> There is also the case of people natively running 32bit kernels on
>> 64bit HW and trying to upstream unspeakable hacks, hoping that the
>> stars will align and that they'll win the lottery (see [1]).
>
> I've tested these patches on a Raspberry Pi 3B running a 32-bit upstream
> (4.17-rc5-git) kernel and they work.
>
> [ 0.472906] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
>
> I only needed to add this to the devicetree
>
> arm-pmu {
> compatible = "arm,cortex-a53-pmu";
> interrupt-parent = <&local_intc>;
> interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
> };

That's definitely the sensible thing to have on such hardware. Why isn't
it in the upstream DT already, irrespective of the state of the kernel
support?

> Tested-by: Vince Weaver <[email protected]>

Thanks a lot for testing.

M.
--
Jazz is not dead. It just smells funny...

2018-05-18 17:40:55

by Stefan Wahren

[permalink] [raw]
Subject: Re: [PATCH v3 0/5]

> Marc Zyngier <[email protected]> hat am 18. Mai 2018 um 18:41 geschrieben:
>
>
> [/me beats himself for not writing a subject line...]
>
> On 18/05/18 17:29, Vince Weaver wrote:
> > On Fri, 18 May 2018, Marc Zyngier wrote:
> >
> >> There is also the case of people natively running 32bit kernels on
> >> 64bit HW and trying to upstream unspeakable hacks, hoping that the
> >> stars will align and that they'll win the lottery (see [1]).
> >
> > I've tested these patches on a Raspberry Pi 3B running a 32-bit upstream
> > (4.17-rc5-git) kernel and they work.
> >
> > [ 0.472906] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
> >
> > I only needed to add this to the devicetree
> >
> > arm-pmu {
> > compatible = "arm,cortex-a53-pmu";
> > interrupt-parent = <&local_intc>;
> > interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
> > };
>
> That's definitely the sensible thing to have on such hardware. Why isn't
> it in the upstream DT already, irrespective of the state of the kernel
> support?

I remember that Vince point out the absence. He asked about how to implement it and i wasn't sure about it. At this time we hadn't IRQ polarity support. So we wanted to get this puzzle piece before. In march i put it on my TODO list, but then RPI 3 B+ support had higher prio to get into 4.18.

In general we have the problem that most of the users take the downstream kernel and don't know about the differences. Luckily more distributions switch to the upstream kernel, which increases the feedback.

>
> > Tested-by: Vince Weaver <[email protected]>

Thanks again
Stefan

>
> Thanks a lot for testing.
>
> M.
> --
> Jazz is not dead. It just smells funny...

2018-05-21 09:36:07

by Vladimir Murzin

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM

On 18/05/18 15:39, Marc Zyngier wrote:
> +static inline int read_pmuver(void)
> +{
> + /* PMUVers is not a signed field */
> + u32 dfr0 = read_cpuid_ext(CPUID_EXT_DFR0);
> + return (dfr0 >> 24) & 0xf;
> +}

Should we rule out versions prior v3 here or in __armv8pmu_probe_pmu()?

Thanks
Vladimir

2018-05-21 09:53:04

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM

On 21/05/18 10:34, Vladimir Murzin wrote:
> On 18/05/18 15:39, Marc Zyngier wrote:
>> +static inline int read_pmuver(void)
>> +{
>> + /* PMUVers is not a signed field */
>> + u32 dfr0 = read_cpuid_ext(CPUID_EXT_DFR0);
>> + return (dfr0 >> 24) & 0xf;
>> +}
>
> Should we rule out versions prior v3 here or in __armv8pmu_probe_pmu()?

I'm in two minds about it: The ARM ARM is quite clear about the fact
that this is not legal ("In any ARMv8 implementation the values 0001 and
0010 are not permitted."), and DT clearly lied to us in that case.

If we want to consistently handle that case, it should probably be done
in __armv8pmu_probe_pmu, bailing out if the version is our of scope for
the driver.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-05-21 18:19:47

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 0/5]

Hi Marc,

Thanks for this.

On Fri, May 18, 2018 at 03:39:08PM +0100, Marc Zyngier wrote:
> PMUv3 has been introduced with ARMv8 and, while it has only been used
> on 64bit systems so far, it would definitely be useful for 32bit
> guests running under KVM/arm64, for example.
>
> There is also the case of people natively running 32bit kernels on
> 64bit HW and trying to upstream unspeakable hacks, hoping that the
> stars will align and that they'll win the lottery (see [1]).
>
> So let's try again, and make the PMUv3 driver usable for everyone.
>
> This is done in three steps:
> (1) Move the driver from arch/arm64 to drivers/perf
> (2) Add a handful of system register accessors so that we can reuse
> the driver on 32bit
> (3) Provide the same accessors on 32bit, enable compilation, and
> make it the default selection for mach-virt.
>
> Tested on a Seattle box with 32bit guests.

I think we should go ahead with something like this, but I don't think
we're quite there with these patches. If we're going to move the arch code
out into drivers, let's do that for the perf_event* files under arch/arm/
as well. Then we could have a structure along the lines of:


drivers/perf/arm_pmu.c - As it is today
drivers/perf/arm_cpu/xscale_pmu.c - Only builds for 32-bit
drivers/perf/arm_cpu/armv6_pmu.c - Only builds for 32-bit
drivers/perf/arm_cpu/arch_pmu.c - Works for v7/v8 on
both 32-bit and 64-bit

The latter can then pull in whatever accessors it needs from the arch
code headers.

I know it's more of an invasive change, but this way we always end up
running the same code on the two architectures and it will be much easier
to maintain.

Will