2018-06-07 14:00:40

by Agustin Vega-Frias

[permalink] [raw]
Subject: [RFC V2 0/3] arm_pmu: acpi: variant support and QCOM Falkor extensions

This series is a complete re-design of V1 of the QCOM Falkor extensions [1],
it introduces a probe table based on the HID of a device nested under the CPU
device to allow variant detection and arm_pmu customization.

The first patch adds an additional section at the end of each ACPI probe table.
This allows probe tables to be sentinel-delimited and better accommodate some
APIs that require such tables.

The second patch adds the PMUv3 ACPI probe table and plumbing to allow drivers
to plug into the ACPI PMUv3 probe sequence.

The third patch adds the QCOM Falkor extensions using the new probe table.

If this found to be a reasonable extension approach other patches will be
added to the series to build on the base QCOM extensions.

[1] https://lkml.org/lkml/2017/3/1/540

Changes since V1:
- Redesign as a separate module by adding variant detection support.

Agustin Vega-Frias (3):
ACPI: add support for sentinel-delimited probe tables
arm_pmu: acpi: add support for CPU PMU variant detection
perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

drivers/perf/Makefile | 2 +-
drivers/perf/arm_pmu_acpi.c | 27 ++++
drivers/perf/qcom_arm_pmu.c | 310 ++++++++++++++++++++++++++++++++++++++
include/asm-generic/vmlinux.lds.h | 4 +-
include/linux/acpi.h | 11 ++
include/linux/perf/arm_pmu.h | 1 +
6 files changed, 353 insertions(+), 2 deletions(-)
create mode 100644 drivers/perf/qcom_arm_pmu.c

--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.



2018-06-07 13:58:35

by Agustin Vega-Frias

[permalink] [raw]
Subject: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

Selection of these events can be envisioned as indexing them from
a 3D matrix:
- the first index selects a Region Event Selection Register (PMRESRx_EL0)
- the second index selects a group from which only one event at a time
can be selected
- the third index selects the event

The event is encoded into perf_event_attr.config as 0xPRCCG, where:
P [config:16 ] = prefix (flag that indicates a matrix-based event)
R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
G [config:0-3 ] = group (specifies the event group)
CC [config:4-11 ] = code (specifies the event)

Events with the P flag set to zero are treated as common PMUv3 events
and are directly programmed into PMXEVTYPERx_EL0.

The first two indexes are set combining the RESR and group number with
a base number and writing it into the architected PMXEVTYPER_EL0 register.
The third index is set by writing the code into the bits corresponding
with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
register.

Support for this extension is signaled by the presence of the Falkor PMU
device node under each Falkor CPU device node in the DSDT ACPI table. E.g.:

Device (CPU0)
{
Name (_HID, "ACPI0007" /* Processor Device */)
...
Device (PMU0)
{
Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */
...
}
}

Signed-off-by: Agustin Vega-Frias <[email protected]>
---
drivers/perf/Makefile | 2 +-
drivers/perf/qcom_arm_pmu.c | 310 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 311 insertions(+), 1 deletion(-)
create mode 100644 drivers/perf/qcom_arm_pmu.c

diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index b3902bd..a61afd9 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -3,7 +3,7 @@ obj-$(CONFIG_ARM_CCI_PMU) += arm-cci.o
obj-$(CONFIG_ARM_CCN) += arm-ccn.o
obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
-obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
+obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o qcom_arm_pmu.o
obj-$(CONFIG_HISI_PMU) += hisilicon/
obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o
obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
diff --git a/drivers/perf/qcom_arm_pmu.c b/drivers/perf/qcom_arm_pmu.c
new file mode 100644
index 0000000..5cec756
--- /dev/null
+++ b/drivers/perf/qcom_arm_pmu.c
@@ -0,0 +1,310 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+/*
+ * Qualcomm Technologies CPU PMU IMPLEMENTATION DEFINED extensions support
+ *
+ * Current extensions supported:
+ *
+ * - Matrix-based microarchitectural events support
+ *
+ * Selection of these events can be envisioned as indexing them from
+ * a 3D matrix:
+ * - the first index selects a Region Event Selection Register (PMRESRx_EL0)
+ * - the second index selects a group from which only one event at a time
+ * can be selected
+ * - the third index selects the event
+ *
+ * The event is encoded into perf_event_attr.config as 0xPRCCG, where:
+ * P [config:16 ] = prefix (flag that indicates a matrix-based event)
+ * R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
+ * G [config:0-3 ] = group (specifies the event group)
+ * CC [config:4-11 ] = code (specifies the event)
+ *
+ * Events with the P flag set to zero are treated as common PMUv3 events
+ * and are directly programmed into PMXEVTYPERx_EL0.
+ *
+ * The first two indexes are set combining the RESR and group number with
+ * a base number and writing it into the architected PMXEVTYPER_EL0 register.
+ * The third index is set by writing the code into the bits corresponding
+ * with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
+ * register.
+ */
+
+#include <linux/acpi.h>
+#include <linux/perf/arm_pmu.h>
+
+#define pmresr0_el0 sys_reg(3, 5, 11, 3, 0)
+#define pmresr1_el0 sys_reg(3, 5, 11, 3, 2)
+#define pmresr2_el0 sys_reg(3, 5, 11, 3, 4)
+#define pmxevcntcr_el0 sys_reg(3, 5, 11, 0, 3)
+
+#define QC_EVT_PFX_SHIFT 16
+#define QC_EVT_REG_SHIFT 12
+#define QC_EVT_CODE_SHIFT 4
+#define QC_EVT_GRP_SHIFT 0
+#define QC_EVT_PFX_MASK GENMASK(QC_EVT_PFX_SHIFT, QC_EVT_PFX_SHIFT)
+#define QC_EVT_REG_MASK GENMASK(QC_EVT_REG_SHIFT + 3, QC_EVT_REG_SHIFT)
+#define QC_EVT_CODE_MASK GENMASK(QC_EVT_CODE_SHIFT + 7, QC_EVT_CODE_SHIFT)
+#define QC_EVT_GRP_MASK GENMASK(QC_EVT_GRP_SHIFT + 3, QC_EVT_GRP_SHIFT)
+#define QC_EVT_PRG_MASK (QC_EVT_PFX_MASK | QC_EVT_REG_MASK | QC_EVT_GRP_MASK)
+#define QC_EVT_PRG(event) ((event) & QC_EVT_PRG_MASK)
+#define QC_EVT_REG(event) (((event) & QC_EVT_REG_MASK) >> QC_EVT_REG_SHIFT)
+#define QC_EVT_CODE(event) (((event) & QC_EVT_CODE_MASK) >> QC_EVT_CODE_SHIFT)
+#define QC_EVT_GROUP(event) (((event) & QC_EVT_GRP_MASK) >> QC_EVT_GRP_SHIFT)
+
+#define QC_MAX_GROUP 7
+#define QC_MAX_RESR 2
+#define QC_BITS_PER_GROUP 8
+#define QC_RESR_ENABLE BIT_ULL(63)
+#define QC_RESR_EVT_BASE 0xd8
+
+static struct arm_pmu *def_ops;
+
+static inline void falkor_write_pmresr(u64 reg, u64 val)
+{
+ if (reg == 0)
+ write_sysreg_s(val, pmresr0_el0);
+ else if (reg == 1)
+ write_sysreg_s(val, pmresr1_el0);
+ else
+ write_sysreg_s(val, pmresr2_el0);
+}
+
+static inline u64 falkor_read_pmresr(u64 reg)
+{
+ return (reg == 0 ? read_sysreg_s(pmresr0_el0) :
+ reg == 1 ? read_sysreg_s(pmresr1_el0) :
+ read_sysreg_s(pmresr2_el0));
+}
+
+static void falkor_set_resr(u64 reg, u64 group, u64 code)
+{
+ u64 shift = group * QC_BITS_PER_GROUP;
+ u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
+ u64 val;
+
+ val = falkor_read_pmresr(reg) & ~mask;
+ val |= (code << shift);
+ val |= QC_RESR_ENABLE;
+ falkor_write_pmresr(reg, val);
+}
+
+static void falkor_clear_resr(u64 reg, u64 group)
+{
+ u32 shift = group * QC_BITS_PER_GROUP;
+ u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
+ u64 val = falkor_read_pmresr(reg) & ~mask;
+
+ falkor_write_pmresr(reg, val == QC_RESR_ENABLE ? 0 : val);
+}
+
+/*
+ * Check if e1 and e2 conflict with each other
+ *
+ * e1 is a matrix-based microarchitectural event we are checking against e2.
+ * A conflict exists if the events use the same reg, group, and a different
+ * code. Events with the same code are allowed because they could be using
+ * different filters (e.g. one to count user space and the other to count
+ * kernel space events).
+ */
+static inline int events_conflict(struct perf_event *e1, struct perf_event *e2)
+{
+ if ((e1 != e2) &&
+ (e1->pmu == e2->pmu) &&
+ (QC_EVT_PRG(e1->attr.config) == QC_EVT_PRG(e2->attr.config)) &&
+ (QC_EVT_CODE(e1->attr.config) != QC_EVT_CODE(e2->attr.config))) {
+ pr_debug_ratelimited(
+ "Group exclusion: conflicting events %llx %llx\n",
+ e1->attr.config,
+ e2->attr.config);
+ return 1;
+ }
+ return 0;
+}
+
+/*
+ * Check if the given event is valid for the PMU and if so return the value
+ * that can be used in PMXEVTYPER_EL0 to select the event
+ */
+static int falkor_map_event(struct perf_event *event)
+{
+ u64 reg = QC_EVT_REG(event->attr.config);
+ u64 group = QC_EVT_GROUP(event->attr.config);
+ struct perf_event *leader;
+ struct perf_event *sibling;
+
+ if (!(event->attr.config & QC_EVT_PFX_MASK))
+ /* Common PMUv3 event, forward to the original op */
+ return def_ops->map_event(event);
+
+ /* Is it a valid matrix event? */
+ if ((group > QC_MAX_GROUP) || (reg > QC_MAX_RESR))
+ return -ENOENT;
+
+ /* If part of an event group, check if the event can be put in it */
+
+ leader = event->group_leader;
+ if (events_conflict(event, leader))
+ return -ENOENT;
+
+ for_each_sibling_event(sibling, leader)
+ if (events_conflict(event, sibling))
+ return -ENOENT;
+
+ return QC_RESR_EVT_BASE + reg*8 + group;
+}
+
+/*
+ * Find a slot for the event on the current CPU
+ */
+static int falkor_get_event_idx(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+ int idx;
+
+ if (!!(event->attr.config & QC_EVT_PFX_MASK))
+ /* Matrix event, check for conflicts with existing events */
+ for_each_set_bit(idx, cpuc->used_mask, ARMPMU_MAX_HWEVENTS)
+ if (cpuc->events[idx] &&
+ events_conflict(event, cpuc->events[idx]))
+ return -ENOENT;
+
+ /* Let the original op handle the rest */
+ idx = def_ops->get_event_idx(cpuc, event);
+
+ /*
+ * This is called for actually allocating the events, but also with
+ * a dummy pmu_hw_events when validating groups, for that case we
+ * need to ensure that cpuc->events[idx] is NULL so we don't use
+ * an uninitialized pointer. Conflicts for matrix events in groups
+ * are checked during event mapping anyway (see falkor_event_map).
+ */
+ cpuc->events[idx] = NULL;
+
+ return idx;
+}
+
+/*
+ * Reset the PMU
+ */
+static void falkor_reset(void *info)
+{
+ struct arm_pmu *pmu = (struct arm_pmu *)info;
+ u32 i, ctrs = pmu->num_events;
+
+ /* PMRESRx_EL0 regs are unknown at reset, except for the EN field */
+ for (i = 0; i <= QC_MAX_RESR; i++)
+ falkor_write_pmresr(i, 0);
+
+ /* PMXEVCNTCRx_EL0 regs are unknown at reset */
+ for (i = 0; i <= ctrs; i++) {
+ write_sysreg(i, pmselr_el0);
+ isb();
+ write_sysreg_s(0, pmxevcntcr_el0);
+ }
+
+ /* Let the original op handle the rest */
+ def_ops->reset(info);
+}
+
+/*
+ * Enable the given event
+ */
+static void falkor_enable(struct perf_event *event)
+{
+ if (!!(event->attr.config & QC_EVT_PFX_MASK)) {
+ /* Matrix event, program the appropriate PMRESRx_EL0 */
+ struct arm_pmu *pmu = to_arm_pmu(event->pmu);
+ struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
+ u64 reg = QC_EVT_REG(event->attr.config);
+ u64 code = QC_EVT_CODE(event->attr.config);
+ u64 group = QC_EVT_GROUP(event->attr.config);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&events->pmu_lock, flags);
+ falkor_set_resr(reg, group, code);
+ raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+ }
+
+ /* Let the original op handle the rest */
+ def_ops->enable(event);
+}
+
+/*
+ * Disable the given event
+ */
+static void falkor_disable(struct perf_event *event)
+{
+ /* Use the original op to disable the counter and interrupt */
+ def_ops->enable(event);
+
+ if (!!(event->attr.config & QC_EVT_PFX_MASK)) {
+ /* Matrix event, de-program the appropriate PMRESRx_EL0 */
+ struct arm_pmu *pmu = to_arm_pmu(event->pmu);
+ struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
+ u64 reg = QC_EVT_REG(event->attr.config);
+ u64 group = QC_EVT_GROUP(event->attr.config);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&events->pmu_lock, flags);
+ falkor_clear_resr(reg, group);
+ raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+ }
+}
+
+PMU_FORMAT_ATTR(event, "config:0-15");
+PMU_FORMAT_ATTR(prefix, "config:16");
+PMU_FORMAT_ATTR(reg, "config:12-15");
+PMU_FORMAT_ATTR(code, "config:4-11");
+PMU_FORMAT_ATTR(group, "config:0-3");
+
+static struct attribute *falkor_pmu_formats[] = {
+ &format_attr_event.attr,
+ &format_attr_prefix.attr,
+ &format_attr_reg.attr,
+ &format_attr_code.attr,
+ &format_attr_group.attr,
+ NULL,
+};
+
+static struct attribute_group falkor_pmu_format_attr_group = {
+ .name = "format",
+ .attrs = falkor_pmu_formats,
+};
+
+static int qcom_falkor_pmu_init(struct arm_pmu *pmu, struct device *dev)
+{
+ /* Save base arm_pmu so we can invoke its ops when appropriate */
+ def_ops = devm_kmemdup(dev, pmu, sizeof(*def_ops), GFP_KERNEL);
+ if (!def_ops) {
+ pr_warn("Failed to allocate arm_pmu for QCOM extensions");
+ return -ENODEV;
+ }
+
+ pmu->name = "qcom_pmuv3";
+
+ /* Override the necessary ops */
+ pmu->map_event = falkor_map_event;
+ pmu->get_event_idx = falkor_get_event_idx;
+ pmu->reset = falkor_reset;
+ pmu->enable = falkor_enable;
+ pmu->disable = falkor_disable;
+
+ /* Override the necessary attributes */
+ pmu->pmu.attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+ &falkor_pmu_format_attr_group;
+
+ return 1;
+}
+
+ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", qcom_falkor_pmu_init);
--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


2018-06-07 13:59:19

by Agustin Vega-Frias

[permalink] [raw]
Subject: [RFC V2 1/3] ACPI: add support for sentinel-delimited probe tables

Tables declared with the ACPI_PROBE_TABLE linker macro are typically
traversed by using the start and end symbols created by the linker
script. However, there are some APIs that use sentinel-delimited
tables (e.g. acpi_match_device). To better support these APIs an
additional section is added at the end of the probe table. This
section can be used to add a sentinel for tables that require it.

Signed-off-by: Agustin Vega-Frias <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index af24057..5894049 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -219,7 +219,8 @@
. = ALIGN(8); \
VMLINUX_SYMBOL(__##name##_acpi_probe_table) = .; \
KEEP(*(__##name##_acpi_probe_table)) \
- VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .;
+ VMLINUX_SYMBOL(__##name##_acpi_probe_table_end) = .; \
+ KEEP(*(__##name##_acpi_probe_table_end))
#else
#define ACPI_PROBE_TABLE(name)
#endif
--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


2018-06-07 16:29:23

by Agustin Vega-Frias

[permalink] [raw]
Subject: [RFC V2 2/3] arm_pmu: acpi: add support for CPU PMU variant detection

DT allows CPU PMU variant detection via the PMU device compatible
property. ACPI does not have an equivalent mechanism so we introduce
a probe table to allow this via a device nested inside the CPU device
in the DSDT:

Device (CPU0)
{
Name (_HID, "ACPI0007" /* Processor Device */)
...
Device (PMU0)
{
Name (_HID, "QCOM8150") /* Qualcomm Falkor PMU device */

/*
* The device might also contain _DSD properties to indicate other
* IMPLEMENTATION DEFINED PMU features.
*/
Name (_DSD, Package ()
{
ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
Package ()
{
...
}
})
}
}

With this in place we can declare the variant:

ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", falkor_pmu_init);

The init function is called after the default PMU initialization and is
passed a pointer to the arm_pmu structure and a pointer to the PMU device.
The init function can then override arm_pmu callbacks and attributes and
query more properties from the PMU device.

Signed-off-by: Agustin Vega-Frias <[email protected]>
---
drivers/perf/arm_pmu_acpi.c | 27 +++++++++++++++++++++++++++
include/asm-generic/vmlinux.lds.h | 1 +
include/linux/acpi.h | 11 +++++++++++
include/linux/perf/arm_pmu.h | 1 +
4 files changed, 40 insertions(+)

diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f19751..6b0ca71 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -220,6 +220,26 @@ static int arm_pmu_acpi_cpu_starting(unsigned int cpu)
return 0;
}

+/*
+ * Check if the given child device of the CPU device matches a PMU variant
+ * device declared with ACPI_DECLARE_PMU_VARIANT, if so, pass the arm_pmu
+ * structure and the matching device for further initialization.
+ */
+static int arm_pmu_variant_init(struct device *dev, void *data)
+{
+ extern struct acpi_device_id ACPI_PROBE_TABLE(pmu);
+ unsigned int cpu = *((unsigned int *)data);
+ const struct acpi_device_id *id;
+
+ id = acpi_match_device(&ACPI_PROBE_TABLE(pmu), dev);
+ if (id) {
+ armpmu_acpi_init_fn fn = (armpmu_acpi_init_fn)id->driver_data;
+
+ return fn(per_cpu(probed_pmus, cpu), dev);
+ }
+ return 0;
+}
+
int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
{
int pmu_idx = 0;
@@ -240,6 +260,7 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
*/
for_each_possible_cpu(cpu) {
struct arm_pmu *pmu = per_cpu(probed_pmus, cpu);
+ struct device *dev = get_cpu_device(cpu);
char *base_name;

if (!pmu || pmu->name)
@@ -254,6 +275,10 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn)
return ret;
}

+ ret = device_for_each_child(dev, &cpu, arm_pmu_variant_init);
+ if (ret == -ENODEV)
+ pr_warn("Failed PMU re-init, fallback to plain PMUv3");
+
base_name = pmu->name;
pmu->name = kasprintf(GFP_KERNEL, "%s_%d", base_name, pmu_idx++);
if (!pmu->name) {
@@ -290,3 +315,5 @@ static int arm_pmu_acpi_init(void)
return ret;
}
subsys_initcall(arm_pmu_acpi_init)
+
+ACPI_DECLARE_PMU_SENTINEL();
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5894049..f1be62a 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -600,6 +600,7 @@
IRQCHIP_OF_MATCH_TABLE() \
ACPI_PROBE_TABLE(irqchip) \
ACPI_PROBE_TABLE(timer) \
+ ACPI_PROBE_TABLE(pmu) \
EARLYCON_TABLE()

#define INIT_TEXT \
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15..9c410cf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1153,6 +1153,17 @@ struct acpi_probe_entry {
(&ACPI_PROBE_TABLE_END(t) - \
&ACPI_PROBE_TABLE(t))); \
})
+
+#define ACPI_DECLARE_PMU_VARIANT(name, hid, init_fn) \
+ static const struct acpi_device_id __acpi_probe_##name \
+ __used __section(__pmu_acpi_probe_table) \
+ = { .id = hid, .driver_data = (kernel_ulong_t)init_fn }
+
+#define ACPI_DECLARE_PMU_SENTINEL() \
+ static const struct acpi_device_id __acpi_probe_sentinel \
+ __used __section(__pmu_acpi_probe_table_end) \
+ = { .id = "", .driver_data = 0 }
+
#else
static inline int acpi_dev_get_property(struct acpi_device *adev,
const char *name, acpi_object_type type,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 40036a5..ff43d65 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -123,6 +123,7 @@ int armpmu_map_event(struct perf_event *event,
u32 raw_event_mask);

typedef int (*armpmu_init_fn)(struct arm_pmu *);
+typedef int (*armpmu_acpi_init_fn)(struct arm_pmu *, struct device *);

struct pmu_probe_info {
unsigned int cpuid;
--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


2018-06-12 14:42:33

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

Hi,

On Thu, Jun 07, 2018 at 09:56:48AM -0400, Agustin Vega-Frias wrote:
> Selection of these events can be envisioned as indexing them from
> a 3D matrix:
> - the first index selects a Region Event Selection Register (PMRESRx_EL0)
> - the second index selects a group from which only one event at a time
> can be selected
> - the third index selects the event
>
> The event is encoded into perf_event_attr.config as 0xPRCCG, where:
> P [config:16 ] = prefix (flag that indicates a matrix-based event)
> R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
> G [config:0-3 ] = group (specifies the event group)
> CC [config:4-11 ] = code (specifies the event)
>
> Events with the P flag set to zero are treated as common PMUv3 events
> and are directly programmed into PMXEVTYPERx_EL0.
>
> The first two indexes are set combining the RESR and group number with
> a base number and writing it into the architected PMXEVTYPER_EL0 register.
> The third index is set by writing the code into the bits corresponding
> with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
> register.

When are the IMP DEF registers accessible at EL0? Are those goverend by
the same controls as the architected registers?

[...]

> +/*
> + * Qualcomm Technologies CPU PMU IMPLEMENTATION DEFINED extensions support
> + *
> + * Current extensions supported:
> + *
> + * - Matrix-based microarchitectural events support
> + *
> + * Selection of these events can be envisioned as indexing them from
> + * a 3D matrix:
> + * - the first index selects a Region Event Selection Register (PMRESRx_EL0)
> + * - the second index selects a group from which only one event at a time
> + * can be selected
> + * - the third index selects the event
> + *
> + * The event is encoded into perf_event_attr.config as 0xPRCCG, where:
> + * P [config:16 ] = prefix (flag that indicates a matrix-based event)
> + * R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
> + * G [config:0-3 ] = group (specifies the event group)
> + * CC [config:4-11 ] = code (specifies the event)
> + *
> + * Events with the P flag set to zero are treated as common PMUv3 events
> + * and are directly programmed into PMXEVTYPERx_EL0.

When PMUv3 is given a raw event code, the config fields should be the
PMU event number, and this conflicts with RESERVED encodings.

I'd rather we used a separate field for the QC extension events. e.g.
turn config1 into a flags field, and move the P flag there.

We *should* add code to sanity check those fields are zero in the PMUv3
driver, even though it's a potential ABI break to start now.

> + *
> + * The first two indexes are set combining the RESR and group number with
> + * a base number and writing it into the architected PMXEVTYPER_EL0 register.
> + * The third index is set by writing the code into the bits corresponding
> + * with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
> + * register.
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/perf/arm_pmu.h>

You'll also need:

#include <linux/bitops.h>
#include <linux/device.h>
#include <linux/perf_event.h>
#include <linux/printk.h>
#include <linux/types.h>

#include <asm/barrier.h>
#include <asm/sysreg.h>

> +
> +#define pmresr0_el0 sys_reg(3, 5, 11, 3, 0)
> +#define pmresr1_el0 sys_reg(3, 5, 11, 3, 2)
> +#define pmresr2_el0 sys_reg(3, 5, 11, 3, 4)
> +#define pmxevcntcr_el0 sys_reg(3, 5, 11, 0, 3)
> +
> +#define QC_EVT_PFX_SHIFT 16
> +#define QC_EVT_REG_SHIFT 12
> +#define QC_EVT_CODE_SHIFT 4
> +#define QC_EVT_GRP_SHIFT 0
> +#define QC_EVT_PFX_MASK GENMASK(QC_EVT_PFX_SHIFT, QC_EVT_PFX_SHIFT)
> +#define QC_EVT_REG_MASK GENMASK(QC_EVT_REG_SHIFT + 3, QC_EVT_REG_SHIFT)
> +#define QC_EVT_CODE_MASK GENMASK(QC_EVT_CODE_SHIFT + 7, QC_EVT_CODE_SHIFT)
> +#define QC_EVT_GRP_MASK GENMASK(QC_EVT_GRP_SHIFT + 3, QC_EVT_GRP_SHIFT)
> +#define QC_EVT_PRG_MASK (QC_EVT_PFX_MASK | QC_EVT_REG_MASK | QC_EVT_GRP_MASK)
> +#define QC_EVT_PRG(event) ((event) & QC_EVT_PRG_MASK)
> +#define QC_EVT_REG(event) (((event) & QC_EVT_REG_MASK) >> QC_EVT_REG_SHIFT)
> +#define QC_EVT_CODE(event) (((event) & QC_EVT_CODE_MASK) >> QC_EVT_CODE_SHIFT)
> +#define QC_EVT_GROUP(event) (((event) & QC_EVT_GRP_MASK) >> QC_EVT_GRP_SHIFT)
> +
> +#define QC_MAX_GROUP 7
> +#define QC_MAX_RESR 2
> +#define QC_BITS_PER_GROUP 8
> +#define QC_RESR_ENABLE BIT_ULL(63)
> +#define QC_RESR_EVT_BASE 0xd8
> +
> +static struct arm_pmu *def_ops;
> +
> +static inline void falkor_write_pmresr(u64 reg, u64 val)
> +{
> + if (reg == 0)
> + write_sysreg_s(val, pmresr0_el0);
> + else if (reg == 1)
> + write_sysreg_s(val, pmresr1_el0);
> + else
> + write_sysreg_s(val, pmresr2_el0);
> +}
> +
> +static inline u64 falkor_read_pmresr(u64 reg)
> +{
> + return (reg == 0 ? read_sysreg_s(pmresr0_el0) :
> + reg == 1 ? read_sysreg_s(pmresr1_el0) :
> + read_sysreg_s(pmresr2_el0));
> +}

Please use switch statements for both of these.

> +
> +static void falkor_set_resr(u64 reg, u64 group, u64 code)
> +{
> + u64 shift = group * QC_BITS_PER_GROUP;
> + u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
> + u64 val;
> +
> + val = falkor_read_pmresr(reg) & ~mask;
> + val |= (code << shift);
> + val |= QC_RESR_ENABLE;
> + falkor_write_pmresr(reg, val);
> +}
> +
> +static void falkor_clear_resr(u64 reg, u64 group)
> +{
> + u32 shift = group * QC_BITS_PER_GROUP;
> + u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
> + u64 val = falkor_read_pmresr(reg) & ~mask;
> +
> + falkor_write_pmresr(reg, val == QC_RESR_ENABLE ? 0 : val);
> +}
> +
> +/*
> + * Check if e1 and e2 conflict with each other
> + *
> + * e1 is a matrix-based microarchitectural event we are checking against e2.
> + * A conflict exists if the events use the same reg, group, and a different
> + * code. Events with the same code are allowed because they could be using
> + * different filters (e.g. one to count user space and the other to count
> + * kernel space events).
> + */

What problem occurs when there's a conflict?

Does the filter matter at all? When happens if I open two identical
events, both counting the same reg, group, and code, with the same
filter?

> +static inline int events_conflict(struct perf_event *e1, struct perf_event *e2)
> +{
> + if ((e1 != e2) &&
> + (e1->pmu == e2->pmu) &&
> + (QC_EVT_PRG(e1->attr.config) == QC_EVT_PRG(e2->attr.config)) &&
> + (QC_EVT_CODE(e1->attr.config) != QC_EVT_CODE(e2->attr.config))) {
> + pr_debug_ratelimited(
> + "Group exclusion: conflicting events %llx %llx\n",
> + e1->attr.config,
> + e2->attr.config);
> + return 1;
> + }
> + return 0;
> +}

This would be easier to read as a series of tests:

static inline bool events_conflict(struct perf_event *new,
struct perf_event *other)
{
/* own group leader */
if (new == other)
return false;

/* software events can't conflict */
if (is_sw_event(other))
return false;

/* No conflict if using different reg or group */
if (QC_EVT_PRG(new->attr.config) != QC_EVT_PRG(other->attr.config))
return false;

/* Same reg and group is fine so long as code matches */
if (QC_EVT_CODE(new->attr.config) == QC_EVT_PRG(other->attr.config)
return false;


pr_debug_ratelimited("Group exclusion: conflicting events %llx %llx\n",
new->attr.config, other->attr.config);

return true;

}

> +
> +/*
> + * Check if the given event is valid for the PMU and if so return the value
> + * that can be used in PMXEVTYPER_EL0 to select the event
> + */
> +static int falkor_map_event(struct perf_event *event)
> +{
> + u64 reg = QC_EVT_REG(event->attr.config);
> + u64 group = QC_EVT_GROUP(event->attr.config);
> + struct perf_event *leader;
> + struct perf_event *sibling;
> +
> + if (!(event->attr.config & QC_EVT_PFX_MASK))
> + /* Common PMUv3 event, forward to the original op */
> + return def_ops->map_event(event);

The QC event codes should only be used when either:

* event->attr.type is PERF_TYPE_RAW, or:

* event->pmu.type is this PMU's dynamic type

... otherwise this will accept events meant for other PMUs, and/or
override conflicting events in other type namespaces (e.g.
PERF_EVENT_TYPE_HW, PERF_EVENT_TYPE_CACHE).

> +
> + /* Is it a valid matrix event? */
> + if ((group > QC_MAX_GROUP) || (reg > QC_MAX_RESR))
> + return -ENOENT;
> +
> + /* If part of an event group, check if the event can be put in it */
> +
> + leader = event->group_leader;
> + if (events_conflict(event, leader))
> + return -ENOENT;
> +
> + for_each_sibling_event(sibling, leader)
> + if (events_conflict(event, sibling))
> + return -ENOENT;
> +
> + return QC_RESR_EVT_BASE + reg*8 + group;

Nit: spacing around binary operators please.

> +}
> +
> +/*
> + * Find a slot for the event on the current CPU
> + */
> +static int falkor_get_event_idx(struct pmu_hw_events *cpuc, struct perf_event *event)
> +{
> + int idx;
> +
> + if (!!(event->attr.config & QC_EVT_PFX_MASK))

The '!!' isn't required.

> + /* Matrix event, check for conflicts with existing events */
> + for_each_set_bit(idx, cpuc->used_mask, ARMPMU_MAX_HWEVENTS)
> + if (cpuc->events[idx] &&
> + events_conflict(event, cpuc->events[idx]))
> + return -ENOENT;
> +
> + /* Let the original op handle the rest */
> + idx = def_ops->get_event_idx(cpuc, event);

Same comments as for falkor_map_event().

> +
> + /*
> + * This is called for actually allocating the events, but also with
> + * a dummy pmu_hw_events when validating groups, for that case we
> + * need to ensure that cpuc->events[idx] is NULL so we don't use
> + * an uninitialized pointer. Conflicts for matrix events in groups
> + * are checked during event mapping anyway (see falkor_event_map).
> + */
> + cpuc->events[idx] = NULL;
> +
> + return idx;
> +}
> +
> +/*
> + * Reset the PMU
> + */
> +static void falkor_reset(void *info)
> +{
> + struct arm_pmu *pmu = (struct arm_pmu *)info;
> + u32 i, ctrs = pmu->num_events;
> +
> + /* PMRESRx_EL0 regs are unknown at reset, except for the EN field */
> + for (i = 0; i <= QC_MAX_RESR; i++)
> + falkor_write_pmresr(i, 0);
> +
> + /* PMXEVCNTCRx_EL0 regs are unknown at reset */
> + for (i = 0; i <= ctrs; i++) {
> + write_sysreg(i, pmselr_el0);
> + isb();
> + write_sysreg_s(0, pmxevcntcr_el0);
> + }
> +
> + /* Let the original op handle the rest */
> + def_ops->reset(info);
> +}
> +
> +/*
> + * Enable the given event
> + */
> +static void falkor_enable(struct perf_event *event)
> +{
> + if (!!(event->attr.config & QC_EVT_PFX_MASK)) {

The '!!' isn't required.

> + /* Matrix event, program the appropriate PMRESRx_EL0 */
> + struct arm_pmu *pmu = to_arm_pmu(event->pmu);
> + struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
> + u64 reg = QC_EVT_REG(event->attr.config);
> + u64 code = QC_EVT_CODE(event->attr.config);
> + u64 group = QC_EVT_GROUP(event->attr.config);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&events->pmu_lock, flags);
> + falkor_set_resr(reg, group, code);
> + raw_spin_unlock_irqrestore(&events->pmu_lock, flags);

Why is the spinlock required?

AFACIT this should only ever be called in contexts where IRQs are
disabled already.

> + }
> +
> + /* Let the original op handle the rest */
> + def_ops->enable(event);
> +}
> +
> +/*
> + * Disable the given event
> + */
> +static void falkor_disable(struct perf_event *event)
> +{
> + /* Use the original op to disable the counter and interrupt */
> + def_ops->enable(event);
> +
> + if (!!(event->attr.config & QC_EVT_PFX_MASK)) {
> + /* Matrix event, de-program the appropriate PMRESRx_EL0 */
> + struct arm_pmu *pmu = to_arm_pmu(event->pmu);
> + struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
> + u64 reg = QC_EVT_REG(event->attr.config);
> + u64 group = QC_EVT_GROUP(event->attr.config);
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&events->pmu_lock, flags);
> + falkor_clear_resr(reg, group);
> + raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> + }
> +}

Same comments as with falkor_enable().

> +
> +PMU_FORMAT_ATTR(event, "config:0-15");
> +PMU_FORMAT_ATTR(prefix, "config:16");
> +PMU_FORMAT_ATTR(reg, "config:12-15");
> +PMU_FORMAT_ATTR(code, "config:4-11");
> +PMU_FORMAT_ATTR(group, "config:0-3");

What sort of events are available? Do you plan to add anything to the
userspace event database in tools/perf/pmu-events/ ?

> +
> +static struct attribute *falkor_pmu_formats[] = {
> + &format_attr_event.attr,
> + &format_attr_prefix.attr,
> + &format_attr_reg.attr,
> + &format_attr_code.attr,
> + &format_attr_group.attr,
> + NULL,
> +};
> +
> +static struct attribute_group falkor_pmu_format_attr_group = {
> + .name = "format",
> + .attrs = falkor_pmu_formats,
> +};
> +
> +static int qcom_falkor_pmu_init(struct arm_pmu *pmu, struct device *dev)
> +{
> + /* Save base arm_pmu so we can invoke its ops when appropriate */
> + def_ops = devm_kmemdup(dev, pmu, sizeof(*def_ops), GFP_KERNEL);
> + if (!def_ops) {
> + pr_warn("Failed to allocate arm_pmu for QCOM extensions");
> + return -ENODEV;
> + }
> +
> + pmu->name = "qcom_pmuv3";

All the other CPU PMUs on an ARM ACPI system will have an index suffix,
e.g. "armv8_pmuv3_0". I can see why we might want to change the name to
indicate the QC extensions, but I think we should keep the existing
pattern, with a '_0' suffix here.

> +
> + /* Override the necessary ops */
> + pmu->map_event = falkor_map_event;
> + pmu->get_event_idx = falkor_get_event_idx;
> + pmu->reset = falkor_reset;
> + pmu->enable = falkor_enable;
> + pmu->disable = falkor_disable;

I'm somewhat concerned by hooking into the existing PMU code at this
level, but I don't currently have a better suggestion.

Thanks,
Mark.

> +
> + /* Override the necessary attributes */
> + pmu->pmu.attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
> + &falkor_pmu_format_attr_group;
> +
> + return 1;
> +}
> +
> +ACPI_DECLARE_PMU_VARIANT(qcom_falkor, "QCOM8150", qcom_falkor_pmu_init);
> --
> Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>

2018-06-12 20:42:44

by Agustin Vega-Frias

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

Hi Mark,

On 2018-06-12 10:40, Mark Rutland wrote:
> Hi,
>
> On Thu, Jun 07, 2018 at 09:56:48AM -0400, Agustin Vega-Frias wrote:
>> Selection of these events can be envisioned as indexing them from
>> a 3D matrix:
>> - the first index selects a Region Event Selection Register
>> (PMRESRx_EL0)
>> - the second index selects a group from which only one event at a time
>> can be selected
>> - the third index selects the event
>>
>> The event is encoded into perf_event_attr.config as 0xPRCCG, where:
>> P [config:16 ] = prefix (flag that indicates a matrix-based
>> event)
>> R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
>> G [config:0-3 ] = group (specifies the event group)
>> CC [config:4-11 ] = code (specifies the event)
>>
>> Events with the P flag set to zero are treated as common PMUv3 events
>> and are directly programmed into PMXEVTYPERx_EL0.
>>
>> The first two indexes are set combining the RESR and group number with
>> a base number and writing it into the architected PMXEVTYPER_EL0
>> register.
>> The third index is set by writing the code into the bits corresponding
>> with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
>> register.
>
> When are the IMP DEF registers accessible at EL0? Are those goverend by
> the same controls as the architected registers?

No, there is a separate IMP DEF register to control access.

>
> [...]
>
>> +/*
>> + * Qualcomm Technologies CPU PMU IMPLEMENTATION DEFINED extensions
>> support
>> + *
>> + * Current extensions supported:
>> + *
>> + * - Matrix-based microarchitectural events support
>> + *
>> + * Selection of these events can be envisioned as indexing them
>> from
>> + * a 3D matrix:
>> + * - the first index selects a Region Event Selection Register
>> (PMRESRx_EL0)
>> + * - the second index selects a group from which only one event at
>> a time
>> + * can be selected
>> + * - the third index selects the event
>> + *
>> + * The event is encoded into perf_event_attr.config as 0xPRCCG,
>> where:
>> + * P [config:16 ] = prefix (flag that indicates a
>> matrix-based event)
>> + * R [config:12-15] = register (specifies the PMRESRx_EL0
>> instance)
>> + * G [config:0-3 ] = group (specifies the event group)
>> + * CC [config:4-11 ] = code (specifies the event)
>> + *
>> + * Events with the P flag set to zero are treated as common PMUv3
>> events
>> + * and are directly programmed into PMXEVTYPERx_EL0.
>
> When PMUv3 is given a raw event code, the config fields should be the
> PMU event number, and this conflicts with RESERVED encodings.
>
> I'd rather we used a separate field for the QC extension events. e.g.
> turn config1 into a flags field, and move the P flag there.
>
> We *should* add code to sanity check those fields are zero in the PMUv3
> driver, even though it's a potential ABI break to start now.

I should have stated clearly that in this case the event code is
directly
programmed into PMXEVTYPERx_EL0.evtCount, not by this code, but by the
PMUv3
code, which will do the masking and ensure reserved bits are not
touched.
IOW, that case is no different from the raw event or a common event
case.

I would prefer to keep the flag in config because it allows the use of
raw code encodings to access these events more easily, and given that
the flag is never propagated to any register I believe it is safe.

>
>> + *
>> + * The first two indexes are set combining the RESR and group
>> number with
>> + * a base number and writing it into the architected PMXEVTYPER_EL0
>> register.
>> + * The third index is set by writing the code into the bits
>> corresponding
>> + * with the group into the appropriate IMPLEMENTATION DEFINED
>> PMRESRx_EL0
>> + * register.
>> + */
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/perf/arm_pmu.h>
>
> You'll also need:
>
> #include <linux/bitops.h>
> #include <linux/device.h>
> #include <linux/perf_event.h>
> #include <linux/printk.h>
> #include <linux/types.h>
>
> #include <asm/barrier.h>
> #include <asm/sysreg.h>
>

Will do.

>> +
>> +#define pmresr0_el0 sys_reg(3, 5, 11, 3, 0)
>> +#define pmresr1_el0 sys_reg(3, 5, 11, 3, 2)
>> +#define pmresr2_el0 sys_reg(3, 5, 11, 3, 4)
>> +#define pmxevcntcr_el0 sys_reg(3, 5, 11, 0, 3)
>> +
>> +#define QC_EVT_PFX_SHIFT 16
>> +#define QC_EVT_REG_SHIFT 12
>> +#define QC_EVT_CODE_SHIFT 4
>> +#define QC_EVT_GRP_SHIFT 0
>> +#define QC_EVT_PFX_MASK GENMASK(QC_EVT_PFX_SHIFT,
>> QC_EVT_PFX_SHIFT)
>> +#define QC_EVT_REG_MASK GENMASK(QC_EVT_REG_SHIFT + 3,
>> QC_EVT_REG_SHIFT)
>> +#define QC_EVT_CODE_MASK GENMASK(QC_EVT_CODE_SHIFT + 7,
>> QC_EVT_CODE_SHIFT)
>> +#define QC_EVT_GRP_MASK GENMASK(QC_EVT_GRP_SHIFT + 3,
>> QC_EVT_GRP_SHIFT)
>> +#define QC_EVT_PRG_MASK (QC_EVT_PFX_MASK | QC_EVT_REG_MASK |
>> QC_EVT_GRP_MASK)
>> +#define QC_EVT_PRG(event) ((event) & QC_EVT_PRG_MASK)
>> +#define QC_EVT_REG(event) (((event) & QC_EVT_REG_MASK) >>
>> QC_EVT_REG_SHIFT)
>> +#define QC_EVT_CODE(event) (((event) & QC_EVT_CODE_MASK) >>
>> QC_EVT_CODE_SHIFT)
>> +#define QC_EVT_GROUP(event) (((event) & QC_EVT_GRP_MASK) >>
>> QC_EVT_GRP_SHIFT)
>> +
>> +#define QC_MAX_GROUP 7
>> +#define QC_MAX_RESR 2
>> +#define QC_BITS_PER_GROUP 8
>> +#define QC_RESR_ENABLE BIT_ULL(63)
>> +#define QC_RESR_EVT_BASE 0xd8
>> +
>> +static struct arm_pmu *def_ops;
>> +
>> +static inline void falkor_write_pmresr(u64 reg, u64 val)
>> +{
>> + if (reg == 0)
>> + write_sysreg_s(val, pmresr0_el0);
>> + else if (reg == 1)
>> + write_sysreg_s(val, pmresr1_el0);
>> + else
>> + write_sysreg_s(val, pmresr2_el0);
>> +}
>> +
>> +static inline u64 falkor_read_pmresr(u64 reg)
>> +{
>> + return (reg == 0 ? read_sysreg_s(pmresr0_el0) :
>> + reg == 1 ? read_sysreg_s(pmresr1_el0) :
>> + read_sysreg_s(pmresr2_el0));
>> +}
>
> Please use switch statements for both of these.

Will do.

>
>> +
>> +static void falkor_set_resr(u64 reg, u64 group, u64 code)
>> +{
>> + u64 shift = group * QC_BITS_PER_GROUP;
>> + u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
>> + u64 val;
>> +
>> + val = falkor_read_pmresr(reg) & ~mask;
>> + val |= (code << shift);
>> + val |= QC_RESR_ENABLE;
>> + falkor_write_pmresr(reg, val);
>> +}
>> +
>> +static void falkor_clear_resr(u64 reg, u64 group)
>> +{
>> + u32 shift = group * QC_BITS_PER_GROUP;
>> + u64 mask = GENMASK(shift + QC_BITS_PER_GROUP - 1, shift);
>> + u64 val = falkor_read_pmresr(reg) & ~mask;
>> +
>> + falkor_write_pmresr(reg, val == QC_RESR_ENABLE ? 0 : val);
>> +}
>> +
>> +/*
>> + * Check if e1 and e2 conflict with each other
>> + *
>> + * e1 is a matrix-based microarchitectural event we are checking
>> against e2.
>> + * A conflict exists if the events use the same reg, group, and a
>> different
>> + * code. Events with the same code are allowed because they could be
>> using
>> + * different filters (e.g. one to count user space and the other to
>> count
>> + * kernel space events).
>> + */
>
> What problem occurs when there's a conflict?

No real problem at the hardware level since only one event can be
selected
at a time from a group, but if this is not detected and dealt with the
user
will receive incorrect information.

Selection is done through the PMRESRx_EL0 registers, these are 64bit
registers
with an enable bit (63) one 7-bit field for group 7 (62-56) event
selection, and
seven 8-bit fields for group 6 to group 0 event selection. So it is
physically
impossible to select two events.

>
> Does the filter matter at all? When happens if I open two identical
> events, both counting the same reg, group, and code, with the same
> filter?

That is possible and allowed, similar to counting the same common event
in two configurable counters. Only problem is wasting a counter
resource.

>
>> +static inline int events_conflict(struct perf_event *e1, struct
>> perf_event *e2)
>> +{
>> + if ((e1 != e2) &&
>> + (e1->pmu == e2->pmu) &&
>> + (QC_EVT_PRG(e1->attr.config) == QC_EVT_PRG(e2->attr.config)) &&
>> + (QC_EVT_CODE(e1->attr.config) != QC_EVT_CODE(e2->attr.config)))
>> {
>> + pr_debug_ratelimited(
>> + "Group exclusion: conflicting events %llx %llx\n",
>> + e1->attr.config,
>> + e2->attr.config);
>> + return 1;
>> + }
>> + return 0;
>> +}
>
> This would be easier to read as a series of tests:
>
> static inline bool events_conflict(struct perf_event *new,
> struct perf_event *other)
> {
> /* own group leader */
> if (new == other)
> return false;
>
> /* software events can't conflict */
> if (is_sw_event(other))
> return false;
>
> /* No conflict if using different reg or group */
> if (QC_EVT_PRG(new->attr.config) != QC_EVT_PRG(other->attr.config))
> return false;
>
> /* Same reg and group is fine so long as code matches */
> if (QC_EVT_CODE(new->attr.config) == QC_EVT_PRG(other->attr.config)
> return false;
>
>
> pr_debug_ratelimited("Group exclusion: conflicting events %llx
> %llx\n",
> new->attr.config, other->attr.config);
>
> return true;
>
> }

Will rework.

>
>> +
>> +/*
>> + * Check if the given event is valid for the PMU and if so return the
>> value
>> + * that can be used in PMXEVTYPER_EL0 to select the event
>> + */
>> +static int falkor_map_event(struct perf_event *event)
>> +{
>> + u64 reg = QC_EVT_REG(event->attr.config);
>> + u64 group = QC_EVT_GROUP(event->attr.config);
>> + struct perf_event *leader;
>> + struct perf_event *sibling;
>> +
>> + if (!(event->attr.config & QC_EVT_PFX_MASK))
>> + /* Common PMUv3 event, forward to the original op */
>> + return def_ops->map_event(event);
>
> The QC event codes should only be used when either:
>
> * event->attr.type is PERF_TYPE_RAW, or:
>
> * event->pmu.type is this PMU's dynamic type
>
> ... otherwise this will accept events meant for other PMUs, and/or
> override conflicting events in other type namespaces (e.g.
> PERF_EVENT_TYPE_HW, PERF_EVENT_TYPE_CACHE).
>

Will add a check.

>> +
>> + /* Is it a valid matrix event? */
>> + if ((group > QC_MAX_GROUP) || (reg > QC_MAX_RESR))
>> + return -ENOENT;
>> +
>> + /* If part of an event group, check if the event can be put in it */
>> +
>> + leader = event->group_leader;
>> + if (events_conflict(event, leader))
>> + return -ENOENT;
>> +
>> + for_each_sibling_event(sibling, leader)
>> + if (events_conflict(event, sibling))
>> + return -ENOENT;
>> +
>> + return QC_RESR_EVT_BASE + reg*8 + group;
>
> Nit: spacing around binary operators please.
>
>> +}
>> +
>> +/*
>> + * Find a slot for the event on the current CPU
>> + */
>> +static int falkor_get_event_idx(struct pmu_hw_events *cpuc, struct
>> perf_event *event)
>> +{
>> + int idx;
>> +
>> + if (!!(event->attr.config & QC_EVT_PFX_MASK))
>
> The '!!' isn't required.
>
>> + /* Matrix event, check for conflicts with existing events */
>> + for_each_set_bit(idx, cpuc->used_mask, ARMPMU_MAX_HWEVENTS)
>> + if (cpuc->events[idx] &&
>> + events_conflict(event, cpuc->events[idx]))
>> + return -ENOENT;
>> +
>> + /* Let the original op handle the rest */
>> + idx = def_ops->get_event_idx(cpuc, event);
>
> Same comments as for falkor_map_event().
>
>> +
>> + /*
>> + * This is called for actually allocating the events, but also with
>> + * a dummy pmu_hw_events when validating groups, for that case we
>> + * need to ensure that cpuc->events[idx] is NULL so we don't use
>> + * an uninitialized pointer. Conflicts for matrix events in groups
>> + * are checked during event mapping anyway (see falkor_event_map).
>> + */
>> + cpuc->events[idx] = NULL;
>> +
>> + return idx;
>> +}
>> +
>> +/*
>> + * Reset the PMU
>> + */
>> +static void falkor_reset(void *info)
>> +{
>> + struct arm_pmu *pmu = (struct arm_pmu *)info;
>> + u32 i, ctrs = pmu->num_events;
>> +
>> + /* PMRESRx_EL0 regs are unknown at reset, except for the EN field */
>> + for (i = 0; i <= QC_MAX_RESR; i++)
>> + falkor_write_pmresr(i, 0);
>> +
>> + /* PMXEVCNTCRx_EL0 regs are unknown at reset */
>> + for (i = 0; i <= ctrs; i++) {
>> + write_sysreg(i, pmselr_el0);
>> + isb();
>> + write_sysreg_s(0, pmxevcntcr_el0);
>> + }
>> +
>> + /* Let the original op handle the rest */
>> + def_ops->reset(info);
>> +}
>> +
>> +/*
>> + * Enable the given event
>> + */
>> +static void falkor_enable(struct perf_event *event)
>> +{
>> + if (!!(event->attr.config & QC_EVT_PFX_MASK)) {
>
> The '!!' isn't required.
>
>> + /* Matrix event, program the appropriate PMRESRx_EL0 */
>> + struct arm_pmu *pmu = to_arm_pmu(event->pmu);
>> + struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
>> + u64 reg = QC_EVT_REG(event->attr.config);
>> + u64 code = QC_EVT_CODE(event->attr.config);
>> + u64 group = QC_EVT_GROUP(event->attr.config);
>> + unsigned long flags;
>> +
>> + raw_spin_lock_irqsave(&events->pmu_lock, flags);
>> + falkor_set_resr(reg, group, code);
>> + raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
>
> Why is the spinlock required?
>
> AFACIT this should only ever be called in contexts where IRQs are
> disabled already.
>

falkor_set_resr is a read-modify-write operation. The PMUv3 code uses
the spinlock to protect the counter selection too
(armv8pmu_enable_event).
I believe this is to deal with event rotation which can potentially
be active when we are creating new events.

>> + }
>> +
>> + /* Let the original op handle the rest */
>> + def_ops->enable(event);
>> +}
>> +
>> +/*
>> + * Disable the given event
>> + */
>> +static void falkor_disable(struct perf_event *event)
>> +{
>> + /* Use the original op to disable the counter and interrupt */
>> + def_ops->enable(event);
>> +
>> + if (!!(event->attr.config & QC_EVT_PFX_MASK)) {
>> + /* Matrix event, de-program the appropriate PMRESRx_EL0 */
>> + struct arm_pmu *pmu = to_arm_pmu(event->pmu);
>> + struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
>> + u64 reg = QC_EVT_REG(event->attr.config);
>> + u64 group = QC_EVT_GROUP(event->attr.config);
>> + unsigned long flags;
>> +
>> + raw_spin_lock_irqsave(&events->pmu_lock, flags);
>> + falkor_clear_resr(reg, group);
>> + raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
>> + }
>> +}
>
> Same comments as with falkor_enable().
>
>> +
>> +PMU_FORMAT_ATTR(event, "config:0-15");
>> +PMU_FORMAT_ATTR(prefix, "config:16");
>> +PMU_FORMAT_ATTR(reg, "config:12-15");
>> +PMU_FORMAT_ATTR(code, "config:4-11");
>> +PMU_FORMAT_ATTR(group, "config:0-3");
>
> What sort of events are available? Do you plan to add anything to the
> userspace event database in tools/perf/pmu-events/ ?
>

Yes, we are still doing some internal work to see what we can put in
the driver or as JSON events.

>> +
>> +static struct attribute *falkor_pmu_formats[] = {
>> + &format_attr_event.attr,
>> + &format_attr_prefix.attr,
>> + &format_attr_reg.attr,
>> + &format_attr_code.attr,
>> + &format_attr_group.attr,
>> + NULL,
>> +};
>> +
>> +static struct attribute_group falkor_pmu_format_attr_group = {
>> + .name = "format",
>> + .attrs = falkor_pmu_formats,
>> +};
>> +
>> +static int qcom_falkor_pmu_init(struct arm_pmu *pmu, struct device
>> *dev)
>> +{
>> + /* Save base arm_pmu so we can invoke its ops when appropriate */
>> + def_ops = devm_kmemdup(dev, pmu, sizeof(*def_ops), GFP_KERNEL);
>> + if (!def_ops) {
>> + pr_warn("Failed to allocate arm_pmu for QCOM extensions");
>> + return -ENODEV;
>> + }
>> +
>> + pmu->name = "qcom_pmuv3";
>
> All the other CPU PMUs on an ARM ACPI system will have an index suffix,
> e.g. "armv8_pmuv3_0". I can see why we might want to change the name to
> indicate the QC extensions, but I think we should keep the existing
> pattern, with a '_0' suffix here.

This overrides the name before the suffix is added, so the PMU name will
be
qcom_pmuv3_0 for Centriq 2400 which has only Falkor CPUs.

>
>> +
>> + /* Override the necessary ops */
>> + pmu->map_event = falkor_map_event;
>> + pmu->get_event_idx = falkor_get_event_idx;
>> + pmu->reset = falkor_reset;
>> + pmu->enable = falkor_enable;
>> + pmu->disable = falkor_disable;
>
> I'm somewhat concerned by hooking into the existing PMU code at this
> level, but I don't currently have a better suggestion.
>

IMO this is no different from other PMUs implemented on top of the
arm_pmu
framework. The difference is of course that I'm calling back into the
base
PMUv3 ops, but the alternative would be to duplicate, which is what we
want
to avoid.

Thanks for the detailed feedback, I'll try to submit V3 before week's
end.
Let me know if you have any concerns about my replies above.

Thanks,
Agustín

--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a
Linux Foundation Collaborative Project.

2018-06-13 10:36:54

by Will Deacon

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

On Tue, Jun 12, 2018 at 04:41:32PM -0400, Agustin Vega-Frias wrote:
> Hi Mark,
>
> On 2018-06-12 10:40, Mark Rutland wrote:
> >Hi,
> >
> >On Thu, Jun 07, 2018 at 09:56:48AM -0400, Agustin Vega-Frias wrote:
> >>Selection of these events can be envisioned as indexing them from
> >>a 3D matrix:
> >>- the first index selects a Region Event Selection Register
> >>(PMRESRx_EL0)
> >>- the second index selects a group from which only one event at a time
> >> can be selected
> >>- the third index selects the event
> >>
> >>The event is encoded into perf_event_attr.config as 0xPRCCG, where:
> >> P [config:16 ] = prefix (flag that indicates a matrix-based
> >>event)
> >> R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
> >> G [config:0-3 ] = group (specifies the event group)
> >> CC [config:4-11 ] = code (specifies the event)
> >>
> >>Events with the P flag set to zero are treated as common PMUv3 events
> >>and are directly programmed into PMXEVTYPERx_EL0.
> >>
> >>The first two indexes are set combining the RESR and group number with
> >>a base number and writing it into the architected PMXEVTYPER_EL0
> >>register.
> >>The third index is set by writing the code into the bits corresponding
> >>with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
> >>register.
> >
> >When are the IMP DEF registers accessible at EL0? Are those goverend by
> >the same controls as the architected registers?
>
> No, there is a separate IMP DEF register to control access.

Great :( We need to make sure we disable EL0 access during boot then, but
that means we need to prove for the existence of this thing in head.S
(since the PMU driver might not get loaded).

Also, what's the kvm story here so that we don't accidentally open up a
VM-VM side-channel via these registers? How do the EL1 trapping controls
work?

Will

2018-06-13 13:00:51

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

On 13/06/18 11:35, Will Deacon wrote:
> On Tue, Jun 12, 2018 at 04:41:32PM -0400, Agustin Vega-Frias wrote:
>> Hi Mark,
>>
>> On 2018-06-12 10:40, Mark Rutland wrote:
>>> Hi,
>>>
>>> On Thu, Jun 07, 2018 at 09:56:48AM -0400, Agustin Vega-Frias wrote:
>>>> Selection of these events can be envisioned as indexing them from
>>>> a 3D matrix:
>>>> - the first index selects a Region Event Selection Register
>>>> (PMRESRx_EL0)
>>>> - the second index selects a group from which only one event at a time
>>>> can be selected
>>>> - the third index selects the event
>>>>
>>>> The event is encoded into perf_event_attr.config as 0xPRCCG, where:
>>>> P [config:16 ] = prefix (flag that indicates a matrix-based
>>>> event)
>>>> R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
>>>> G [config:0-3 ] = group (specifies the event group)
>>>> CC [config:4-11 ] = code (specifies the event)
>>>>
>>>> Events with the P flag set to zero are treated as common PMUv3 events
>>>> and are directly programmed into PMXEVTYPERx_EL0.
>>>>
>>>> The first two indexes are set combining the RESR and group number with
>>>> a base number and writing it into the architected PMXEVTYPER_EL0
>>>> register.
>>>> The third index is set by writing the code into the bits corresponding
>>>> with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
>>>> register.
>>>
>>> When are the IMP DEF registers accessible at EL0? Are those goverend by
>>> the same controls as the architected registers?
>>
>> No, there is a separate IMP DEF register to control access.
>
> Great :( We need to make sure we disable EL0 access during boot then, but
> that means we need to prove for the existence of this thing in head.S
> (since the PMU driver might not get loaded).
>
> Also, what's the kvm story here so that we don't accidentally open up a
> VM-VM side-channel via these registers? How do the EL1 trapping controls
> work?

We'd trap the IMPDEF register access and inject an UNDEF (assuming that
the IMPDEF trapping works correctly). I have strictly no plan to support
this in a guest.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-06-13 13:02:59

by Will Deacon

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

On Wed, Jun 13, 2018 at 01:59:58PM +0100, Marc Zyngier wrote:
> On 13/06/18 11:35, Will Deacon wrote:
> > On Tue, Jun 12, 2018 at 04:41:32PM -0400, Agustin Vega-Frias wrote:
> >> On 2018-06-12 10:40, Mark Rutland wrote:
> >>> On Thu, Jun 07, 2018 at 09:56:48AM -0400, Agustin Vega-Frias wrote:
> >>>> Selection of these events can be envisioned as indexing them from
> >>>> a 3D matrix:
> >>>> - the first index selects a Region Event Selection Register
> >>>> (PMRESRx_EL0)
> >>>> - the second index selects a group from which only one event at a time
> >>>> can be selected
> >>>> - the third index selects the event
> >>>>
> >>>> The event is encoded into perf_event_attr.config as 0xPRCCG, where:
> >>>> P [config:16 ] = prefix (flag that indicates a matrix-based
> >>>> event)
> >>>> R [config:12-15] = register (specifies the PMRESRx_EL0 instance)
> >>>> G [config:0-3 ] = group (specifies the event group)
> >>>> CC [config:4-11 ] = code (specifies the event)
> >>>>
> >>>> Events with the P flag set to zero are treated as common PMUv3 events
> >>>> and are directly programmed into PMXEVTYPERx_EL0.
> >>>>
> >>>> The first two indexes are set combining the RESR and group number with
> >>>> a base number and writing it into the architected PMXEVTYPER_EL0
> >>>> register.
> >>>> The third index is set by writing the code into the bits corresponding
> >>>> with the group into the appropriate IMPLEMENTATION DEFINED PMRESRx_EL0
> >>>> register.
> >>>
> >>> When are the IMP DEF registers accessible at EL0? Are those goverend by
> >>> the same controls as the architected registers?
> >>
> >> No, there is a separate IMP DEF register to control access.
> >
> > Great :( We need to make sure we disable EL0 access during boot then, but
> > that means we need to prove for the existence of this thing in head.S
> > (since the PMU driver might not get loaded).
> >
> > Also, what's the kvm story here so that we don't accidentally open up a
> > VM-VM side-channel via these registers? How do the EL1 trapping controls
> > work?
>
> We'd trap the IMPDEF register access and inject an UNDEF (assuming that
> the IMPDEF trapping works correctly). I have strictly no plan to support
> this in a guest.

Ah, so we could actually configure that in el2_setup and solve the host
problem if we're entered at EL2. Agustin -- does that work, and what do we
need to do if the host is entered at EL1?

Will

2018-06-13 13:08:33

by Mark Rutland

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

On Tue, Jun 12, 2018 at 04:41:32PM -0400, Agustin Vega-Frias wrote:
> On 2018-06-12 10:40, Mark Rutland wrote:
> > On Thu, Jun 07, 2018 at 09:56:48AM -0400, Agustin Vega-Frias wrote:
> > > +/*
> > > + * Qualcomm Technologies CPU PMU IMPLEMENTATION DEFINED extensions
> > > support
> > > + *
> > > + * Current extensions supported:
> > > + *
> > > + * - Matrix-based microarchitectural events support
> > > + *
> > > + * Selection of these events can be envisioned as indexing them
> > > from
> > > + * a 3D matrix:
> > > + * - the first index selects a Region Event Selection Register
> > > (PMRESRx_EL0)
> > > + * - the second index selects a group from which only one event
> > > at a time
> > > + * can be selected
> > > + * - the third index selects the event
> > > + *
> > > + * The event is encoded into perf_event_attr.config as 0xPRCCG,
> > > where:
> > > + * P [config:16 ] = prefix (flag that indicates a
> > > matrix-based event)
> > > + * R [config:12-15] = register (specifies the PMRESRx_EL0
> > > instance)
> > > + * G [config:0-3 ] = group (specifies the event group)
> > > + * CC [config:4-11 ] = code (specifies the event)
> > > + *
> > > + * Events with the P flag set to zero are treated as common PMUv3
> > > events
> > > + * and are directly programmed into PMXEVTYPERx_EL0.
> >
> > When PMUv3 is given a raw event code, the config fields should be the
> > PMU event number, and this conflicts with RESERVED encodings.
> >
> > I'd rather we used a separate field for the QC extension events. e.g.
> > turn config1 into a flags field, and move the P flag there.
> >
> > We *should* add code to sanity check those fields are zero in the PMUv3
> > driver, even though it's a potential ABI break to start now.
>
> I should have stated clearly that in this case the event code is directly
> programmed into PMXEVTYPERx_EL0.evtCount, not by this code, but by the PMUv3
> code, which will do the masking and ensure reserved bits are not touched.

I understand this may be fine on the HW side; it's more to do with the
userspace ABI expected with PMUv3. The config encoding should be the
(PMUv3) type field, and I don't want a potential clash if/when PMUv3
gets extended in future beyond the current 16 bits.

I'd very much like to keep the QC extension bits separate from that.

> IOW, that case is no different from the raw event or a common event case.
>
> I would prefer to keep the flag in config because it allows the use of
> raw code encodings to access these events more easily, and given that
> the flag is never propagated to any register I believe it is safe.

You can easily add sysfs fields to make this easy, e.g. have reg map to
config1:x-y, and the user can specify the event based on that, e.g.

perf ${cmd} -e qcom_pmuv3/reg=0xf,.../

Which means they're *explicitly* asking for the QC extension bits, and
we can be sure they're not asking for something from baseline PMUv3.

We *might* want to namespace the qc fields, in case we want to add
anything similar to PMUv3 in future.

[...]

> > > +/*
> > > + * Check if e1 and e2 conflict with each other
> > > + *
> > > + * e1 is a matrix-based microarchitectural event we are checking
> > > against e2.
> > > + * A conflict exists if the events use the same reg, group, and a
> > > different
> > > + * code. Events with the same code are allowed because they could
> > > be using
> > > + * different filters (e.g. one to count user space and the other to
> > > count
> > > + * kernel space events).
> > > + */

> > Does the filter matter at all? When happens if I open two identical
> > events, both counting the same reg, group, and code, with the same
> > filter?
>
> That is possible and allowed, similar to counting the same common event
> in two configurable counters. Only problem is wasting a counter resource.

Ok. Please drop the mention of filtering from the comment -- it makes it
sound like there's a potential problem, and it isn't relevant.

[...]

> > > + /* Matrix event, program the appropriate PMRESRx_EL0 */
> > > + struct arm_pmu *pmu = to_arm_pmu(event->pmu);
> > > + struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
> > > + u64 reg = QC_EVT_REG(event->attr.config);
> > > + u64 code = QC_EVT_CODE(event->attr.config);
> > > + u64 group = QC_EVT_GROUP(event->attr.config);
> > > + unsigned long flags;
> > > +
> > > + raw_spin_lock_irqsave(&events->pmu_lock, flags);
> > > + falkor_set_resr(reg, group, code);
> > > + raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> >
> > Why is the spinlock required?
> >
> > AFACIT this should only ever be called in contexts where IRQs are
> > disabled already.
> >
>
> falkor_set_resr is a read-modify-write operation. The PMUv3 code uses
> the spinlock to protect the counter selection too (armv8pmu_enable_event).

As I mention, that only happens in contexts where IRQs are disabled, so
that shouldn't be a problem.

> I believe this is to deal with event rotation which can potentially
> be active when we are creating new events.

Event rotation happens off the back of a hrtimer callback, with IRQs
disabled. There's a lockdep assert to that effect in
perf_mux_hrtimer_handler(), before it calls perf_rotate_context().

> > > + }
> > > +
> > > + /* Let the original op handle the rest */
> > > + def_ops->enable(event);
> > > +}
> > > +
> > > +/*
> > > + * Disable the given event
> > > + */
> > > +static void falkor_disable(struct perf_event *event)
> > > +{
> > > + /* Use the original op to disable the counter and interrupt */
> > > + def_ops->enable(event);
> > > +
> > > + if (!!(event->attr.config & QC_EVT_PFX_MASK)) {
> > > + /* Matrix event, de-program the appropriate PMRESRx_EL0 */
> > > + struct arm_pmu *pmu = to_arm_pmu(event->pmu);
> > > + struct pmu_hw_events *events = this_cpu_ptr(pmu->hw_events);
> > > + u64 reg = QC_EVT_REG(event->attr.config);
> > > + u64 group = QC_EVT_GROUP(event->attr.config);
> > > + unsigned long flags;
> > > +
> > > + raw_spin_lock_irqsave(&events->pmu_lock, flags);
> > > + falkor_clear_resr(reg, group);
> > > + raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
> > > + }
> > > +}
> >
> > Same comments as with falkor_enable().
> >
> > > +
> > > +PMU_FORMAT_ATTR(event, "config:0-15");
> > > +PMU_FORMAT_ATTR(prefix, "config:16");
> > > +PMU_FORMAT_ATTR(reg, "config:12-15");
> > > +PMU_FORMAT_ATTR(code, "config:4-11");
> > > +PMU_FORMAT_ATTR(group, "config:0-3");
> >
> > What sort of events are available? Do you plan to add anything to the
> > userspace event database in tools/perf/pmu-events/ ?
> >
>
> Yes, we are still doing some internal work to see what we can put in
> the driver or as JSON events.

Please put them in userspace. Events living in the kernel is really a
legacy thing, and we've placed all other IMP-DEF events in userspace for
ACPI systems.

[...]

> > > + pmu->name = "qcom_pmuv3";
> >
> > All the other CPU PMUs on an ARM ACPI system will have an index suffix,
> > e.g. "armv8_pmuv3_0". I can see why we might want to change the name to
> > indicate the QC extensions, but I think we should keep the existing
> > pattern, with a '_0' suffix here.
>
> This overrides the name before the suffix is added, so the PMU name will be
> qcom_pmuv3_0 for Centriq 2400 which has only Falkor CPUs.

Ok.

[...]

> > > + /* Override the necessary ops */
> > > + pmu->map_event = falkor_map_event;
> > > + pmu->get_event_idx = falkor_get_event_idx;
> > > + pmu->reset = falkor_reset;
> > > + pmu->enable = falkor_enable;
> > > + pmu->disable = falkor_disable;
> >
> > I'm somewhat concerned by hooking into the existing PMU code at this
> > level, but I don't currently have a better suggestion.
> >
>
> IMO this is no different from other PMUs implemented on top of the arm_pmu
> framework. The difference is of course that I'm calling back into the base
> PMUv3 ops,

Yup, the latter is what I'm concerned about. e.g. when you take the
pmu_lock, if PMUv3 code has already taken this, there'll be a deadlock,
and it means that we can't consider the PMUv3 code in isolation.

As long as we can keep this *simple*, that'll just leave me uneasy.

Thanks,
Mark.

2018-06-14 15:31:46

by Agustin Vega-Frias

[permalink] [raw]
Subject: Re: [RFC V2 3/3] perf: qcom: Add Falkor CPU PMU IMPLEMENTATION DEFINED event support

Hi,

On 2018-06-13 09:02, Will Deacon wrote:
> On Wed, Jun 13, 2018 at 01:59:58PM +0100, Marc Zyngier wrote:
>> On 13/06/18 11:35, Will Deacon wrote:

[...]

>> > Great :( We need to make sure we disable EL0 access during boot then, but
>> > that means we need to prove for the existence of this thing in head.S
>> > (since the PMU driver might not get loaded).

Unfortunately it appears the only check we can do for this is through
the MIDR or PMPIDRx. I'm investigating whether we can detect this
through
some other mechanism, but it that doesn't exists would the MIDR.PMPIDRx
check be acceptable?

>> >
>> > Also, what's the kvm story here so that we don't accidentally open up a
>> > VM-VM side-channel via these registers? How do the EL1 trapping controls
>> > work?

Traps for these registers are as follows:

Architecture traps:
- HCR_EL2.TIDCP enables a trap when accessing IMP DEF registers. kvm
will set
this bit for all guests and access from EL1/EL0 will cause a trap to
EL2.
- MDCR_EL2.TPM enables a trap when accessing the PM registers from
EL1/EL0,
causing accesses to trap to EL2.
- MDCR_EL2.TPM enables a trap when accessing the PM registers from
EL1/EL0,
causing accesses to trap to EL2.

IMP DEF traps:
- IMP DEF register PMACTLR_EL0.UEN controls EL0 access to all IMP DEF
registers related to performance monitoring:
0->disables EL0 access (default), 1->enables EL0 access
- IMP DEF HACR_EL2.TCPUPMRBB enables a trap when accessing the RBB
registers
from EL1/EL0, causing accesses to trap to EL2.
- IMP DEF HACR_EL2.TCPUPMPCCPT enables a trap when accessing the PCC
registers
from EL1/EL0, causing accesses to trap to EL2.
- IMP DEF HACR_EL2.TCPUPMRESRRLDR enables a trap when accessing the
PMRESRx_EL0
registers from EL1/EL0, causing accesses to trap to EL2.

>>
>> We'd trap the IMPDEF register access and inject an UNDEF (assuming
>> that
>> the IMPDEF trapping works correctly). I have strictly no plan to
>> support
>> this in a guest.
>
> Ah, so we could actually configure that in el2_setup and solve the host
> problem if we're entered at EL2. Agustin -- does that work, and what do
> we
> need to do if the host is entered at EL1?
>

Yes, that works, I understand there is no desire to support this under
virtualization.

As for the question about the EL1 host, all of our firmware
configurations
boot the host OS at EL2. Is that sufficient guarantee or can you suggest
an alternative mechanism to ensure security?

Thanks,
Agustín

--
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a
Linux Foundation Collaborative Project.