2022-10-19 19:14:44

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 0/7] perf: Arm SPEv1.2 support

This series adds support for Arm SPEv1.2 which is part of the
Armv8.7/Armv9.2 architecture. There's 2 new features that affect the
kernel: a new event filter bit, branch 'not taken', and an inverted
event filter register.

Since this support adds new registers and fields, first the SPE register
defines are converted to automatic generation.

Note that the 'config3' addition in sysfs format files causes SPE to
break. A stable fix e552b7be12ed ("perf: Skip and warn on unknown format
'configN' attrs") landed in v6.1-rc1.

The perf tool side changes are available here[1].

Tested on FVP.

[1] https://lore.kernel.org/all/[email protected]/

Signed-off-by: Rob Herring <[email protected]>
---
Changes in v2:
- Convert the SPE register defines to automatic generation
- Fixed access to SYS_PMSNEVFR_EL1 when not present
- Rebase on v6.1-rc1
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Rob Herring (7):
perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines
arm64: Drop SYS_ from SPE register defines
arm64/sysreg: Convert SPE registers to automatic generation
perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors
perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event
perf: Add perf_event_attr::config3
perf: arm_spe: Add support for SPEv1.2 inverted event filtering

arch/arm64/include/asm/el2_setup.h | 6 +-
arch/arm64/include/asm/sysreg.h | 99 +++------------------------
arch/arm64/kvm/debug.c | 2 +-
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 2 +-
arch/arm64/tools/sysreg | 116 +++++++++++++++++++++++++++++++
drivers/perf/arm_spe_pmu.c | 136 ++++++++++++++++++++++++-------------
include/uapi/linux/perf_event.h | 3 +
7 files changed, 224 insertions(+), 140 deletions(-)
---
base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
change-id: 20220825-arm-spe-v8-7-fedf04e16f23

Best regards,
--
Rob Herring <[email protected]>


2022-10-19 19:14:53

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 2/7] arm64: Drop SYS_ from SPE register defines

We currently have a non-standard SYS_ prefix in the constants generated
for the SPE register bitfields. Drop this in preparation for automatic
register definition generation.

The SPE mask defines were unshifted, and the SPE register field
enumerations were shifted. The autogenerated defines are the opposite,
so make the necessary adjustments.

No functional changes.

Signed-off-by: Rob Herring <[email protected]>
---
v2:
- New patch
---
arch/arm64/include/asm/el2_setup.h | 6 +-
arch/arm64/include/asm/sysreg.h | 112 ++++++++++++++++++-------------------
arch/arm64/kvm/debug.c | 2 +-
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 2 +-
drivers/perf/arm_spe_pmu.c | 85 ++++++++++++++--------------
5 files changed, 103 insertions(+), 104 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index 668569adf4d3..f9da43e53cdb 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -53,10 +53,10 @@
cbz x0, .Lskip_spe_\@ // Skip if SPE not present

mrs_s x0, SYS_PMBIDR_EL1 // If SPE available at EL2,
- and x0, x0, #(1 << SYS_PMBIDR_EL1_P_SHIFT)
+ and x0, x0, #(1 << PMBIDR_EL1_P_SHIFT)
cbnz x0, .Lskip_spe_el2_\@ // then permit sampling of physical
- mov x0, #(1 << SYS_PMSCR_EL2_PCT_SHIFT | \
- 1 << SYS_PMSCR_EL2_PA_SHIFT)
+ mov x0, #(1 << PMSCR_EL2_PCT_SHIFT | \
+ 1 << PMSCR_EL2_PA_SHIFT)
msr_s SYS_PMSCR_EL2, x0 // addresses and physical counter
.Lskip_spe_el2_\@:
mov x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 9a4cf12e3e16..8df8a0a51273 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -239,59 +239,59 @@
/*** Statistical Profiling Extension ***/
/* ID registers */
#define SYS_PMSIDR_EL1 sys_reg(3, 0, 9, 9, 7)
-#define SYS_PMSIDR_EL1_FE_SHIFT 0
-#define SYS_PMSIDR_EL1_FT_SHIFT 1
-#define SYS_PMSIDR_EL1_FL_SHIFT 2
-#define SYS_PMSIDR_EL1_ARCHINST_SHIFT 3
-#define SYS_PMSIDR_EL1_LDS_SHIFT 4
-#define SYS_PMSIDR_EL1_ERND_SHIFT 5
-#define SYS_PMSIDR_EL1_INTERVAL_SHIFT 8
-#define SYS_PMSIDR_EL1_INTERVAL_MASK 0xfUL
-#define SYS_PMSIDR_EL1_MAXSIZE_SHIFT 12
-#define SYS_PMSIDR_EL1_MAXSIZE_MASK 0xfUL
-#define SYS_PMSIDR_EL1_COUNTSIZE_SHIFT 16
-#define SYS_PMSIDR_EL1_COUNTSIZE_MASK 0xfUL
+#define PMSIDR_EL1_FE_SHIFT 0
+#define PMSIDR_EL1_FT_SHIFT 1
+#define PMSIDR_EL1_FL_SHIFT 2
+#define PMSIDR_EL1_ARCHINST_SHIFT 3
+#define PMSIDR_EL1_LDS_SHIFT 4
+#define PMSIDR_EL1_ERND_SHIFT 5
+#define PMSIDR_EL1_INTERVAL_SHIFT 8
+#define PMSIDR_EL1_INTERVAL_MASK GENMASK_ULL(11, 8)
+#define PMSIDR_EL1_MAXSIZE_SHIFT 12
+#define PMSIDR_EL1_MAXSIZE_MASK GENMASK_ULL(15, 12)
+#define PMSIDR_EL1_COUNTSIZE_SHIFT 16
+#define PMSIDR_EL1_COUNTSIZE_MASK GENMASK_ULL(19, 16)

#define SYS_PMBIDR_EL1 sys_reg(3, 0, 9, 10, 7)
-#define SYS_PMBIDR_EL1_ALIGN_SHIFT 0
-#define SYS_PMBIDR_EL1_ALIGN_MASK 0xfU
-#define SYS_PMBIDR_EL1_P_SHIFT 4
-#define SYS_PMBIDR_EL1_F_SHIFT 5
+#define PMBIDR_EL1_ALIGN_SHIFT 0
+#define PMBIDR_EL1_ALIGN_MASK 0xfU
+#define PMBIDR_EL1_P_SHIFT 4
+#define PMBIDR_EL1_F_SHIFT 5

/* Sampling controls */
#define SYS_PMSCR_EL1 sys_reg(3, 0, 9, 9, 0)
-#define SYS_PMSCR_EL1_E0SPE_SHIFT 0
-#define SYS_PMSCR_EL1_E1SPE_SHIFT 1
-#define SYS_PMSCR_EL1_CX_SHIFT 3
-#define SYS_PMSCR_EL1_PA_SHIFT 4
-#define SYS_PMSCR_EL1_TS_SHIFT 5
-#define SYS_PMSCR_EL1_PCT_SHIFT 6
+#define PMSCR_EL1_E0SPE_SHIFT 0
+#define PMSCR_EL1_E1SPE_SHIFT 1
+#define PMSCR_EL1_CX_SHIFT 3
+#define PMSCR_EL1_PA_SHIFT 4
+#define PMSCR_EL1_TS_SHIFT 5
+#define PMSCR_EL1_PCT_SHIFT 6

#define SYS_PMSCR_EL2 sys_reg(3, 4, 9, 9, 0)
-#define SYS_PMSCR_EL2_E0HSPE_SHIFT 0
-#define SYS_PMSCR_EL2_E2SPE_SHIFT 1
-#define SYS_PMSCR_EL2_CX_SHIFT 3
-#define SYS_PMSCR_EL2_PA_SHIFT 4
-#define SYS_PMSCR_EL2_TS_SHIFT 5
-#define SYS_PMSCR_EL2_PCT_SHIFT 6
+#define PMSCR_EL2_E0HSPE_SHIFT 0
+#define PMSCR_EL2_E2SPE_SHIFT 1
+#define PMSCR_EL2_CX_SHIFT 3
+#define PMSCR_EL2_PA_SHIFT 4
+#define PMSCR_EL2_TS_SHIFT 5
+#define PMSCR_EL2_PCT_SHIFT 6

#define SYS_PMSICR_EL1 sys_reg(3, 0, 9, 9, 2)

#define SYS_PMSIRR_EL1 sys_reg(3, 0, 9, 9, 3)
-#define SYS_PMSIRR_EL1_RND_SHIFT 0
-#define SYS_PMSIRR_EL1_INTERVAL_SHIFT 8
-#define SYS_PMSIRR_EL1_INTERVAL_MASK 0xffffffUL
+#define PMSIRR_EL1_RND_SHIFT 0
+#define PMSIRR_EL1_INTERVAL_SHIFT 8
+#define PMSIRR_EL1_INTERVAL_MASK GENMASK_ULL(31, 8)

/* Filtering controls */
#define SYS_PMSNEVFR_EL1 sys_reg(3, 0, 9, 9, 1)

#define SYS_PMSFCR_EL1 sys_reg(3, 0, 9, 9, 4)
-#define SYS_PMSFCR_EL1_FE_SHIFT 0
-#define SYS_PMSFCR_EL1_FT_SHIFT 1
-#define SYS_PMSFCR_EL1_FL_SHIFT 2
-#define SYS_PMSFCR_EL1_B_SHIFT 16
-#define SYS_PMSFCR_EL1_LD_SHIFT 17
-#define SYS_PMSFCR_EL1_ST_SHIFT 18
+#define PMSFCR_EL1_FE_SHIFT 0
+#define PMSFCR_EL1_FT_SHIFT 1
+#define PMSFCR_EL1_FL_SHIFT 2
+#define PMSFCR_EL1_B_SHIFT 16
+#define PMSFCR_EL1_LD_SHIFT 17
+#define PMSFCR_EL1_ST_SHIFT 18

#define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
#define PMSEVFR_EL1_RES0_IMP \
@@ -301,37 +301,37 @@
(PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))

#define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
-#define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
+#define PMSLATFR_EL1_MINLAT_SHIFT 0

/* Buffer controls */
#define SYS_PMBLIMITR_EL1 sys_reg(3, 0, 9, 10, 0)
-#define SYS_PMBLIMITR_EL1_E_SHIFT 0
-#define SYS_PMBLIMITR_EL1_FM_SHIFT 1
-#define SYS_PMBLIMITR_EL1_FM_MASK 0x3UL
-#define SYS_PMBLIMITR_EL1_FM_STOP_IRQ (0 << SYS_PMBLIMITR_EL1_FM_SHIFT)
+#define PMBLIMITR_EL1_E_SHIFT 0
+#define PMBLIMITR_EL1_FM_SHIFT 1
+#define PMBLIMITR_EL1_FM_MASK GENMASK_ULL(2, 1)
+#define PMBLIMITR_EL1_FM_STOP_IRQ 0

#define SYS_PMBPTR_EL1 sys_reg(3, 0, 9, 10, 1)

/* Buffer error reporting */
#define SYS_PMBSR_EL1 sys_reg(3, 0, 9, 10, 3)
-#define SYS_PMBSR_EL1_COLL_SHIFT 16
-#define SYS_PMBSR_EL1_S_SHIFT 17
-#define SYS_PMBSR_EL1_EA_SHIFT 18
-#define SYS_PMBSR_EL1_DL_SHIFT 19
-#define SYS_PMBSR_EL1_EC_SHIFT 26
-#define SYS_PMBSR_EL1_EC_MASK 0x3fUL
+#define PMBSR_EL1_COLL_SHIFT 16
+#define PMBSR_EL1_S_SHIFT 17
+#define PMBSR_EL1_EA_SHIFT 18
+#define PMBSR_EL1_DL_SHIFT 19
+#define PMBSR_EL1_EC_SHIFT 26
+#define PMBSR_EL1_EC_MASK GENMASK_ULL(31, 26)

-#define SYS_PMBSR_EL1_EC_BUF (0x0UL << SYS_PMBSR_EL1_EC_SHIFT)
-#define SYS_PMBSR_EL1_EC_FAULT_S1 (0x24UL << SYS_PMBSR_EL1_EC_SHIFT)
-#define SYS_PMBSR_EL1_EC_FAULT_S2 (0x25UL << SYS_PMBSR_EL1_EC_SHIFT)
+#define PMBSR_EL1_EC_BUF 0x0UL
+#define PMBSR_EL1_EC_FAULT_S1 0x24UL
+#define PMBSR_EL1_EC_FAULT_S2 0x25UL

-#define SYS_PMBSR_EL1_FAULT_FSC_SHIFT 0
-#define SYS_PMBSR_EL1_FAULT_FSC_MASK 0x3fUL
+#define PMBSR_EL1_FAULT_FSC_SHIFT 0
+#define PMBSR_EL1_FAULT_FSC_MASK 0x3fUL

-#define SYS_PMBSR_EL1_BUF_BSC_SHIFT 0
-#define SYS_PMBSR_EL1_BUF_BSC_MASK 0x3fUL
+#define PMBSR_EL1_BUF_BSC_SHIFT 0
+#define PMBSR_EL1_BUF_BSC_MASK 0x3fUL

-#define SYS_PMBSR_EL1_BUF_BSC_FULL (0x1UL << SYS_PMBSR_EL1_BUF_BSC_SHIFT)
+#define PMBSR_EL1_BUF_BSC_FULL 0x1UL

/*** End of Statistical Profiling Extension ***/

diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index fccf9ec01813..55f80fb93925 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -328,7 +328,7 @@ void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu)
* we may need to check if the host state needs to be saved.
*/
if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_PMSVer_SHIFT) &&
- !(read_sysreg_s(SYS_PMBIDR_EL1) & BIT(SYS_PMBIDR_EL1_P_SHIFT)))
+ !(read_sysreg_s(SYS_PMBIDR_EL1) & BIT(PMBIDR_EL1_P_SHIFT)))
vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_SPE);

/* Check if we have TRBE implemented and available at the host */
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index e17455773b98..2673bde62fad 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -27,7 +27,7 @@ static void __debug_save_spe(u64 *pmscr_el1)
* Check if the host is actually using it ?
*/
reg = read_sysreg_s(SYS_PMBLIMITR_EL1);
- if (!(reg & BIT(SYS_PMBLIMITR_EL1_E_SHIFT)))
+ if (!(reg & BIT(PMBLIMITR_EL1_E_SHIFT)))
return;

/* Yes; save the control register and disable data generation */
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 65cf93dcc8ee..814ed18346b6 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -12,6 +12,7 @@
#define DRVNAME PMUNAME "_pmu"
#define pr_fmt(fmt) DRVNAME ": " fmt

+#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/bug.h>
#include <linux/capability.h>
@@ -282,18 +283,18 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
struct perf_event_attr *attr = &event->attr;
u64 reg = 0;

- reg |= ATTR_CFG_GET_FLD(attr, ts_enable) << SYS_PMSCR_EL1_TS_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, pa_enable) << SYS_PMSCR_EL1_PA_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, pct_enable) << SYS_PMSCR_EL1_PCT_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, ts_enable) << PMSCR_EL1_TS_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, pa_enable) << PMSCR_EL1_PA_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, pct_enable) << PMSCR_EL1_PCT_SHIFT;

if (!attr->exclude_user)
- reg |= BIT(SYS_PMSCR_EL1_E0SPE_SHIFT);
+ reg |= BIT(PMSCR_EL1_E0SPE_SHIFT);

if (!attr->exclude_kernel)
- reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
+ reg |= BIT(PMSCR_EL1_E1SPE_SHIFT);

if (get_spe_event_has_cx(event))
- reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
+ reg |= BIT(PMSCR_EL1_CX_SHIFT);

return reg;
}
@@ -302,8 +303,7 @@ static void arm_spe_event_sanitise_period(struct perf_event *event)
{
struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu);
u64 period = event->hw.sample_period;
- u64 max_period = SYS_PMSIRR_EL1_INTERVAL_MASK
- << SYS_PMSIRR_EL1_INTERVAL_SHIFT;
+ u64 max_period = PMSIRR_EL1_INTERVAL_MASK;

if (period < spe_pmu->min_period)
period = spe_pmu->min_period;
@@ -322,7 +322,7 @@ static u64 arm_spe_event_to_pmsirr(struct perf_event *event)

arm_spe_event_sanitise_period(event);

- reg |= ATTR_CFG_GET_FLD(attr, jitter) << SYS_PMSIRR_EL1_RND_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, jitter) << PMSIRR_EL1_RND_SHIFT;
reg |= event->hw.sample_period;

return reg;
@@ -333,18 +333,18 @@ static u64 arm_spe_event_to_pmsfcr(struct perf_event *event)
struct perf_event_attr *attr = &event->attr;
u64 reg = 0;

- reg |= ATTR_CFG_GET_FLD(attr, load_filter) << SYS_PMSFCR_EL1_LD_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, store_filter) << SYS_PMSFCR_EL1_ST_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, branch_filter) << SYS_PMSFCR_EL1_B_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, load_filter) << PMSFCR_EL1_LD_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, store_filter) << PMSFCR_EL1_ST_SHIFT;
+ reg |= ATTR_CFG_GET_FLD(attr, branch_filter) << PMSFCR_EL1_B_SHIFT;

if (reg)
- reg |= BIT(SYS_PMSFCR_EL1_FT_SHIFT);
+ reg |= BIT(PMSFCR_EL1_FT_SHIFT);

if (ATTR_CFG_GET_FLD(attr, event_filter))
- reg |= BIT(SYS_PMSFCR_EL1_FE_SHIFT);
+ reg |= BIT(PMSFCR_EL1_FE_SHIFT);

if (ATTR_CFG_GET_FLD(attr, min_latency))
- reg |= BIT(SYS_PMSFCR_EL1_FL_SHIFT);
+ reg |= BIT(PMSFCR_EL1_FL_SHIFT);

return reg;
}
@@ -359,7 +359,7 @@ static u64 arm_spe_event_to_pmslatfr(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
return ATTR_CFG_GET_FLD(attr, min_latency)
- << SYS_PMSLATFR_EL1_MINLAT_SHIFT;
+ << PMSLATFR_EL1_MINLAT_SHIFT;
}

static void arm_spe_pmu_pad_buf(struct perf_output_handle *handle, int len)
@@ -511,7 +511,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle,
limit = buf->snapshot ? arm_spe_pmu_next_snapshot_off(handle)
: arm_spe_pmu_next_off(handle);
if (limit)
- limit |= BIT(SYS_PMBLIMITR_EL1_E_SHIFT);
+ limit |= BIT(PMBLIMITR_EL1_E_SHIFT);

limit += (u64)buf->base;
base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf);
@@ -570,28 +570,28 @@ arm_spe_pmu_buf_get_fault_act(struct perf_output_handle *handle)

/* Service required? */
pmbsr = read_sysreg_s(SYS_PMBSR_EL1);
- if (!(pmbsr & BIT(SYS_PMBSR_EL1_S_SHIFT)))
+ if (!(pmbsr & BIT(PMBSR_EL1_S_SHIFT)))
return SPE_PMU_BUF_FAULT_ACT_SPURIOUS;

/*
* If we've lost data, disable profiling and also set the PARTIAL
* flag to indicate that the last record is corrupted.
*/
- if (pmbsr & BIT(SYS_PMBSR_EL1_DL_SHIFT))
+ if (pmbsr & BIT(PMBSR_EL1_DL_SHIFT))
perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED |
PERF_AUX_FLAG_PARTIAL);

/* Report collisions to userspace so that it can up the period */
- if (pmbsr & BIT(SYS_PMBSR_EL1_COLL_SHIFT))
+ if (pmbsr & BIT(PMBSR_EL1_COLL_SHIFT))
perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION);

/* We only expect buffer management events */
- switch (pmbsr & (SYS_PMBSR_EL1_EC_MASK << SYS_PMBSR_EL1_EC_SHIFT)) {
- case SYS_PMBSR_EL1_EC_BUF:
+ switch (FIELD_GET(PMBSR_EL1_EC_MASK, pmbsr)) {
+ case PMBSR_EL1_EC_BUF:
/* Handled below */
break;
- case SYS_PMBSR_EL1_EC_FAULT_S1:
- case SYS_PMBSR_EL1_EC_FAULT_S2:
+ case PMBSR_EL1_EC_FAULT_S1:
+ case PMBSR_EL1_EC_FAULT_S2:
err_str = "Unexpected buffer fault";
goto out_err;
default:
@@ -600,9 +600,8 @@ arm_spe_pmu_buf_get_fault_act(struct perf_output_handle *handle)
}

/* Buffer management event */
- switch (pmbsr &
- (SYS_PMBSR_EL1_BUF_BSC_MASK << SYS_PMBSR_EL1_BUF_BSC_SHIFT)) {
- case SYS_PMBSR_EL1_BUF_BSC_FULL:
+ switch (FIELD_GET(PMBSR_EL1_BUF_BSC_MASK, pmbsr)) {
+ case PMBSR_EL1_BUF_BSC_FULL:
ret = SPE_PMU_BUF_FAULT_ACT_OK;
goto out_stop;
default:
@@ -717,23 +716,23 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
return -EINVAL;

reg = arm_spe_event_to_pmsfcr(event);
- if ((reg & BIT(SYS_PMSFCR_EL1_FE_SHIFT)) &&
+ if ((reg & BIT(PMSFCR_EL1_FE_SHIFT)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_EVT))
return -EOPNOTSUPP;

- if ((reg & BIT(SYS_PMSFCR_EL1_FT_SHIFT)) &&
+ if ((reg & BIT(PMSFCR_EL1_FT_SHIFT)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_TYP))
return -EOPNOTSUPP;

- if ((reg & BIT(SYS_PMSFCR_EL1_FL_SHIFT)) &&
+ if ((reg & BIT(PMSFCR_EL1_FL_SHIFT)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT))
return -EOPNOTSUPP;

set_spe_event_has_cx(event);
reg = arm_spe_event_to_pmscr(event);
if (!perfmon_capable() &&
- (reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
- BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
+ (reg & (BIT(PMSCR_EL1_PA_SHIFT) |
+ BIT(PMSCR_EL1_PCT_SHIFT))))
return -EACCES;

return 0;
@@ -971,14 +970,14 @@ static void __arm_spe_pmu_dev_probe(void *info)

/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
- if (reg & BIT(SYS_PMBIDR_EL1_P_SHIFT)) {
+ if (reg & BIT(PMBIDR_EL1_P_SHIFT)) {
dev_err(dev,
"profiling buffer owned by higher exception level\n");
return;
}

/* Minimum alignment. If it's out-of-range, then fail the probe */
- fld = reg >> SYS_PMBIDR_EL1_ALIGN_SHIFT & SYS_PMBIDR_EL1_ALIGN_MASK;
+ fld = (reg & PMBIDR_EL1_ALIGN_MASK) >> PMBIDR_EL1_ALIGN_SHIFT;
spe_pmu->align = 1 << fld;
if (spe_pmu->align > SZ_2K) {
dev_err(dev, "unsupported PMBIDR.Align [%d] on CPU %d\n",
@@ -988,26 +987,26 @@ static void __arm_spe_pmu_dev_probe(void *info)

/* It's now safe to read PMSIDR and figure out what we've got */
reg = read_sysreg_s(SYS_PMSIDR_EL1);
- if (reg & BIT(SYS_PMSIDR_EL1_FE_SHIFT))
+ if (reg & BIT(PMSIDR_EL1_FE_SHIFT))
spe_pmu->features |= SPE_PMU_FEAT_FILT_EVT;

- if (reg & BIT(SYS_PMSIDR_EL1_FT_SHIFT))
+ if (reg & BIT(PMSIDR_EL1_FT_SHIFT))
spe_pmu->features |= SPE_PMU_FEAT_FILT_TYP;

- if (reg & BIT(SYS_PMSIDR_EL1_FL_SHIFT))
+ if (reg & BIT(PMSIDR_EL1_FL_SHIFT))
spe_pmu->features |= SPE_PMU_FEAT_FILT_LAT;

- if (reg & BIT(SYS_PMSIDR_EL1_ARCHINST_SHIFT))
+ if (reg & BIT(PMSIDR_EL1_ARCHINST_SHIFT))
spe_pmu->features |= SPE_PMU_FEAT_ARCH_INST;

- if (reg & BIT(SYS_PMSIDR_EL1_LDS_SHIFT))
+ if (reg & BIT(PMSIDR_EL1_LDS_SHIFT))
spe_pmu->features |= SPE_PMU_FEAT_LDS;

- if (reg & BIT(SYS_PMSIDR_EL1_ERND_SHIFT))
+ if (reg & BIT(PMSIDR_EL1_ERND_SHIFT))
spe_pmu->features |= SPE_PMU_FEAT_ERND;

/* This field has a spaced out encoding, so just use a look-up */
- fld = reg >> SYS_PMSIDR_EL1_INTERVAL_SHIFT & SYS_PMSIDR_EL1_INTERVAL_MASK;
+ fld = (reg & PMSIDR_EL1_INTERVAL_MASK) >> PMSIDR_EL1_INTERVAL_SHIFT;
switch (fld) {
case 0:
spe_pmu->min_period = 256;
@@ -1039,7 +1038,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
}

/* Maximum record size. If it's out-of-range, then fail the probe */
- fld = reg >> SYS_PMSIDR_EL1_MAXSIZE_SHIFT & SYS_PMSIDR_EL1_MAXSIZE_MASK;
+ fld = (reg & PMSIDR_EL1_MAXSIZE_MASK) >> PMSIDR_EL1_MAXSIZE_SHIFT;
spe_pmu->max_record_sz = 1 << fld;
if (spe_pmu->max_record_sz > SZ_2K || spe_pmu->max_record_sz < 16) {
dev_err(dev, "unsupported PMSIDR_EL1.MaxSize [%d] on CPU %d\n",
@@ -1047,7 +1046,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
return;
}

- fld = reg >> SYS_PMSIDR_EL1_COUNTSIZE_SHIFT & SYS_PMSIDR_EL1_COUNTSIZE_MASK;
+ fld = (reg & PMSIDR_EL1_COUNTSIZE_MASK) >> PMSIDR_EL1_COUNTSIZE_SHIFT;
switch (fld) {
default:
dev_warn(dev, "unknown PMSIDR_EL1.CountSize [%d]; assuming 2\n",

--
b4 0.11.0-dev

2022-10-19 19:17:31

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 5/7] perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event

Arm SPEv1.2 (Armv8.7/v9.2) adds a new event, 'not taken', in bit 6 of
the PMSEVFR_EL1 register. Update arm_spe_pmsevfr_res0() to support the
additional event.

Signed-off-by: Rob Herring <[email protected]>
---
v2:
- Update for v6.1 sysreg generated header changes
---
arch/arm64/include/asm/sysreg.h | 2 ++
drivers/perf/arm_spe_pmu.c | 4 +++-
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index d002dd00e53e..06231e896832 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -242,6 +242,8 @@
BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0))
#define PMSEVFR_EL1_RES0_V1P1 \
(PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))
+#define PMSEVFR_EL1_RES0_V1P2 \
+ (PMSEVFR_EL1_RES0_V1P1 & ~BIT_ULL(6))

/* Buffer error reporting */
#define PMBSR_EL1_FAULT_FSC_SHIFT PMBSR_EL1_MSS_SHIFT
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 9b4bd72087ea..0b9b847919d0 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -677,9 +677,11 @@ static u64 arm_spe_pmsevfr_res0(u16 pmsver)
case ID_AA64DFR0_EL1_PMSVer_IMP:
return PMSEVFR_EL1_RES0_IMP;
case ID_AA64DFR0_EL1_PMSVer_V1P1:
+ return PMSEVFR_EL1_RES0_V1P1;
+ case ID_AA64DFR0_EL1_PMSVer_V1P2:
/* Return the highest version we support in default */
default:
- return PMSEVFR_EL1_RES0_V1P1;
+ return PMSEVFR_EL1_RES0_V1P2;
}
}


--
b4 0.11.0-dev

2022-10-19 19:31:28

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 7/7] perf: arm_spe: Add support for SPEv1.2 inverted event filtering

Arm SPEv1.2 (Arm v8.7/v9.2) adds a new feature called Inverted Event
Filter which excludes samples matching the event filter. The feature
mirrors the existing event filter in PMSEVFR_EL1 adding a new register,
PMSNEVFR_EL1, which has the same event bit assignments.

Signed-off-by: Rob Herring <[email protected]>
---
v2:
- Update for auto generated register defines
- Avoid accessing SYS_PMSNEVFR_EL1 on < v8.7
---
drivers/perf/arm_spe_pmu.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 0b9b847919d0..ab5fcf02a4ca 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -85,6 +85,7 @@ struct arm_spe_pmu {
#define SPE_PMU_FEAT_ARCH_INST (1UL << 3)
#define SPE_PMU_FEAT_LDS (1UL << 4)
#define SPE_PMU_FEAT_ERND (1UL << 5)
+#define SPE_PMU_FEAT_INV_FILT_EVT (1UL << 6)
#define SPE_PMU_FEAT_DEV_PROBED (1UL << 63)
u64 features;

@@ -202,6 +203,10 @@ static const struct attribute_group arm_spe_pmu_cap_group = {
#define ATTR_CFG_FLD_min_latency_LO 0
#define ATTR_CFG_FLD_min_latency_HI 11

+#define ATTR_CFG_FLD_inv_event_filter_CFG config3 /* PMSNEVFR_EL1 */
+#define ATTR_CFG_FLD_inv_event_filter_LO 0
+#define ATTR_CFG_FLD_inv_event_filter_HI 63
+
/* Why does everything I do descend into this? */
#define __GEN_PMU_FORMAT_ATTR(cfg, lo, hi) \
(lo) == (hi) ? #cfg ":" #lo "\n" : #cfg ":" #lo "-" #hi
@@ -232,6 +237,7 @@ GEN_PMU_FORMAT_ATTR(branch_filter);
GEN_PMU_FORMAT_ATTR(load_filter);
GEN_PMU_FORMAT_ATTR(store_filter);
GEN_PMU_FORMAT_ATTR(event_filter);
+GEN_PMU_FORMAT_ATTR(inv_event_filter);
GEN_PMU_FORMAT_ATTR(min_latency);

static struct attribute *arm_spe_pmu_formats_attr[] = {
@@ -243,12 +249,27 @@ static struct attribute *arm_spe_pmu_formats_attr[] = {
&format_attr_load_filter.attr,
&format_attr_store_filter.attr,
&format_attr_event_filter.attr,
+ &format_attr_inv_event_filter.attr,
&format_attr_min_latency.attr,
NULL,
};

+static umode_t arm_spe_pmu_format_attr_is_visible(struct kobject *kobj,
+ struct attribute *attr,
+ int unused)
+ {
+ struct device *dev = kobj_to_dev(kobj);
+ struct arm_spe_pmu *spe_pmu = dev_get_drvdata(dev);
+
+ if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT))
+ return 0;
+
+ return attr->mode;
+}
+
static const struct attribute_group arm_spe_pmu_format_group = {
.name = "format",
+ .is_visible = arm_spe_pmu_format_attr_is_visible,
.attrs = arm_spe_pmu_formats_attr,
};

@@ -343,6 +364,9 @@ static u64 arm_spe_event_to_pmsfcr(struct perf_event *event)
if (ATTR_CFG_GET_FLD(attr, event_filter))
reg |= PMSFCR_EL1_FE;

+ if (ATTR_CFG_GET_FLD(attr, inv_event_filter))
+ reg |= PMSFCR_EL1_FnE;
+
if (ATTR_CFG_GET_FLD(attr, min_latency))
reg |= PMSFCR_EL1_FL;

@@ -355,6 +379,12 @@ static u64 arm_spe_event_to_pmsevfr(struct perf_event *event)
return ATTR_CFG_GET_FLD(attr, event_filter);
}

+static u64 arm_spe_event_to_pmsnevfr(struct perf_event *event)
+{
+ struct perf_event_attr *attr = &event->attr;
+ return ATTR_CFG_GET_FLD(attr, inv_event_filter);
+}
+
static u64 arm_spe_event_to_pmslatfr(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
@@ -703,6 +733,9 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
if (arm_spe_event_to_pmsevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver))
return -EOPNOTSUPP;

+ if (arm_spe_event_to_pmsnevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver))
+ return -EOPNOTSUPP;
+
if (attr->exclude_idle)
return -EOPNOTSUPP;

@@ -721,6 +754,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!(spe_pmu->features & SPE_PMU_FEAT_FILT_EVT))
return -EOPNOTSUPP;

+ if ((FIELD_GET(PMSFCR_EL1_FnE, reg)) &&
+ !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT))
+ return -EOPNOTSUPP;
+
if ((FIELD_GET(PMSFCR_EL1_FT, reg)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_TYP))
return -EOPNOTSUPP;
@@ -756,6 +793,11 @@ static void arm_spe_pmu_start(struct perf_event *event, int flags)
reg = arm_spe_event_to_pmsevfr(event);
write_sysreg_s(reg, SYS_PMSEVFR_EL1);

+ if (spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT) {
+ reg = arm_spe_event_to_pmsnevfr(event);
+ write_sysreg_s(reg, SYS_PMSNEVFR_EL1);
+ }
+
reg = arm_spe_event_to_pmslatfr(event);
write_sysreg_s(reg, SYS_PMSLATFR_EL1);

@@ -990,6 +1032,9 @@ static void __arm_spe_pmu_dev_probe(void *info)
if (FIELD_GET(PMSIDR_EL1_FE, reg))
spe_pmu->features |= SPE_PMU_FEAT_FILT_EVT;

+ if (FIELD_GET(PMSIDR_EL1_FnE, reg))
+ spe_pmu->features |= SPE_PMU_FEAT_INV_FILT_EVT;
+
if (FIELD_GET(PMSIDR_EL1_FT, reg))
spe_pmu->features |= SPE_PMU_FEAT_FILT_TYP;


--
b4 0.11.0-dev

2022-10-19 19:31:31

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 3/7] arm64/sysreg: Convert SPE registers to automatic generation

Convert all the SPE register defines to automatic generation. No
functional changes.

New registers and fields for SPEv1.2 are added with the conversion.

Some of the PMBSR MSS field defines are kept as the automatic generation
has no way to create multiple names for the same register bits. The
meaning of the MSS field depends on other bits.

Signed-off-by: Rob Herring <[email protected]>
---
v2:
- New patch
---
arch/arm64/include/asm/sysreg.h | 91 ++-----------------------------
arch/arm64/tools/sysreg | 116 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 121 insertions(+), 86 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 8df8a0a51273..d002dd00e53e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -237,99 +237,18 @@
#define SYS_PAR_EL1_FST GENMASK(6, 1)

/*** Statistical Profiling Extension ***/
-/* ID registers */
-#define SYS_PMSIDR_EL1 sys_reg(3, 0, 9, 9, 7)
-#define PMSIDR_EL1_FE_SHIFT 0
-#define PMSIDR_EL1_FT_SHIFT 1
-#define PMSIDR_EL1_FL_SHIFT 2
-#define PMSIDR_EL1_ARCHINST_SHIFT 3
-#define PMSIDR_EL1_LDS_SHIFT 4
-#define PMSIDR_EL1_ERND_SHIFT 5
-#define PMSIDR_EL1_INTERVAL_SHIFT 8
-#define PMSIDR_EL1_INTERVAL_MASK GENMASK_ULL(11, 8)
-#define PMSIDR_EL1_MAXSIZE_SHIFT 12
-#define PMSIDR_EL1_MAXSIZE_MASK GENMASK_ULL(15, 12)
-#define PMSIDR_EL1_COUNTSIZE_SHIFT 16
-#define PMSIDR_EL1_COUNTSIZE_MASK GENMASK_ULL(19, 16)
-
-#define SYS_PMBIDR_EL1 sys_reg(3, 0, 9, 10, 7)
-#define PMBIDR_EL1_ALIGN_SHIFT 0
-#define PMBIDR_EL1_ALIGN_MASK 0xfU
-#define PMBIDR_EL1_P_SHIFT 4
-#define PMBIDR_EL1_F_SHIFT 5
-
-/* Sampling controls */
-#define SYS_PMSCR_EL1 sys_reg(3, 0, 9, 9, 0)
-#define PMSCR_EL1_E0SPE_SHIFT 0
-#define PMSCR_EL1_E1SPE_SHIFT 1
-#define PMSCR_EL1_CX_SHIFT 3
-#define PMSCR_EL1_PA_SHIFT 4
-#define PMSCR_EL1_TS_SHIFT 5
-#define PMSCR_EL1_PCT_SHIFT 6
-
-#define SYS_PMSCR_EL2 sys_reg(3, 4, 9, 9, 0)
-#define PMSCR_EL2_E0HSPE_SHIFT 0
-#define PMSCR_EL2_E2SPE_SHIFT 1
-#define PMSCR_EL2_CX_SHIFT 3
-#define PMSCR_EL2_PA_SHIFT 4
-#define PMSCR_EL2_TS_SHIFT 5
-#define PMSCR_EL2_PCT_SHIFT 6
-
-#define SYS_PMSICR_EL1 sys_reg(3, 0, 9, 9, 2)
-
-#define SYS_PMSIRR_EL1 sys_reg(3, 0, 9, 9, 3)
-#define PMSIRR_EL1_RND_SHIFT 0
-#define PMSIRR_EL1_INTERVAL_SHIFT 8
-#define PMSIRR_EL1_INTERVAL_MASK GENMASK_ULL(31, 8)
-
-/* Filtering controls */
-#define SYS_PMSNEVFR_EL1 sys_reg(3, 0, 9, 9, 1)
-
-#define SYS_PMSFCR_EL1 sys_reg(3, 0, 9, 9, 4)
-#define PMSFCR_EL1_FE_SHIFT 0
-#define PMSFCR_EL1_FT_SHIFT 1
-#define PMSFCR_EL1_FL_SHIFT 2
-#define PMSFCR_EL1_B_SHIFT 16
-#define PMSFCR_EL1_LD_SHIFT 17
-#define PMSFCR_EL1_ST_SHIFT 18
-
-#define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
#define PMSEVFR_EL1_RES0_IMP \
(GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\
BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0))
#define PMSEVFR_EL1_RES0_V1P1 \
(PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))

-#define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
-#define PMSLATFR_EL1_MINLAT_SHIFT 0
-
-/* Buffer controls */
-#define SYS_PMBLIMITR_EL1 sys_reg(3, 0, 9, 10, 0)
-#define PMBLIMITR_EL1_E_SHIFT 0
-#define PMBLIMITR_EL1_FM_SHIFT 1
-#define PMBLIMITR_EL1_FM_MASK GENMASK_ULL(2, 1)
-#define PMBLIMITR_EL1_FM_STOP_IRQ 0
-
-#define SYS_PMBPTR_EL1 sys_reg(3, 0, 9, 10, 1)
-
/* Buffer error reporting */
-#define SYS_PMBSR_EL1 sys_reg(3, 0, 9, 10, 3)
-#define PMBSR_EL1_COLL_SHIFT 16
-#define PMBSR_EL1_S_SHIFT 17
-#define PMBSR_EL1_EA_SHIFT 18
-#define PMBSR_EL1_DL_SHIFT 19
-#define PMBSR_EL1_EC_SHIFT 26
-#define PMBSR_EL1_EC_MASK GENMASK_ULL(31, 26)
-
-#define PMBSR_EL1_EC_BUF 0x0UL
-#define PMBSR_EL1_EC_FAULT_S1 0x24UL
-#define PMBSR_EL1_EC_FAULT_S2 0x25UL
-
-#define PMBSR_EL1_FAULT_FSC_SHIFT 0
-#define PMBSR_EL1_FAULT_FSC_MASK 0x3fUL
-
-#define PMBSR_EL1_BUF_BSC_SHIFT 0
-#define PMBSR_EL1_BUF_BSC_MASK 0x3fUL
+#define PMBSR_EL1_FAULT_FSC_SHIFT PMBSR_EL1_MSS_SHIFT
+#define PMBSR_EL1_FAULT_FSC_MASK PMBSR_EL1_MSS_MASK
+
+#define PMBSR_EL1_BUF_BSC_SHIFT PMBSR_EL1_MSS_SHIFT
+#define PMBSR_EL1_BUF_BSC_MASK PMBSR_EL1_MSS_MASK

#define PMBSR_EL1_BUF_BSC_FULL 0x1UL

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 384757a7eda9..2fbfe625dacc 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -854,6 +854,111 @@ Sysreg FAR_EL1 3 0 6 0 0
Field 63:0 ADDR
EndSysreg

+Sysreg PMSCR_EL1 3 0 9 9 0
+Res0 63:8
+Field 7:6 PCT
+Field 5 TS
+Field 4 PA
+Field 3 CX
+Res0 2
+Field 1 E1SPE
+Field 0 E0SPE
+EndSysreg
+
+Sysreg PMSNEVFR_EL1 3 0 9 9 1
+Field 63:0 E
+EndSysreg
+
+Sysreg PMSICR_EL1 3 0 9 9 2
+Field 63:56 ECOUNT
+Res0 55:32
+Field 31:0 COUNT
+EndSysreg
+
+Sysreg PMSIRR_EL1 3 0 9 9 3
+Res0 63:32
+Field 31:8 INTERVAL
+Res0 7:1
+Field 0 RND
+EndSysreg
+
+Sysreg PMSFCR_EL1 3 0 9 9 4
+Res0 63:19
+Field 18 ST
+Field 17 LD
+Field 16 B
+Res0 15:4
+Field 3 FnE
+Field 2 FL
+Field 1 FT
+Field 0 FE
+EndSysreg
+
+Sysreg PMSEVFR_EL1 3 0 9 9 5
+Field 63:0 E
+EndSysreg
+
+Sysreg PMSLATFR_EL1 3 0 9 9 6
+Res0 63:16
+Field 15:0 MINLAT
+EndSysreg
+
+Sysreg PMSIDR_EL1 3 0 9 9 7
+Res0 63:25
+Field 24 PBT
+Field 23:20 FORMAT
+Field 19:16 COUNTSIZE
+Field 15:12 MAXSIZE
+Field 11:8 INTERVAL
+Res0 7
+Field 6 FnE
+Field 5 ERND
+Field 4 LDS
+Field 3 ARCHINST
+Field 2 FL
+Field 1 FT
+Field 0 FE
+EndSysreg
+
+Sysreg PMBLIMITR_EL1 3 0 9 10 0
+Field 63:12 LIMIT
+Res0 11:6
+Field 5 PMFZ
+Res0 4:3
+Enum 2:1 FM
+ 0b0000 STOP_IRQ
+EndEnum
+Field 0 E
+EndSysreg
+
+Sysreg PMBPTR_EL1 3 0 9 10 1
+Field 63:0 PTR
+EndSysreg
+
+Sysreg PMBSR_EL1 3 0 9 10 3
+Res0 63:32
+Enum 31:26 EC
+ 0b000000 BUF
+ 0b100100 FAULT_S1
+ 0b100101 FAULT_S2
+EndEnum
+Res0 25:20
+Field 19 DL
+Field 18 EA
+Field 17 S
+Field 16 COLL
+Field 15:0 MSS
+EndSysreg
+
+Sysreg PMBIDR_EL1 3 0 9 10 7
+Res0 63:12
+Field 11:8 EA
+Res0 7:6
+Field 5 F
+Field 4 P
+Field 3:0 ALIGN
+EndSysreg
+
SysregFields CONTEXTIDR_ELx
Res0 63:32
Field 31:0 PROCID
@@ -1008,6 +1113,17 @@ Sysreg FAR_EL2 3 4 6 0 0
Field 63:0 ADDR
EndSysreg

+Sysreg PMSCR_EL2 3 4 9 9 0
+Res0 63:8
+Field 7:6 PCT
+Field 5 TS
+Field 4 PA
+Field 3 CX
+Res0 2
+Field 1 E2SPE
+Field 0 E0HSPE
+EndSysreg
+
Sysreg CONTEXTIDR_EL2 3 4 13 0 1
Fields CONTEXTIDR_ELx
EndSysreg

--
b4 0.11.0-dev

2022-10-19 19:32:59

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 6/7] perf: Add perf_event_attr::config3

Arm SPEv1.2 adds another 64-bits of event filtering control. As the
existing perf_event_attr::configN fields are all used up for SPE PMU, an
additional field is needed. Add a new 'config3' field.

Signed-off-by: Rob Herring <[email protected]>
---
v2:
- Drop tools/ side update
---
include/uapi/linux/perf_event.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 85be78e0e7f6..b2b1d7b54097 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -374,6 +374,7 @@ enum perf_event_read_format {
#define PERF_ATTR_SIZE_VER5 112 /* add: aux_watermark */
#define PERF_ATTR_SIZE_VER6 120 /* add: aux_sample_size */
#define PERF_ATTR_SIZE_VER7 128 /* add: sig_data */
+#define PERF_ATTR_SIZE_VER8 136 /* add: config3 */

/*
* Hardware event_id to monitor via a performance monitoring event:
@@ -515,6 +516,8 @@ struct perf_event_attr {
* truncated accordingly on 32 bit architectures.
*/
__u64 sig_data;
+
+ __u64 config3; /* extension of config2 */
};

/*

--
b4 0.11.0-dev

2022-10-19 19:37:38

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 4/7] perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors

Now that generated sysregs are in place, update the register field
accesses. The use of BIT() is no longer needed with the new defines. Use
FIELD_GET and FIELD_PREP instead of open coding masking and shifting.

No functional change.

Signed-off-by: Rob Herring <[email protected]>
---
drivers/perf/arm_spe_pmu.c | 70 ++++++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 36 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 814ed18346b6..9b4bd72087ea 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -283,18 +283,18 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
struct perf_event_attr *attr = &event->attr;
u64 reg = 0;

- reg |= ATTR_CFG_GET_FLD(attr, ts_enable) << PMSCR_EL1_TS_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, pa_enable) << PMSCR_EL1_PA_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, pct_enable) << PMSCR_EL1_PCT_SHIFT;
+ reg |= FIELD_PREP(PMSCR_EL1_TS, ATTR_CFG_GET_FLD(attr, ts_enable));
+ reg |= FIELD_PREP(PMSCR_EL1_PA, ATTR_CFG_GET_FLD(attr, pa_enable));
+ reg |= FIELD_PREP(PMSCR_EL1_PCT, ATTR_CFG_GET_FLD(attr, pct_enable));

if (!attr->exclude_user)
- reg |= BIT(PMSCR_EL1_E0SPE_SHIFT);
+ reg |= PMSCR_EL1_E0SPE;

if (!attr->exclude_kernel)
- reg |= BIT(PMSCR_EL1_E1SPE_SHIFT);
+ reg |= PMSCR_EL1_E1SPE;

if (get_spe_event_has_cx(event))
- reg |= BIT(PMSCR_EL1_CX_SHIFT);
+ reg |= PMSCR_EL1_CX;

return reg;
}
@@ -322,7 +322,7 @@ static u64 arm_spe_event_to_pmsirr(struct perf_event *event)

arm_spe_event_sanitise_period(event);

- reg |= ATTR_CFG_GET_FLD(attr, jitter) << PMSIRR_EL1_RND_SHIFT;
+ reg |= FIELD_PREP(PMSIRR_EL1_RND, ATTR_CFG_GET_FLD(attr, jitter));
reg |= event->hw.sample_period;

return reg;
@@ -333,18 +333,18 @@ static u64 arm_spe_event_to_pmsfcr(struct perf_event *event)
struct perf_event_attr *attr = &event->attr;
u64 reg = 0;

- reg |= ATTR_CFG_GET_FLD(attr, load_filter) << PMSFCR_EL1_LD_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, store_filter) << PMSFCR_EL1_ST_SHIFT;
- reg |= ATTR_CFG_GET_FLD(attr, branch_filter) << PMSFCR_EL1_B_SHIFT;
+ reg |= FIELD_PREP(PMSFCR_EL1_LD, ATTR_CFG_GET_FLD(attr, load_filter));
+ reg |= FIELD_PREP(PMSFCR_EL1_ST, ATTR_CFG_GET_FLD(attr, store_filter));
+ reg |= FIELD_PREP(PMSFCR_EL1_B, ATTR_CFG_GET_FLD(attr, branch_filter));

if (reg)
- reg |= BIT(PMSFCR_EL1_FT_SHIFT);
+ reg |= PMSFCR_EL1_FT;

if (ATTR_CFG_GET_FLD(attr, event_filter))
- reg |= BIT(PMSFCR_EL1_FE_SHIFT);
+ reg |= PMSFCR_EL1_FE;

if (ATTR_CFG_GET_FLD(attr, min_latency))
- reg |= BIT(PMSFCR_EL1_FL_SHIFT);
+ reg |= PMSFCR_EL1_FL;

return reg;
}
@@ -358,8 +358,7 @@ static u64 arm_spe_event_to_pmsevfr(struct perf_event *event)
static u64 arm_spe_event_to_pmslatfr(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
- return ATTR_CFG_GET_FLD(attr, min_latency)
- << PMSLATFR_EL1_MINLAT_SHIFT;
+ return FIELD_PREP(PMSLATFR_EL1_MINLAT, ATTR_CFG_GET_FLD(attr, min_latency));
}

static void arm_spe_pmu_pad_buf(struct perf_output_handle *handle, int len)
@@ -511,7 +510,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle,
limit = buf->snapshot ? arm_spe_pmu_next_snapshot_off(handle)
: arm_spe_pmu_next_off(handle);
if (limit)
- limit |= BIT(PMBLIMITR_EL1_E_SHIFT);
+ limit |= PMBLIMITR_EL1_E;

limit += (u64)buf->base;
base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf);
@@ -570,23 +569,23 @@ arm_spe_pmu_buf_get_fault_act(struct perf_output_handle *handle)

/* Service required? */
pmbsr = read_sysreg_s(SYS_PMBSR_EL1);
- if (!(pmbsr & BIT(PMBSR_EL1_S_SHIFT)))
+ if (!FIELD_GET(PMBSR_EL1_S, pmbsr))
return SPE_PMU_BUF_FAULT_ACT_SPURIOUS;

/*
* If we've lost data, disable profiling and also set the PARTIAL
* flag to indicate that the last record is corrupted.
*/
- if (pmbsr & BIT(PMBSR_EL1_DL_SHIFT))
+ if (FIELD_GET(PMBSR_EL1_DL, pmbsr))
perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED |
PERF_AUX_FLAG_PARTIAL);

/* Report collisions to userspace so that it can up the period */
- if (pmbsr & BIT(PMBSR_EL1_COLL_SHIFT))
+ if (FIELD_GET(PMBSR_EL1_COLL, pmbsr))
perf_aux_output_flag(handle, PERF_AUX_FLAG_COLLISION);

/* We only expect buffer management events */
- switch (FIELD_GET(PMBSR_EL1_EC_MASK, pmbsr)) {
+ switch (FIELD_GET(PMBSR_EL1_EC, pmbsr)) {
case PMBSR_EL1_EC_BUF:
/* Handled below */
break;
@@ -716,23 +715,22 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
return -EINVAL;

reg = arm_spe_event_to_pmsfcr(event);
- if ((reg & BIT(PMSFCR_EL1_FE_SHIFT)) &&
+ if ((FIELD_GET(PMSFCR_EL1_FE, reg)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_EVT))
return -EOPNOTSUPP;

- if ((reg & BIT(PMSFCR_EL1_FT_SHIFT)) &&
+ if ((FIELD_GET(PMSFCR_EL1_FT, reg)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_TYP))
return -EOPNOTSUPP;

- if ((reg & BIT(PMSFCR_EL1_FL_SHIFT)) &&
+ if ((FIELD_GET(PMSFCR_EL1_FL, reg)) &&
!(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT))
return -EOPNOTSUPP;

set_spe_event_has_cx(event);
reg = arm_spe_event_to_pmscr(event);
if (!perfmon_capable() &&
- (reg & (BIT(PMSCR_EL1_PA_SHIFT) |
- BIT(PMSCR_EL1_PCT_SHIFT))))
+ (reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT)))
return -EACCES;

return 0;
@@ -970,14 +968,14 @@ static void __arm_spe_pmu_dev_probe(void *info)

/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
- if (reg & BIT(PMBIDR_EL1_P_SHIFT)) {
+ if (FIELD_GET(PMBIDR_EL1_P, reg)) {
dev_err(dev,
"profiling buffer owned by higher exception level\n");
return;
}

/* Minimum alignment. If it's out-of-range, then fail the probe */
- fld = (reg & PMBIDR_EL1_ALIGN_MASK) >> PMBIDR_EL1_ALIGN_SHIFT;
+ fld = FIELD_GET(PMBIDR_EL1_ALIGN, reg);
spe_pmu->align = 1 << fld;
if (spe_pmu->align > SZ_2K) {
dev_err(dev, "unsupported PMBIDR.Align [%d] on CPU %d\n",
@@ -987,26 +985,26 @@ static void __arm_spe_pmu_dev_probe(void *info)

/* It's now safe to read PMSIDR and figure out what we've got */
reg = read_sysreg_s(SYS_PMSIDR_EL1);
- if (reg & BIT(PMSIDR_EL1_FE_SHIFT))
+ if (FIELD_GET(PMSIDR_EL1_FE, reg))
spe_pmu->features |= SPE_PMU_FEAT_FILT_EVT;

- if (reg & BIT(PMSIDR_EL1_FT_SHIFT))
+ if (FIELD_GET(PMSIDR_EL1_FT, reg))
spe_pmu->features |= SPE_PMU_FEAT_FILT_TYP;

- if (reg & BIT(PMSIDR_EL1_FL_SHIFT))
+ if (FIELD_GET(PMSIDR_EL1_FL, reg))
spe_pmu->features |= SPE_PMU_FEAT_FILT_LAT;

- if (reg & BIT(PMSIDR_EL1_ARCHINST_SHIFT))
+ if (FIELD_GET(PMSIDR_EL1_ARCHINST, reg))
spe_pmu->features |= SPE_PMU_FEAT_ARCH_INST;

- if (reg & BIT(PMSIDR_EL1_LDS_SHIFT))
+ if (FIELD_GET(PMSIDR_EL1_LDS, reg))
spe_pmu->features |= SPE_PMU_FEAT_LDS;

- if (reg & BIT(PMSIDR_EL1_ERND_SHIFT))
+ if (FIELD_GET(PMSIDR_EL1_ERND, reg))
spe_pmu->features |= SPE_PMU_FEAT_ERND;

/* This field has a spaced out encoding, so just use a look-up */
- fld = (reg & PMSIDR_EL1_INTERVAL_MASK) >> PMSIDR_EL1_INTERVAL_SHIFT;
+ fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg);
switch (fld) {
case 0:
spe_pmu->min_period = 256;
@@ -1038,7 +1036,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
}

/* Maximum record size. If it's out-of-range, then fail the probe */
- fld = (reg & PMSIDR_EL1_MAXSIZE_MASK) >> PMSIDR_EL1_MAXSIZE_SHIFT;
+ fld = FIELD_GET(PMSIDR_EL1_MAXSIZE, reg);
spe_pmu->max_record_sz = 1 << fld;
if (spe_pmu->max_record_sz > SZ_2K || spe_pmu->max_record_sz < 16) {
dev_err(dev, "unsupported PMSIDR_EL1.MaxSize [%d] on CPU %d\n",
@@ -1046,7 +1044,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
return;
}

- fld = (reg & PMSIDR_EL1_COUNTSIZE_MASK) >> PMSIDR_EL1_COUNTSIZE_SHIFT;
+ fld = FIELD_GET(PMSIDR_EL1_COUNTSIZE, reg);
switch (fld) {
default:
dev_warn(dev, "unknown PMSIDR_EL1.CountSize [%d]; assuming 2\n",

--
b4 0.11.0-dev

2022-10-19 19:54:58

by Rob Herring (Arm)

[permalink] [raw]
Subject: [PATCH v2 1/7] perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines

Similar to commit 121a8fc088f1 ("arm64/sysreg: Use feature numbering for
PMU and SPE revisions") use feature numbering instead of architecture
versions for the PMSEVFR_EL1 Res0 defines.

Signed-off-by: Rob Herring <[email protected]>
---
v2:
- New patch
---
arch/arm64/include/asm/sysreg.h | 6 +++---
drivers/perf/arm_spe_pmu.c | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7d301700d1a9..9a4cf12e3e16 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -294,11 +294,11 @@
#define SYS_PMSFCR_EL1_ST_SHIFT 18

#define SYS_PMSEVFR_EL1 sys_reg(3, 0, 9, 9, 5)
-#define SYS_PMSEVFR_EL1_RES0_8_2 \
+#define PMSEVFR_EL1_RES0_IMP \
(GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\
BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0))
-#define SYS_PMSEVFR_EL1_RES0_8_3 \
- (SYS_PMSEVFR_EL1_RES0_8_2 & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))
+#define PMSEVFR_EL1_RES0_V1P1 \
+ (PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))

#define SYS_PMSLATFR_EL1 sys_reg(3, 0, 9, 9, 6)
#define SYS_PMSLATFR_EL1_MINLAT_SHIFT 0
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 00e3a637f7b6..65cf93dcc8ee 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -677,11 +677,11 @@ static u64 arm_spe_pmsevfr_res0(u16 pmsver)
{
switch (pmsver) {
case ID_AA64DFR0_EL1_PMSVer_IMP:
- return SYS_PMSEVFR_EL1_RES0_8_2;
+ return PMSEVFR_EL1_RES0_IMP;
case ID_AA64DFR0_EL1_PMSVer_V1P1:
/* Return the highest version we support in default */
default:
- return SYS_PMSEVFR_EL1_RES0_8_3;
+ return PMSEVFR_EL1_RES0_V1P1;
}
}


--
b4 0.11.0-dev

2022-10-20 15:06:07

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] arm64/sysreg: Convert SPE registers to automatic generation

On Wed, Oct 19, 2022 at 02:11:26PM -0500, Rob Herring wrote:
> Convert all the SPE register defines to automatic generation. No
> functional changes.
>
> New registers and fields for SPEv1.2 are added with the conversion.
>
> Some of the PMBSR MSS field defines are kept as the automatic generation
> has no way to create multiple names for the same register bits. The
> meaning of the MSS field depends on other bits.

A few small things below from checking against DDI0487I.a, nothing
major:

> +Sysreg PMSCR_EL1 3 0 9 9 0
> +Res0 63:8
> +Field 7:6 PCT

Given the potential multiple meanings depending on both integration and
runtime configuration representing this as a field seems sensible.

> +Sysreg PMSNEVFR_EL1 3 0 9 9 1
> +Field 63:0 E
> +EndSysreg

While this does look rather different to the spec it does appear to be a
sensible interpretation - the intent is clearly to have a mask of
feature bits.

> +Sysreg PMSIDR_EL1 3 0 9 9 7

> +Field 23:20 FORMAT
> +Field 19:16 COUNTSIZE
> +Field 15:12 MAXSIZE
> +Field 11:8 INTERVAL

These should really be enums.

> +Sysreg PMBLIMITR_EL1 3 0 9 10 0
> +Enum 2:1 FM
> + 0b0000 STOP_IRQ
> +EndEnum

DDI0487I.a also defines

0b01 DISCARD

from FEAT_SPEv1p2. Also this is a two bit field so 0b00 would be a bit
better.

> +Sysreg PMBSR_EL1 3 0 9 10 3
> +Res0 63:32
> +Enum 31:26 EC
> + 0b000000 BUF
> + 0b100100 FAULT_S1
> + 0b100101 FAULT_S2
> +EndEnum

DDI0487I.a also has

0b011110 FAULT_GPC
0b011111 IMP_DEF

(the former from FEAT_RME).

> +Sysreg PMBIDR_EL1 3 0 9 10 7
> +Res0 63:12
> +Field 11:8 EA

This looks like it should be described as an enum.

> +Field 3:0 ALIGN

This could potentially also be an enum.

> +Sysreg PMSCR_EL2 3 4 9 9 0
> +Res0 63:8
> +Field 7:6 PCT

This lookslike it should be an enum.


Attachments:
(No filename) (1.81 kB)
signature.asc (499.00 B)
Download all attachments

2022-10-20 20:24:29

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] arm64/sysreg: Convert SPE registers to automatic generation

On Thu, Oct 20, 2022 at 9:33 AM Mark Brown <[email protected]> wrote:
>
> On Wed, Oct 19, 2022 at 02:11:26PM -0500, Rob Herring wrote:
> > Convert all the SPE register defines to automatic generation. No
> > functional changes.
> >
> > New registers and fields for SPEv1.2 are added with the conversion.
> >
> > Some of the PMBSR MSS field defines are kept as the automatic generation
> > has no way to create multiple names for the same register bits. The
> > meaning of the MSS field depends on other bits.
>
> A few small things below from checking against DDI0487I.a, nothing
> major:

[...]

> > +Sysreg PMSIDR_EL1 3 0 9 9 7
>
> > +Field 23:20 FORMAT
> > +Field 19:16 COUNTSIZE
> > +Field 15:12 MAXSIZE
> > +Field 11:8 INTERVAL
>
> These should really be enums.

Okay for COUNTSIZE and INTERVAL.

The only defined value for FORMAT is 'Format 0'. I assume next will be
'Format 1'. I don't think defines with the value in the name are too
useful. When this is used, we'll just be passing the value to
userspace via either sysfs or perf aux buffer.

MAXSIZE is the power of 2 encoded size and we just convert to bytes
(1<<MAXSIZE). So I can add them, but they will never be used.

> > +Sysreg PMBLIMITR_EL1 3 0 9 10 0
> > +Enum 2:1 FM
> > + 0b0000 STOP_IRQ
> > +EndEnum
>
> DDI0487I.a also defines
>
> 0b01 DISCARD

STOP_IRQ doesn't seem like the best name either. It's not used, so I
changed it to 'FILL'.

> > +Sysreg PMBIDR_EL1 3 0 9 10 7
> > +Res0 63:12
> > +Field 11:8 EA
>
> This looks like it should be described as an enum.

0b0000 Not_Described
0b0001 Ignored
0b0010 SError

What's the preferred case here?

>
> > +Field 3:0 ALIGN
>
> This could potentially also be an enum.

Another power of 2 encoding.

> > +Sysreg PMSCR_EL2 3 4 9 9 0
> > +Res0 63:8
> > +Field 7:6 PCT
>
> This lookslike it should be an enum.

Humm, PCT is a bit tricky. Here's what I came up with:

Enum 7:6 PCT
0b00 VIRT
0b01 PHYS
0b11 GUEST
EndEnum

Again, CAPS or CarrotCase for new things?

Rob

2022-10-21 11:00:58

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] arm64/sysreg: Convert SPE registers to automatic generation

On Thu, Oct 20, 2022 at 02:51:02PM -0500, Rob Herring wrote:
> On Thu, Oct 20, 2022 at 9:33 AM Mark Brown <[email protected]> wrote:

> > > +Field 11:8 EA

> > This looks like it should be described as an enum.

> 0b0000 Not_Described
> 0b0001 Ignored
> 0b0010 SError

> What's the preferred case here?

Either option is good for case I think, it's a taste question - going
with upper case is more the kernel coding style so fine there, though
sometimes it makes things hard to read.


Attachments:
(No filename) (532.00 B)
signature.asc (499.00 B)
Download all attachments

2022-11-04 12:05:01

by James Clark

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] perf: Arm SPEv1.2 support



On 19/10/2022 20:11, Rob Herring wrote:
> This series adds support for Arm SPEv1.2 which is part of the
> Armv8.7/Armv9.2 architecture. There's 2 new features that affect the
> kernel: a new event filter bit, branch 'not taken', and an inverted
> event filter register.
>
> Since this support adds new registers and fields, first the SPE register
> defines are converted to automatic generation.
>
> Note that the 'config3' addition in sysfs format files causes SPE to
> break. A stable fix e552b7be12ed ("perf: Skip and warn on unknown format
> 'configN' attrs") landed in v6.1-rc1.
>
> The perf tool side changes are available here[1].
>
> Tested on FVP.
>
> [1] https://lore.kernel.org/all/[email protected]/
>

LGTM. Tested with [1] applied and on N1SDP (where it isn't supported),
and on the FVP. Enabling all the inverted filters results in no trace
and other combinations work as expected.

James

> Signed-off-by: Rob Herring <[email protected]>
> ---
> Changes in v2:
> - Convert the SPE register defines to automatic generation
> - Fixed access to SYS_PMSNEVFR_EL1 when not present
> - Rebase on v6.1-rc1
> - Link to v1: https://lore.kernel.org/r/[email protected]
>
> ---
> Rob Herring (7):
> perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines
> arm64: Drop SYS_ from SPE register defines
> arm64/sysreg: Convert SPE registers to automatic generation
> perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors
> perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event
> perf: Add perf_event_attr::config3
> perf: arm_spe: Add support for SPEv1.2 inverted event filtering
>
> arch/arm64/include/asm/el2_setup.h | 6 +-
> arch/arm64/include/asm/sysreg.h | 99 +++------------------------
> arch/arm64/kvm/debug.c | 2 +-
> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 2 +-
> arch/arm64/tools/sysreg | 116 +++++++++++++++++++++++++++++++
> drivers/perf/arm_spe_pmu.c | 136 ++++++++++++++++++++++++-------------
> include/uapi/linux/perf_event.h | 3 +
> 7 files changed, 224 insertions(+), 140 deletions(-)
> ---
> base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
> change-id: 20220825-arm-spe-v8-7-fedf04e16f23
>
> Best regards,

2022-11-04 15:25:02

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] perf: Arm SPEv1.2 support

On Fri, Nov 4, 2022 at 6:30 AM James Clark <[email protected]> wrote:
>
>
>
> On 19/10/2022 20:11, Rob Herring wrote:
> > This series adds support for Arm SPEv1.2 which is part of the
> > Armv8.7/Armv9.2 architecture. There's 2 new features that affect the
> > kernel: a new event filter bit, branch 'not taken', and an inverted
> > event filter register.
> >
> > Since this support adds new registers and fields, first the SPE register
> > defines are converted to automatic generation.
> >
> > Note that the 'config3' addition in sysfs format files causes SPE to
> > break. A stable fix e552b7be12ed ("perf: Skip and warn on unknown format
> > 'configN' attrs") landed in v6.1-rc1.
> >
> > The perf tool side changes are available here[1].
> >
> > Tested on FVP.
> >
> > [1] https://lore.kernel.org/all/[email protected]/
> >
>
> LGTM. Tested with [1] applied and on N1SDP (where it isn't supported),
> and on the FVP. Enabling all the inverted filters results in no trace
> and other combinations work as expected.

Thanks, I'll take that as a 'Tested-by' tag if that's okay.

In the future, please be explicit with tags. Tools such as b4 and
patchwork will recognize them and add them automatically.

Rob