2021-02-25 19:40:27

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 00/19] arm64: coresight: Add support for ETE and TRBE


This series enables future IP trace features Embedded Trace Extension (ETE)
and Trace Buffer Extension (TRBE). This series applies on linux-next/master
(from tag next-20210222), and is also available here [0].

Patches 1 & 2: UABI updates for perf AUX flag format. We reserve
a byte for advertising the format of the buffer when the PMU could
support different formats. The CoreSight PMUs could use Frame formatted
data and Raw format of the trace source.

Patches 3 - 5: Fixes for arm64 KVM hypervisor to align with the architecture.
Patches 6 - 7: Adds the arrchitecture defintions for trace and TRBE
Patch 8 : Adds the necessary changes for enabling TRBE access to host
from the early initialisation (VHE and nVHE). Also support for nVHE hyp
to save/restore the TRBE context of the host during a trip to the guest.

Patches 9 - 19: CoreSight driver specific changes and DT bindings for
ETE and TRBE support


ETE is the PE (CPU) trace unit for CPUs, implementing future architecture
extensions. ETE overlaps with the ETMv4 architecture, with additions to
support the newer architecture features and some restrictions on the
supported features w.r.t ETMv4. The ETE support is added by extending the
ETMv4 driver to recognise the ETE and handle the features as exposed by the
TRCIDRx registers. ETE only supports system instructions access from the
host CPU. The ETE could be integrated with a TRBE (see below), or with the
legacy CoreSight trace bus (e.g, ETRs). Thus the ETE follows same firmware
description as the ETMs and requires a node per instance.

Trace Buffer Extensions (TRBE) implements a per CPU trace buffer, which is
accessible via the system registers and can be combined with the ETE to
provide a 1x1 configuration of source & sink. TRBE is being represented
here as a CoreSight sink. Primary reason is that the ETE source could work
with other traditional CoreSight sink devices. As TRBE captures the trace
data which is produced by ETE, it cannot work alone.

TRBE representation here have some distinct deviations from a traditional
CoreSight sink device. Coresight path between ETE and TRBE are not built
during boot looking at respective DT or ACPI entries.

Unlike traditional sinks, TRBE can generate interrupts to signal including
many other things, buffer got filled. The interrupt is a PPI and should be
communicated from the platform. DT or ACPI entry representing TRBE should
have the PPI number for a given platform. During perf session, the TRBE IRQ
handler should capture trace for perf auxiliary buffer before restarting it
back. System registers being used here to configure ETE and TRBE could be
referred in the link below.

https://developer.arm.com/docs/ddi0601/g/aarch64-system-registers.

[0] https://gitlab.arm.com/linux-arm/linux-skp/-/tree/coresight/ete/v4/next

Changes in V4:

- ETE and TRBE changes have been captured in the respective patches
- Better support for nVHE
- Re-ordered and splitted the patches to keep the changes separate
for the generic/arm64 tree from CoreSight driver specific changes.
- Fixes for KVM handling of Trace/SPE


Changes in V3:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

- Rebased on coresight/next
- Changed DT bindings for ETE
- Included additional patches for arm64 nvhe, perf aux buffer flags etc
- TRBE changes have been captured in the respective patches

Changes in V2:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

- Converted both ETE and TRBE DT bindings into Yaml
- TRBE changes have been captured in the respective patches

Changes in V1:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

- There are not much ETE changes from Suzuki apart from splitting of the ETE DTS patch
- TRBE changes have been captured in the respective patches

Changes in RFC:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Peter Zilstra <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Linu Cherian <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]


Anshuman Khandual (3):
arm64: Add TRBE definitions
coresight: core: Add support for dedicated percpu sinks
coresight: sink: Add TRBE driver

Suzuki K Poulose (16):
perf: aux: Add flags for the buffer format
perf: aux: Add CoreSight PMU buffer formats
kvm: arm64: Hide system instruction access to Trace registers
kvm: arm64: nvhe: Save the SPE context early
kvm: arm64: Disable guest access to trace filter controls
arm64: Add support for trace synchronization barrier
arm64: kvm: Enable access to TRBE support for host
coresight: etm4x: Move ETM to prohibited region for disable
coresight: etm-perf: Allow an event to use different sinks
coresight: Do not scan for graph if none is present
coresight: etm4x: Add support for PE OS lock
coresight: ete: Add support for ETE sysreg access
coresight: ete: Add support for ETE tracing
dts: bindings: Document device tree bindings for ETE
coresight: etm-perf: Handle stale output handles
dts: bindings: Document device tree bindings for Arm TRBE

.../testing/sysfs-bus-coresight-devices-trbe | 14 +
.../devicetree/bindings/arm/ete.yaml | 71 +
.../devicetree/bindings/arm/trbe.yaml | 49 +
.../trace/coresight/coresight-trbe.rst | 38 +
arch/arm64/include/asm/barrier.h | 1 +
arch/arm64/include/asm/el2_setup.h | 13 +
arch/arm64/include/asm/kvm_arm.h | 3 +
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/asm/kvm_hyp.h | 5 +
arch/arm64/include/asm/sysreg.h | 50 +
arch/arm64/kernel/cpufeature.c | 1 -
arch/arm64/kernel/hyp-stub.S | 3 +-
arch/arm64/kvm/debug.c | 6 +-
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 54 +-
arch/arm64/kvm/hyp/nvhe/switch.c | 13 +-
drivers/hwtracing/coresight/Kconfig | 24 +-
drivers/hwtracing/coresight/Makefile | 1 +
drivers/hwtracing/coresight/coresight-core.c | 29 +-
.../hwtracing/coresight/coresight-etm-perf.c | 119 +-
.../coresight/coresight-etm4x-core.c | 161 ++-
.../coresight/coresight-etm4x-sysfs.c | 19 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 83 +-
.../hwtracing/coresight/coresight-platform.c | 6 +
drivers/hwtracing/coresight/coresight-priv.h | 3 +
drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
include/linux/coresight.h | 12 +
include/uapi/linux/perf_event.h | 13 +-
28 files changed, 2028 insertions(+), 67 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml
create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h

--
2.24.1


2021-02-25 19:41:45

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 03/19] kvm: arm64: Hide system instruction access to Trace registers

Currently we advertise the ID_AA6DFR0_EL1.TRACEVER for the guest,
when the trace register accesses are trapped (CPTR_EL2.TTA == 1).
So, the guest will get an undefined instruction, if trusts the
ID registers and access one of the trace registers.
Lets be nice to the guest and hide the feature to avoid
unexpected behavior.

Even though this can be done at KVM sysreg emulation layer,
we do this by removing the TRACEVER from the sanitised feature
register field. This is fine as long as the ETM drivers
can handle the individual trace units separately, even
when there are differences among the CPUs.

Cc: Marc Zyngier <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Mark Rutland <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
New patch
---
arch/arm64/kernel/cpufeature.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 066030717a4c..a4698f09bf32 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -383,7 +383,6 @@ static const struct arm64_ftr_bits ftr_id_aa64dfr0[] = {
* of support.
*/
S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_AA64DFR0_PMUVER_SHIFT, 4, 0),
- ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_TRACEVER_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_DEBUGVER_SHIFT, 4, 0x6),
ARM64_FTR_END,
};
--
2.24.1

2021-02-25 19:41:55

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 01/19] perf: aux: Add flags for the buffer format

Allocate a byte for advertising the PMU specific format type
of the given AUX record. A PMU could end up providing hardware
trace data in multiple format in a single session.

e.g, The format of hardware buffer produced by CoreSight ETM
PMU depends on the type of the "sink" device used for collection
for an event (Traditional TMC-ETR/Bs with formatting or
TRBEs without any formatting).

# Boring story of why this is needed. Goto The_End_of_Story for skipping.

CoreSight ETM trace allows instruction level tracing of Arm CPUs.
The ETM generates the CPU excecution trace and pumps it into CoreSight
AMBA Trace Bus and is collected by a different CoreSight component
(traditionally CoreSight TMC-ETR /ETB/ETF), called "sink".
Important to note that there is no guarantee that every CPU has
a dedicated sink. Thus multiple ETMs could pump the trace data
into the same "sink" and thus they apply additional formatting
of the trace data for the user to decode it properly and attribute
the trace data to the corresponding ETM.

However, with the introduction of Arm Trace buffer Extensions (TRBE),
we now have a dedicated per-CPU architected sink for collecting the
trace. Since the TRBE is always per-CPU, it doesn't apply any formatting
of the trace. The support for this driver is under review [1].

Now a system could have a per-cpu TRBE and one or more shared
TMC-ETRs on the system. A user could choose a "specific" sink
for a perf session (e.g, a TMC-ETR) or the driver could automatically
select the nearest sink for a given ETM. It is possible that
some ETMs could end up using TMC-ETR (e.g, if the TRBE is not
usable on the CPU) while the others using TRBE in a single
perf session. Thus we now have "formatted" trace collected
from TMC-ETR and "unformatted" trace collected from TRBE.
However, we don't get into a situation where a single event
could end up using TMC-ETR & TRBE. i.e, any AUX buffer is
guaranteed to be either RAW or FORMATTED, but not a mix
of both.

As for perf decoding, we need to know the type of the data
in the individual AUX buffers, so that it can set up the
"OpenCSD" (library for decoding CoreSight trace) decoder
instance appropriately. Thus the perf.data file must conatin
the hints for the tool to decode the data correctly.

Since this is a runtime variable, and perf tool doesn't have
a control on what sink gets used (in case of automatic sink
selection), we need this information made available from
the PMU driver for each AUX record.

# The_End_of_Story

Cc: Peter Ziljstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Mathieu Poirier <[email protected]>
Reviewed by: Mike Leach <[email protected]>
Acked-by: Peter Ziljstra <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
include/uapi/linux/perf_event.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ad15e40d7f5d..f006eeab6f0e 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1156,10 +1156,11 @@ enum perf_callchain_context {
/**
* PERF_RECORD_AUX::flags bits
*/
-#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */
-#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */
-#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */
-#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */
+#define PERF_AUX_FLAG_TRUNCATED 0x01 /* record was truncated to fit */
+#define PERF_AUX_FLAG_OVERWRITE 0x02 /* snapshot from overwrite mode */
+#define PERF_AUX_FLAG_PARTIAL 0x04 /* record contains gaps */
+#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */
+#define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */

#define PERF_FLAG_FD_NO_GROUP (1UL << 0)
#define PERF_FLAG_FD_OUTPUT (1UL << 1)
--
2.24.1

2021-02-25 19:42:04

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 04/19] kvm: arm64: nvhe: Save the SPE context early

The nvhe hyp saves the SPE context, flushing any unwritten
data before we switch to the guest. But this operation is
performed way too late, because :
- The ownership of the SPE is transferred to EL2. i.e,
using EL2 translations. (MDCR_EL2_E2PB == 0)
- The guest Stage1 is loaded.

Thus the flush could use the host EL1 virtual address,
but use the EL2 translations instead. Fix this by
moving the SPE context save early.
i.e, Save the context before we load the guest stage1
and before we change the ownership to EL2.

The restore path is doing the right thing.

Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow")
Cc: [email protected]
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexandru Elisei <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
New patch.
---
arch/arm64/include/asm/kvm_hyp.h | 5 +++++
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 ++++++++++--
arch/arm64/kvm/hyp/nvhe/switch.c | 12 +++++++++++-
3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index c0450828378b..385bd7dd3d39 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -83,6 +83,11 @@ void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt);
void __debug_switch_to_guest(struct kvm_vcpu *vcpu);
void __debug_switch_to_host(struct kvm_vcpu *vcpu);

+#ifdef __KVM_NVHE_HYPERVISOR__
+void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
+void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
+#endif
+
void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);

diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index 91a711aa8382..f401724f12ef 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -58,16 +58,24 @@ static void __debug_restore_spe(u64 pmscr_el1)
write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
}

-void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
+void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
/* Disable and flush SPE data generation */
__debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
+}
+
+void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
+{
__debug_switch_to_guest_common(vcpu);
}

-void __debug_switch_to_host(struct kvm_vcpu *vcpu)
+void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
__debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
+}
+
+void __debug_switch_to_host(struct kvm_vcpu *vcpu)
+{
__debug_switch_to_host_common(vcpu);
}

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index f3d0e9eca56c..10eed66136a0 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -192,6 +192,15 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);

__sysreg_save_state_nvhe(host_ctxt);
+ /*
+ * For nVHE, we must save and disable any SPE
+ * buffers, as the translation regime is going
+ * to be loaded with that of the guest. And we must
+ * save host context for SPE, before we change the
+ * ownership to EL2 (via MDCR_EL2_E2PB == 0) and before
+ * we load guest Stage1.
+ */
+ __debug_save_host_buffers_nvhe(vcpu);

__adjust_pc(vcpu);

@@ -234,11 +243,12 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
__fpsimd_save_fpexc32(vcpu);

+ __debug_switch_to_host(vcpu);
/*
* This must come after restoring the host sysregs, since a non-VHE
* system may enable SPE here and make use of the TTBRs.
*/
- __debug_switch_to_host(vcpu);
+ __debug_restore_host_buffers_nvhe(vcpu);

if (pmu_switch_needed)
__pmu_switch_to_host(host_ctxt);
--
2.24.1

2021-02-25 19:42:17

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 02/19] perf: aux: Add CoreSight PMU buffer formats

CoreSight PMU supports aux-buffer for the ETM tracing. The trace
generated by the ETM (associated with individual CPUs, like Intel PT)
is captured by a separate IP (CoreSight TMC-ETR/ETF until now).

The TMC-ETR applies formatting of the raw ETM trace data, as it
can collect traces from multiple ETMs, with the TraceID to indicate
the source of a given trace packet.

Arm Trace Buffer Extension is new "sink" IP, attached to individual
CPUs and thus do not provide additional formatting, like TMC-ETR.

Additionally, a system could have both TRBE *and* TMC-ETR for
the trace collection. e.g, TMC-ETR could be used as a single
trace buffer to collect data from multiple ETMs to correlate
the traces from different CPUs. It is possible to have a
perf session where some events end up collecting the trace
in TMC-ETR while the others in TRBE. Thus we need a way
to identify the type of the trace for each AUX record.

Define the trace formats exported by the CoreSight PMU.
We don't define the flags following the "ETM" as this
information is available to the user when issuing
the session. What is missing is the additional
formatting applied by the "sink" which is decided
at the runtime and the user may not have a control on.

So we define :
- CORESIGHT format (indicates the Frame format)
- RAW format (indicates the format of the source)

The default value is CORESIGHT format for all the records
(i,e == 0). Add the RAW format for others that use
raw format.

Cc: Peter Zijlstra <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes from previous:
- Split from the coresight driver specific code
for ease of merging
---
include/uapi/linux/perf_event.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index f006eeab6f0e..63971eaef127 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1162,6 +1162,10 @@ enum perf_callchain_context {
#define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */
#define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */

+/* CoreSight PMU AUX buffer formats */
+#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */
+#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */
+
#define PERF_FLAG_FD_NO_GROUP (1UL << 0)
#define PERF_FLAG_FD_OUTPUT (1UL << 1)
#define PERF_FLAG_PID_CGROUP (1UL << 2) /* pid=cgroup id, per-cpu mode only */
--
2.24.1

2021-02-25 19:45:34

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 09/19] coresight: etm4x: Move ETM to prohibited region for disable

If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF),
move the ETM to trace prohibited region using TRFCR, while disabling.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
New patch
---
.../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++--
drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 15016f757828..00297906669c 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -31,6 +31,7 @@
#include <linux/pm_runtime.h>
#include <linux/property.h>

+#include <asm/barrier.h>
#include <asm/sections.h>
#include <asm/sysreg.h>
#include <asm/local.h>
@@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev,
static void etm4_disable_hw(void *info)
{
u32 control;
+ u64 trfcr;
struct etmv4_drvdata *drvdata = info;
struct etmv4_config *config = &drvdata->config;
struct coresight_device *csdev = drvdata->csdev;
@@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info)
/* EN, bit[0] Trace unit enable bit */
control &= ~0x1;

+ /*
+ * If the CPU supports v8.4 Trace filter Control,
+ * set the ETM to trace prohibited region.
+ */
+ if (drvdata->trfc) {
+ trfcr = read_sysreg_s(SYS_TRFCR_EL1);
+ write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
+ SYS_TRFCR_EL1);
+ isb();
+ }
/*
* Make sure everything completes before disabling, as recommended
* by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
@@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info)
*/
dsb(sy);
isb();
+ /* Trace synchronization barrier, is a nop if not supported */
+ tsb_csync();
etm4x_relaxed_write32(csa, control, TRCPRGCTLR);

/* wait for TRCSTATR.PMSTABLE to go to '1' */
if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
dev_err(etm_dev,
"timeout while waiting for PM stable Trace Status\n");
+ if (drvdata->trfc)
+ write_sysreg_s(trfcr, SYS_TRFCR_EL1);

/* read the status of the single shot comparators */
for (i = 0; i < drvdata->nr_ss_cmp; i++) {
@@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
return false;
}

-static void cpu_enable_tracing(void)
+static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
{
u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
u64 trfcr;
@@ -881,6 +897,7 @@ static void cpu_enable_tracing(void)
if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
return;

+ drvdata->trfc = true;
/*
* If the CPU supports v8.4 SelfHosted Tracing, enable
* tracing at the kernel EL and EL0, forcing to use the
@@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info)
/* NUMCNTR, bits[30:28] number of counters available for tracing */
drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
etm4_cs_lock(drvdata, csa);
- cpu_enable_tracing();
+ cpu_enable_tracing(drvdata);
}

static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
index 0af60571aa23..f6478ef642bf 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -862,6 +862,7 @@ struct etmv4_save_state {
* @nooverflow: Indicate if overflow prevention is supported.
* @atbtrig: If the implementation can support ATB triggers
* @lpoverride: If the implementation can support low-power state over.
+ * @trfc: If the implementation supports Arm v8.4 trace filter controls.
* @config: structure holding configuration parameters.
* @save_state: State to be preserved across power loss
* @state_needs_restore: True when there is context to restore after PM exit
@@ -912,6 +913,7 @@ struct etmv4_drvdata {
bool nooverflow;
bool atbtrig;
bool lpoverride;
+ bool trfc;
struct etmv4_config config;
struct etmv4_save_state *save_state;
bool state_needs_restore;
--
2.24.1

2021-02-25 19:45:51

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 13/19] coresight: ete: Add support for ETE sysreg access

Add support for handling the system registers for Embedded Trace
Extensions (ETE). ETE shares most of the registers with ETMv4 except
for some and also adds some new registers. Re-arrange the ETMv4x list
to share the common definitions and add the ETE sysreg support.

Cc: Mike Leach <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes:
- Fix alignment and switch pr_warn_ratelimited()
- Move NOP case list macro here
---
.../coresight/coresight-etm4x-core.c | 32 +++++++++++
drivers/hwtracing/coresight/coresight-etm4x.h | 54 +++++++++++++++----
2 files changed, 77 insertions(+), 9 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 35802caca32a..e406b6e6843b 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -115,6 +115,38 @@ void etm4x_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
}
}

+u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
+{
+ u64 res = 0;
+
+ switch (offset) {
+ ETE_READ_CASES(res)
+ default :
+ pr_warn_ratelimited("ete: trying to read unsupported register @%x\n",
+ offset);
+ }
+
+ if (!_relaxed)
+ __iormb(res); /* Imitate the !relaxed I/O helpers */
+
+ return res;
+}
+
+void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
+{
+ if (!_relaxed)
+ __iowmb(); /* Imitate the !relaxed I/O helpers */
+ if (!_64bit)
+ val &= GENMASK(31, 0);
+
+ switch (offset) {
+ ETE_WRITE_CASES(val)
+ default :
+ pr_warn_ratelimited("ete: trying to write to unsupported register @%x\n",
+ offset);
+ }
+}
+
static void etm_detect_os_lock(struct etmv4_drvdata *drvdata,
struct csdev_access *csa)
{
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
index 5b961c5b78d1..157fb1ae7e64 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -29,6 +29,7 @@
#define TRCAUXCTLR 0x018
#define TRCEVENTCTL0R 0x020
#define TRCEVENTCTL1R 0x024
+#define TRCRSR 0x028
#define TRCSTALLCTLR 0x02C
#define TRCTSCTLR 0x030
#define TRCSYNCPR 0x034
@@ -49,6 +50,7 @@
#define TRCSEQRSTEVR 0x118
#define TRCSEQSTR 0x11C
#define TRCEXTINSELR 0x120
+#define TRCEXTINSELRn(n) (0x120 + (n * 4)) /* n = 0-3 */
#define TRCCNTRLDVRn(n) (0x140 + (n * 4)) /* n = 0-3 */
#define TRCCNTCTLRn(n) (0x150 + (n * 4)) /* n = 0-3 */
#define TRCCNTVRn(n) (0x160 + (n * 4)) /* n = 0-3 */
@@ -160,10 +162,22 @@
#define CASE_NOP(__unused, x) \
case (x): /* fall through */

+#define ETE_ONLY_SYSREG_LIST(op, val) \
+ CASE_##op((val), TRCRSR) \
+ CASE_##op((val), TRCEXTINSELRn(1)) \
+ CASE_##op((val), TRCEXTINSELRn(2)) \
+ CASE_##op((val), TRCEXTINSELRn(3))
+
/* List of registers accessible via System instructions */
-#define ETM_SYSREG_LIST(op, val) \
- CASE_##op((val), TRCPRGCTLR) \
+#define ETM4x_ONLY_SYSREG_LIST(op, val) \
CASE_##op((val), TRCPROCSELR) \
+ CASE_##op((val), TRCVDCTLR) \
+ CASE_##op((val), TRCVDSACCTLR) \
+ CASE_##op((val), TRCVDARCCTLR) \
+ CASE_##op((val), TRCOSLAR)
+
+#define ETM_COMMON_SYSREG_LIST(op, val) \
+ CASE_##op((val), TRCPRGCTLR) \
CASE_##op((val), TRCSTATR) \
CASE_##op((val), TRCCONFIGR) \
CASE_##op((val), TRCAUXCTLR) \
@@ -180,9 +194,6 @@
CASE_##op((val), TRCVIIECTLR) \
CASE_##op((val), TRCVISSCTLR) \
CASE_##op((val), TRCVIPCSSCTLR) \
- CASE_##op((val), TRCVDCTLR) \
- CASE_##op((val), TRCVDSACCTLR) \
- CASE_##op((val), TRCVDARCCTLR) \
CASE_##op((val), TRCSEQEVRn(0)) \
CASE_##op((val), TRCSEQEVRn(1)) \
CASE_##op((val), TRCSEQEVRn(2)) \
@@ -277,7 +288,6 @@
CASE_##op((val), TRCSSPCICRn(5)) \
CASE_##op((val), TRCSSPCICRn(6)) \
CASE_##op((val), TRCSSPCICRn(7)) \
- CASE_##op((val), TRCOSLAR) \
CASE_##op((val), TRCOSLSR) \
CASE_##op((val), TRCACVRn(0)) \
CASE_##op((val), TRCACVRn(1)) \
@@ -369,12 +379,38 @@
CASE_##op((val), TRCPIDR2) \
CASE_##op((val), TRCPIDR3)

-#define ETM4x_READ_SYSREG_CASES(res) ETM_SYSREG_LIST(READ, (res))
-#define ETM4x_WRITE_SYSREG_CASES(val) ETM_SYSREG_LIST(WRITE, (val))
+#define ETM4x_READ_SYSREG_CASES(res) \
+ ETM_COMMON_SYSREG_LIST(READ, (res)) \
+ ETM4x_ONLY_SYSREG_LIST(READ, (res))
+
+#define ETM4x_WRITE_SYSREG_CASES(val) \
+ ETM_COMMON_SYSREG_LIST(WRITE, (val)) \
+ ETM4x_ONLY_SYSREG_LIST(WRITE, (val))
+
+#define ETM_COMMON_SYSREG_LIST_CASES \
+ ETM_COMMON_SYSREG_LIST(NOP, __unused)
+
+#define ETM4x_ONLY_SYSREG_LIST_CASES \
+ ETM4x_ONLY_SYSREG_LIST(NOP, __unused)
+
+#define ETM4x_SYSREG_LIST_CASES \
+ ETM_COMMON_SYSREG_LIST_CASES \
+ ETM4x_ONLY_SYSREG_LIST(NOP, __unused)

-#define ETM4x_SYSREG_LIST_CASES ETM_SYSREG_LIST(NOP, __unused)
#define ETM4x_MMAP_LIST_CASES ETM_MMAP_LIST(NOP, __unused)

+/* ETE only supports system register access */
+#define ETE_READ_CASES(res) \
+ ETM_COMMON_SYSREG_LIST(READ, (res)) \
+ ETE_ONLY_SYSREG_LIST(READ, (res))
+
+#define ETE_WRITE_CASES(val) \
+ ETM_COMMON_SYSREG_LIST(WRITE, (val)) \
+ ETE_ONLY_SYSREG_LIST(WRITE, (val))
+
+#define ETE_ONLY_SYSREG_LIST_CASES \
+ ETE_ONLY_SYSREG_LIST(NOP, __unused)
+
#define read_etm4x_sysreg_offset(offset, _64bit) \
({ \
u64 __val; \
--
2.24.1

2021-02-25 19:46:26

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 12/19] coresight: etm4x: Add support for PE OS lock

ETE may not implement the OS lock and instead could rely on
the PE OS Lock for the trace unit access. This is indicated
by the TRCOLSR.OSM == 0b100. Add support for handling the
PE OS lock

Cc: Mike Leach <[email protected]>
Reviewed-by: mike.leach <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
.../coresight/coresight-etm4x-core.c | 50 +++++++++++++++----
drivers/hwtracing/coresight/coresight-etm4x.h | 15 ++++++
2 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 00297906669c..35802caca32a 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -115,30 +115,59 @@ void etm4x_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
}
}

-static void etm4_os_unlock_csa(struct etmv4_drvdata *drvdata, struct csdev_access *csa)
+static void etm_detect_os_lock(struct etmv4_drvdata *drvdata,
+ struct csdev_access *csa)
{
- /* Writing 0 to TRCOSLAR unlocks the trace registers */
- etm4x_relaxed_write32(csa, 0x0, TRCOSLAR);
- drvdata->os_unlock = true;
+ u32 oslsr = etm4x_relaxed_read32(csa, TRCOSLSR);
+
+ drvdata->os_lock_model = ETM_OSLSR_OSLM(oslsr);
+}
+
+static void etm_write_os_lock(struct etmv4_drvdata *drvdata,
+ struct csdev_access *csa, u32 val)
+{
+ val = !!val;
+
+ switch (drvdata->os_lock_model) {
+ case ETM_OSLOCK_PRESENT:
+ etm4x_relaxed_write32(csa, val, TRCOSLAR);
+ break;
+ case ETM_OSLOCK_PE:
+ write_sysreg_s(val, SYS_OSLAR_EL1);
+ break;
+ default:
+ pr_warn_once("CPU%d: Unsupported Trace OSLock model: %x\n",
+ smp_processor_id(), drvdata->os_lock_model);
+ fallthrough;
+ case ETM_OSLOCK_NI:
+ return;
+ }
isb();
}

+static inline void etm4_os_unlock_csa(struct etmv4_drvdata *drvdata,
+ struct csdev_access *csa)
+{
+ WARN_ON(drvdata->cpu != smp_processor_id());
+
+ /* Writing 0 to OS Lock unlocks the trace unit registers */
+ etm_write_os_lock(drvdata, csa, 0x0);
+ drvdata->os_unlock = true;
+}
+
static void etm4_os_unlock(struct etmv4_drvdata *drvdata)
{
if (!WARN_ON(!drvdata->csdev))
etm4_os_unlock_csa(drvdata, &drvdata->csdev->access);
-
}

static void etm4_os_lock(struct etmv4_drvdata *drvdata)
{
if (WARN_ON(!drvdata->csdev))
return;
-
- /* Writing 0x1 to TRCOSLAR locks the trace registers */
- etm4x_relaxed_write32(&drvdata->csdev->access, 0x1, TRCOSLAR);
+ /* Writing 0x1 to OS Lock locks the trace registers */
+ etm_write_os_lock(drvdata, &drvdata->csdev->access, 0x1);
drvdata->os_unlock = false;
- isb();
}

static void etm4_cs_lock(struct etmv4_drvdata *drvdata,
@@ -937,6 +966,9 @@ static void etm4_init_arch_data(void *info)
if (!etm4_init_csdev_access(drvdata, csa))
return;

+ /* Detect the support for OS Lock before we actually use it */
+ etm_detect_os_lock(drvdata, csa);
+
/* Make sure all registers are accessible */
etm4_os_unlock_csa(drvdata, csa);
etm4_cs_unlock(drvdata, csa);
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
index f6478ef642bf..5b961c5b78d1 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -505,6 +505,20 @@
ETM_MODE_EXCL_KERN | \
ETM_MODE_EXCL_USER)

+/*
+ * TRCOSLSR.OSLM advertises the OS Lock model.
+ * OSLM[2:0] = TRCOSLSR[4:3,0]
+ *
+ * 0b000 - Trace OS Lock is not implemented.
+ * 0b010 - Trace OS Lock is implemented.
+ * 0b100 - Trace OS Lock is not implemented, unit is controlled by PE OS Lock.
+ */
+#define ETM_OSLOCK_NI 0b000
+#define ETM_OSLOCK_PRESENT 0b010
+#define ETM_OSLOCK_PE 0b100
+
+#define ETM_OSLSR_OSLM(oslsr) ((((oslsr) & GENMASK(4, 3)) >> 2) | (oslsr & 0x1))
+
/*
* TRCDEVARCH Bit field definitions
* Bits[31:21] - ARCHITECT = Always Arm Ltd.
@@ -898,6 +912,7 @@ struct etmv4_drvdata {
u8 s_ex_level;
u8 ns_ex_level;
u8 q_support;
+ u8 os_lock_model;
bool sticky_enable;
bool boot_enable;
bool os_unlock;
--
2.24.1

2021-02-25 19:47:36

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 06/19] arm64: Add support for trace synchronization barrier

tsb csync synchronizes the trace operation of instructions.
The instruction is a nop when FEAT_TRF is not implemented.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
New patch, split from the TRBE driver for ease of merging
---
arch/arm64/include/asm/barrier.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index c3009b0e5239..5a8367a2b868 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -23,6 +23,7 @@
#define dsb(opt) asm volatile("dsb " #opt : : : "memory")

#define psb_csync() asm volatile("hint #17" : : : "memory")
+#define tsb_csync() asm volatile("hint #18" : : : "memory")
#define csdb() asm volatile("hint #20" : : : "memory")

#define spec_bar() asm volatile(ALTERNATIVE("dsb nsh\nisb\n", \
--
2.24.1

2021-02-25 19:47:47

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 05/19] kvm: arm64: Disable guest access to trace filter controls

Disable guest access to the Trace Filter control registers.
We do not advertise the Trace filter feature to the guest
(ID_AA64DFR0_EL1: TRACE_FILT is cleared) already, but the guest
can still access the TRFCR_EL1 unless we trap it.

This will also make sure that the guest cannot fiddle with
the filtering controls set by a nvhe host.

Cc: Marc Zyngier <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
New patch
---
arch/arm64/include/asm/kvm_arm.h | 1 +
arch/arm64/kvm/debug.c | 2 ++
2 files changed, 3 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 4e90c2debf70..94d4025acc0b 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -278,6 +278,7 @@
#define CPTR_EL2_DEFAULT CPTR_EL2_RES1

/* Hyp Debug Configuration Register bits */
+#define MDCR_EL2_TTRF (1 << 19)
#define MDCR_EL2_TPMS (1 << 14)
#define MDCR_EL2_E2PB_MASK (UL(0x3))
#define MDCR_EL2_E2PB_SHIFT (UL(12))
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 7a7e425616b5..dbc890511631 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -89,6 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
* - Debug ROM Address (MDCR_EL2_TDRA)
* - OS related registers (MDCR_EL2_TDOSA)
* - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
+ * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
*
* Additionally, KVM only traps guest accesses to the debug registers if
* the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -112,6 +113,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
MDCR_EL2_TPMS |
+ MDCR_EL2_TTRF |
MDCR_EL2_TPMCR |
MDCR_EL2_TDRA |
MDCR_EL2_TDOSA);
--
2.24.1

2021-02-25 19:47:49

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 11/19] coresight: Do not scan for graph if none is present

If a graph node is not found for a given node, of_get_next_endpoint()
will emit the following error message :

OF: graph: no port node found in /<node_name>

If the given component doesn't have any explicit connections (e.g,
ETE) we could simply ignore the graph parsing. As for any legacy
component where this is mandatory, the device will not be usable
as before this patch. Updating the DT bindings to Yaml and enabling
the schema checks can detect such issues with the DT.

Cc: Mike Leach <[email protected]>
Cc: Leo Yan <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
drivers/hwtracing/coresight/coresight-platform.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-platform.c b/drivers/hwtracing/coresight/coresight-platform.c
index 3629b7885aca..c594f45319fc 100644
--- a/drivers/hwtracing/coresight/coresight-platform.c
+++ b/drivers/hwtracing/coresight/coresight-platform.c
@@ -90,6 +90,12 @@ static void of_coresight_get_ports_legacy(const struct device_node *node,
struct of_endpoint endpoint;
int in = 0, out = 0;

+ /*
+ * Avoid warnings in of_graph_get_next_endpoint()
+ * if the device doesn't have any graph connections
+ */
+ if (!of_graph_is_present(node))
+ return;
do {
ep = of_graph_get_next_endpoint(node, ep);
if (!ep)
--
2.24.1

2021-02-25 19:47:54

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 14/19] coresight: ete: Add support for ETE tracing

Add ETE as one of the supported device types we support
with ETM4x driver. The devices are named following the
existing convention as ete<N>.

ETE mandates that the trace resource status register is programmed
before the tracing is turned on. For the moment simply write to
it indicating TraceActive.

Cc: Mike Leach <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes:
- Addressed style related comments
- Moved sysreg list macro definition the previous patch
---
drivers/hwtracing/coresight/Kconfig | 10 ++--
.../coresight/coresight-etm4x-core.c | 58 ++++++++++++++-----
.../coresight/coresight-etm4x-sysfs.c | 19 +++++-
drivers/hwtracing/coresight/coresight-etm4x.h | 12 ++++
4 files changed, 78 insertions(+), 21 deletions(-)

diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index 7b44ba22cbe1..f154ae7e705d 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -97,15 +97,15 @@ config CORESIGHT_SOURCE_ETM3X
module will be called coresight-etm3x.

config CORESIGHT_SOURCE_ETM4X
- tristate "CoreSight Embedded Trace Macrocell 4.x driver"
+ tristate "CoreSight ETMv4.x / ETE driver"
depends on ARM64
select CORESIGHT_LINKS_AND_SINKS
select PID_IN_CONTEXTIDR
help
- This driver provides support for the ETM4.x tracer module, tracing the
- instructions that a processor is executing. This is primarily useful
- for instruction level tracing. Depending on the implemented version
- data tracing may also be available.
+ This driver provides support for the CoreSight Embedded Trace Macrocell
+ version 4.x and the Embedded Trace Extensions (ETE). Both are CPU tracer
+ modules, tracing the instructions that a processor is executing. This is
+ primarily useful for instruction level tracing.

To compile this driver as a module, choose M here: the
module will be called coresight-etm4x.
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index e406b6e6843b..2cf048e82e0f 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -433,6 +433,13 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR);
}

+ /*
+ * ETE mandates that the TRCRSR is written to before
+ * enabling it.
+ */
+ if (etm4x_is_ete(drvdata))
+ etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR);
+
/* Enable the trace unit */
etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);

@@ -894,13 +901,24 @@ static bool etm4_init_sysreg_access(struct etmv4_drvdata *drvdata,
* ETMs implementing sysreg access must implement TRCDEVARCH.
*/
devarch = read_etm4x_sysreg_const_offset(TRCDEVARCH);
- if ((devarch & ETM_DEVARCH_ID_MASK) != ETM_DEVARCH_ETMv4x_ARCH)
+ switch (devarch & ETM_DEVARCH_ID_MASK) {
+ case ETM_DEVARCH_ETMv4x_ARCH:
+ *csa = (struct csdev_access) {
+ .io_mem = false,
+ .read = etm4x_sysreg_read,
+ .write = etm4x_sysreg_write,
+ };
+ break;
+ case ETM_DEVARCH_ETE_ARCH:
+ *csa = (struct csdev_access) {
+ .io_mem = false,
+ .read = ete_sysreg_read,
+ .write = ete_sysreg_write,
+ };
+ break;
+ default:
return false;
- *csa = (struct csdev_access) {
- .io_mem = false,
- .read = etm4x_sysreg_read,
- .write = etm4x_sysreg_write,
- };
+ }

drvdata->arch = etm_devarch_to_arch(devarch);
return true;
@@ -1841,6 +1859,8 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
struct etmv4_drvdata *drvdata;
struct coresight_desc desc = { 0 };
struct etm4_init_arg init_arg = { 0 };
+ u8 major, minor;
+ char *type_name;

drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
if (!drvdata)
@@ -1867,10 +1887,6 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
if (drvdata->cpu < 0)
return drvdata->cpu;

- desc.name = devm_kasprintf(dev, GFP_KERNEL, "etm%d", drvdata->cpu);
- if (!desc.name)
- return -ENOMEM;
-
init_arg.drvdata = drvdata;
init_arg.csa = &desc.access;
init_arg.pid = etm_pid;
@@ -1887,6 +1903,22 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
fwnode_property_present(dev_fwnode(dev), "qcom,skip-power-up"))
drvdata->skip_power_up = true;

+ major = ETM_ARCH_MAJOR_VERSION(drvdata->arch);
+ minor = ETM_ARCH_MINOR_VERSION(drvdata->arch);
+
+ if (etm4x_is_ete(drvdata)) {
+ type_name = "ete";
+ /* ETE v1 has major version == 0b101. Adjust this for logging.*/
+ major -= 4;
+ } else {
+ type_name = "etm";
+ }
+
+ desc.name = devm_kasprintf(dev, GFP_KERNEL,
+ "%s%d", type_name, drvdata->cpu);
+ if (!desc.name)
+ return -ENOMEM;
+
etm4_init_trace_id(drvdata);
etm4_set_default(&drvdata->config);

@@ -1914,9 +1946,8 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)

etmdrvdata[drvdata->cpu] = drvdata;

- dev_info(&drvdata->csdev->dev, "CPU%d: ETM v%d.%d initialized\n",
- drvdata->cpu, ETM_ARCH_MAJOR_VERSION(drvdata->arch),
- ETM_ARCH_MINOR_VERSION(drvdata->arch));
+ dev_info(&drvdata->csdev->dev, "CPU%d: %s v%d.%d initialized\n",
+ drvdata->cpu, type_name, major, minor);

if (boot_enable) {
coresight_enable(drvdata->csdev);
@@ -2059,6 +2090,7 @@ static struct amba_driver etm4x_amba_driver = {

static const struct of_device_id etm4_sysreg_match[] = {
{ .compatible = "arm,coresight-etm4x-sysreg" },
+ { .compatible = "arm,embedded-trace-extension" },
{}
};

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
index 0995a10790f4..007bad9e7ad8 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
@@ -2374,12 +2374,20 @@ static inline bool
etm4x_register_implemented(struct etmv4_drvdata *drvdata, u32 offset)
{
switch (offset) {
- ETM4x_SYSREG_LIST_CASES
+ ETM_COMMON_SYSREG_LIST_CASES
/*
- * Registers accessible via system instructions are always
- * implemented.
+ * Common registers to ETE & ETM4x accessible via system
+ * instructions are always implemented.
*/
return true;
+
+ ETM4x_ONLY_SYSREG_LIST_CASES
+ /*
+ * We only support etm4x and ete. So if the device is not
+ * ETE, it must be ETMv4x.
+ */
+ return !etm4x_is_ete(drvdata);
+
ETM4x_MMAP_LIST_CASES
/*
* Registers accessible only via memory-mapped registers
@@ -2389,8 +2397,13 @@ etm4x_register_implemented(struct etmv4_drvdata *drvdata, u32 offset)
* coresight_register() and the csdev is not initialized
* until that is done. So rely on the drvdata->base to
* detect if we have a memory mapped access.
+ * Also ETE doesn't implement memory mapped access, thus
+ * it is sufficient to check that we are using mmio.
*/
return !!drvdata->base;
+
+ ETE_ONLY_SYSREG_LIST_CASES
+ return etm4x_is_ete(drvdata);
}

return false;
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
index 157fb1ae7e64..e5b79bdb9851 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -128,6 +128,8 @@
#define TRCCIDR2 0xFF8
#define TRCCIDR3 0xFFC

+#define TRCRSR_TA BIT(12)
+
/*
* System instructions to access ETM registers.
* See ETMv4.4 spec ARM IHI0064F section 4.3.6 System instructions
@@ -591,11 +593,14 @@
((ETM_DEVARCH_MAKE_ARCHID_ARCH_VER(major)) | ETM_DEVARCH_ARCHID_ARCH_PART(0xA13))

#define ETM_DEVARCH_ARCHID_ETMv4x ETM_DEVARCH_MAKE_ARCHID(0x4)
+#define ETM_DEVARCH_ARCHID_ETE ETM_DEVARCH_MAKE_ARCHID(0x5)

#define ETM_DEVARCH_ID_MASK \
(ETM_DEVARCH_ARCHITECT_MASK | ETM_DEVARCH_ARCHID_MASK | ETM_DEVARCH_PRESENT)
#define ETM_DEVARCH_ETMv4x_ARCH \
(ETM_DEVARCH_ARCHITECT_ARM | ETM_DEVARCH_ARCHID_ETMv4x | ETM_DEVARCH_PRESENT)
+#define ETM_DEVARCH_ETE_ARCH \
+ (ETM_DEVARCH_ARCHITECT_ARM | ETM_DEVARCH_ARCHID_ETE | ETM_DEVARCH_PRESENT)

#define TRCSTATR_IDLE_BIT 0
#define TRCSTATR_PMSTABLE_BIT 1
@@ -685,6 +690,8 @@
#define ETM_ARCH_MINOR_VERSION(arch) ((arch) & 0xfU)

#define ETM_ARCH_V4 ETM_ARCH_VERSION(4, 0)
+#define ETM_ARCH_ETE ETM_ARCH_VERSION(5, 0)
+
/* Interpretation of resource numbers change at ETM v4.3 architecture */
#define ETM_ARCH_V4_3 ETM_ARCH_VERSION(4, 3)

@@ -993,4 +1000,9 @@ void etm4_config_trace_mode(struct etmv4_config *config);

u64 etm4x_sysreg_read(u32 offset, bool _relaxed, bool _64bit);
void etm4x_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit);
+
+static inline bool etm4x_is_ete(struct etmv4_drvdata *drvdata)
+{
+ return drvdata->arch >= ETM_ARCH_ETE;
+}
#endif
--
2.24.1

2021-02-25 19:47:55

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 10/19] coresight: etm-perf: Allow an event to use different sinks

When a sink is not specified by the user, the etm perf driver
finds a suitable sink automatically, based on the first ETM
where this event could be scheduled. Then we allocate the
sink buffer based on the selected sink. This is fine for a
CPU bound event as the "sink" is always guaranteed to be
reachable from the ETM (as this is the only ETM where the
event is going to be scheduled). However, if we have a thread
bound event, the event could be scheduled on any of the ETMs
on the system. In this case, currently we automatically select
a sink and exclude any ETMs that cannot reach the selected
sink. This is problematic especially for 1x1 configurations.
We end up in tracing the event only on the "first" ETM,
as the default sink is local to the first ETM and unreachable
from the rest. However, we could allow the other ETMs to
trace if they all have a sink that is compatible with the
"selected" sink and can use the sink buffer. This can be
easily done by verifying that they are all driven by the
same driver and matches the same subtype. Please note
that at anytime there can be only one ETM tracing the event.

Adding support for different types of sinks for a single
event is complex and is not something that we expect
on a sane configuration.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Tested-by: Linu Cherian <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes:
- Rename sinks_match => sinks_compatible
- Tighten the check by matching the sink subtype
- Use user_sink instead of "sink_forced" and clean up the code (Mathieu)
- More comments, better commit description
---
.../hwtracing/coresight/coresight-etm-perf.c | 60 ++++++++++++++++---
1 file changed, 52 insertions(+), 8 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 0f603b4094f2..aa0974bd265b 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -232,6 +232,25 @@ static void etm_free_aux(void *data)
schedule_work(&event_data->work);
}

+/*
+ * Check if two given sinks are compatible with each other,
+ * so that they can use the same sink buffers, when an event
+ * moves around.
+ */
+static bool sinks_compatible(struct coresight_device *a,
+ struct coresight_device *b)
+{
+ if (!a || !b)
+ return false;
+ /*
+ * If the sinks are of the same subtype and driven
+ * by the same driver, we can use the same buffer
+ * on these sinks.
+ */
+ return (a->subtype.sink_subtype == b->subtype.sink_subtype) &&
+ (sink_ops(a) == sink_ops(b));
+}
+
static void *etm_setup_aux(struct perf_event *event, void **pages,
int nr_pages, bool overwrite)
{
@@ -239,6 +258,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
int cpu = event->cpu;
cpumask_t *mask;
struct coresight_device *sink = NULL;
+ struct coresight_device *user_sink = NULL, *last_sink = NULL;
struct etm_event_data *event_data = NULL;

event_data = alloc_event_data(cpu);
@@ -249,7 +269,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
/* First get the selected sink from user space. */
if (event->attr.config2) {
id = (u32)event->attr.config2;
- sink = coresight_get_sink_by_id(id);
+ sink = user_sink = coresight_get_sink_by_id(id);
}

mask = &event_data->mask;
@@ -277,14 +297,33 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
}

/*
- * No sink provided - look for a default sink for one of the
- * devices. At present we only support topology where all CPUs
- * use the same sink [N:1], so only need to find one sink. The
- * coresight_build_path later will remove any CPU that does not
- * attach to the sink, or if we have not found a sink.
+ * No sink provided - look for a default sink for all the ETMs,
+ * where this event can be scheduled.
+ * We allocate the sink specific buffers only once for this
+ * event. If the ETMs have different default sink devices, we
+ * can only use a single "type" of sink as the event can carry
+ * only one sink specific buffer. Thus we have to make sure
+ * that the sinks are of the same type and driven by the same
+ * driver, as the one we allocate the buffer for. As such
+ * we choose the first sink and check if the remaining ETMs
+ * have a compatible default sink. We don't trace on a CPU
+ * if the sink is not compatible.
*/
- if (!sink)
+ if (!user_sink) {
+ /* Find the default sink for this ETM */
sink = coresight_find_default_sink(csdev);
+ if (!sink) {
+ cpumask_clear_cpu(cpu, mask);
+ continue;
+ }
+
+ /* Check if this sink compatible with the last sink */
+ if (last_sink && !sinks_compatible(last_sink, sink)) {
+ cpumask_clear_cpu(cpu, mask);
+ continue;
+ }
+ last_sink = sink;
+ }

/*
* Building a path doesn't enable it, it simply builds a
@@ -312,7 +351,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
goto err;

- /* Allocate the sink buffer for this session */
+ /*
+ * Allocate the sink buffer for this session. All the sinks
+ * where this event can be scheduled are ensured to be of the
+ * same type. Thus the same sink configuration is used by the
+ * sinks.
+ */
event_data->snk_config =
sink_ops(sink)->alloc_buffer(sink, event, pages,
nr_pages, overwrite);
--
2.24.1

2021-02-25 19:48:41

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 07/19] arm64: Add TRBE definitions

From: Anshuman Khandual <[email protected]>

This adds TRBE related registers and corresponding feature macros.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
arch/arm64/include/asm/sysreg.h | 50 +++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index dfd4edbfe360..6470d783ea59 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -333,6 +333,55 @@

/*** End of Statistical Profiling Extension ***/

+/*
+ * TRBE Registers
+ */
+#define SYS_TRBLIMITR_EL1 sys_reg(3, 0, 9, 11, 0)
+#define SYS_TRBPTR_EL1 sys_reg(3, 0, 9, 11, 1)
+#define SYS_TRBBASER_EL1 sys_reg(3, 0, 9, 11, 2)
+#define SYS_TRBSR_EL1 sys_reg(3, 0, 9, 11, 3)
+#define SYS_TRBMAR_EL1 sys_reg(3, 0, 9, 11, 4)
+#define SYS_TRBTRG_EL1 sys_reg(3, 0, 9, 11, 6)
+#define SYS_TRBIDR_EL1 sys_reg(3, 0, 9, 11, 7)
+
+#define TRBLIMITR_LIMIT_MASK GENMASK_ULL(51, 0)
+#define TRBLIMITR_LIMIT_SHIFT 12
+#define TRBLIMITR_NVM BIT(5)
+#define TRBLIMITR_TRIG_MODE_MASK GENMASK(1, 0)
+#define TRBLIMITR_TRIG_MODE_SHIFT 3
+#define TRBLIMITR_FILL_MODE_MASK GENMASK(1, 0)
+#define TRBLIMITR_FILL_MODE_SHIFT 1
+#define TRBLIMITR_ENABLE BIT(0)
+#define TRBPTR_PTR_MASK GENMASK_ULL(63, 0)
+#define TRBPTR_PTR_SHIFT 0
+#define TRBBASER_BASE_MASK GENMASK_ULL(51, 0)
+#define TRBBASER_BASE_SHIFT 12
+#define TRBSR_EC_MASK GENMASK(5, 0)
+#define TRBSR_EC_SHIFT 26
+#define TRBSR_IRQ BIT(22)
+#define TRBSR_TRG BIT(21)
+#define TRBSR_WRAP BIT(20)
+#define TRBSR_ABORT BIT(18)
+#define TRBSR_STOP BIT(17)
+#define TRBSR_MSS_MASK GENMASK(15, 0)
+#define TRBSR_MSS_SHIFT 0
+#define TRBSR_BSC_MASK GENMASK(5, 0)
+#define TRBSR_BSC_SHIFT 0
+#define TRBSR_FSC_MASK GENMASK(5, 0)
+#define TRBSR_FSC_SHIFT 0
+#define TRBMAR_SHARE_MASK GENMASK(1, 0)
+#define TRBMAR_SHARE_SHIFT 8
+#define TRBMAR_OUTER_MASK GENMASK(3, 0)
+#define TRBMAR_OUTER_SHIFT 4
+#define TRBMAR_INNER_MASK GENMASK(3, 0)
+#define TRBMAR_INNER_SHIFT 0
+#define TRBTRG_TRG_MASK GENMASK(31, 0)
+#define TRBTRG_TRG_SHIFT 0
+#define TRBIDR_FLAG BIT(5)
+#define TRBIDR_PROG BIT(4)
+#define TRBIDR_ALIGN_MASK GENMASK(3, 0)
+#define TRBIDR_ALIGN_SHIFT 0
+
#define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1)
#define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2)

@@ -835,6 +884,7 @@
#define ID_AA64MMFR2_CNP_SHIFT 0

/* id_aa64dfr0 */
+#define ID_AA64DFR0_TRBE_SHIFT 44
#define ID_AA64DFR0_TRACE_FILT_SHIFT 40
#define ID_AA64DFR0_DOUBLELOCK_SHIFT 36
#define ID_AA64DFR0_PMSVER_SHIFT 32
--
2.24.1

2021-02-25 19:49:11

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 08/19] arm64: kvm: Enable access to TRBE support for host

For a nvhe host, the EL2 must allow the EL1&0 translation
regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must
be saved/restored over a trip to the guest. Also, before
entering the guest, we must flush any trace data if the
TRBE was enabled. And we must prohibit the generation
of trace while we are in EL1 by clearing the TRFCR_EL1.

For vhe, the EL2 must prevent the EL1 access to the Trace
Buffer.

Cc: Will Deacon <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
cc: Anshuman Khandual <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes
- Rebased to linux-next.
- Re-enable TRBE access on host restore.
- For nvhe, flush the trace and prohibit the
trace while we are at guest.
---
arch/arm64/include/asm/el2_setup.h | 13 +++++++++
arch/arm64/include/asm/kvm_arm.h | 2 ++
arch/arm64/include/asm/kvm_host.h | 2 ++
arch/arm64/kernel/hyp-stub.S | 3 ++-
arch/arm64/kvm/debug.c | 6 ++---
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++
arch/arm64/kvm/hyp/nvhe/switch.c | 1 +
7 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index d77d358f9395..bda918948471 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -65,6 +65,19 @@
// use EL1&0 translation.

.Lskip_spe_\@:
+ /* Trace buffer */
+ ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
+ cbz x0, .Lskip_trace_\@ // Skip if TraceBuffer is not present
+
+ mrs_s x0, SYS_TRBIDR_EL1
+ and x0, x0, TRBIDR_PROG
+ cbnz x0, .Lskip_trace_\@ // If TRBE is available at EL2
+
+ mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
+ orr x2, x2, x0 // allow the EL1&0 translation
+ // to own it.
+
+.Lskip_trace_\@:
msr mdcr_el2, x2 // Configure debug traps
.endm

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 94d4025acc0b..692c9049befa 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -278,6 +278,8 @@
#define CPTR_EL2_DEFAULT CPTR_EL2_RES1

/* Hyp Debug Configuration Register bits */
+#define MDCR_EL2_E2TB_MASK (UL(0x3))
+#define MDCR_EL2_E2TB_SHIFT (UL(24))
#define MDCR_EL2_TTRF (1 << 19)
#define MDCR_EL2_TPMS (1 << 14)
#define MDCR_EL2_E2PB_MASK (UL(0x3))
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3d10e6527f7d..80d0a1a82a4c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -315,6 +315,8 @@ struct kvm_vcpu_arch {
struct kvm_guest_debug_arch regs;
/* Statistical profiling extension */
u64 pmscr_el1;
+ /* Self-hosted trace */
+ u64 trfcr_el1;
} host_debug_state;

/* VGIC state */
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 678cd2c618ee..ba84502b582a 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
mrs_s x0, SYS_VBAR_EL12
msr vbar_el1, x0

- // Use EL2 translations for SPE and disable access from EL1
+ // Use EL2 translations for SPE & TRBE and disable access from EL1
mrs x0, mdcr_el2
bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
+ bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
msr mdcr_el2, x0

// Transfer the MM state from EL1 to EL2
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index dbc890511631..7b16f42d39f4 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
* - Debug ROM Address (MDCR_EL2_TDRA)
* - OS related registers (MDCR_EL2_TDOSA)
* - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
- * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
+ * - Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
*
* Additionally, KVM only traps guest accesses to the debug registers if
* the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);

/*
- * This also clears MDCR_EL2_E2PB_MASK to disable guest access
- * to the profiling buffer.
+ * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
+ * to disable guest access to the profiling and trace buffers
*/
vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
index f401724f12ef..9499e18dd28f 100644
--- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1)
write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
}

+static void __debug_save_trace(u64 *trfcr_el1)
+{
+
+ *trfcr_el1 = 0;
+
+ /* Check if we have TRBE */
+ if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
+ ID_AA64DFR0_TRBE_SHIFT))
+ return;
+
+ /* Check we can access the TRBE */
+ if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
+ return;
+
+ /* Check if the TRBE is enabled */
+ if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
+ return;
+ /*
+ * Prohibit trace generation while we are in guest.
+ * Since access to TRFCR_EL1 is trapped, the guest can't
+ * modify the filtering set by the host.
+ */
+ *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
+ write_sysreg_s(0, SYS_TRFCR_EL1);
+ isb();
+ /* Drain the trace buffer to memory */
+ tsb_csync();
+ dsb(nsh);
+}
+
+static void __debug_restore_trace(u64 trfcr_el1)
+{
+ if (!trfcr_el1)
+ return;
+
+ /* Restore trace filter controls */
+ write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
+}
+
void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
/* Disable and flush SPE data generation */
__debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
+ /* Disable and flush Self-Hosted Trace generation */
+ __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1);
}

void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
@@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
{
__debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
+ __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1);
}

void __debug_switch_to_host(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 10eed66136a0..d6ea5c8b5551 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)

mdcr_el2 &= MDCR_EL2_HPMN_MASK;
mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
+ mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;

write_sysreg(mdcr_el2, mdcr_el2);
if (is_protected_kvm_enabled())
--
2.24.1

2021-02-25 19:50:49

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 16/19] coresight: etm-perf: Handle stale output handles

The context associated with an ETM for a given perf event
includes :
- handle -> the perf output handle for the AUX buffer.
- the path for the trace components
- the buffer config for the sink.

The path and the buffer config are part of the "aux_priv" data
(etm_event_data) setup by the setup_aux() callback, and made available
via perf_get_aux(handle).

Now with a sink supporting IRQ, the sink could "end" an output
handle when the buffer reaches the programmed limit and would try
to restart a handle. This could fail if there is not enough
space left the AUX buffer (e.g, the userspace has not consumed
the data). This leaves the "handle" disconnected from the "event"
and also the "perf_get_aux()" cleared. This all happens within
the sink driver, without the etm_perf driver being aware.
Now when the event is actually stopped, etm_event_stop()
will need to access the "event_data". But since the handle
is not valid anymore, we loose the information to stop the
"trace" path. So, we need a reliable way to access the etm_event_data
even when the handle may not be active.

This patch replaces the per_cpu handle array with a per_cpu context
for the ETM, which tracks the "handle" as well as the "etm_event_data".
The context notes the etm_event_data at etm_event_start() and clears
it at etm_event_stop(). This makes sure that we don't access a
stale "etm_event_data" as we are guaranteed that it is not
freed by free_aux() as long as the event is active and tracing,
also provides us with access to the critical information
needed to wind up a session even in the absence of an active
output_handle.

This is not an issue for the legacy sinks as none of them supports
an IRQ and is centrally handled by the etm-perf.

Cc: Mathieu Poirier <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mike Leach <[email protected]>
Reviewed-by: Mike Leach <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes :
- Added WARN_ON() as suggested by Mathieu
---
.../hwtracing/coresight/coresight-etm-perf.c | 59 +++++++++++++++++--
1 file changed, 54 insertions(+), 5 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index aa0974bd265b..f123c26b9f54 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -24,7 +24,26 @@
static struct pmu etm_pmu;
static bool etm_perf_up;

-static DEFINE_PER_CPU(struct perf_output_handle, ctx_handle);
+/*
+ * An ETM context for a running event includes the perf aux handle
+ * and aux_data. For ETM, the aux_data (etm_event_data), consists of
+ * the trace path and the sink configuration. The event data is accessible
+ * via perf_get_aux(handle). However, a sink could "end" a perf output
+ * handle via the IRQ handler. And if the "sink" encounters a failure
+ * to "begin" another session (e.g due to lack of space in the buffer),
+ * the handle will be cleared. Thus, the event_data may not be accessible
+ * from the handle when we get to the etm_event_stop(), which is required
+ * for stopping the trace path. The event_data is guaranteed to stay alive
+ * until "free_aux()", which cannot happen as long as the event is active on
+ * the ETM. Thus the event_data for the session must be part of the ETM context
+ * to make sure we can disable the trace path.
+ */
+struct etm_ctxt {
+ struct perf_output_handle handle;
+ struct etm_event_data *event_data;
+};
+
+static DEFINE_PER_CPU(struct etm_ctxt, etm_ctxt);
static DEFINE_PER_CPU(struct coresight_device *, csdev_src);

/*
@@ -376,13 +395,18 @@ static void etm_event_start(struct perf_event *event, int flags)
{
int cpu = smp_processor_id();
struct etm_event_data *event_data;
- struct perf_output_handle *handle = this_cpu_ptr(&ctx_handle);
+ struct etm_ctxt *ctxt = this_cpu_ptr(&etm_ctxt);
+ struct perf_output_handle *handle = &ctxt->handle;
struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu);
struct list_head *path;

if (!csdev)
goto fail;

+ /* Have we messed up our tracking ? */
+ if (WARN_ON(ctxt->event_data))
+ goto fail;
+
/*
* Deal with the ring buffer API and get a handle on the
* session's information.
@@ -418,6 +442,8 @@ static void etm_event_start(struct perf_event *event, int flags)
if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
goto fail_disable_path;

+ /* Save the event_data for this ETM */
+ ctxt->event_data = event_data;
out:
return;

@@ -436,13 +462,30 @@ static void etm_event_stop(struct perf_event *event, int mode)
int cpu = smp_processor_id();
unsigned long size;
struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu);
- struct perf_output_handle *handle = this_cpu_ptr(&ctx_handle);
- struct etm_event_data *event_data = perf_get_aux(handle);
+ struct etm_ctxt *ctxt = this_cpu_ptr(&etm_ctxt);
+ struct perf_output_handle *handle = &ctxt->handle;
+ struct etm_event_data *event_data;
struct list_head *path;

+ /*
+ * If we still have access to the event_data via handle,
+ * confirm that we haven't messed up the tracking.
+ */
+ if (handle->event &&
+ WARN_ON(perf_get_aux(handle) != ctxt->event_data))
+ return;
+
+ event_data = ctxt->event_data;
+ /* Clear the event_data as this ETM is stopping the trace. */
+ ctxt->event_data = NULL;
+
if (event->hw.state == PERF_HES_STOPPED)
return;

+ /* We must have a valid event_data for a running event */
+ if (WARN_ON(!event_data))
+ return;
+
if (!csdev)
return;

@@ -460,7 +503,13 @@ static void etm_event_stop(struct perf_event *event, int mode)
/* tell the core */
event->hw.state = PERF_HES_STOPPED;

- if (mode & PERF_EF_UPDATE) {
+ /*
+ * If the handle is not bound to an event anymore
+ * (e.g, the sink driver was unable to restart the
+ * handle due to lack of buffer space), we don't
+ * have to do anything here.
+ */
+ if (handle->event && (mode & PERF_EF_UPDATE)) {
if (WARN_ON_ONCE(handle->event != event))
return;

--
2.24.1

2021-02-25 19:51:13

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 19/19] dts: bindings: Document device tree bindings for Arm TRBE

Document the device tree bindings for Trace Buffer Extension (TRBE).

Cc: Anshuman Khandual <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: [email protected]
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
.../devicetree/bindings/arm/trbe.yaml | 49 +++++++++++++++++++
1 file changed, 49 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/trbe.yaml

diff --git a/Documentation/devicetree/bindings/arm/trbe.yaml b/Documentation/devicetree/bindings/arm/trbe.yaml
new file mode 100644
index 000000000000..4402d7bfd1fc
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/trbe.yaml
@@ -0,0 +1,49 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+# Copyright 2021, Arm Ltd
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/arm/trbe.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: ARM Trace Buffer Extensions
+
+maintainers:
+ - Anshuman Khandual <[email protected]>
+
+description: |
+ Arm Trace Buffer Extension (TRBE) is a per CPU component
+ for storing trace generated on the CPU to memory. It is
+ accessed via CPU system registers. The software can verify
+ if it is permitted to use the component by checking the
+ TRBIDR register.
+
+properties:
+ $nodename:
+ const: "trbe"
+ compatible:
+ items:
+ - const: arm,trace-buffer-extension
+
+ interrupts:
+ description: |
+ Exactly 1 PPI must be listed. For heterogeneous systems where
+ TRBE is only supported on a subset of the CPUs, please consult
+ the arm,gic-v3 binding for details on describing a PPI partition.
+ maxItems: 1
+
+required:
+ - compatible
+ - interrupts
+
+additionalProperties: false
+
+examples:
+
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+ trbe {
+ compatible = "arm,trace-buffer-extension";
+ interrupts = <GIC_PPI 15 IRQ_TYPE_LEVEL_HIGH>;
+ };
+...
--
2.24.1

2021-02-25 19:51:46

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 15/19] dts: bindings: Document device tree bindings for ETE

Document the device tree bindings for Embedded Trace Extensions.
ETE can be connected to legacy coresight components and thus
could optionally contain a connection graph as described by
the CoreSight bindings.

Cc: [email protected]
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Rob Herring <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes:
- Fix out-ports defintion
---
.../devicetree/bindings/arm/ete.yaml | 71 +++++++++++++++++++
1 file changed, 71 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml

diff --git a/Documentation/devicetree/bindings/arm/ete.yaml b/Documentation/devicetree/bindings/arm/ete.yaml
new file mode 100644
index 000000000000..35a42d92bf97
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/ete.yaml
@@ -0,0 +1,71 @@
+# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
+# Copyright 2021, Arm Ltd
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/arm/ete.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: ARM Embedded Trace Extensions
+
+maintainers:
+ - Suzuki K Poulose <[email protected]>
+ - Mathieu Poirier <[email protected]>
+
+description: |
+ Arm Embedded Trace Extension(ETE) is a per CPU trace component that
+ allows tracing the CPU execution. It overlaps with the CoreSight ETMv4
+ architecture and has extended support for future architecture changes.
+ The trace generated by the ETE could be stored via legacy CoreSight
+ components (e.g, TMC-ETR) or other means (e.g, using a per CPU buffer
+ Arm Trace Buffer Extension (TRBE)). Since the ETE can be connected to
+ legacy CoreSight components, a node must be listed per instance, along
+ with any optional connection graph as per the coresight bindings.
+ See bindings/arm/coresight.txt.
+
+properties:
+ $nodename:
+ pattern: "^ete([0-9a-f]+)$"
+ compatible:
+ items:
+ - const: arm,embedded-trace-extension
+
+ cpu:
+ description: |
+ Handle to the cpu this ETE is bound to.
+ $ref: /schemas/types.yaml#/definitions/phandle
+
+ out-ports:
+ description: |
+ Output connections from the ETE to legacy CoreSight trace bus.
+ $ref: /schemas/graph.yaml#/properties/port
+
+required:
+ - compatible
+ - cpu
+
+additionalProperties: false
+
+examples:
+
+# An ETE node without legacy CoreSight connections
+ - |
+ ete0 {
+ compatible = "arm,embedded-trace-extension";
+ cpu = <&cpu_0>;
+ };
+# An ETE node with legacy CoreSight connections
+ - |
+ ete1 {
+ compatible = "arm,embedded-trace-extension";
+ cpu = <&cpu_1>;
+
+ out-ports { /* legacy coresight connection */
+ port {
+ ete1_out_port: endpoint {
+ remote-endpoint = <&funnel_in_port0>;
+ };
+ };
+ };
+ };
+
+...
--
2.24.1

2021-02-25 19:52:04

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 18/19] coresight: sink: Add TRBE driver

From: Anshuman Khandual <[email protected]>

Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
accessible via the system registers. The TRBE supports different addressing
modes including CPU virtual address and buffer modes including the circular
buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
access to the trace buffer could be prohibited by a higher exception level
(EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
private interrupt (PPI) on address translation errors and when the buffer
is full. Overall implementation here is inspired from the Arm SPE driver.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes:
- Replaced TRBLIMITR_LIMIT_SHIFT with TRBBASER_BASE_SHIFT in set_trbe_base_pointer()
- Dropped TRBBASER_BASE_MASK and TRBBASER_BASE_SHIFT from get_trbe_base_pointer()
- Indentation changes for TRBE_BSC_NOT_[STOPPED|FILLED|TRIGGERED] definitions
- Moved DECLARE_PER_CPU(...., csdev_sink) into coresight-priv.h
- Moved isb() from trbe_enable_hw() into set_trbe_limit_pointer_enabled()
- Dropped the space after type casting before vmap()
- Return 0 instead of EINVAL in arm_trbe_update_buffer()
- Add a comment in trbe_handle_overflow()
- Add a comment in arm_trbe_cpu_startup()
- Unregister coresight TRBE device when not supported
- Fix potential NULL handle dereference in IRQ handler with a spurious IRQ
- Read TRBIDR after is_trbe_programmable() in arm_trbe_probe_coresight_cpu()
- Replaced and modified trbe_drain_and_disable_local() in IRQ handler
- Updated arm_trbe_update_buffer() for handling a missing interrupt
- Dropped kfree() for all devm_xxx() allocated buffer
- Dropped additional blank line in documentation coresight/coresight-trbe.rst
- Added Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
- Changed CONFIG_CORESIGHT_TRBE options, dependencies and helper write up
- Added comment for irq_work_run()
- Updated comment for minumum buffer length in arm_trbe_alloc_buffer()
- Dropped redundant smp_processor_id() from arm_trbe_probe_coresight_cpu()
- Fixed indentation in arm_trbe_probe_cpuhp()
- Added static for arm_trbe_free_buffer()
- Added comment for trbe_base element in trbe_buf structure
- Dropped IS_ERR() check from vmap() returned pointer
- Added WARN_ON(trbe_csdev) in arm_trbe_probe_coresight_cpu()
- Changed TRBE device names from arm_trbeX to just trbeX
- Dropped unused argument perf_output_handle from trbe_get_fault_act()
- Dropped IS_ERR() from kzalloc_node()/kcalloc() buffer in arm_trbe_alloc_buffer()
- Dropped IS_ERR() and return -ENOMEM in arm_trbe_probe_coresight()
- Moved TRBE HW disabling before coresight cleanup in arm_trbe_remove_coresight_cpu()
- Changed error return codes from arm_trbe_probe_irq()
- Changed error return codes from arm_trbe_device_probe()
- Changed arm_trbe_remove_coresight() order in arm_trbe_device_remove()
- Changed TRBE CPU support probe/remove sequence with for_each_cpu() iterator
- Changed coresight_register() in arm_trbe_probe_coresight_cpu()
- Changed error return code when cpuhp_setup_state_multi() fails in arm_trbe_probe_cpuhp()
- Changed error return code when cpuhp_state_add_instance() fails in arm_trbe_probe_cpuhp()
- Changed trbe_dbm as trbe_flag including its sysfs interface
- Handle race between update_buffer & IRQ handler
- Rework and split the TRBE probe to avoid lockdep due to memory allocation
from IPI calls (via coresight_register())
- Fix handle->head updat for snapshot mode.
---
.../testing/sysfs-bus-coresight-devices-trbe | 14 +
.../trace/coresight/coresight-trbe.rst | 38 +
drivers/hwtracing/coresight/Kconfig | 14 +
drivers/hwtracing/coresight/Makefile | 1 +
drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
6 files changed, 1369 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h

diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
new file mode 100644
index 000000000000..ad3bbc6fa751
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
@@ -0,0 +1,14 @@
+What: /sys/bus/coresight/devices/trbe<cpu>/align
+Date: March 2021
+KernelVersion: 5.13
+Contact: Anshuman Khandual <[email protected]>
+Description: (Read) Shows the TRBE write pointer alignment. This value
+ is fetched from the TRBIDR register.
+
+What: /sys/bus/coresight/devices/trbe<cpu>/flag
+Date: March 2021
+KernelVersion: 5.13
+Contact: Anshuman Khandual <[email protected]>
+Description: (Read) Shows if TRBE updates in the memory are with access
+ and dirty flag updates as well. This value is fetched from
+ the TRBIDR register.
diff --git a/Documentation/trace/coresight/coresight-trbe.rst b/Documentation/trace/coresight/coresight-trbe.rst
new file mode 100644
index 000000000000..b9928ef148da
--- /dev/null
+++ b/Documentation/trace/coresight/coresight-trbe.rst
@@ -0,0 +1,38 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Trace Buffer Extension (TRBE).
+==============================
+
+ :Author: Anshuman Khandual <[email protected]>
+ :Date: November 2020
+
+Hardware Description
+--------------------
+
+Trace Buffer Extension (TRBE) is a percpu hardware which captures in system
+memory, CPU traces generated from a corresponding percpu tracing unit. This
+gets plugged in as a coresight sink device because the corresponding trace
+generators (ETE), are plugged in as source device.
+
+The TRBE is not compliant to CoreSight architecture specifications, but is
+driven via the CoreSight driver framework to support the ETE (which is
+CoreSight compliant) integration.
+
+Sysfs files and directories
+---------------------------
+
+The TRBE devices appear on the existing coresight bus alongside the other
+coresight devices::
+
+ >$ ls /sys/bus/coresight/devices
+ trbe0 trbe1 trbe2 trbe3
+
+The ``trbe<N>`` named TRBEs are associated with a CPU.::
+
+ >$ ls /sys/bus/coresight/devices/trbe0/
+ align flag
+
+*Key file items are:-*
+ * ``align``: TRBE write pointer alignment
+ * ``flag``: TRBE updates memory with access and dirty flags
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index f154ae7e705d..84530fd80998 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -173,4 +173,18 @@ config CORESIGHT_CTI_INTEGRATION_REGS
CTI trigger connections between this and other devices.These
registers are not used in normal operation and can leave devices in
an inconsistent state.
+
+config CORESIGHT_TRBE
+ tristate "Trace Buffer Extension (TRBE) driver"
+ depends on ARM64 && CORESIGHT_SOURCE_ETM4X
+ help
+ This driver provides support for percpu Trace Buffer Extension (TRBE).
+ TRBE always needs to be used along with it's corresponding percpu ETE
+ component. ETE generates trace data which is then captured with TRBE.
+ Unlike traditional sink devices, TRBE is a CPU feature accessible via
+ system registers. But it's explicit dependency with trace unit (ETE)
+ requires it to be plugged in as a coresight sink device.
+
+ To compile this driver as a module, choose M here: the module will be
+ called coresight-trbe.
endif
diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
index f20e357758d1..d60816509755 100644
--- a/drivers/hwtracing/coresight/Makefile
+++ b/drivers/hwtracing/coresight/Makefile
@@ -21,5 +21,6 @@ obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
obj-$(CONFIG_CORESIGHT_CATU) += coresight-catu.o
obj-$(CONFIG_CORESIGHT_CTI) += coresight-cti.o
+obj-$(CONFIG_CORESIGHT_TRBE) += coresight-trbe.o
coresight-cti-y := coresight-cti-core.o coresight-cti-platform.o \
coresight-cti-sysfs.o
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
new file mode 100644
index 000000000000..41a012b525bb
--- /dev/null
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -0,0 +1,1149 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This driver enables Trace Buffer Extension (TRBE) as a per-cpu coresight
+ * sink device could then pair with an appropriate per-cpu coresight source
+ * device (ETE) thus generating required trace data. Trace can be enabled
+ * via the perf framework.
+ *
+ * Copyright (C) 2020 ARM Ltd.
+ *
+ * Author: Anshuman Khandual <[email protected]>
+ */
+#define DRVNAME "arm_trbe"
+
+#define pr_fmt(fmt) DRVNAME ": " fmt
+
+#include <asm/barrier.h>
+#include "coresight-trbe.h"
+
+#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
+
+/*
+ * A padding packet that will help the user space tools
+ * in skipping relevant sections in the captured trace
+ * data which could not be decoded. TRBE doesn't support
+ * formatting the trace data, unlike the legacy CoreSight
+ * sinks and thus we use ETE trace packets to pad the
+ * sections of the buffer.
+ */
+#define ETE_IGNORE_PACKET 0x70
+
+/*
+ * Minimum amount of meaningful trace will contain:
+ * A-Sync, Trace Info, Trace On, Address, Atom.
+ * This is about 44bytes of ETE trace. To be on
+ * the safer side, we assume 64bytes is the minimum
+ * space required for a meaningful session, before
+ * we hit a "WRAP" event.
+ */
+#define TRBE_TRACE_MIN_BUF_SIZE 64
+
+enum trbe_fault_action {
+ TRBE_FAULT_ACT_WRAP,
+ TRBE_FAULT_ACT_SPURIOUS,
+ TRBE_FAULT_ACT_FATAL,
+};
+
+struct trbe_buf {
+ /*
+ * Even though trbe_base represents vmap()
+ * mapped allocated buffer's start address,
+ * it's being as unsigned long for various
+ * arithmetic and comparision operations &
+ * also to be consistent with trbe_write &
+ * trbe_limit sibling pointers.
+ */
+ unsigned long trbe_base;
+ unsigned long trbe_limit;
+ unsigned long trbe_write;
+ int nr_pages;
+ void **pages;
+ bool snapshot;
+ struct trbe_cpudata *cpudata;
+};
+
+struct trbe_cpudata {
+ bool trbe_flag;
+ u64 trbe_align;
+ int cpu;
+ enum cs_mode mode;
+ struct trbe_buf *buf;
+ struct trbe_drvdata *drvdata;
+};
+
+struct trbe_drvdata {
+ struct trbe_cpudata __percpu *cpudata;
+ struct perf_output_handle __percpu **handle;
+ struct hlist_node hotplug_node;
+ int irq;
+ cpumask_t supported_cpus;
+ enum cpuhp_state trbe_online;
+ struct platform_device *pdev;
+};
+
+static int trbe_alloc_node(struct perf_event *event)
+{
+ if (event->cpu == -1)
+ return NUMA_NO_NODE;
+ return cpu_to_node(event->cpu);
+}
+
+static void trbe_drain_buffer(void)
+{
+ tsb_csync();
+ dsb(nsh);
+}
+
+static void trbe_drain_and_disable_local(void)
+{
+ u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
+
+ trbe_drain_buffer();
+
+ /*
+ * Disable the TRBE without clearing LIMITPTR which
+ * might be required for fetching the buffer limits.
+ */
+ trblimitr &= ~TRBLIMITR_ENABLE;
+ write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
+ isb();
+}
+
+static void trbe_reset_local(void)
+{
+ trbe_drain_and_disable_local();
+ write_sysreg_s(0, SYS_TRBLIMITR_EL1);
+ write_sysreg_s(0, SYS_TRBPTR_EL1);
+ write_sysreg_s(0, SYS_TRBBASER_EL1);
+ write_sysreg_s(0, SYS_TRBSR_EL1);
+}
+
+static void trbe_stop_and_truncate_event(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+
+ /*
+ * We cannot proceed with the buffer collection and we
+ * do not have any data for the current session. The
+ * etm_perf driver expects to close out the aux_buffer
+ * at event_stop(). So disable the TRBE here and leave
+ * the update_buffer() to return a 0 size.
+ */
+ trbe_drain_and_disable_local();
+ perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
+ *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
+}
+
+/*
+ * TRBE Buffer Management
+ *
+ * The TRBE buffer spans from the base pointer till the limit pointer. When enabled,
+ * it starts writing trace data from the write pointer onward till the limit pointer.
+ * When the write pointer reaches the address just before the limit pointer, it gets
+ * wrapped around again to the base pointer. This is called a TRBE wrap event, which
+ * generates a maintenance interrupt when operated in WRAP or FILL mode. This driver
+ * uses FILL mode, where the TRBE stops the trace collection at wrap event. The IRQ
+ * handler updates the AUX buffer and re-enables the TRBE with updated WRITE and
+ * LIMIT pointers.
+ *
+ * Wrap around with an IRQ
+ * ------ < ------ < ------- < ----- < -----
+ * | |
+ * ------ > ------ > ------- > ----- > -----
+ *
+ * +---------------+-----------------------+
+ * | | |
+ * +---------------+-----------------------+
+ * Base Pointer Write Pointer Limit Pointer
+ *
+ * The base and limit pointers always needs to be PAGE_SIZE aligned. But the write
+ * pointer can be aligned to the implementation defined TRBE trace buffer alignment
+ * as captured in trbe_cpudata->trbe_align.
+ *
+ *
+ * head tail wakeup
+ * +---------------------------------------+----- ~ ~ ------
+ * |$$$$$$$|################|$$$$$$$$$$$$$$| |
+ * +---------------------------------------+----- ~ ~ ------
+ * Base Pointer Write Pointer Limit Pointer
+ *
+ * The perf_output_handle indices (head, tail, wakeup) are monotonically increasing
+ * values which tracks all the driver writes and user reads from the perf auxiliary
+ * buffer. Generally [head..tail] is the area where the driver can write into unless
+ * the wakeup is behind the tail. Enabled TRBE buffer span needs to be adjusted and
+ * configured depending on the perf_output_handle indices, so that the driver does
+ * not override into areas in the perf auxiliary buffer which is being or yet to be
+ * consumed from the user space. The enabled TRBE buffer area is a moving subset of
+ * the allocated perf auxiliary buffer.
+ */
+static void trbe_pad_buf(struct perf_output_handle *handle, int len)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+ u64 head = PERF_IDX2OFF(handle->head, buf);
+
+ memset((void *)buf->trbe_base + head, ETE_IGNORE_PACKET, len);
+ if (!buf->snapshot)
+ perf_aux_output_skip(handle, len);
+}
+
+static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+
+ /*
+ * The ETE trace has alignment synchronization packets allowing
+ * the decoder to reset in case of an overflow or corruption.
+ * So we can use the entire buffer for the snapshot mode.
+ */
+ return buf->nr_pages * PAGE_SIZE;
+}
+
+/*
+ * TRBE Limit Calculation
+ *
+ * The following markers are used to illustrate various TRBE buffer situations.
+ *
+ * $$$$ - Data area, unconsumed captured trace data, not to be overridden
+ * #### - Free area, enabled, trace will be written
+ * %%%% - Free area, disabled, trace will not be written
+ * ==== - Free area, padded with ETE_IGNORE_PACKET, trace will be skipped
+ */
+static unsigned long __trbe_normal_offset(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+ struct trbe_cpudata *cpudata = buf->cpudata;
+ const u64 bufsize = buf->nr_pages * PAGE_SIZE;
+ u64 limit = bufsize;
+ u64 head, tail, wakeup;
+
+ head = PERF_IDX2OFF(handle->head, buf);
+
+ /*
+ * head
+ * ------->|
+ * |
+ * head TRBE align tail
+ * +----|-------|---------------|-------+
+ * |$$$$|=======|###############|$$$$$$$|
+ * +----|-------|---------------|-------+
+ * trbe_base trbe_base + nr_pages
+ *
+ * Perf aux buffer output head position can be misaligned depending on
+ * various factors including user space reads. In case misaligned, head
+ * needs to be aligned before TRBE can be configured. Pad the alignment
+ * gap with ETE_IGNORE_PACKET bytes that will be ignored by user tools
+ * and skip this section thus advancing the head.
+ */
+ if (!IS_ALIGNED(head, cpudata->trbe_align)) {
+ unsigned long delta = roundup(head, cpudata->trbe_align) - head;
+
+ delta = min(delta, handle->size);
+ trbe_pad_buf(handle, delta);
+ head = PERF_IDX2OFF(handle->head, buf);
+ }
+
+ /*
+ * head = tail (size = 0)
+ * +----|-------------------------------+
+ * |$$$$|$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ |
+ * +----|-------------------------------+
+ * trbe_base trbe_base + nr_pages
+ *
+ * Perf aux buffer does not have any space for the driver to write into.
+ * Just communicate trace truncation event to the user space by marking
+ * it with PERF_AUX_FLAG_TRUNCATED.
+ */
+ if (!handle->size) {
+ perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
+ return 0;
+ }
+
+ /* Compute the tail and wakeup indices now that we've aligned head */
+ tail = PERF_IDX2OFF(handle->head + handle->size, buf);
+ wakeup = PERF_IDX2OFF(handle->wakeup, buf);
+
+ /*
+ * Lets calculate the buffer area which TRBE could write into. There
+ * are three possible scenarios here. Limit needs to be aligned with
+ * PAGE_SIZE per the TRBE requirement. Always avoid clobbering the
+ * unconsumed data.
+ *
+ * 1) head < tail
+ *
+ * head tail
+ * +----|-----------------------|-------+
+ * |$$$$|#######################|$$$$$$$|
+ * +----|-----------------------|-------+
+ * trbe_base limit trbe_base + nr_pages
+ *
+ * TRBE could write into [head..tail] area. Unless the tail is right at
+ * the end of the buffer, neither an wrap around nor an IRQ is expected
+ * while being enabled.
+ *
+ * 2) head == tail
+ *
+ * head = tail (size > 0)
+ * +----|-------------------------------+
+ * |%%%%|###############################|
+ * +----|-------------------------------+
+ * trbe_base limit = trbe_base + nr_pages
+ *
+ * TRBE should just write into [head..base + nr_pages] area even though
+ * the entire buffer is empty. Reason being, when the trace reaches the
+ * end of the buffer, it will just wrap around with an IRQ giving an
+ * opportunity to reconfigure the buffer.
+ *
+ * 3) tail < head
+ *
+ * tail head
+ * +----|-----------------------|-------+
+ * |%%%%|$$$$$$$$$$$$$$$$$$$$$$$|#######|
+ * +----|-----------------------|-------+
+ * trbe_base limit = trbe_base + nr_pages
+ *
+ * TRBE should just write into [head..base + nr_pages] area even though
+ * the [trbe_base..tail] is also empty. Reason being, when the trace
+ * reaches the end of the buffer, it will just wrap around with an IRQ
+ * giving an opportunity to reconfigure the buffer.
+ */
+ if (head < tail)
+ limit = round_down(tail, PAGE_SIZE);
+
+ /*
+ * Wakeup may be arbitrarily far into the future. If it's not in the
+ * current generation, either we'll wrap before hitting it, or it's
+ * in the past and has been handled already.
+ *
+ * If there's a wakeup before we wrap, arrange to be woken up by the
+ * page boundary following it. Keep the tail boundary if that's lower.
+ *
+ * head wakeup tail
+ * +----|---------------|-------|-------+
+ * |$$$$|###############|%%%%%%%|$$$$$$$|
+ * +----|---------------|-------|-------+
+ * trbe_base limit trbe_base + nr_pages
+ */
+ if (handle->wakeup < (handle->head + handle->size) && head <= wakeup)
+ limit = min(limit, round_up(wakeup, PAGE_SIZE));
+
+ /*
+ * There are two situation when this can happen i.e limit is before
+ * the head and hence TRBE cannot be configured.
+ *
+ * 1) head < tail (aligned down with PAGE_SIZE) and also they are both
+ * within the same PAGE size range.
+ *
+ * PAGE_SIZE
+ * |----------------------|
+ *
+ * limit head tail
+ * +------------|------|--------|-------+
+ * |$$$$$$$$$$$$$$$$$$$|========|$$$$$$$|
+ * +------------|------|--------|-------+
+ * trbe_base trbe_base + nr_pages
+ *
+ * 2) head < wakeup (aligned up with PAGE_SIZE) < tail and also both
+ * head and wakeup are within same PAGE size range.
+ *
+ * PAGE_SIZE
+ * |----------------------|
+ *
+ * limit head wakeup tail
+ * +----|------|-------|--------|-------+
+ * |$$$$$$$$$$$|=======|========|$$$$$$$|
+ * +----|------|-------|--------|-------+
+ * trbe_base trbe_base + nr_pages
+ */
+ if (limit > head)
+ return limit;
+
+ trbe_pad_buf(handle, handle->size);
+ perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
+ return 0;
+}
+
+static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = perf_get_aux(handle);
+ u64 limit = __trbe_normal_offset(handle);
+ u64 head = PERF_IDX2OFF(handle->head, buf);
+
+ /*
+ * If the head is too close to the limit and we don't
+ * have space for a meaningful run, we rather pad it
+ * and start fresh.
+ */
+ if (limit && (limit - head < TRBE_TRACE_MIN_BUF_SIZE)) {
+ trbe_pad_buf(handle, limit - head);
+ limit = __trbe_normal_offset(handle);
+ }
+ return limit;
+}
+
+static unsigned long compute_trbe_buffer_limit(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+ unsigned long offset;
+
+ if (buf->snapshot)
+ offset = trbe_snapshot_offset(handle);
+ else
+ offset = trbe_normal_offset(handle);
+ return buf->trbe_base + offset;
+}
+
+static void clr_trbe_status(void)
+{
+ u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
+
+ WARN_ON(is_trbe_enabled());
+ trbsr &= ~TRBSR_IRQ;
+ trbsr &= ~TRBSR_TRG;
+ trbsr &= ~TRBSR_WRAP;
+ trbsr &= ~(TRBSR_EC_MASK << TRBSR_EC_SHIFT);
+ trbsr &= ~(TRBSR_BSC_MASK << TRBSR_BSC_SHIFT);
+ trbsr &= ~TRBSR_STOP;
+ write_sysreg_s(trbsr, SYS_TRBSR_EL1);
+}
+
+static void set_trbe_limit_pointer_enabled(unsigned long addr)
+{
+ u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
+
+ WARN_ON(!IS_ALIGNED(addr, (1UL << TRBLIMITR_LIMIT_SHIFT)));
+ WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
+
+ trblimitr &= ~TRBLIMITR_NVM;
+ trblimitr &= ~(TRBLIMITR_FILL_MODE_MASK << TRBLIMITR_FILL_MODE_SHIFT);
+ trblimitr &= ~(TRBLIMITR_TRIG_MODE_MASK << TRBLIMITR_TRIG_MODE_SHIFT);
+ trblimitr &= ~(TRBLIMITR_LIMIT_MASK << TRBLIMITR_LIMIT_SHIFT);
+
+ /*
+ * Fill trace buffer mode is used here while configuring the
+ * TRBE for trace capture. In this particular mode, the trace
+ * collection is stopped and a maintenance interrupt is raised
+ * when the current write pointer wraps. This pause in trace
+ * collection gives the software an opportunity to capture the
+ * trace data in the interrupt handler, before reconfiguring
+ * the TRBE.
+ */
+ trblimitr |= (TRBE_FILL_MODE_FILL & TRBLIMITR_FILL_MODE_MASK) << TRBLIMITR_FILL_MODE_SHIFT;
+
+ /*
+ * Trigger mode is not used here while configuring the TRBE for
+ * the trace capture. Hence just keep this in the ignore mode.
+ */
+ trblimitr |= (TRBE_TRIG_MODE_IGNORE & TRBLIMITR_TRIG_MODE_MASK) <<
+ TRBLIMITR_TRIG_MODE_SHIFT;
+ trblimitr |= (addr & PAGE_MASK);
+
+ trblimitr |= TRBLIMITR_ENABLE;
+ write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
+
+ /* Synchronize the TRBE enable event */
+ isb();
+}
+
+static void trbe_enable_hw(struct trbe_buf *buf)
+{
+ WARN_ON(buf->trbe_write < buf->trbe_base);
+ WARN_ON(buf->trbe_write >= buf->trbe_limit);
+ set_trbe_disabled();
+ isb();
+ clr_trbe_status();
+ set_trbe_base_pointer(buf->trbe_base);
+ set_trbe_write_pointer(buf->trbe_write);
+
+ /*
+ * Synchronize all the register updates
+ * till now before enabling the TRBE.
+ */
+ isb();
+ set_trbe_limit_pointer_enabled(buf->trbe_limit);
+}
+
+static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
+{
+ int ec = get_trbe_ec(trbsr);
+ int bsc = get_trbe_bsc(trbsr);
+
+ WARN_ON(is_trbe_running(trbsr));
+ if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
+ return TRBE_FAULT_ACT_FATAL;
+
+ if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
+ return TRBE_FAULT_ACT_FATAL;
+
+ if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) {
+ if (get_trbe_write_pointer() == get_trbe_base_pointer())
+ return TRBE_FAULT_ACT_WRAP;
+ }
+ return TRBE_FAULT_ACT_SPURIOUS;
+}
+
+static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
+ struct perf_event *event, void **pages,
+ int nr_pages, bool snapshot)
+{
+ struct trbe_buf *buf;
+ struct page **pglist;
+ int i;
+
+ /*
+ * TRBE LIMIT and TRBE WRITE pointers must be page aligned. But with
+ * just a single page, there would not be any room left while writing
+ * into a partially filled TRBE buffer after the page size alignment.
+ * Hence restrict the minimum buffer size as two pages.
+ */
+ if (nr_pages < 2)
+ return NULL;
+
+ buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, trbe_alloc_node(event));
+ if (!buf)
+ return ERR_PTR(-ENOMEM);
+
+ pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL);
+ if (!pglist) {
+ kfree(buf);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ for (i = 0; i < nr_pages; i++)
+ pglist[i] = virt_to_page(pages[i]);
+
+ buf->trbe_base = (unsigned long)vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
+ if (!buf->trbe_base) {
+ kfree(pglist);
+ kfree(buf);
+ return ERR_PTR(buf->trbe_base);
+ }
+ buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
+ buf->trbe_write = buf->trbe_base;
+ buf->snapshot = snapshot;
+ buf->nr_pages = nr_pages;
+ buf->pages = pages;
+ kfree(pglist);
+ return buf;
+}
+
+static void arm_trbe_free_buffer(void *config)
+{
+ struct trbe_buf *buf = config;
+
+ vunmap((void *)buf->trbe_base);
+ kfree(buf);
+}
+
+static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
+ struct perf_output_handle *handle,
+ void *config)
+{
+ struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+ struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
+ struct trbe_buf *buf = config;
+ enum trbe_fault_action act;
+ unsigned long size, offset;
+ unsigned long write, base, status;
+ unsigned long flags;
+
+ WARN_ON(buf->cpudata != cpudata);
+ WARN_ON(cpudata->cpu != smp_processor_id());
+ WARN_ON(cpudata->drvdata != drvdata);
+ if (cpudata->mode != CS_MODE_PERF)
+ return 0;
+
+ perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
+
+ /*
+ * We are about to disable the TRBE. And this could in turn
+ * fill up the buffer triggering, an IRQ. This could be consumed
+ * by the PE asynchronously, causing a race here against
+ * the IRQ handler in closing out the handle. So, let us
+ * make sure the IRQ can't trigger while we are collecting
+ * the buffer. We also make sure that a WRAP event is handled
+ * accordingly.
+ */
+ local_irq_save(flags);
+
+ /*
+ * If the TRBE was disabled due to lack of space in the AUX buffer or a
+ * spurious fault, the driver leaves it disabled, truncating the buffer.
+ * Since the etm_perf driver expects to close out the AUX buffer, the
+ * driver skips it. Thus, just pass in 0 size here to indicate that the
+ * buffer was truncated.
+ */
+ if (!is_trbe_enabled()) {
+ size = 0;
+ goto done;
+ }
+ /*
+ * perf handle structure needs to be shared with the TRBE IRQ handler for
+ * capturing trace data and restarting the handle. There is a probability
+ * of an undefined reference based crash when etm event is being stopped
+ * while a TRBE IRQ also getting processed. This happens due the release
+ * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
+ * the TRBE here will ensure that no IRQ could be generated when the perf
+ * handle gets freed in etm_event_stop().
+ */
+ trbe_drain_and_disable_local();
+ write = get_trbe_write_pointer();
+ base = get_trbe_base_pointer();
+
+ /* Check if there is a pending interrupt and handle it here */
+ status = read_sysreg_s(SYS_TRBSR_EL1);
+ if (is_trbe_irq(status)) {
+
+ /*
+ * Now that we are handling the IRQ here, clear the IRQ
+ * from the status, to let the irq handler know that it
+ * is taken care of.
+ */
+ clr_trbe_irq();
+ isb();
+
+ act = trbe_get_fault_act(status);
+ /*
+ * If this was not due to a WRAP event, we have some
+ * errors and as such buffer is empty.
+ */
+ if (act != TRBE_FAULT_ACT_WRAP) {
+ size = 0;
+ goto done;
+ }
+
+ /*
+ * Otherwise, the buffer is full and the write pointer
+ * has reached base. Adjust this back to the Limit pointer
+ * for correct size.
+ */
+ write = get_trbe_limit_pointer();
+ }
+
+ offset = write - base;
+ if (WARN_ON_ONCE(offset < PERF_IDX2OFF(handle->head, buf)))
+ size = 0;
+ else
+ size = offset - PERF_IDX2OFF(handle->head, buf);
+
+done:
+ local_irq_restore(flags);
+
+ if (buf->snapshot)
+ handle->head += size;
+ return size;
+}
+
+static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data)
+{
+ struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+ struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
+ struct perf_output_handle *handle = data;
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+
+ WARN_ON(cpudata->cpu != smp_processor_id());
+ WARN_ON(cpudata->drvdata != drvdata);
+ if (mode != CS_MODE_PERF)
+ return -EINVAL;
+
+ *this_cpu_ptr(drvdata->handle) = handle;
+ cpudata->buf = buf;
+ cpudata->mode = mode;
+ buf->cpudata = cpudata;
+ buf->trbe_limit = compute_trbe_buffer_limit(handle);
+ buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
+ if (buf->trbe_limit == buf->trbe_base) {
+ trbe_stop_and_truncate_event(handle);
+ return 0;
+ }
+ trbe_enable_hw(buf);
+ return 0;
+}
+
+static int arm_trbe_disable(struct coresight_device *csdev)
+{
+ struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
+ struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
+ struct trbe_buf *buf = cpudata->buf;
+
+ WARN_ON(buf->cpudata != cpudata);
+ WARN_ON(cpudata->cpu != smp_processor_id());
+ WARN_ON(cpudata->drvdata != drvdata);
+ if (cpudata->mode != CS_MODE_PERF)
+ return -EINVAL;
+
+ trbe_drain_and_disable_local();
+ buf->cpudata = NULL;
+ cpudata->buf = NULL;
+ cpudata->mode = CS_MODE_DISABLED;
+ return 0;
+}
+
+static void trbe_handle_spurious(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+
+ buf->trbe_limit = compute_trbe_buffer_limit(handle);
+ buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
+ if (buf->trbe_limit == buf->trbe_base) {
+ trbe_drain_and_disable_local();
+ return;
+ }
+ trbe_enable_hw(buf);
+}
+
+static void trbe_handle_overflow(struct perf_output_handle *handle)
+{
+ struct perf_event *event = handle->event;
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+ unsigned long offset, size;
+ struct etm_event_data *event_data;
+
+ offset = get_trbe_limit_pointer() - get_trbe_base_pointer();
+ size = offset - PERF_IDX2OFF(handle->head, buf);
+ if (buf->snapshot)
+ handle->head += size;
+
+ perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
+ perf_aux_output_end(handle, size);
+ event_data = perf_aux_output_begin(handle, event);
+ if (!event_data) {
+ /*
+ * We are unable to restart the trace collection,
+ * thus leave the TRBE disabled. The etm-perf driver
+ * is able to detect this with a disconnected handle
+ * (handle->event = NULL).
+ */
+ trbe_drain_and_disable_local();
+ *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
+ return;
+ }
+ buf->trbe_limit = compute_trbe_buffer_limit(handle);
+ buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
+ if (buf->trbe_limit == buf->trbe_base) {
+ trbe_stop_and_truncate_event(handle);
+ return;
+ }
+ *this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
+ trbe_enable_hw(buf);
+}
+
+static bool is_perf_trbe(struct perf_output_handle *handle)
+{
+ struct trbe_buf *buf = etm_perf_sink_config(handle);
+ struct trbe_cpudata *cpudata = buf->cpudata;
+ struct trbe_drvdata *drvdata = cpudata->drvdata;
+ int cpu = smp_processor_id();
+
+ WARN_ON(buf->trbe_base != get_trbe_base_pointer());
+ WARN_ON(buf->trbe_limit != get_trbe_limit_pointer());
+
+ if (cpudata->mode != CS_MODE_PERF)
+ return false;
+
+ if (cpudata->cpu != cpu)
+ return false;
+
+ if (!cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+ return false;
+
+ return true;
+}
+
+static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
+{
+ struct perf_output_handle **handle_ptr = dev;
+ struct perf_output_handle *handle = *handle_ptr;
+ enum trbe_fault_action act;
+ u64 status;
+
+ /*
+ * Ensure the trace is visible to the CPUs and
+ * any external aborts have been resolved.
+ */
+ trbe_drain_and_disable_local();
+
+ status = read_sysreg_s(SYS_TRBSR_EL1);
+ /*
+ * If the pending IRQ was handled by update_buffer callback
+ * we have nothing to do here.
+ */
+ if (!is_trbe_irq(status))
+ return IRQ_NONE;
+
+ clr_trbe_irq();
+ isb();
+
+ if (WARN_ON_ONCE(!handle) || !perf_get_aux(handle))
+ return IRQ_NONE;
+
+ if (!is_perf_trbe(handle))
+ return IRQ_NONE;
+
+ /*
+ * Ensure perf callbacks have completed, which may disable
+ * the trace buffer in response to a TRUNCATION flag.
+ */
+ irq_work_run();
+
+ act = trbe_get_fault_act(status);
+ switch (act) {
+ case TRBE_FAULT_ACT_WRAP:
+ trbe_handle_overflow(handle);
+ break;
+ case TRBE_FAULT_ACT_SPURIOUS:
+ trbe_handle_spurious(handle);
+ break;
+ case TRBE_FAULT_ACT_FATAL:
+ trbe_stop_and_truncate_event(handle);
+ break;
+ }
+ return IRQ_HANDLED;
+}
+
+static const struct coresight_ops_sink arm_trbe_sink_ops = {
+ .enable = arm_trbe_enable,
+ .disable = arm_trbe_disable,
+ .alloc_buffer = arm_trbe_alloc_buffer,
+ .free_buffer = arm_trbe_free_buffer,
+ .update_buffer = arm_trbe_update_buffer,
+};
+
+static const struct coresight_ops arm_trbe_cs_ops = {
+ .sink_ops = &arm_trbe_sink_ops,
+};
+
+static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
+
+ return sprintf(buf, "%llx\n", cpudata->trbe_align);
+}
+static DEVICE_ATTR_RO(align);
+
+static ssize_t flag_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
+
+ return sprintf(buf, "%d\n", cpudata->trbe_flag);
+}
+static DEVICE_ATTR_RO(flag);
+
+static struct attribute *arm_trbe_attrs[] = {
+ &dev_attr_align.attr,
+ &dev_attr_flag.attr,
+ NULL,
+};
+
+static const struct attribute_group arm_trbe_group = {
+ .attrs = arm_trbe_attrs,
+};
+
+static const struct attribute_group *arm_trbe_groups[] = {
+ &arm_trbe_group,
+ NULL,
+};
+
+static void arm_trbe_enable_cpu(void *info)
+{
+ struct trbe_drvdata *drvdata = info;
+
+ trbe_reset_local();
+ enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
+}
+
+static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
+{
+ struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
+ struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
+ struct coresight_desc desc = { 0 };
+ struct device *dev;
+
+ if (WARN_ON(trbe_csdev))
+ return;
+
+ dev = &cpudata->drvdata->pdev->dev;
+ desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu);
+ if (IS_ERR(desc.name))
+ goto cpu_clear;
+
+ desc.type = CORESIGHT_DEV_TYPE_SINK;
+ desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
+ desc.ops = &arm_trbe_cs_ops;
+ desc.pdata = dev_get_platdata(dev);
+ desc.groups = arm_trbe_groups;
+ desc.dev = dev;
+ trbe_csdev = coresight_register(&desc);
+ if (IS_ERR(trbe_csdev))
+ goto cpu_clear;
+
+ dev_set_drvdata(&trbe_csdev->dev, cpudata);
+ coresight_set_percpu_sink(cpu, trbe_csdev);
+ return;
+cpu_clear:
+ cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
+}
+
+static void arm_trbe_probe_cpu(void *info)
+{
+ struct trbe_drvdata *drvdata = info;
+ int cpu = smp_processor_id();
+ struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
+ u64 trbidr;
+
+ if (WARN_ON(!cpudata))
+ goto cpu_clear;
+
+ if (!is_trbe_available()) {
+ pr_err("TRBE is not implemented on cpu %d\n", cpu);
+ goto cpu_clear;
+ }
+
+ trbidr = read_sysreg_s(SYS_TRBIDR_EL1);
+ if (!is_trbe_programmable(trbidr)) {
+ pr_err("TRBE is owned in higher exception level on cpu %d\n", cpu);
+ goto cpu_clear;
+ }
+
+ cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
+ if (cpudata->trbe_align > SZ_2K) {
+ pr_err("Unsupported alignment on cpu %d\n", cpu);
+ goto cpu_clear;
+ }
+ cpudata->trbe_flag = get_trbe_flag_update(trbidr);
+ cpudata->cpu = cpu;
+ cpudata->drvdata = drvdata;
+ return;
+cpu_clear:
+ cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
+}
+
+static void arm_trbe_remove_coresight_cpu(void *info)
+{
+ int cpu = smp_processor_id();
+ struct trbe_drvdata *drvdata = info;
+ struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
+ struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
+
+ disable_percpu_irq(drvdata->irq);
+ trbe_reset_local();
+ if (trbe_csdev) {
+ coresight_unregister(trbe_csdev);
+ cpudata->drvdata = NULL;
+ coresight_set_percpu_sink(cpu, NULL);
+ }
+}
+
+static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata)
+{
+ int cpu;
+
+ drvdata->cpudata = alloc_percpu(typeof(*drvdata->cpudata));
+ if (!drvdata->cpudata)
+ return -ENOMEM;
+
+ for_each_cpu(cpu, &drvdata->supported_cpus) {
+ smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1);
+ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+ arm_trbe_register_coresight_cpu(drvdata, cpu);
+ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+ smp_call_function_single(cpu, arm_trbe_enable_cpu, drvdata, 1);
+ }
+ return 0;
+}
+
+static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
+{
+ int cpu;
+
+ for_each_cpu(cpu, &drvdata->supported_cpus)
+ smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
+ free_percpu(drvdata->cpudata);
+ return 0;
+}
+
+static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node)
+{
+ struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
+
+ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
+
+ /*
+ * If this CPU was not probed for TRBE,
+ * initialize it now.
+ */
+ if (!coresight_get_percpu_sink(cpu)) {
+ arm_trbe_probe_cpu(drvdata);
+ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+ arm_trbe_register_coresight_cpu(drvdata, cpu);
+ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
+ arm_trbe_enable_cpu(drvdata);
+ } else {
+ arm_trbe_enable_cpu(drvdata);
+ }
+ }
+ return 0;
+}
+
+static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
+{
+ struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
+
+ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
+ disable_percpu_irq(drvdata->irq);
+ trbe_reset_local();
+ }
+ return 0;
+}
+
+static int arm_trbe_probe_cpuhp(struct trbe_drvdata *drvdata)
+{
+ enum cpuhp_state trbe_online;
+ int ret;
+
+ trbe_online = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME,
+ arm_trbe_cpu_startup, arm_trbe_cpu_teardown);
+ if (trbe_online < 0)
+ return trbe_online;
+
+ ret = cpuhp_state_add_instance(trbe_online, &drvdata->hotplug_node);
+ if (ret) {
+ cpuhp_remove_multi_state(trbe_online);
+ return ret;
+ }
+ drvdata->trbe_online = trbe_online;
+ return 0;
+}
+
+static void arm_trbe_remove_cpuhp(struct trbe_drvdata *drvdata)
+{
+ cpuhp_remove_multi_state(drvdata->trbe_online);
+}
+
+static int arm_trbe_probe_irq(struct platform_device *pdev,
+ struct trbe_drvdata *drvdata)
+{
+ int ret;
+
+ drvdata->irq = platform_get_irq(pdev, 0);
+ if (drvdata->irq < 0) {
+ pr_err("IRQ not found for the platform device\n");
+ return drvdata->irq;
+ }
+
+ if (!irq_is_percpu(drvdata->irq)) {
+ pr_err("IRQ is not a PPI\n");
+ return -EINVAL;
+ }
+
+ if (irq_get_percpu_devid_partition(drvdata->irq, &drvdata->supported_cpus))
+ return -EINVAL;
+
+ drvdata->handle = alloc_percpu(typeof(*drvdata->handle));
+ if (!drvdata->handle)
+ return -ENOMEM;
+
+ ret = request_percpu_irq(drvdata->irq, arm_trbe_irq_handler, DRVNAME, drvdata->handle);
+ if (ret) {
+ free_percpu(drvdata->handle);
+ return ret;
+ }
+ return 0;
+}
+
+static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
+{
+ free_percpu_irq(drvdata->irq, drvdata->handle);
+ free_percpu(drvdata->handle);
+}
+
+static int arm_trbe_device_probe(struct platform_device *pdev)
+{
+ struct coresight_platform_data *pdata;
+ struct trbe_drvdata *drvdata;
+ struct device *dev = &pdev->dev;
+ int ret;
+
+ drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
+ if (!drvdata)
+ return -ENOMEM;
+
+ pdata = coresight_get_platform_data(dev);
+ if (IS_ERR(pdata))
+ return PTR_ERR(pdata);
+
+ dev_set_drvdata(dev, drvdata);
+ dev->platform_data = pdata;
+ drvdata->pdev = pdev;
+ ret = arm_trbe_probe_irq(pdev, drvdata);
+ if (ret)
+ return ret;
+
+ ret = arm_trbe_probe_coresight(drvdata);
+ if (ret)
+ goto probe_failed;
+
+ ret = arm_trbe_probe_cpuhp(drvdata);
+ if (ret)
+ goto cpuhp_failed;
+
+ return 0;
+cpuhp_failed:
+ arm_trbe_remove_coresight(drvdata);
+probe_failed:
+ arm_trbe_remove_irq(drvdata);
+ return ret;
+}
+
+static int arm_trbe_device_remove(struct platform_device *pdev)
+{
+ struct trbe_drvdata *drvdata = platform_get_drvdata(pdev);
+
+ arm_trbe_remove_cpuhp(drvdata);
+ arm_trbe_remove_coresight(drvdata);
+ arm_trbe_remove_irq(drvdata);
+ return 0;
+}
+
+static const struct of_device_id arm_trbe_of_match[] = {
+ { .compatible = "arm,trace-buffer-extension"},
+ {},
+};
+MODULE_DEVICE_TABLE(of, arm_trbe_of_match);
+
+static struct platform_driver arm_trbe_driver = {
+ .driver = {
+ .name = DRVNAME,
+ .of_match_table = of_match_ptr(arm_trbe_of_match),
+ .suppress_bind_attrs = true,
+ },
+ .probe = arm_trbe_device_probe,
+ .remove = arm_trbe_device_remove,
+};
+
+static int __init arm_trbe_init(void)
+{
+ int ret;
+
+ if (arm64_kernel_unmapped_at_el0()) {
+ pr_err("TRBE wouldn't work if kernel gets unmapped at EL0\n");
+ return -EOPNOTSUPP;
+ }
+
+ ret = platform_driver_register(&arm_trbe_driver);
+ if (!ret)
+ return 0;
+
+ pr_err("Error registering %s platform driver\n", DRVNAME);
+ return ret;
+}
+
+static void __exit arm_trbe_exit(void)
+{
+ platform_driver_unregister(&arm_trbe_driver);
+}
+module_init(arm_trbe_init);
+module_exit(arm_trbe_exit);
+
+MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
+MODULE_DESCRIPTION("Arm Trace Buffer Extension (TRBE) driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
new file mode 100644
index 000000000000..499b846ccfee
--- /dev/null
+++ b/drivers/hwtracing/coresight/coresight-trbe.h
@@ -0,0 +1,153 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * This contains all required hardware related helper functions for
+ * Trace Buffer Extension (TRBE) driver in the coresight framework.
+ *
+ * Copyright (C) 2020 ARM Ltd.
+ *
+ * Author: Anshuman Khandual <[email protected]>
+ */
+#include <linux/coresight.h>
+#include <linux/device.h>
+#include <linux/irq.h>
+#include <linux/kernel.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/smp.h>
+
+#include "coresight-etm-perf.h"
+
+static inline bool is_trbe_available(void)
+{
+ u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+ unsigned int trbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_TRBE_SHIFT);
+
+ return trbe >= 0b0001;
+}
+
+static inline bool is_trbe_enabled(void)
+{
+ u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
+
+ return trblimitr & TRBLIMITR_ENABLE;
+}
+
+#define TRBE_EC_OTHERS 0
+#define TRBE_EC_STAGE1_ABORT 36
+#define TRBE_EC_STAGE2_ABORT 37
+
+static inline int get_trbe_ec(u64 trbsr)
+{
+ return (trbsr >> TRBSR_EC_SHIFT) & TRBSR_EC_MASK;
+}
+
+#define TRBE_BSC_NOT_STOPPED 0
+#define TRBE_BSC_FILLED 1
+#define TRBE_BSC_TRIGGERED 2
+
+static inline int get_trbe_bsc(u64 trbsr)
+{
+ return (trbsr >> TRBSR_BSC_SHIFT) & TRBSR_BSC_MASK;
+}
+
+static inline void clr_trbe_irq(void)
+{
+ u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
+
+ trbsr &= ~TRBSR_IRQ;
+ write_sysreg_s(trbsr, SYS_TRBSR_EL1);
+}
+
+static inline bool is_trbe_irq(u64 trbsr)
+{
+ return trbsr & TRBSR_IRQ;
+}
+
+static inline bool is_trbe_trg(u64 trbsr)
+{
+ return trbsr & TRBSR_TRG;
+}
+
+static inline bool is_trbe_wrap(u64 trbsr)
+{
+ return trbsr & TRBSR_WRAP;
+}
+
+static inline bool is_trbe_abort(u64 trbsr)
+{
+ return trbsr & TRBSR_ABORT;
+}
+
+static inline bool is_trbe_running(u64 trbsr)
+{
+ return !(trbsr & TRBSR_STOP);
+}
+
+#define TRBE_TRIG_MODE_STOP 0
+#define TRBE_TRIG_MODE_IRQ 1
+#define TRBE_TRIG_MODE_IGNORE 3
+
+#define TRBE_FILL_MODE_FILL 0
+#define TRBE_FILL_MODE_WRAP 1
+#define TRBE_FILL_MODE_CIRCULAR_BUFFER 3
+
+static inline void set_trbe_disabled(void)
+{
+ u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
+
+ trblimitr &= ~TRBLIMITR_ENABLE;
+ write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
+}
+
+static inline bool get_trbe_flag_update(u64 trbidr)
+{
+ return trbidr & TRBIDR_FLAG;
+}
+
+static inline bool is_trbe_programmable(u64 trbidr)
+{
+ return !(trbidr & TRBIDR_PROG);
+}
+
+static inline int get_trbe_address_align(u64 trbidr)
+{
+ return (trbidr >> TRBIDR_ALIGN_SHIFT) & TRBIDR_ALIGN_MASK;
+}
+
+static inline unsigned long get_trbe_write_pointer(void)
+{
+ return read_sysreg_s(SYS_TRBPTR_EL1);
+}
+
+static inline void set_trbe_write_pointer(unsigned long addr)
+{
+ WARN_ON(is_trbe_enabled());
+ write_sysreg_s(addr, SYS_TRBPTR_EL1);
+}
+
+static inline unsigned long get_trbe_limit_pointer(void)
+{
+ u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
+ unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
+ unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;
+
+ WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
+ return addr;
+}
+
+static inline unsigned long get_trbe_base_pointer(void)
+{
+ u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
+ unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
+
+ WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
+ return addr;
+}
+
+static inline void set_trbe_base_pointer(unsigned long addr)
+{
+ WARN_ON(is_trbe_enabled());
+ WARN_ON(!IS_ALIGNED(addr, (1UL << TRBBASER_BASE_SHIFT)));
+ WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
+ write_sysreg_s(addr, SYS_TRBBASER_EL1);
+}
--
2.24.1

2021-02-25 19:52:22

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks

From: Anshuman Khandual <[email protected]>

Add support for dedicated sinks that are bound to individual CPUs. (e.g,
TRBE). To allow quicker access to the sink for a given CPU bound source,
keep a percpu array of the sink devices. Also, add support for building
a path to the CPU local sink from the ETM.

This adds a new percpu sink type CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM.
This new sink type is exclusively available and can only work with percpu
source type device CORESIGHT_DEV_SUBTYPE_SOURCE_PROC.

This defines a percpu structure that accommodates a single coresight_device
which can be used to store an initialized instance from a sink driver. As
these sinks are exclusively linked and dependent on corresponding percpu
sources devices, they should also be the default sink device during a perf
session.

Outwards device connections are scanned while establishing paths between a
source and a sink device. But such connections are not present for certain
percpu source and sink devices which are exclusively linked and dependent.
Build the path directly and skip connection scanning for such devices.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Tested-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
[Moved the set/get percpu sink APIs from TRBE patch to here]
Signed-off-by: Suzuki K Poulose <[email protected]>
---
Changes:
- Export methods to set/get percpu sinks for fixing module
build for TRBE
- Addressed coding style comments (Suzuki)
- Check status of _coresight_build_path() (Mathieu)
---
drivers/hwtracing/coresight/coresight-core.c | 29 ++++++++++++++++++--
drivers/hwtracing/coresight/coresight-priv.h | 3 ++
include/linux/coresight.h | 12 ++++++++
3 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 0062c8935653..55c645616bf6 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -23,6 +23,7 @@
#include "coresight-priv.h"

static DEFINE_MUTEX(coresight_mutex);
+DEFINE_PER_CPU(struct coresight_device *, csdev_sink);

/**
* struct coresight_node - elements of a path, from source to sink
@@ -70,6 +71,18 @@ void coresight_remove_cti_ops(void)
}
EXPORT_SYMBOL_GPL(coresight_remove_cti_ops);

+void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev)
+{
+ per_cpu(csdev_sink, cpu) = csdev;
+}
+EXPORT_SYMBOL_GPL(coresight_set_percpu_sink);
+
+struct coresight_device *coresight_get_percpu_sink(int cpu)
+{
+ return per_cpu(csdev_sink, cpu);
+}
+EXPORT_SYMBOL_GPL(coresight_get_percpu_sink);
+
static int coresight_id_match(struct device *dev, void *data)
{
int trace_id, i_trace_id;
@@ -784,6 +797,14 @@ static int _coresight_build_path(struct coresight_device *csdev,
if (csdev == sink)
goto out;

+ if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
+ sink == per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev))) {
+ if (_coresight_build_path(sink, sink, path) == 0) {
+ found = true;
+ goto out;
+ }
+ }
+
/* Not a sink - recursively explore each port found on this element */
for (i = 0; i < csdev->pdata->nr_outport; i++) {
struct coresight_device *child_dev;
@@ -999,8 +1020,12 @@ coresight_find_default_sink(struct coresight_device *csdev)
int depth = 0;

/* look for a default sink if we have not found for this device */
- if (!csdev->def_sink)
- csdev->def_sink = coresight_find_sink(csdev, &depth);
+ if (!csdev->def_sink) {
+ if (coresight_is_percpu_source(csdev))
+ csdev->def_sink = per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev));
+ if (!csdev->def_sink)
+ csdev->def_sink = coresight_find_sink(csdev, &depth);
+ }
return csdev->def_sink;
}

diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index f5f654ea2994..ff1dd2092ac5 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -232,4 +232,7 @@ coresight_find_csdev_by_fwnode(struct fwnode_handle *r_fwnode);
void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev,
struct coresight_device *ect_csdev);

+void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev);
+struct coresight_device *coresight_get_percpu_sink(int cpu);
+
#endif
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 976ec2697610..8a3a3c199087 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -50,6 +50,7 @@ enum coresight_dev_subtype_sink {
CORESIGHT_DEV_SUBTYPE_SINK_PORT,
CORESIGHT_DEV_SUBTYPE_SINK_BUFFER,
CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM,
+ CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM,
};

enum coresight_dev_subtype_link {
@@ -428,6 +429,17 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o
csa->write(val, offset, false, true);
}

+static inline bool coresight_is_percpu_source(struct coresight_device *csdev)
+{
+ return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) &&
+ (csdev->subtype.source_subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_PROC);
+}
+
+static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
+{
+ return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) &&
+ (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM);
+}
#else /* !CONFIG_64BIT */

static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
--
2.24.1

2021-02-25 22:37:42

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 13/19] coresight: ete: Add support for ETE sysreg access

Hi Suzuki,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20210225]
[cannot apply to kvmarm/next arm64/for-next/core tip/perf/core v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6fbd6cf85a3be127454a1ad58525a3adcf8612ab
config: arm64-randconfig-r032-20210225 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a921aaf789912d981cbb2036bdc91ad7289e1523)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/0day-ci/linux/commit/66c402c1fecfcacd92971f7c4ef6ee17f8243745
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
git checkout 66c402c1fecfcacd92971f7c4ef6ee17f8243745
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> drivers/hwtracing/coresight/coresight-etm4x-core.c:118:5: warning: no previous prototype for function 'ete_sysreg_read' [-Wmissing-prototypes]
u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
^
drivers/hwtracing/coresight/coresight-etm4x-core.c:118:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
^
static
>> drivers/hwtracing/coresight/coresight-etm4x-core.c:135:6: warning: no previous prototype for function 'ete_sysreg_write' [-Wmissing-prototypes]
void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
^
drivers/hwtracing/coresight/coresight-etm4x-core.c:135:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
^
static
2 warnings generated.


vim +/ete_sysreg_read +118 drivers/hwtracing/coresight/coresight-etm4x-core.c

117
> 118 u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
119 {
120 u64 res = 0;
121
122 switch (offset) {
123 ETE_READ_CASES(res)
124 default :
125 pr_warn_ratelimited("ete: trying to read unsupported register @%x\n",
126 offset);
127 }
128
129 if (!_relaxed)
130 __iormb(res); /* Imitate the !relaxed I/O helpers */
131
132 return res;
133 }
134
> 135 void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
136 {
137 if (!_relaxed)
138 __iowmb(); /* Imitate the !relaxed I/O helpers */
139 if (!_64bit)
140 val &= GENMASK(31, 0);
141
142 switch (offset) {
143 ETE_WRITE_CASES(val)
144 default :
145 pr_warn_ratelimited("ete: trying to write to unsupported register @%x\n",
146 offset);
147 }
148 }
149

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (3.76 kB)
.config.gz (28.78 kB)
Download all attachments

2021-02-26 06:28:12

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 13/19] coresight: ete: Add support for ETE sysreg access

Hi Suzuki,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20210226]
[cannot apply to kvmarm/next arm64/for-next/core tip/perf/core v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6fbd6cf85a3be127454a1ad58525a3adcf8612ab
config: arm64-allyesconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/66c402c1fecfcacd92971f7c4ef6ee17f8243745
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
git checkout 66c402c1fecfcacd92971f7c4ef6ee17f8243745
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> drivers/hwtracing/coresight/coresight-etm4x-core.c:118:5: warning: no previous prototype for 'ete_sysreg_read' [-Wmissing-prototypes]
118 | u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
| ^~~~~~~~~~~~~~~
>> drivers/hwtracing/coresight/coresight-etm4x-core.c:135:6: warning: no previous prototype for 'ete_sysreg_write' [-Wmissing-prototypes]
135 | void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
| ^~~~~~~~~~~~~~~~


vim +/ete_sysreg_read +118 drivers/hwtracing/coresight/coresight-etm4x-core.c

117
> 118 u64 ete_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
119 {
120 u64 res = 0;
121
122 switch (offset) {
123 ETE_READ_CASES(res)
124 default :
125 pr_warn_ratelimited("ete: trying to read unsupported register @%x\n",
126 offset);
127 }
128
129 if (!_relaxed)
130 __iormb(res); /* Imitate the !relaxed I/O helpers */
131
132 return res;
133 }
134
> 135 void ete_sysreg_write(u64 val, u32 offset, bool _relaxed, bool _64bit)
136 {
137 if (!_relaxed)
138 __iowmb(); /* Imitate the !relaxed I/O helpers */
139 if (!_64bit)
140 val &= GENMASK(31, 0);
141
142 switch (offset) {
143 ETE_WRITE_CASES(val)
144 default :
145 pr_warn_ratelimited("ete: trying to write to unsupported register @%x\n",
146 offset);
147 }
148 }
149

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (3.10 kB)
.config.gz (75.17 kB)
Download all attachments

2021-02-26 06:37:19

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks

Hi Suzuki,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on next-20210226]
[cannot apply to kvmarm/next arm64/for-next/core tip/perf/core v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6fbd6cf85a3be127454a1ad58525a3adcf8612ab
config: arm-randconfig-r024-20210225 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a921aaf789912d981cbb2036bdc91ad7289e1523)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm cross compiling tool for clang build
# apt-get install binutils-arm-linux-gnueabi
# https://github.com/0day-ci/linux/commit/c37564326cdf11e0839eae06c1bfead47d3e5775
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
git checkout c37564326cdf11e0839eae06c1bfead47d3e5775
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> drivers/hwtracing/coresight/coresight-core.c:800:6: error: implicit declaration of function 'coresight_is_percpu_source' [-Werror,-Wimplicit-function-declaration]
if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
^
>> drivers/hwtracing/coresight/coresight-core.c:800:43: error: implicit declaration of function 'coresight_is_percpu_sink' [-Werror,-Wimplicit-function-declaration]
if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
^
drivers/hwtracing/coresight/coresight-core.c:1024:7: error: implicit declaration of function 'coresight_is_percpu_source' [-Werror,-Wimplicit-function-declaration]
if (coresight_is_percpu_source(csdev))
^
3 errors generated.


vim +/coresight_is_percpu_source +800 drivers/hwtracing/coresight/coresight-core.c

775
776 /**
777 * _coresight_build_path - recursively build a path from a @csdev to a sink.
778 * @csdev: The device to start from.
779 * @sink: The final sink we want in this path.
780 * @path: The list to add devices to.
781 *
782 * The tree of Coresight device is traversed until an activated sink is
783 * found. From there the sink is added to the list along with all the
784 * devices that led to that point - the end result is a list from source
785 * to sink. In that list the source is the first device and the sink the
786 * last one.
787 */
788 static int _coresight_build_path(struct coresight_device *csdev,
789 struct coresight_device *sink,
790 struct list_head *path)
791 {
792 int i, ret;
793 bool found = false;
794 struct coresight_node *node;
795
796 /* An activated sink has been found. Enqueue the element */
797 if (csdev == sink)
798 goto out;
799
> 800 if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
801 sink == per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev))) {
802 if (_coresight_build_path(sink, sink, path) == 0) {
803 found = true;
804 goto out;
805 }
806 }
807
808 /* Not a sink - recursively explore each port found on this element */
809 for (i = 0; i < csdev->pdata->nr_outport; i++) {
810 struct coresight_device *child_dev;
811
812 child_dev = csdev->pdata->conns[i].child_dev;
813 if (child_dev &&
814 _coresight_build_path(child_dev, sink, path) == 0) {
815 found = true;
816 break;
817 }
818 }
819
820 if (!found)
821 return -ENODEV;
822
823 out:
824 /*
825 * A path from this element to a sink has been found. The elements
826 * leading to the sink are already enqueued, all that is left to do
827 * is tell the PM runtime core we need this element and add a node
828 * for it.
829 */
830 ret = coresight_grab_device(csdev);
831 if (ret)
832 return ret;
833
834 node = kzalloc(sizeof(struct coresight_node), GFP_KERNEL);
835 if (!node)
836 return -ENOMEM;
837
838 node->csdev = csdev;
839 list_add(&node->link, path);
840
841 return 0;
842 }
843

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (5.10 kB)
.config.gz (30.10 kB)
Download all attachments

2021-03-01 13:57:58

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks

On 2/26/21 6:34 AM, kernel test robot wrote:
> Hi Suzuki,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [also build test ERROR on next-20210226]
> [cannot apply to kvmarm/next arm64/for-next/core tip/perf/core v5.11]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
> base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6fbd6cf85a3be127454a1ad58525a3adcf8612ab
> config: arm-randconfig-r024-20210225 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a921aaf789912d981cbb2036bdc91ad7289e1523)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install arm cross compiling tool for clang build
> # apt-get install binutils-arm-linux-gnueabi
> # https://github.com/0day-ci/linux/commit/c37564326cdf11e0839eae06c1bfead47d3e5775
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
> git checkout c37564326cdf11e0839eae06c1bfead47d3e5775
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>

Thanks for the report. The following fixup should clear this :


---8>---



diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 8a3a3c199087..85008a65e21f 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -429,6 +429,33 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o
csa->write(val, offset, false, true);
}

+#else /* !CONFIG_64BIT */
+
+static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
+ u32 offset)
+{
+ WARN_ON(1);
+ return 0;
+}
+
+static inline u64 csdev_access_read64(struct csdev_access *csa, u32 offset)
+{
+ WARN_ON(1);
+ return 0;
+}
+
+static inline void csdev_access_relaxed_write64(struct csdev_access *csa,
+ u64 val, u32 offset)
+{
+ WARN_ON(1);
+}
+
+static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 offset)
+{
+ WARN_ON(1);
+}
+#endif /* CONFIG_64BIT */
+
static inline bool coresight_is_percpu_source(struct coresight_device *csdev)
{
return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) &&
@@ -440,32 +467,6 @@ static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) &&
(csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM);
}
-#else /* !CONFIG_64BIT */
-
-static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
- u32 offset)
-{
- WARN_ON(1);
- return 0;
-}
-
-static inline u64 csdev_access_read64(struct csdev_access *csa, u32 offset)
-{
- WARN_ON(1);
- return 0;
-}
-
-static inline void csdev_access_relaxed_write64(struct csdev_access *csa,
- u64 val, u32 offset)
-{
- WARN_ON(1);
-}
-
-static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 offset)
-{
- WARN_ON(1);
-}
-#endif /* CONFIG_64BIT */

extern struct coresight_device *
coresight_register(struct coresight_desc *desc);

2021-03-01 14:14:00

by Suzuki K Poulose

[permalink] [raw]
Subject: [PATCH v4.1 17/19] coresight: core: Add support for dedicated percpu sinks

From: Anshuman Khandual <[email protected]>

Add support for dedicated sinks that are bound to individual CPUs. (e.g,
TRBE). To allow quicker access to the sink for a given CPU bound source,
keep a percpu array of the sink devices. Also, add support for building
a path to the CPU local sink from the ETM.

This adds a new percpu sink type CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM.
This new sink type is exclusively available and can only work with percpu
source type device CORESIGHT_DEV_SUBTYPE_SOURCE_PROC.

This defines a percpu structure that accommodates a single coresight_device
which can be used to store an initialized instance from a sink driver. As
these sinks are exclusively linked and dependent on corresponding percpu
sources devices, they should also be the default sink device during a perf
session.

Outwards device connections are scanned while establishing paths between a
source and a sink device. But such connections are not present for certain
percpu source and sink devices which are exclusively linked and dependent.
Build the path directly and skip connection scanning for such devices.

Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Tested-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
[Moved the set/get percpu sink APIs from TRBE patch to here
Fix build break on arm32]
Signed-off-by: Suzuki K Poulose <[email protected]>
---

Changes since v4:
- Fix build break on arm32 reported by : [email protected]
Changes:
- Export methods to set/get percpu sinks for fixing module
build for TRBE
- Addressed coding style comments (Suzuki)
- Check status of _coresight_build_path() (Mathieu)
---
drivers/hwtracing/coresight/coresight-core.c | 29 ++++++++++++++++++--
drivers/hwtracing/coresight/coresight-priv.h | 3 ++
include/linux/coresight.h | 13 +++++++++
3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 0062c8935653..55c645616bf6 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -23,6 +23,7 @@
#include "coresight-priv.h"

static DEFINE_MUTEX(coresight_mutex);
+DEFINE_PER_CPU(struct coresight_device *, csdev_sink);

/**
* struct coresight_node - elements of a path, from source to sink
@@ -70,6 +71,18 @@ void coresight_remove_cti_ops(void)
}
EXPORT_SYMBOL_GPL(coresight_remove_cti_ops);

+void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev)
+{
+ per_cpu(csdev_sink, cpu) = csdev;
+}
+EXPORT_SYMBOL_GPL(coresight_set_percpu_sink);
+
+struct coresight_device *coresight_get_percpu_sink(int cpu)
+{
+ return per_cpu(csdev_sink, cpu);
+}
+EXPORT_SYMBOL_GPL(coresight_get_percpu_sink);
+
static int coresight_id_match(struct device *dev, void *data)
{
int trace_id, i_trace_id;
@@ -784,6 +797,14 @@ static int _coresight_build_path(struct coresight_device *csdev,
if (csdev == sink)
goto out;

+ if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
+ sink == per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev))) {
+ if (_coresight_build_path(sink, sink, path) == 0) {
+ found = true;
+ goto out;
+ }
+ }
+
/* Not a sink - recursively explore each port found on this element */
for (i = 0; i < csdev->pdata->nr_outport; i++) {
struct coresight_device *child_dev;
@@ -999,8 +1020,12 @@ coresight_find_default_sink(struct coresight_device *csdev)
int depth = 0;

/* look for a default sink if we have not found for this device */
- if (!csdev->def_sink)
- csdev->def_sink = coresight_find_sink(csdev, &depth);
+ if (!csdev->def_sink) {
+ if (coresight_is_percpu_source(csdev))
+ csdev->def_sink = per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev));
+ if (!csdev->def_sink)
+ csdev->def_sink = coresight_find_sink(csdev, &depth);
+ }
return csdev->def_sink;
}

diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
index f5f654ea2994..ff1dd2092ac5 100644
--- a/drivers/hwtracing/coresight/coresight-priv.h
+++ b/drivers/hwtracing/coresight/coresight-priv.h
@@ -232,4 +232,7 @@ coresight_find_csdev_by_fwnode(struct fwnode_handle *r_fwnode);
void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev,
struct coresight_device *ect_csdev);

+void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev);
+struct coresight_device *coresight_get_percpu_sink(int cpu);
+
#endif
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 976ec2697610..85008a65e21f 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -50,6 +50,7 @@ enum coresight_dev_subtype_sink {
CORESIGHT_DEV_SUBTYPE_SINK_PORT,
CORESIGHT_DEV_SUBTYPE_SINK_BUFFER,
CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM,
+ CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM,
};

enum coresight_dev_subtype_link {
@@ -455,6 +456,18 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o
}
#endif /* CONFIG_64BIT */

+static inline bool coresight_is_percpu_source(struct coresight_device *csdev)
+{
+ return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) &&
+ (csdev->subtype.source_subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_PROC);
+}
+
+static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
+{
+ return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) &&
+ (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM);
+}
+
extern struct coresight_device *
coresight_register(struct coresight_desc *desc);
extern void coresight_unregister(struct coresight_device *csdev);
--
2.24.1

2021-03-01 18:02:21

by Alexandru Elisei

[permalink] [raw]
Subject: Re: [PATCH v4 04/19] kvm: arm64: nvhe: Save the SPE context early

Hello Suzuki,

On 2/25/21 7:35 PM, Suzuki K Poulose wrote:
> The nvhe hyp saves the SPE context, flushing any unwritten

Perhaps that can be reworded to "The nVHE world switch code saves [..]".

Also, according to my understanding of "context", that means saving *all* the
related registers. But KVM saves only *one* register, PMSCR_EL1. I think stating
that KVM disables sampling and drains the buffer would be more accurate.

> data before we switch to the guest. But this operation is
> performed way too late, because :
> - The ownership of the SPE is transferred to EL2. i.e,

I think the Arm ARM defines only the ownership of the SPE *buffer* (buffer owning
regime), not the ownership of SPE as a whole.

> using EL2 translations. (MDCR_EL2_E2PB == 0)

I think "EL2 translations" is bit too vague, EL2 stage 1 translation would be more
precise, since we're talking only about the nVHE case.

> - The guest Stage1 is loaded.
>
> Thus the flush could use the host EL1 virtual address,
> but use the EL2 translations instead. Fix this by

It's not *could*, it's *will*. The SPE buffer will use the buffer pointer
programmed by the host at EL1, but will attempt to translate it using EL2 stage 1
translation, where it's (probably) not mapped.

> and before we change the ownership to EL2.

Same comment about ownership.

> The restore path is doing the right thing.
>
> Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow")
> Cc: [email protected]
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexandru Elisei <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> New patch.
> ---
> arch/arm64/include/asm/kvm_hyp.h | 5 +++++
> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 ++++++++++--
> arch/arm64/kvm/hyp/nvhe/switch.c | 12 +++++++++++-
> 3 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index c0450828378b..385bd7dd3d39 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -83,6 +83,11 @@ void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt);
> void __debug_switch_to_guest(struct kvm_vcpu *vcpu);
> void __debug_switch_to_host(struct kvm_vcpu *vcpu);
>
> +#ifdef __KVM_NVHE_HYPERVISOR__
> +void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
> +void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
> +#endif
> +
> void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
> void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> index 91a711aa8382..f401724f12ef 100644
> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> @@ -58,16 +58,24 @@ static void __debug_restore_spe(u64 pmscr_el1)
> write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
> }
>
> -void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
> +void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> {
> /* Disable and flush SPE data generation */
> __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
> +}
> +
> +void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
> +{
> __debug_switch_to_guest_common(vcpu);
> }
>
> -void __debug_switch_to_host(struct kvm_vcpu *vcpu)
> +void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> {
> __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
> +}
> +
> +void __debug_switch_to_host(struct kvm_vcpu *vcpu)
> +{
> __debug_switch_to_host_common(vcpu);
> }
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index f3d0e9eca56c..10eed66136a0 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -192,6 +192,15 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
>
> __sysreg_save_state_nvhe(host_ctxt);
> + /*
> + * For nVHE, we must save and disable any SPE
> + * buffers, as the translation regime is going

I'm not sure what "save and disable any SPE buffers" means. The code definitely
doesn't save anything related to the SPE buffer. Maybe you're trying to say that
it must drain the buffer contents to memory? Also, SPE has only *one* buffer.

> + * to be loaded with that of the guest. And we must
> + * save host context for SPE, before we change the
> + * ownership to EL2 (via MDCR_EL2_E2PB == 0) and before

Same comments about "save host context for SPE" (from what I understand that
"context" means, KVM doesn't do that) and ownership (SPE doesn't have an
ownership, the SPE buffer has an owning translation regime).

> + * we load guest Stage1.
> + */
> + __debug_save_host_buffers_nvhe(vcpu);
>
> __adjust_pc(vcpu);
>
> @@ -234,11 +243,12 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
> __fpsimd_save_fpexc32(vcpu);
>
> + __debug_switch_to_host(vcpu);
> /*
> * This must come after restoring the host sysregs, since a non-VHE
> * system may enable SPE here and make use of the TTBRs.
> */
> - __debug_switch_to_host(vcpu);
> + __debug_restore_host_buffers_nvhe(vcpu);
>
> if (pmu_switch_needed)
> __pmu_switch_to_host(host_ctxt);

The patch looks correct to me. There are several things that are wrong with the
way KVM drains the SPE buffer in __debug_switch_to_guest():

1. The buffer is drained after the guest's stage 1 is loaded in
__sysreg_restore_state_nvhe() -> __sysreg_restore_el1_state().

2. The buffer is drained after HCR_EL2.VM is set in __activate_traps() ->
___activate_traps(), which means that the buffer would have use the guest's stage
1 + host's stage 2 for address translation if not 3 below.

3. And finally, the buffer is drained after MDCR_EL2.E2PB is set to 0b00 in
__activate_traps() -> __activate_traps_common() (vcpu->arch.mdcr_el2 is computed
in kvm_arch_vcpu_ioctl_run() -> kvm_arm_setup_debug() before __kvm_vcpu_run(),
which means that the buffer will end up using the EL2 stage 1 translation because
of the ISB after sampling is disabled.

Your fix looks correct to me, we drain the buffer and disable event sampling
before we start restoring any of the state associated with the guest, and we
re-enable profiling after we restore all the host's state relevant for profiling.

Thanks,

Alex

2021-03-02 21:53:31

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 04/19] kvm: arm64: nvhe: Save the SPE context early

Hi Alex

On 3/1/21 4:32 PM, Alexandru Elisei wrote:
> Hello Suzuki,
>
> On 2/25/21 7:35 PM, Suzuki K Poulose wrote:
>> The nvhe hyp saves the SPE context, flushing any unwritten
>
> Perhaps that can be reworded to "The nVHE world switch code saves [..]".
>

Sure

> Also, according to my understanding of "context", that means saving *all* the
> related registers. But KVM saves only *one* register, PMSCR_EL1. I think stating
> that KVM disables sampling and drains the buffer would be more accurate.

I understand that the "context" meaning could be interpreted differently. It could
also mean necessary registers. In this case, as such the guest can't access the SPE,
thus saving the "state" of the SPE only involves saving the PMSCR and restoring
the same. But when we get to to enabling the Guest access, this would mean
saving the other registers too. But yes, your suggestion is clearer, will use
that.

>
>> data before we switch to the guest. But this operation is
>> performed way too late, because :
>> - The ownership of the SPE is transferred to EL2. i.e,
>
> I think the Arm ARM defines only the ownership of the SPE *buffer* (buffer owning
> regime), not the ownership of SPE as a whole.

True. While it means the buffer ownership, all registers except the PMBIDR is
inaccessible to an EL, if the buffer is not accessible (i.e, the ownership is
with a higher EL).

>
>> using EL2 translations. (MDCR_EL2_E2PB == 0)
>
> I think "EL2 translations" is bit too vague, EL2 stage 1 translation would be more
> precise, since we're talking only about the nVHE case.

True.

>
>> - The guest Stage1 is loaded.
>>
>> Thus the flush could use the host EL1 virtual address,
>> but use the EL2 translations instead. Fix this by
>
> It's not *could*, it's *will*. The SPE buffer will use the buffer pointer

Well, if there was nothing to flush, or if the SPE had flushed any data before
we entered the EL2, then there wouldn't be anything left with the flush.

> programmed by the host at EL1, but will attempt to translate it using EL2 stage 1
> translation, where it's (probably) not mapped.
>
>> and before we change the ownership to EL2.
>
> Same comment about ownership.
>
>> The restore path is doing the right thing.
>>
>> Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow")
>> Cc: [email protected]
>> Cc: Christoffer Dall <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> Cc: Alexandru Elisei <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> ---
>> New patch.
>> ---
>> arch/arm64/include/asm/kvm_hyp.h | 5 +++++
>> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 ++++++++++--
>> arch/arm64/kvm/hyp/nvhe/switch.c | 12 +++++++++++-
>> 3 files changed, 26 insertions(+), 3 deletions(-)
>>

>>
>> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
>> index f3d0e9eca56c..10eed66136a0 100644
>> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
>> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
>> @@ -192,6 +192,15 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>> pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
>>
>> __sysreg_save_state_nvhe(host_ctxt);
>> + /*
>> + * For nVHE, we must save and disable any SPE
>> + * buffers, as the translation regime is going
>
> I'm not sure what "save and disable any SPE buffers" means. The code definitely
> doesn't save anything related to the SPE buffer. Maybe you're trying to say that

Agreed, this could be clearer. "save" implies the state of the SPE buffer, not the
entire buffer as such. It does save the PMSCR_EL1, which controls where the profiling
is permitted. In turn, we disable the profiling at EL1&0, preventing the any further
generation of data written to the buffer.

> it must drain the buffer contents to memory? Also, SPE has only *one* buffer.
>

The details on what we do exactly are already in the function where
we take the action. So, we don't need to explain those here. The
comment here is there to give a notice on the dependency on other context
operations, in case someone tries to move this code around.

>> + * to be loaded with that of the guest. And we must
>> + * save host context for SPE, before we change the
>> + * ownership to EL2 (via MDCR_EL2_E2PB == 0) and before
>
> Same comments about "save host context for SPE" (from what I understand that
> "context" means, KVM doesn't do that) and ownership (SPE doesn't have an
> ownership, the SPE buffer has an owning translation regime).
>
>> + * we load guest Stage1.
>> + */
>> + __debug_save_host_buffers_nvhe(vcpu);
>>
>> __adjust_pc(vcpu);
>>
>> @@ -234,11 +243,12 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>> if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>> __fpsimd_save_fpexc32(vcpu);
>>
>> + __debug_switch_to_host(vcpu);
>> /*
>> * This must come after restoring the host sysregs, since a non-VHE
>> * system may enable SPE here and make use of the TTBRs.
>> */
>> - __debug_switch_to_host(vcpu);
>> + __debug_restore_host_buffers_nvhe(vcpu);
>>
>> if (pmu_switch_needed)
>> __pmu_switch_to_host(host_ctxt);
>
> The patch looks correct to me. There are several things that are wrong with the
> way KVM drains the SPE buffer in __debug_switch_to_guest():
>
> 1. The buffer is drained after the guest's stage 1 is loaded in
> __sysreg_restore_state_nvhe() -> __sysreg_restore_el1_state().
>
> 2. The buffer is drained after HCR_EL2.VM is set in __activate_traps() ->
> ___activate_traps(), which means that the buffer would have use the guest's stage
> 1 + host's stage 2 for address translation if not 3 below.
>
> 3. And finally, the buffer is drained after MDCR_EL2.E2PB is set to 0b00 in
> __activate_traps() -> __activate_traps_common() (vcpu->arch.mdcr_el2 is computed
> in kvm_arch_vcpu_ioctl_run() -> kvm_arm_setup_debug() before __kvm_vcpu_run(),
> which means that the buffer will end up using the EL2 stage 1 translation because
> of the ISB after sampling is disabled.
>

Correct.

> Your fix looks correct to me, we drain the buffer and disable event sampling
> before we start restoring any of the state associated with the guest, and we
> re-enable profiling after we restore all the host's state relevant for profiling.

Thanks for the review.

Cheers
Suzuki

2021-03-02 21:53:32

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 04/19] kvm: arm64: nvhe: Save the SPE context early

Hi Suzuki,

On 2021-03-02 10:01, Suzuki K Poulose wrote:
> Hi Alex
>
> On 3/1/21 4:32 PM, Alexandru Elisei wrote:
>> Hello Suzuki,
>>
>> On 2/25/21 7:35 PM, Suzuki K Poulose wrote:
>>> The nvhe hyp saves the SPE context, flushing any unwritten
>>
>> Perhaps that can be reworded to "The nVHE world switch code saves
>> [..]".
>>
>
> Sure
>
>> Also, according to my understanding of "context", that means saving
>> *all* the
>> related registers. But KVM saves only *one* register, PMSCR_EL1. I
>> think stating
>> that KVM disables sampling and drains the buffer would be more
>> accurate.
>
> I understand that the "context" meaning could be interpreted
> differently. It could
> also mean necessary registers. In this case, as such the guest can't
> access the SPE,
> thus saving the "state" of the SPE only involves saving the PMSCR and
> restoring
> the same. But when we get to to enabling the Guest access, this would
> mean
> saving the other registers too. But yes, your suggestion is clearer,
> will use
> that.
>
>>
>>> data before we switch to the guest. But this operation is
>>> performed way too late, because :
>>> - The ownership of the SPE is transferred to EL2. i.e,
>>
>> I think the Arm ARM defines only the ownership of the SPE *buffer*
>> (buffer owning
>> regime), not the ownership of SPE as a whole.
>
> True. While it means the buffer ownership, all registers except the
> PMBIDR is
> inaccessible to an EL, if the buffer is not accessible (i.e, the
> ownership is
> with a higher EL).
>
>>
>>> using EL2 translations. (MDCR_EL2_E2PB == 0)
>>
>> I think "EL2 translations" is bit too vague, EL2 stage 1 translation
>> would be more
>> precise, since we're talking only about the nVHE case.
>
> True.
>
>>
>>> - The guest Stage1 is loaded.
>>>
>>> Thus the flush could use the host EL1 virtual address,
>>> but use the EL2 translations instead. Fix this by
>>
>> It's not *could*, it's *will*. The SPE buffer will use the buffer
>> pointer
>
> Well, if there was nothing to flush, or if the SPE had flushed any data
> before
> we entered the EL2, then there wouldn't be anything left with the
> flush.
>
>> programmed by the host at EL1, but will attempt to translate it using
>> EL2 stage 1
>> translation, where it's (probably) not mapped.
>>
>>> and before we change the ownership to EL2.
>>
>> Same comment about ownership.
>>
>>> The restore path is doing the right thing.
>>>
>>> Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore
>>> flow")
>>> Cc: [email protected]
>>> Cc: Christoffer Dall <[email protected]>
>>> Cc: Marc Zyngier <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Mark Rutland <[email protected]>
>>> Cc: Alexandru Elisei <[email protected]>
>>> Signed-off-by: Suzuki K Poulose <[email protected]>
>>> ---
>>> New patch.
>>> ---
>>> arch/arm64/include/asm/kvm_hyp.h | 5 +++++
>>> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 ++++++++++--
>>> arch/arm64/kvm/hyp/nvhe/switch.c | 12 +++++++++++-
>>> 3 files changed, 26 insertions(+), 3 deletions(-)
>>>
>
>>> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c
>>> b/arch/arm64/kvm/hyp/nvhe/switch.c
>>> index f3d0e9eca56c..10eed66136a0 100644
>>> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
>>> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
>>> @@ -192,6 +192,15 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>>> pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
>>> __sysreg_save_state_nvhe(host_ctxt);
>>> + /*
>>> + * For nVHE, we must save and disable any SPE
>>> + * buffers, as the translation regime is going
>>
>> I'm not sure what "save and disable any SPE buffers" means. The code
>> definitely
>> doesn't save anything related to the SPE buffer. Maybe you're trying
>> to say that
>
> Agreed, this could be clearer. "save" implies the state of the SPE
> buffer, not the
> entire buffer as such. It does save the PMSCR_EL1, which controls
> where the profiling
> is permitted. In turn, we disable the profiling at EL1&0, preventing
> the any further
> generation of data written to the buffer.
>
>> it must drain the buffer contents to memory? Also, SPE has only *one*
>> buffer.
>>
>
> The details on what we do exactly are already in the function where
> we take the action. So, we don't need to explain those here. The
> comment here is there to give a notice on the dependency on other
> context
> operations, in case someone tries to move this code around.
>
>>> + * to be loaded with that of the guest. And we must
>>> + * save host context for SPE, before we change the
>>> + * ownership to EL2 (via MDCR_EL2_E2PB == 0) and before
>>
>> Same comments about "save host context for SPE" (from what I
>> understand that
>> "context" means, KVM doesn't do that) and ownership (SPE doesn't have
>> an
>> ownership, the SPE buffer has an owning translation regime).
>>
>>> + * we load guest Stage1.
>>> + */
>>> + __debug_save_host_buffers_nvhe(vcpu);
>>> __adjust_pc(vcpu);
>>> @@ -234,11 +243,12 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>>> if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>>> __fpsimd_save_fpexc32(vcpu);
>>> + __debug_switch_to_host(vcpu);
>>> /*
>>> * This must come after restoring the host sysregs, since a
>>> non-VHE
>>> * system may enable SPE here and make use of the TTBRs.
>>> */
>>> - __debug_switch_to_host(vcpu);
>>> + __debug_restore_host_buffers_nvhe(vcpu);
>>> if (pmu_switch_needed)
>>> __pmu_switch_to_host(host_ctxt);
>>
>> The patch looks correct to me. There are several things that are wrong
>> with the
>> way KVM drains the SPE buffer in __debug_switch_to_guest():
>>
>> 1. The buffer is drained after the guest's stage 1 is loaded in
>> __sysreg_restore_state_nvhe() -> __sysreg_restore_el1_state().
>>
>> 2. The buffer is drained after HCR_EL2.VM is set in __activate_traps()
>> ->
>> ___activate_traps(), which means that the buffer would have use the
>> guest's stage
>> 1 + host's stage 2 for address translation if not 3 below.
>>
>> 3. And finally, the buffer is drained after MDCR_EL2.E2PB is set to
>> 0b00 in
>> __activate_traps() -> __activate_traps_common() (vcpu->arch.mdcr_el2
>> is computed
>> in kvm_arch_vcpu_ioctl_run() -> kvm_arm_setup_debug() before
>> __kvm_vcpu_run(),
>> which means that the buffer will end up using the EL2 stage 1
>> translation because
>> of the ISB after sampling is disabled.
>>
>
> Correct.
>
>> Your fix looks correct to me, we drain the buffer and disable event
>> sampling
>> before we start restoring any of the state associated with the guest,
>> and we
>> re-enable profiling after we restore all the host's state relevant for
>> profiling.
>
> Thanks for the review.

If you respin this single patch quickly to address the cosmetic issues
pointed
out by Alex, I can queue this as part of the first batch of fixes for
5.12.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2021-03-02 22:57:30

by Alexandru Elisei

[permalink] [raw]
Subject: Re: [PATCH v4 04/19] kvm: arm64: nvhe: Save the SPE context early

Hello Suzuki,

On 3/2/21 10:01 AM, Suzuki K Poulose wrote:
> Hi Alex
>
> On 3/1/21 4:32 PM, Alexandru Elisei wrote:
>> Hello Suzuki,
>>
>> On 2/25/21 7:35 PM, Suzuki K Poulose wrote:
>>> The nvhe hyp saves the SPE context, flushing any unwritten
>>
>> Perhaps that can be reworded to "The nVHE world switch code saves [..]".
>>
>
> Sure
>
>> Also, according to my understanding of "context", that means saving *all* the
>> related registers. But KVM saves only *one* register, PMSCR_EL1. I think stating
>> that KVM disables sampling and drains the buffer would be more accurate.
>
> I understand that the "context" meaning could be interpreted differently. It could
> also mean necessary registers. In this case, as such the guest can't access the
> SPE,
> thus saving the "state" of the SPE only involves saving the PMSCR and restoring
> the same. But when we get to to enabling the Guest access, this would mean
> saving the other registers too. But yes, your suggestion is clearer, will use
> that.
>
>>
>>> data before we switch to the guest. But this operation is
>>> performed way too late, because :
>>>    - The ownership of the SPE is transferred to EL2. i.e,
>>
>> I think the Arm ARM defines only the ownership of the SPE *buffer* (buffer owning
>> regime), not the ownership of SPE as a whole.
>
> True. While it means the buffer ownership, all registers except the PMBIDR is
> inaccessible to an EL, if the buffer is not accessible (i.e, the ownership is
> with a higher EL).
>
>>
>>>      using EL2 translations. (MDCR_EL2_E2PB == 0)
>>
>> I think "EL2 translations" is bit too vague, EL2 stage 1 translation would be more
>> precise, since we're talking only about the nVHE case.
>
> True.
>
>>
>>>    - The guest Stage1 is loaded.
>>>
>>> Thus the flush could use the host EL1 virtual address,
>>> but use the EL2 translations instead. Fix this by
>>
>> It's not *could*, it's *will*. The SPE buffer will use the buffer pointer
>
> Well, if there was nothing to flush, or if the SPE had flushed any data before
> we entered the EL2, then there wouldn't be anything left with the flush.
>
>> programmed by the host at EL1, but will attempt to translate it using EL2 stage 1
>> translation, where it's (probably) not mapped.
>>
>>> and before we change the ownership to EL2.
>>
>> Same comment about ownership.
>>
>>> The restore path is doing the right thing.
>>>
>>> Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow")
>>> Cc: [email protected]
>>> Cc: Christoffer Dall <[email protected]>
>>> Cc: Marc Zyngier <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Mark Rutland <[email protected]>
>>> Cc: Alexandru Elisei <[email protected]>
>>> Signed-off-by: Suzuki K Poulose <[email protected]>
>>> ---
>>> New patch.
>>> ---
>>>   arch/arm64/include/asm/kvm_hyp.h   |  5 +++++
>>>   arch/arm64/kvm/hyp/nvhe/debug-sr.c | 12 ++++++++++--
>>>   arch/arm64/kvm/hyp/nvhe/switch.c   | 12 +++++++++++-
>>>   3 files changed, 26 insertions(+), 3 deletions(-)
>>>
>
>>>   diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c
>>> b/arch/arm64/kvm/hyp/nvhe/switch.c
>>> index f3d0e9eca56c..10eed66136a0 100644
>>> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
>>> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
>>> @@ -192,6 +192,15 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>>>       pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
>>>         __sysreg_save_state_nvhe(host_ctxt);
>>> +    /*
>>> +     * For nVHE, we must save and disable any SPE
>>> +     * buffers, as the translation regime is going
>>
>> I'm not sure what "save and disable any SPE buffers" means. The code definitely
>> doesn't save anything related to the SPE buffer. Maybe you're trying to say that
>
> Agreed, this could be clearer. "save" implies the state of the SPE buffer, not the
> entire buffer as such. It does save the PMSCR_EL1, which controls where the
> profiling
> is permitted. In turn, we disable the profiling at EL1&0, preventing the any
> further
> generation of data written to the buffer.
>
>> it must drain the buffer contents to memory? Also, SPE has only *one* buffer.
>>
>
> The details on what we do exactly are already in the function where
> we take the action. So, we don't need to explain those here. The
> comment here is there to give a notice on the dependency on other context
> operations, in case someone tries to move this code around.
>
>>> +     * to be loaded with that of the guest. And we must
>>> +     * save host context for SPE, before we change the
>>> +     * ownership to EL2 (via MDCR_EL2_E2PB == 0)  and before
>>
>> Same comments about "save host context for SPE" (from what I understand that
>> "context" means, KVM doesn't do that) and ownership (SPE doesn't have an
>> ownership, the SPE buffer has an owning translation regime).
>>
>>> +     * we load guest Stage1.
>>> +     */
>>> +    __debug_save_host_buffers_nvhe(vcpu);
>>>         __adjust_pc(vcpu);
>>>   @@ -234,11 +243,12 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>>>       if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED)
>>>           __fpsimd_save_fpexc32(vcpu);
>>>   +    __debug_switch_to_host(vcpu);
>>>       /*
>>>        * This must come after restoring the host sysregs, since a non-VHE
>>>        * system may enable SPE here and make use of the TTBRs.
>>>        */
>>> -    __debug_switch_to_host(vcpu);
>>> +    __debug_restore_host_buffers_nvhe(vcpu);
>>>         if (pmu_switch_needed)
>>>           __pmu_switch_to_host(host_ctxt);
>>
>> The patch looks correct to me. There are several things that are wrong with the
>> way KVM drains the SPE buffer in __debug_switch_to_guest():
>>
>> 1. The buffer is drained after the guest's stage 1 is loaded in
>> __sysreg_restore_state_nvhe() -> __sysreg_restore_el1_state().
>>
>> 2. The buffer is drained after HCR_EL2.VM is set in __activate_traps() ->
>> ___activate_traps(), which means that the buffer would have use the guest's stage
>> 1 + host's stage 2 for address translation if not 3 below.
>>
>> 3. And finally, the buffer is drained after MDCR_EL2.E2PB is set to 0b00 in
>> __activate_traps() -> __activate_traps_common() (vcpu->arch.mdcr_el2 is computed
>> in kvm_arch_vcpu_ioctl_run() -> kvm_arm_setup_debug() before __kvm_vcpu_run(),
>> which means that the buffer will end up using the EL2 stage 1 translation because
>> of the ISB after sampling is disabled.
>>
>
> Correct.
>
>> Your fix looks correct to me, we drain the buffer and disable event sampling
>> before we start restoring any of the state associated with the guest, and we
>> re-enable profiling after we restore all the host's state relevant for profiling.
>
> Thanks for the review.

I agree with your suggestions, when you respin this patch please add my R-b:

Reviewed-by: Alexandru Elisei <[email protected]>

Thanks,

Alex

2021-03-04 06:10:08

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks



On 3/1/21 7:24 PM, Suzuki K Poulose wrote:
> On 2/26/21 6:34 AM, kernel test robot wrote:
>> Hi Suzuki,
>>
>> Thank you for the patch! Yet something to improve:
>>
>> [auto build test ERROR on linus/master]
>> [also build test ERROR on next-20210226]
>> [cannot apply to kvmarm/next arm64/for-next/core tip/perf/core v5.11]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:??? https://github.com/0day-ci/linux/commits/Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
>> base:?? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6fbd6cf85a3be127454a1ad58525a3adcf8612ab
>> config: arm-randconfig-r024-20210225 (attached as .config)
>> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a921aaf789912d981cbb2036bdc91ad7289e1523)
>> reproduce (this is a W=1 build):
>> ???????? wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>> ???????? chmod +x ~/bin/make.cross
>> ???????? # install arm cross compiling tool for clang build
>> ???????? # apt-get install binutils-arm-linux-gnueabi
>> ???????? # https://github.com/0day-ci/linux/commit/c37564326cdf11e0839eae06c1bfead47d3e5775
>> ???????? git remote add linux-review https://github.com/0day-ci/linux
>> ???????? git fetch --no-tags linux-review Suzuki-K-Poulose/arm64-coresight-Add-support-for-ETE-and-TRBE/20210226-035447
>> ???????? git checkout c37564326cdf11e0839eae06c1bfead47d3e5775
>> ???????? # save the attached .config to linux build tree
>> ???????? COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm
>>
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <[email protected]>
>
> Thanks for the report. The following fixup should clear this :
>
>
> ---8>---
>
>
>
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 8a3a3c199087..85008a65e21f 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -429,6 +429,33 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o
> ???????? csa->write(val, offset, false, true);
> ?}
>
> +#else??? /* !CONFIG_64BIT */
> +
> +static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
> +????????????????????????? u32 offset)
> +{
> +??? WARN_ON(1);
> +??? return 0;
> +}
> +
> +static inline u64 csdev_access_read64(struct csdev_access *csa, u32 offset)
> +{
> +??? WARN_ON(1);
> +??? return 0;
> +}
> +
> +static inline void csdev_access_relaxed_write64(struct csdev_access *csa,
> +??????????????????????? u64 val, u32 offset)
> +{
> +??? WARN_ON(1);
> +}
> +
> +static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 offset)
> +{
> +??? WARN_ON(1);
> +}
> +#endif??? /* CONFIG_64BIT */
> +
> ?static inline bool coresight_is_percpu_source(struct coresight_device *csdev)
> ?{
> ???? return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) &&
> @@ -440,32 +467,6 @@ static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
> ???? return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) &&
> ??????????? (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM);
> ?}
> -#else??? /* !CONFIG_64BIT */
> -
> -static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
> -????????????????????????? u32 offset)
> -{
> -??? WARN_ON(1);
> -??? return 0;
> -}
> -
> -static inline u64 csdev_access_read64(struct csdev_access *csa, u32 offset)
> -{
> -??? WARN_ON(1);
> -??? return 0;
> -}
> -
> -static inline void csdev_access_relaxed_write64(struct csdev_access *csa,
> -??????????????????????? u64 val, u32 offset)
> -{
> -??? WARN_ON(1);
> -}
> -
> -static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 offset)
> -{
> -??? WARN_ON(1);
> -}
> -#endif??? /* CONFIG_64BIT */
>
> ?extern struct coresight_device *
> ?coresight_register(struct coresight_desc *desc);

Agreed, these new helpers should be available in general and not restricted for 64BIT.

2021-03-06 21:08:30

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v4 15/19] dts: bindings: Document device tree bindings for ETE

On Thu, Feb 25, 2021 at 07:35:39PM +0000, Suzuki K Poulose wrote:
> Document the device tree bindings for Embedded Trace Extensions.
> ETE can be connected to legacy coresight components and thus
> could optionally contain a connection graph as described by
> the CoreSight bindings.
>
> Cc: [email protected]
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Rob Herring <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Fix out-ports defintion
> ---
> .../devicetree/bindings/arm/ete.yaml | 71 +++++++++++++++++++
> 1 file changed, 71 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
>
> diff --git a/Documentation/devicetree/bindings/arm/ete.yaml b/Documentation/devicetree/bindings/arm/ete.yaml
> new file mode 100644
> index 000000000000..35a42d92bf97
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/ete.yaml
> @@ -0,0 +1,71 @@
> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> +# Copyright 2021, Arm Ltd
> +%YAML 1.2
> +---
> +$id: "http://devicetree.org/schemas/arm/ete.yaml#"
> +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> +
> +title: ARM Embedded Trace Extensions
> +
> +maintainers:
> + - Suzuki K Poulose <[email protected]>
> + - Mathieu Poirier <[email protected]>
> +
> +description: |
> + Arm Embedded Trace Extension(ETE) is a per CPU trace component that
> + allows tracing the CPU execution. It overlaps with the CoreSight ETMv4
> + architecture and has extended support for future architecture changes.
> + The trace generated by the ETE could be stored via legacy CoreSight
> + components (e.g, TMC-ETR) or other means (e.g, using a per CPU buffer
> + Arm Trace Buffer Extension (TRBE)). Since the ETE can be connected to
> + legacy CoreSight components, a node must be listed per instance, along
> + with any optional connection graph as per the coresight bindings.
> + See bindings/arm/coresight.txt.
> +
> +properties:
> + $nodename:
> + pattern: "^ete([0-9a-f]+)$"
> + compatible:
> + items:
> + - const: arm,embedded-trace-extension
> +
> + cpu:
> + description: |
> + Handle to the cpu this ETE is bound to.
> + $ref: /schemas/types.yaml#/definitions/phandle
> +
> + out-ports:
> + description: |
> + Output connections from the ETE to legacy CoreSight trace bus.
> + $ref: /schemas/graph.yaml#/properties/port

s/port/ports/

And then you need:

properties:
port:
description: what this port is
$ref: /schemas/graph.yaml#/properties/port

> +
> +required:
> + - compatible
> + - cpu
> +
> +additionalProperties: false
> +
> +examples:
> +
> +# An ETE node without legacy CoreSight connections
> + - |
> + ete0 {
> + compatible = "arm,embedded-trace-extension";
> + cpu = <&cpu_0>;
> + };
> +# An ETE node with legacy CoreSight connections
> + - |
> + ete1 {
> + compatible = "arm,embedded-trace-extension";
> + cpu = <&cpu_1>;
> +
> + out-ports { /* legacy coresight connection */
> + port {
> + ete1_out_port: endpoint {
> + remote-endpoint = <&funnel_in_port0>;
> + };
> + };
> + };
> + };
> +
> +...
> --
> 2.24.1
>

2021-03-08 17:28:00

by Mike Leach

[permalink] [raw]
Subject: Re: [PATCH v4 10/19] coresight: etm-perf: Allow an event to use different sinks

On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>
> When a sink is not specified by the user, the etm perf driver
> finds a suitable sink automatically, based on the first ETM
> where this event could be scheduled. Then we allocate the
> sink buffer based on the selected sink. This is fine for a
> CPU bound event as the "sink" is always guaranteed to be
> reachable from the ETM (as this is the only ETM where the
> event is going to be scheduled). However, if we have a thread
> bound event, the event could be scheduled on any of the ETMs
> on the system. In this case, currently we automatically select
> a sink and exclude any ETMs that cannot reach the selected
> sink. This is problematic especially for 1x1 configurations.
> We end up in tracing the event only on the "first" ETM,
> as the default sink is local to the first ETM and unreachable
> from the rest. However, we could allow the other ETMs to
> trace if they all have a sink that is compatible with the
> "selected" sink and can use the sink buffer. This can be
> easily done by verifying that they are all driven by the
> same driver and matches the same subtype. Please note
> that at anytime there can be only one ETM tracing the event.
>
> Adding support for different types of sinks for a single
> event is complex and is not something that we expect
> on a sane configuration.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Tested-by: Linu Cherian <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Rename sinks_match => sinks_compatible
> - Tighten the check by matching the sink subtype
> - Use user_sink instead of "sink_forced" and clean up the code (Mathieu)
> - More comments, better commit description
> ---
> .../hwtracing/coresight/coresight-etm-perf.c | 60 ++++++++++++++++---
> 1 file changed, 52 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 0f603b4094f2..aa0974bd265b 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -232,6 +232,25 @@ static void etm_free_aux(void *data)
> schedule_work(&event_data->work);
> }
>
> +/*
> + * Check if two given sinks are compatible with each other,
> + * so that they can use the same sink buffers, when an event
> + * moves around.
> + */
> +static bool sinks_compatible(struct coresight_device *a,
> + struct coresight_device *b)
> +{
> + if (!a || !b)
> + return false;
> + /*
> + * If the sinks are of the same subtype and driven
> + * by the same driver, we can use the same buffer
> + * on these sinks.
> + */
> + return (a->subtype.sink_subtype == b->subtype.sink_subtype) &&
> + (sink_ops(a) == sink_ops(b));
> +}
> +
> static void *etm_setup_aux(struct perf_event *event, void **pages,
> int nr_pages, bool overwrite)
> {
> @@ -239,6 +258,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> int cpu = event->cpu;
> cpumask_t *mask;
> struct coresight_device *sink = NULL;
> + struct coresight_device *user_sink = NULL, *last_sink = NULL;
> struct etm_event_data *event_data = NULL;
>
> event_data = alloc_event_data(cpu);
> @@ -249,7 +269,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> /* First get the selected sink from user space. */
> if (event->attr.config2) {
> id = (u32)event->attr.config2;
> - sink = coresight_get_sink_by_id(id);
> + sink = user_sink = coresight_get_sink_by_id(id);
> }
>
> mask = &event_data->mask;
> @@ -277,14 +297,33 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> }
>
> /*
> - * No sink provided - look for a default sink for one of the
> - * devices. At present we only support topology where all CPUs
> - * use the same sink [N:1], so only need to find one sink. The
> - * coresight_build_path later will remove any CPU that does not
> - * attach to the sink, or if we have not found a sink.
> + * No sink provided - look for a default sink for all the ETMs,
> + * where this event can be scheduled.
> + * We allocate the sink specific buffers only once for this
> + * event. If the ETMs have different default sink devices, we
> + * can only use a single "type" of sink as the event can carry
> + * only one sink specific buffer. Thus we have to make sure
> + * that the sinks are of the same type and driven by the same
> + * driver, as the one we allocate the buffer for. As such
> + * we choose the first sink and check if the remaining ETMs
> + * have a compatible default sink. We don't trace on a CPU
> + * if the sink is not compatible.
> */
> - if (!sink)
> + if (!user_sink) {
> + /* Find the default sink for this ETM */
> sink = coresight_find_default_sink(csdev);
> + if (!sink) {
> + cpumask_clear_cpu(cpu, mask);
> + continue;
> + }
> +
> + /* Check if this sink compatible with the last sink */
> + if (last_sink && !sinks_compatible(last_sink, sink)) {
> + cpumask_clear_cpu(cpu, mask);
> + continue;
> + }
> + last_sink = sink;
> + }
>
> /*
> * Building a path doesn't enable it, it simply builds a
> @@ -312,7 +351,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
> goto err;
>
> - /* Allocate the sink buffer for this session */
> + /*
> + * Allocate the sink buffer for this session. All the sinks
> + * where this event can be scheduled are ensured to be of the
> + * same type. Thus the same sink configuration is used by the
> + * sinks.
> + */
> event_data->snk_config =
> sink_ops(sink)->alloc_buffer(sink, event, pages,
> nr_pages, overwrite);
> --
> 2.24.1
>

Reviewed-by: Mike Leach <[email protected]>


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

2021-03-08 17:28:14

by Mike Leach

[permalink] [raw]
Subject: Re: [PATCH v4 09/19] coresight: etm4x: Move ETM to prohibited region for disable

On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>
> If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF),
> move the ETM to trace prohibited region using TRFCR, while disabling.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> New patch
> ---
> .../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++--
> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
> 2 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 15016f757828..00297906669c 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -31,6 +31,7 @@
> #include <linux/pm_runtime.h>
> #include <linux/property.h>
>
> +#include <asm/barrier.h>
> #include <asm/sections.h>
> #include <asm/sysreg.h>
> #include <asm/local.h>
> @@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev,
> static void etm4_disable_hw(void *info)
> {
> u32 control;
> + u64 trfcr;
> struct etmv4_drvdata *drvdata = info;
> struct etmv4_config *config = &drvdata->config;
> struct coresight_device *csdev = drvdata->csdev;
> @@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info)
> /* EN, bit[0] Trace unit enable bit */
> control &= ~0x1;
>
> + /*
> + * If the CPU supports v8.4 Trace filter Control,
> + * set the ETM to trace prohibited region.
> + */
> + if (drvdata->trfc) {
> + trfcr = read_sysreg_s(SYS_TRFCR_EL1);
> + write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
> + SYS_TRFCR_EL1);
> + isb();
> + }
> /*
> * Make sure everything completes before disabling, as recommended
> * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
> @@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info)
> */
> dsb(sy);
> isb();
> + /* Trace synchronization barrier, is a nop if not supported */
> + tsb_csync();
> etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
>
> /* wait for TRCSTATR.PMSTABLE to go to '1' */
> if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
> dev_err(etm_dev,
> "timeout while waiting for PM stable Trace Status\n");
> + if (drvdata->trfc)
> + write_sysreg_s(trfcr, SYS_TRFCR_EL1);
>
> /* read the status of the single shot comparators */
> for (i = 0; i < drvdata->nr_ss_cmp; i++) {
> @@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
> return false;
> }
>
> -static void cpu_enable_tracing(void)
> +static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
> {
> u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
> u64 trfcr;
> @@ -881,6 +897,7 @@ static void cpu_enable_tracing(void)
> if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
> return;
>
> + drvdata->trfc = true;
> /*
> * If the CPU supports v8.4 SelfHosted Tracing, enable
> * tracing at the kernel EL and EL0, forcing to use the
> @@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info)
> /* NUMCNTR, bits[30:28] number of counters available for tracing */
> drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
> etm4_cs_lock(drvdata, csa);
> - cpu_enable_tracing();
> + cpu_enable_tracing(drvdata);
> }
>
> static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> index 0af60571aa23..f6478ef642bf 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> @@ -862,6 +862,7 @@ struct etmv4_save_state {
> * @nooverflow: Indicate if overflow prevention is supported.
> * @atbtrig: If the implementation can support ATB triggers
> * @lpoverride: If the implementation can support low-power state over.
> + * @trfc: If the implementation supports Arm v8.4 trace filter controls.
> * @config: structure holding configuration parameters.
> * @save_state: State to be preserved across power loss
> * @state_needs_restore: True when there is context to restore after PM exit
> @@ -912,6 +913,7 @@ struct etmv4_drvdata {
> bool nooverflow;
> bool atbtrig;
> bool lpoverride;
> + bool trfc;
> struct etmv4_config config;
> struct etmv4_save_state *save_state;
> bool state_needs_restore;
> --
> 2.24.1
>

Reviewed-by: Mike Leach <[email protected]>


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

2021-03-08 17:28:27

by Mike Leach

[permalink] [raw]
Subject: Re: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks

Hi,

On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>
> From: Anshuman Khandual <[email protected]>
>
> Add support for dedicated sinks that are bound to individual CPUs. (e.g,
> TRBE). To allow quicker access to the sink for a given CPU bound source,
> keep a percpu array of the sink devices. Also, add support for building
> a path to the CPU local sink from the ETM.
>
> This adds a new percpu sink type CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM.
> This new sink type is exclusively available and can only work with percpu
> source type device CORESIGHT_DEV_SUBTYPE_SOURCE_PROC.
>

Minor nit: FEAT_TRBE architecturally guarantees a compatible
architectural FEAT_ETE source.
However _all_ CPU sources have CORESIGHT_DEV_SUBTYPE_SOURCE_PROC set,
ETMv3.x, PTM, ETM4.x and ETE alike.
In the code that follows - coresight_is_percpu_source() checks it is
any type of CPU source, not the FEAT_ETE type, which is fine as we
then check the cpu and if it has TRBE.
So the simplifications to the code from the first couple of patch sets
make this explanation slightly misleading. Could do to adjust if
re-spinning set.

Reviewed-by: Mike Leach <[email protected]>



> This defines a percpu structure that accommodates a single coresight_device
> which can be used to store an initialized instance from a sink driver. As
> these sinks are exclusively linked and dependent on corresponding percpu
> sources devices, they should also be the default sink device during a perf
> session.
>
> Outwards device connections are scanned while establishing paths between a
> source and a sink device. But such connections are not present for certain
> percpu source and sink devices which are exclusively linked and dependent.
> Build the path directly and skip connection scanning for such devices.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Tested-by: Suzuki K Poulose <[email protected]>
> Reviewed-by: Suzuki K Poulose <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> [Moved the set/get percpu sink APIs from TRBE patch to here]
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Export methods to set/get percpu sinks for fixing module
> build for TRBE
> - Addressed coding style comments (Suzuki)
> - Check status of _coresight_build_path() (Mathieu)
> ---
> drivers/hwtracing/coresight/coresight-core.c | 29 ++++++++++++++++++--
> drivers/hwtracing/coresight/coresight-priv.h | 3 ++
> include/linux/coresight.h | 12 ++++++++
> 3 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 0062c8935653..55c645616bf6 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -23,6 +23,7 @@
> #include "coresight-priv.h"
>
> static DEFINE_MUTEX(coresight_mutex);
> +DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
>
> /**
> * struct coresight_node - elements of a path, from source to sink
> @@ -70,6 +71,18 @@ void coresight_remove_cti_ops(void)
> }
> EXPORT_SYMBOL_GPL(coresight_remove_cti_ops);
>
> +void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev)
> +{
> + per_cpu(csdev_sink, cpu) = csdev;
> +}
> +EXPORT_SYMBOL_GPL(coresight_set_percpu_sink);
> +
> +struct coresight_device *coresight_get_percpu_sink(int cpu)
> +{
> + return per_cpu(csdev_sink, cpu);
> +}
> +EXPORT_SYMBOL_GPL(coresight_get_percpu_sink);
> +
> static int coresight_id_match(struct device *dev, void *data)
> {
> int trace_id, i_trace_id;
> @@ -784,6 +797,14 @@ static int _coresight_build_path(struct coresight_device *csdev,
> if (csdev == sink)
> goto out;
>
> + if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
> + sink == per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev))) {
> + if (_coresight_build_path(sink, sink, path) == 0) {
> + found = true;
> + goto out;
> + }
> + }
> +
> /* Not a sink - recursively explore each port found on this element */
> for (i = 0; i < csdev->pdata->nr_outport; i++) {
> struct coresight_device *child_dev;
> @@ -999,8 +1020,12 @@ coresight_find_default_sink(struct coresight_device *csdev)
> int depth = 0;
>
> /* look for a default sink if we have not found for this device */
> - if (!csdev->def_sink)
> - csdev->def_sink = coresight_find_sink(csdev, &depth);
> + if (!csdev->def_sink) {
> + if (coresight_is_percpu_source(csdev))
> + csdev->def_sink = per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev));
> + if (!csdev->def_sink)
> + csdev->def_sink = coresight_find_sink(csdev, &depth);
> + }
> return csdev->def_sink;
> }
>
> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
> index f5f654ea2994..ff1dd2092ac5 100644
> --- a/drivers/hwtracing/coresight/coresight-priv.h
> +++ b/drivers/hwtracing/coresight/coresight-priv.h
> @@ -232,4 +232,7 @@ coresight_find_csdev_by_fwnode(struct fwnode_handle *r_fwnode);
> void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev,
> struct coresight_device *ect_csdev);
>
> +void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev);
> +struct coresight_device *coresight_get_percpu_sink(int cpu);
> +
> #endif
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 976ec2697610..8a3a3c199087 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -50,6 +50,7 @@ enum coresight_dev_subtype_sink {
> CORESIGHT_DEV_SUBTYPE_SINK_PORT,
> CORESIGHT_DEV_SUBTYPE_SINK_BUFFER,
> CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM,
> + CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM,
> };
>
> enum coresight_dev_subtype_link {
> @@ -428,6 +429,17 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o
> csa->write(val, offset, false, true);
> }
>
> +static inline bool coresight_is_percpu_source(struct coresight_device *csdev)
> +{
> + return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) &&
> + (csdev->subtype.source_subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_PROC);
> +}
> +
> +static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
> +{
> + return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) &&
> + (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM);
> +}
> #else /* !CONFIG_64BIT */
>
> static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
> --
> 2.24.1
>


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

2021-03-08 17:30:09

by Mike Leach

[permalink] [raw]
Subject: Re: [PATCH v4 15/19] dts: bindings: Document device tree bindings for ETE

Hi Suzuki

Need to add this file and the TRBE bindings file to the ARM/CORESIGHT
section of the MAINTAINERS file.

Regards

Mike



On Sat, 6 Mar 2021 at 21:06, Rob Herring <[email protected]> wrote:
>
> On Thu, Feb 25, 2021 at 07:35:39PM +0000, Suzuki K Poulose wrote:
> > Document the device tree bindings for Embedded Trace Extensions.
> > ETE can be connected to legacy coresight components and thus
> > could optionally contain a connection graph as described by
> > the CoreSight bindings.
> >
> > Cc: [email protected]
> > Cc: Mathieu Poirier <[email protected]>
> > Cc: Mike Leach <[email protected]>
> > Cc: Rob Herring <[email protected]>
> > Signed-off-by: Suzuki K Poulose <[email protected]>
> > ---
> > Changes:
> > - Fix out-ports defintion
> > ---
> > .../devicetree/bindings/arm/ete.yaml | 71 +++++++++++++++++++
> > 1 file changed, 71 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/arm/ete.yaml b/Documentation/devicetree/bindings/arm/ete.yaml
> > new file mode 100644
> > index 000000000000..35a42d92bf97
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/arm/ete.yaml
> > @@ -0,0 +1,71 @@
> > +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> > +# Copyright 2021, Arm Ltd
> > +%YAML 1.2
> > +---
> > +$id: "http://devicetree.org/schemas/arm/ete.yaml#"
> > +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> > +
> > +title: ARM Embedded Trace Extensions
> > +
> > +maintainers:
> > + - Suzuki K Poulose <[email protected]>
> > + - Mathieu Poirier <[email protected]>
> > +
> > +description: |
> > + Arm Embedded Trace Extension(ETE) is a per CPU trace component that
> > + allows tracing the CPU execution. It overlaps with the CoreSight ETMv4
> > + architecture and has extended support for future architecture changes.
> > + The trace generated by the ETE could be stored via legacy CoreSight
> > + components (e.g, TMC-ETR) or other means (e.g, using a per CPU buffer
> > + Arm Trace Buffer Extension (TRBE)). Since the ETE can be connected to
> > + legacy CoreSight components, a node must be listed per instance, along
> > + with any optional connection graph as per the coresight bindings.
> > + See bindings/arm/coresight.txt.
> > +
> > +properties:
> > + $nodename:
> > + pattern: "^ete([0-9a-f]+)$"
> > + compatible:
> > + items:
> > + - const: arm,embedded-trace-extension
> > +
> > + cpu:
> > + description: |
> > + Handle to the cpu this ETE is bound to.
> > + $ref: /schemas/types.yaml#/definitions/phandle
> > +
> > + out-ports:
> > + description: |
> > + Output connections from the ETE to legacy CoreSight trace bus.
> > + $ref: /schemas/graph.yaml#/properties/port
>
> s/port/ports/
>
> And then you need:
>
> properties:
> port:
> description: what this port is
> $ref: /schemas/graph.yaml#/properties/port
>
> > +
> > +required:
> > + - compatible
> > + - cpu
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > +
> > +# An ETE node without legacy CoreSight connections
> > + - |
> > + ete0 {
> > + compatible = "arm,embedded-trace-extension";
> > + cpu = <&cpu_0>;
> > + };
> > +# An ETE node with legacy CoreSight connections
> > + - |
> > + ete1 {
> > + compatible = "arm,embedded-trace-extension";
> > + cpu = <&cpu_1>;
> > +
> > + out-ports { /* legacy coresight connection */
> > + port {
> > + ete1_out_port: endpoint {
> > + remote-endpoint = <&funnel_in_port0>;
> > + };
> > + };
> > + };
> > + };
> > +
> > +...
> > --
> > 2.24.1
> >



--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

2021-03-08 17:30:57

by Mike Leach

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

Hi Suzuki,

On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>
> From: Anshuman Khandual <[email protected]>
>
> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> accessible via the system registers. The TRBE supports different addressing
> modes including CPU virtual address and buffer modes including the circular
> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> access to the trace buffer could be prohibited by a higher exception level
> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> private interrupt (PPI) on address translation errors and when the buffer
> is full. Overall implementation here is inspired from the Arm SPE driver.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Replaced TRBLIMITR_LIMIT_SHIFT with TRBBASER_BASE_SHIFT in set_trbe_base_pointer()
> - Dropped TRBBASER_BASE_MASK and TRBBASER_BASE_SHIFT from get_trbe_base_pointer()
> - Indentation changes for TRBE_BSC_NOT_[STOPPED|FILLED|TRIGGERED] definitions
> - Moved DECLARE_PER_CPU(...., csdev_sink) into coresight-priv.h
> - Moved isb() from trbe_enable_hw() into set_trbe_limit_pointer_enabled()
> - Dropped the space after type casting before vmap()
> - Return 0 instead of EINVAL in arm_trbe_update_buffer()
> - Add a comment in trbe_handle_overflow()
> - Add a comment in arm_trbe_cpu_startup()
> - Unregister coresight TRBE device when not supported
> - Fix potential NULL handle dereference in IRQ handler with a spurious IRQ
> - Read TRBIDR after is_trbe_programmable() in arm_trbe_probe_coresight_cpu()
> - Replaced and modified trbe_drain_and_disable_local() in IRQ handler
> - Updated arm_trbe_update_buffer() for handling a missing interrupt
> - Dropped kfree() for all devm_xxx() allocated buffer
> - Dropped additional blank line in documentation coresight/coresight-trbe.rst
> - Added Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> - Changed CONFIG_CORESIGHT_TRBE options, dependencies and helper write up
> - Added comment for irq_work_run()
> - Updated comment for minumum buffer length in arm_trbe_alloc_buffer()
> - Dropped redundant smp_processor_id() from arm_trbe_probe_coresight_cpu()
> - Fixed indentation in arm_trbe_probe_cpuhp()
> - Added static for arm_trbe_free_buffer()
> - Added comment for trbe_base element in trbe_buf structure
> - Dropped IS_ERR() check from vmap() returned pointer
> - Added WARN_ON(trbe_csdev) in arm_trbe_probe_coresight_cpu()
> - Changed TRBE device names from arm_trbeX to just trbeX
> - Dropped unused argument perf_output_handle from trbe_get_fault_act()
> - Dropped IS_ERR() from kzalloc_node()/kcalloc() buffer in arm_trbe_alloc_buffer()
> - Dropped IS_ERR() and return -ENOMEM in arm_trbe_probe_coresight()
> - Moved TRBE HW disabling before coresight cleanup in arm_trbe_remove_coresight_cpu()
> - Changed error return codes from arm_trbe_probe_irq()
> - Changed error return codes from arm_trbe_device_probe()
> - Changed arm_trbe_remove_coresight() order in arm_trbe_device_remove()
> - Changed TRBE CPU support probe/remove sequence with for_each_cpu() iterator
> - Changed coresight_register() in arm_trbe_probe_coresight_cpu()
> - Changed error return code when cpuhp_setup_state_multi() fails in arm_trbe_probe_cpuhp()
> - Changed error return code when cpuhp_state_add_instance() fails in arm_trbe_probe_cpuhp()
> - Changed trbe_dbm as trbe_flag including its sysfs interface
> - Handle race between update_buffer & IRQ handler
> - Rework and split the TRBE probe to avoid lockdep due to memory allocation
> from IPI calls (via coresight_register())
> - Fix handle->head updat for snapshot mode.
> ---
> .../testing/sysfs-bus-coresight-devices-trbe | 14 +
> .../trace/coresight/coresight-trbe.rst | 38 +
> drivers/hwtracing/coresight/Kconfig | 14 +
> drivers/hwtracing/coresight/Makefile | 1 +
> drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
> drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
> 6 files changed, 1369 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> new file mode 100644
> index 000000000000..ad3bbc6fa751
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> @@ -0,0 +1,14 @@
> +What: /sys/bus/coresight/devices/trbe<cpu>/align
> +Date: March 2021
> +KernelVersion: 5.13
> +Contact: Anshuman Khandual <[email protected]>
> +Description: (Read) Shows the TRBE write pointer alignment. This value
> + is fetched from the TRBIDR register.
> +
> +What: /sys/bus/coresight/devices/trbe<cpu>/flag
> +Date: March 2021
> +KernelVersion: 5.13
> +Contact: Anshuman Khandual <[email protected]>
> +Description: (Read) Shows if TRBE updates in the memory are with access
> + and dirty flag updates as well. This value is fetched from
> + the TRBIDR register.
> diff --git a/Documentation/trace/coresight/coresight-trbe.rst b/Documentation/trace/coresight/coresight-trbe.rst
> new file mode 100644
> index 000000000000..b9928ef148da
> --- /dev/null
> +++ b/Documentation/trace/coresight/coresight-trbe.rst
> @@ -0,0 +1,38 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==============================
> +Trace Buffer Extension (TRBE).
> +==============================
> +
> + :Author: Anshuman Khandual <[email protected]>
> + :Date: November 2020
> +
> +Hardware Description
> +--------------------
> +
> +Trace Buffer Extension (TRBE) is a percpu hardware which captures in system
> +memory, CPU traces generated from a corresponding percpu tracing unit. This
> +gets plugged in as a coresight sink device because the corresponding trace
> +generators (ETE), are plugged in as source device.
> +
> +The TRBE is not compliant to CoreSight architecture specifications, but is
> +driven via the CoreSight driver framework to support the ETE (which is
> +CoreSight compliant) integration.
> +
> +Sysfs files and directories
> +---------------------------
> +
> +The TRBE devices appear on the existing coresight bus alongside the other
> +coresight devices::
> +
> + >$ ls /sys/bus/coresight/devices
> + trbe0 trbe1 trbe2 trbe3
> +
> +The ``trbe<N>`` named TRBEs are associated with a CPU.::
> +
> + >$ ls /sys/bus/coresight/devices/trbe0/
> + align flag
> +
> +*Key file items are:-*
> + * ``align``: TRBE write pointer alignment
> + * ``flag``: TRBE updates memory with access and dirty flags
> diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> index f154ae7e705d..84530fd80998 100644
> --- a/drivers/hwtracing/coresight/Kconfig
> +++ b/drivers/hwtracing/coresight/Kconfig
> @@ -173,4 +173,18 @@ config CORESIGHT_CTI_INTEGRATION_REGS
> CTI trigger connections between this and other devices.These
> registers are not used in normal operation and can leave devices in
> an inconsistent state.
> +
> +config CORESIGHT_TRBE
> + tristate "Trace Buffer Extension (TRBE) driver"
> + depends on ARM64 && CORESIGHT_SOURCE_ETM4X
> + help
> + This driver provides support for percpu Trace Buffer Extension (TRBE).
> + TRBE always needs to be used along with it's corresponding percpu ETE
> + component. ETE generates trace data which is then captured with TRBE.
> + Unlike traditional sink devices, TRBE is a CPU feature accessible via
> + system registers. But it's explicit dependency with trace unit (ETE)
> + requires it to be plugged in as a coresight sink device.
> +
> + To compile this driver as a module, choose M here: the module will be
> + called coresight-trbe.
> endif
> diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> index f20e357758d1..d60816509755 100644
> --- a/drivers/hwtracing/coresight/Makefile
> +++ b/drivers/hwtracing/coresight/Makefile
> @@ -21,5 +21,6 @@ obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
> obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
> obj-$(CONFIG_CORESIGHT_CATU) += coresight-catu.o
> obj-$(CONFIG_CORESIGHT_CTI) += coresight-cti.o
> +obj-$(CONFIG_CORESIGHT_TRBE) += coresight-trbe.o
> coresight-cti-y := coresight-cti-core.o coresight-cti-platform.o \
> coresight-cti-sysfs.o
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> new file mode 100644
> index 000000000000..41a012b525bb
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -0,0 +1,1149 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This driver enables Trace Buffer Extension (TRBE) as a per-cpu coresight
> + * sink device could then pair with an appropriate per-cpu coresight source
> + * device (ETE) thus generating required trace data. Trace can be enabled
> + * via the perf framework.
> + *
> + * Copyright (C) 2020 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#define DRVNAME "arm_trbe"
> +
> +#define pr_fmt(fmt) DRVNAME ": " fmt
> +
> +#include <asm/barrier.h>
> +#include "coresight-trbe.h"
> +
> +#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
> +
> +/*
> + * A padding packet that will help the user space tools
> + * in skipping relevant sections in the captured trace
> + * data which could not be decoded. TRBE doesn't support
> + * formatting the trace data, unlike the legacy CoreSight
> + * sinks and thus we use ETE trace packets to pad the
> + * sections of the buffer.
> + */
> +#define ETE_IGNORE_PACKET 0x70
> +
> +/*
> + * Minimum amount of meaningful trace will contain:
> + * A-Sync, Trace Info, Trace On, Address, Atom.
> + * This is about 44bytes of ETE trace. To be on
> + * the safer side, we assume 64bytes is the minimum
> + * space required for a meaningful session, before
> + * we hit a "WRAP" event.
> + */
> +#define TRBE_TRACE_MIN_BUF_SIZE 64
> +
> +enum trbe_fault_action {
> + TRBE_FAULT_ACT_WRAP,
> + TRBE_FAULT_ACT_SPURIOUS,
> + TRBE_FAULT_ACT_FATAL,
> +};
> +
> +struct trbe_buf {
> + /*
> + * Even though trbe_base represents vmap()
> + * mapped allocated buffer's start address,
> + * it's being as unsigned long for various
> + * arithmetic and comparision operations &
> + * also to be consistent with trbe_write &
> + * trbe_limit sibling pointers.
> + */
> + unsigned long trbe_base;
> + unsigned long trbe_limit;
> + unsigned long trbe_write;
> + int nr_pages;
> + void **pages;
> + bool snapshot;
> + struct trbe_cpudata *cpudata;
> +};
> +
> +struct trbe_cpudata {
> + bool trbe_flag;
> + u64 trbe_align;
> + int cpu;
> + enum cs_mode mode;
> + struct trbe_buf *buf;
> + struct trbe_drvdata *drvdata;
> +};
> +
> +struct trbe_drvdata {
> + struct trbe_cpudata __percpu *cpudata;
> + struct perf_output_handle __percpu **handle;
> + struct hlist_node hotplug_node;
> + int irq;
> + cpumask_t supported_cpus;
> + enum cpuhp_state trbe_online;
> + struct platform_device *pdev;
> +};
> +
> +static int trbe_alloc_node(struct perf_event *event)
> +{
> + if (event->cpu == -1)
> + return NUMA_NO_NODE;
> + return cpu_to_node(event->cpu);
> +}
> +
> +static void trbe_drain_buffer(void)
> +{
> + tsb_csync();
> + dsb(nsh);
> +}
> +
> +static void trbe_drain_and_disable_local(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + trbe_drain_buffer();
> +
> + /*
> + * Disable the TRBE without clearing LIMITPTR which
> + * might be required for fetching the buffer limits.
> + */
> + trblimitr &= ~TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> + isb();
> +}
> +
> +static void trbe_reset_local(void)
> +{
> + trbe_drain_and_disable_local();
> + write_sysreg_s(0, SYS_TRBLIMITR_EL1);
> + write_sysreg_s(0, SYS_TRBPTR_EL1);
> + write_sysreg_s(0, SYS_TRBBASER_EL1);
> + write_sysreg_s(0, SYS_TRBSR_EL1);
> +}
> +
> +static void trbe_stop_and_truncate_event(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + /*
> + * We cannot proceed with the buffer collection and we
> + * do not have any data for the current session. The
> + * etm_perf driver expects to close out the aux_buffer
> + * at event_stop(). So disable the TRBE here and leave
> + * the update_buffer() to return a 0 size.
> + */
> + trbe_drain_and_disable_local();
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> + *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
> +}
> +
> +/*
> + * TRBE Buffer Management
> + *
> + * The TRBE buffer spans from the base pointer till the limit pointer. When enabled,
> + * it starts writing trace data from the write pointer onward till the limit pointer.
> + * When the write pointer reaches the address just before the limit pointer, it gets
> + * wrapped around again to the base pointer. This is called a TRBE wrap event, which
> + * generates a maintenance interrupt when operated in WRAP or FILL mode. This driver
> + * uses FILL mode, where the TRBE stops the trace collection at wrap event. The IRQ
> + * handler updates the AUX buffer and re-enables the TRBE with updated WRITE and
> + * LIMIT pointers.
> + *
> + * Wrap around with an IRQ
> + * ------ < ------ < ------- < ----- < -----
> + * | |
> + * ------ > ------ > ------- > ----- > -----
> + *
> + * +---------------+-----------------------+
> + * | | |
> + * +---------------+-----------------------+
> + * Base Pointer Write Pointer Limit Pointer
> + *
> + * The base and limit pointers always needs to be PAGE_SIZE aligned. But the write
> + * pointer can be aligned to the implementation defined TRBE trace buffer alignment
> + * as captured in trbe_cpudata->trbe_align.
> + *
> + *
> + * head tail wakeup
> + * +---------------------------------------+----- ~ ~ ------
> + * |$$$$$$$|################|$$$$$$$$$$$$$$| |
> + * +---------------------------------------+----- ~ ~ ------
> + * Base Pointer Write Pointer Limit Pointer
> + *
> + * The perf_output_handle indices (head, tail, wakeup) are monotonically increasing
> + * values which tracks all the driver writes and user reads from the perf auxiliary
> + * buffer. Generally [head..tail] is the area where the driver can write into unless
> + * the wakeup is behind the tail. Enabled TRBE buffer span needs to be adjusted and
> + * configured depending on the perf_output_handle indices, so that the driver does
> + * not override into areas in the perf auxiliary buffer which is being or yet to be
> + * consumed from the user space. The enabled TRBE buffer area is a moving subset of
> + * the allocated perf auxiliary buffer.
> + */
> +static void trbe_pad_buf(struct perf_output_handle *handle, int len)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + u64 head = PERF_IDX2OFF(handle->head, buf);
> +
> + memset((void *)buf->trbe_base + head, ETE_IGNORE_PACKET, len);
> + if (!buf->snapshot)
> + perf_aux_output_skip(handle, len);
> +}
> +
> +static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + /*
> + * The ETE trace has alignment synchronization packets allowing
> + * the decoder to reset in case of an overflow or corruption.
> + * So we can use the entire buffer for the snapshot mode.
> + */
> + return buf->nr_pages * PAGE_SIZE;
> +}
> +
> +/*
> + * TRBE Limit Calculation
> + *
> + * The following markers are used to illustrate various TRBE buffer situations.
> + *
> + * $$$$ - Data area, unconsumed captured trace data, not to be overridden
> + * #### - Free area, enabled, trace will be written
> + * %%%% - Free area, disabled, trace will not be written
> + * ==== - Free area, padded with ETE_IGNORE_PACKET, trace will be skipped
> + */
> +static unsigned long __trbe_normal_offset(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + struct trbe_cpudata *cpudata = buf->cpudata;
> + const u64 bufsize = buf->nr_pages * PAGE_SIZE;
> + u64 limit = bufsize;
> + u64 head, tail, wakeup;
> +
> + head = PERF_IDX2OFF(handle->head, buf);
> +
> + /*
> + * head
> + * ------->|
> + * |
> + * head TRBE align tail
> + * +----|-------|---------------|-------+
> + * |$$$$|=======|###############|$$$$$$$|
> + * +----|-------|---------------|-------+
> + * trbe_base trbe_base + nr_pages
> + *
> + * Perf aux buffer output head position can be misaligned depending on
> + * various factors including user space reads. In case misaligned, head
> + * needs to be aligned before TRBE can be configured. Pad the alignment
> + * gap with ETE_IGNORE_PACKET bytes that will be ignored by user tools
> + * and skip this section thus advancing the head.
> + */
> + if (!IS_ALIGNED(head, cpudata->trbe_align)) {
> + unsigned long delta = roundup(head, cpudata->trbe_align) - head;
> +
> + delta = min(delta, handle->size);
> + trbe_pad_buf(handle, delta);
> + head = PERF_IDX2OFF(handle->head, buf);
> + }
> +
> + /*
> + * head = tail (size = 0)
> + * +----|-------------------------------+
> + * |$$$$|$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ |
> + * +----|-------------------------------+
> + * trbe_base trbe_base + nr_pages
> + *
> + * Perf aux buffer does not have any space for the driver to write into.
> + * Just communicate trace truncation event to the user space by marking
> + * it with PERF_AUX_FLAG_TRUNCATED.
> + */
> + if (!handle->size) {
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> + return 0;
> + }
> +
> + /* Compute the tail and wakeup indices now that we've aligned head */
> + tail = PERF_IDX2OFF(handle->head + handle->size, buf);
> + wakeup = PERF_IDX2OFF(handle->wakeup, buf);
> +
> + /*
> + * Lets calculate the buffer area which TRBE could write into. There
> + * are three possible scenarios here. Limit needs to be aligned with
> + * PAGE_SIZE per the TRBE requirement. Always avoid clobbering the
> + * unconsumed data.
> + *
> + * 1) head < tail
> + *
> + * head tail
> + * +----|-----------------------|-------+
> + * |$$$$|#######################|$$$$$$$|
> + * +----|-----------------------|-------+
> + * trbe_base limit trbe_base + nr_pages
> + *
> + * TRBE could write into [head..tail] area. Unless the tail is right at
> + * the end of the buffer, neither an wrap around nor an IRQ is expected
> + * while being enabled.
> + *
> + * 2) head == tail
> + *
> + * head = tail (size > 0)
> + * +----|-------------------------------+
> + * |%%%%|###############################|
> + * +----|-------------------------------+
> + * trbe_base limit = trbe_base + nr_pages
> + *
> + * TRBE should just write into [head..base + nr_pages] area even though
> + * the entire buffer is empty. Reason being, when the trace reaches the
> + * end of the buffer, it will just wrap around with an IRQ giving an
> + * opportunity to reconfigure the buffer.
> + *
> + * 3) tail < head
> + *
> + * tail head
> + * +----|-----------------------|-------+
> + * |%%%%|$$$$$$$$$$$$$$$$$$$$$$$|#######|
> + * +----|-----------------------|-------+
> + * trbe_base limit = trbe_base + nr_pages
> + *
> + * TRBE should just write into [head..base + nr_pages] area even though
> + * the [trbe_base..tail] is also empty. Reason being, when the trace
> + * reaches the end of the buffer, it will just wrap around with an IRQ
> + * giving an opportunity to reconfigure the buffer.
> + */
> + if (head < tail)
> + limit = round_down(tail, PAGE_SIZE);
> +
> + /*
> + * Wakeup may be arbitrarily far into the future. If it's not in the
> + * current generation, either we'll wrap before hitting it, or it's
> + * in the past and has been handled already.
> + *
> + * If there's a wakeup before we wrap, arrange to be woken up by the
> + * page boundary following it. Keep the tail boundary if that's lower.
> + *
> + * head wakeup tail
> + * +----|---------------|-------|-------+
> + * |$$$$|###############|%%%%%%%|$$$$$$$|
> + * +----|---------------|-------|-------+
> + * trbe_base limit trbe_base + nr_pages
> + */
> + if (handle->wakeup < (handle->head + handle->size) && head <= wakeup)
> + limit = min(limit, round_up(wakeup, PAGE_SIZE));
> +
> + /*
> + * There are two situation when this can happen i.e limit is before
> + * the head and hence TRBE cannot be configured.
> + *
> + * 1) head < tail (aligned down with PAGE_SIZE) and also they are both
> + * within the same PAGE size range.
> + *
> + * PAGE_SIZE
> + * |----------------------|
> + *
> + * limit head tail
> + * +------------|------|--------|-------+
> + * |$$$$$$$$$$$$$$$$$$$|========|$$$$$$$|
> + * +------------|------|--------|-------+
> + * trbe_base trbe_base + nr_pages
> + *
> + * 2) head < wakeup (aligned up with PAGE_SIZE) < tail and also both
> + * head and wakeup are within same PAGE size range.
> + *
> + * PAGE_SIZE
> + * |----------------------|
> + *
> + * limit head wakeup tail
> + * +----|------|-------|--------|-------+
> + * |$$$$$$$$$$$|=======|========|$$$$$$$|
> + * +----|------|-------|--------|-------+
> + * trbe_base trbe_base + nr_pages
> + */
> + if (limit > head)
> + return limit;
> +
> + trbe_pad_buf(handle, handle->size);
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> + return 0;
> +}
> +
> +static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = perf_get_aux(handle);
> + u64 limit = __trbe_normal_offset(handle);
> + u64 head = PERF_IDX2OFF(handle->head, buf);
> +
> + /*
> + * If the head is too close to the limit and we don't
> + * have space for a meaningful run, we rather pad it
> + * and start fresh.
> + */
> + if (limit && (limit - head < TRBE_TRACE_MIN_BUF_SIZE)) {
> + trbe_pad_buf(handle, limit - head);
> + limit = __trbe_normal_offset(handle);
> + }
> + return limit;
> +}
> +
> +static unsigned long compute_trbe_buffer_limit(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + unsigned long offset;
> +
> + if (buf->snapshot)
> + offset = trbe_snapshot_offset(handle);
> + else
> + offset = trbe_normal_offset(handle);
> + return buf->trbe_base + offset;
> +}
> +
> +static void clr_trbe_status(void)
> +{
> + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
> +
> + WARN_ON(is_trbe_enabled());
> + trbsr &= ~TRBSR_IRQ;
> + trbsr &= ~TRBSR_TRG;
> + trbsr &= ~TRBSR_WRAP;
> + trbsr &= ~(TRBSR_EC_MASK << TRBSR_EC_SHIFT);
> + trbsr &= ~(TRBSR_BSC_MASK << TRBSR_BSC_SHIFT);
> + trbsr &= ~TRBSR_STOP;
> + write_sysreg_s(trbsr, SYS_TRBSR_EL1);
> +}
> +
> +static void set_trbe_limit_pointer_enabled(unsigned long addr)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBLIMITR_LIMIT_SHIFT)));
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> +
> + trblimitr &= ~TRBLIMITR_NVM;
> + trblimitr &= ~(TRBLIMITR_FILL_MODE_MASK << TRBLIMITR_FILL_MODE_SHIFT);
> + trblimitr &= ~(TRBLIMITR_TRIG_MODE_MASK << TRBLIMITR_TRIG_MODE_SHIFT);
> + trblimitr &= ~(TRBLIMITR_LIMIT_MASK << TRBLIMITR_LIMIT_SHIFT);
> +
> + /*
> + * Fill trace buffer mode is used here while configuring the
> + * TRBE for trace capture. In this particular mode, the trace
> + * collection is stopped and a maintenance interrupt is raised
> + * when the current write pointer wraps. This pause in trace
> + * collection gives the software an opportunity to capture the
> + * trace data in the interrupt handler, before reconfiguring
> + * the TRBE.
> + */
> + trblimitr |= (TRBE_FILL_MODE_FILL & TRBLIMITR_FILL_MODE_MASK) << TRBLIMITR_FILL_MODE_SHIFT;
> +
> + /*
> + * Trigger mode is not used here while configuring the TRBE for
> + * the trace capture. Hence just keep this in the ignore mode.
> + */
> + trblimitr |= (TRBE_TRIG_MODE_IGNORE & TRBLIMITR_TRIG_MODE_MASK) <<
> + TRBLIMITR_TRIG_MODE_SHIFT;
> + trblimitr |= (addr & PAGE_MASK);
> +
> + trblimitr |= TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +
> + /* Synchronize the TRBE enable event */
> + isb();
> +}
> +
> +static void trbe_enable_hw(struct trbe_buf *buf)
> +{
> + WARN_ON(buf->trbe_write < buf->trbe_base);
> + WARN_ON(buf->trbe_write >= buf->trbe_limit);
> + set_trbe_disabled();
> + isb();
> + clr_trbe_status();
> + set_trbe_base_pointer(buf->trbe_base);
> + set_trbe_write_pointer(buf->trbe_write);
> +
> + /*
> + * Synchronize all the register updates
> + * till now before enabling the TRBE.
> + */
> + isb();
> + set_trbe_limit_pointer_enabled(buf->trbe_limit);
> +}
> +
> +static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
> +{
> + int ec = get_trbe_ec(trbsr);
> + int bsc = get_trbe_bsc(trbsr);
> +
> + WARN_ON(is_trbe_running(trbsr));
> + if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> + return TRBE_FAULT_ACT_FATAL;
> +
> + if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
> + return TRBE_FAULT_ACT_FATAL;
> +
> + if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) {
> + if (get_trbe_write_pointer() == get_trbe_base_pointer())
> + return TRBE_FAULT_ACT_WRAP;
> + }
> + return TRBE_FAULT_ACT_SPURIOUS;
> +}
> +
> +static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
> + struct perf_event *event, void **pages,
> + int nr_pages, bool snapshot)
> +{
> + struct trbe_buf *buf;
> + struct page **pglist;
> + int i;
> +
> + /*
> + * TRBE LIMIT and TRBE WRITE pointers must be page aligned. But with
> + * just a single page, there would not be any room left while writing
> + * into a partially filled TRBE buffer after the page size alignment.
> + * Hence restrict the minimum buffer size as two pages.
> + */
> + if (nr_pages < 2)
> + return NULL;
> +
> + buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, trbe_alloc_node(event));
> + if (!buf)
> + return ERR_PTR(-ENOMEM);
> +
> + pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL);
> + if (!pglist) {
> + kfree(buf);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + for (i = 0; i < nr_pages; i++)
> + pglist[i] = virt_to_page(pages[i]);
> +
> + buf->trbe_base = (unsigned long)vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
> + if (!buf->trbe_base) {
> + kfree(pglist);
> + kfree(buf);
> + return ERR_PTR(buf->trbe_base);
> + }
> + buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
> + buf->trbe_write = buf->trbe_base;
> + buf->snapshot = snapshot;
> + buf->nr_pages = nr_pages;
> + buf->pages = pages;
> + kfree(pglist);
> + return buf;
> +}
> +
> +static void arm_trbe_free_buffer(void *config)
> +{
> + struct trbe_buf *buf = config;
> +
> + vunmap((void *)buf->trbe_base);
> + kfree(buf);
> +}
> +
> +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
> + struct perf_output_handle *handle,
> + void *config)
> +{
> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct trbe_buf *buf = config;
> + enum trbe_fault_action act;
> + unsigned long size, offset;
> + unsigned long write, base, status;
> + unsigned long flags;
> +
> + WARN_ON(buf->cpudata != cpudata);
> + WARN_ON(cpudata->cpu != smp_processor_id());
> + WARN_ON(cpudata->drvdata != drvdata);
> + if (cpudata->mode != CS_MODE_PERF)
> + return 0;
> +
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
> +
> + /*
> + * We are about to disable the TRBE. And this could in turn
> + * fill up the buffer triggering, an IRQ. This could be consumed
> + * by the PE asynchronously, causing a race here against
> + * the IRQ handler in closing out the handle. So, let us
> + * make sure the IRQ can't trigger while we are collecting
> + * the buffer. We also make sure that a WRAP event is handled
> + * accordingly.
> + */
> + local_irq_save(flags);
> +
> + /*
> + * If the TRBE was disabled due to lack of space in the AUX buffer or a
> + * spurious fault, the driver leaves it disabled, truncating the buffer.
> + * Since the etm_perf driver expects to close out the AUX buffer, the
> + * driver skips it. Thus, just pass in 0 size here to indicate that the
> + * buffer was truncated.
> + */
> + if (!is_trbe_enabled()) {
> + size = 0;
> + goto done;
> + }
> + /*
> + * perf handle structure needs to be shared with the TRBE IRQ handler for
> + * capturing trace data and restarting the handle. There is a probability
> + * of an undefined reference based crash when etm event is being stopped
> + * while a TRBE IRQ also getting processed. This happens due the release
> + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
> + * the TRBE here will ensure that no IRQ could be generated when the perf
> + * handle gets freed in etm_event_stop().
> + */
> + trbe_drain_and_disable_local();
> + write = get_trbe_write_pointer();
> + base = get_trbe_base_pointer();
> +
> + /* Check if there is a pending interrupt and handle it here */
> + status = read_sysreg_s(SYS_TRBSR_EL1);
> + if (is_trbe_irq(status)) {
> +
> + /*
> + * Now that we are handling the IRQ here, clear the IRQ
> + * from the status, to let the irq handler know that it
> + * is taken care of.
> + */
> + clr_trbe_irq();
> + isb();
> +
> + act = trbe_get_fault_act(status);
> + /*
> + * If this was not due to a WRAP event, we have some
> + * errors and as such buffer is empty.
> + */
> + if (act != TRBE_FAULT_ACT_WRAP) {
> + size = 0;
> + goto done;
> + }

We are using TRBE FILL mode - which halts capture on a full buffer and
triggers the IRQ, without disabling the source first.
This means that the mode is inherently lossy (unless by some unlikely
co-incidence the last byte that caused the wrap was also the last byte
to be sent from an ETE that was in the process of being disabled.)
Therefore we must have a perf_aux_output_flag(handle,
PERF_AUX_FLAG_TRUNCATED) call in here to signal that some trace was
lost, for consistence of operation with ETR etc, and intelpt.

> + /*
> + * Otherwise, the buffer is full and the write pointer
> + * has reached base. Adjust this back to the Limit pointer
> + * for correct size.
> + */
> + write = get_trbe_limit_pointer();
> + }
> +
> + offset = write - base;
> + if (WARN_ON_ONCE(offset < PERF_IDX2OFF(handle->head, buf)))
> + size = 0;
> + else
> + size = offset - PERF_IDX2OFF(handle->head, buf);
> +
> +done:
> + local_irq_restore(flags);
> +
> + if (buf->snapshot)
> + handle->head += size;
> + return size;
> +}
> +
> +static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data)
> +{
> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct perf_output_handle *handle = data;
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + WARN_ON(cpudata->cpu != smp_processor_id());
> + WARN_ON(cpudata->drvdata != drvdata);
> + if (mode != CS_MODE_PERF)
> + return -EINVAL;
> +
> + *this_cpu_ptr(drvdata->handle) = handle;
> + cpudata->buf = buf;
> + cpudata->mode = mode;
> + buf->cpudata = cpudata;
> + buf->trbe_limit = compute_trbe_buffer_limit(handle);
> + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> + if (buf->trbe_limit == buf->trbe_base) {
> + trbe_stop_and_truncate_event(handle);
> + return 0;
> + }
> + trbe_enable_hw(buf);
> + return 0;
> +}
> +
> +static int arm_trbe_disable(struct coresight_device *csdev)
> +{
> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct trbe_buf *buf = cpudata->buf;
> +
> + WARN_ON(buf->cpudata != cpudata);
> + WARN_ON(cpudata->cpu != smp_processor_id());
> + WARN_ON(cpudata->drvdata != drvdata);
> + if (cpudata->mode != CS_MODE_PERF)
> + return -EINVAL;
> +
> + trbe_drain_and_disable_local();
> + buf->cpudata = NULL;
> + cpudata->buf = NULL;
> + cpudata->mode = CS_MODE_DISABLED;
> + return 0;
> +}
> +
> +static void trbe_handle_spurious(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + buf->trbe_limit = compute_trbe_buffer_limit(handle);
> + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> + if (buf->trbe_limit == buf->trbe_base) {
> + trbe_drain_and_disable_local();
> + return;
> + }
> + trbe_enable_hw(buf);
> +}
> +
> +static void trbe_handle_overflow(struct perf_output_handle *handle)
> +{
> + struct perf_event *event = handle->event;
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + unsigned long offset, size;
> + struct etm_event_data *event_data;
> +
> + offset = get_trbe_limit_pointer() - get_trbe_base_pointer();
> + size = offset - PERF_IDX2OFF(handle->head, buf);
> + if (buf->snapshot)
> + handle->head += size;
> +

Again perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED) call
required for the reasons described above.


> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
> + perf_aux_output_end(handle, size);
> + event_data = perf_aux_output_begin(handle, event);
> + if (!event_data) {
> + /*
> + * We are unable to restart the trace collection,
> + * thus leave the TRBE disabled. The etm-perf driver
> + * is able to detect this with a disconnected handle
> + * (handle->event = NULL).
> + */
> + trbe_drain_and_disable_local();
> + *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
> + return;
> + }
> + buf->trbe_limit = compute_trbe_buffer_limit(handle);
> + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> + if (buf->trbe_limit == buf->trbe_base) {
> + trbe_stop_and_truncate_event(handle);
> + return;
> + }
> + *this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
> + trbe_enable_hw(buf);
> +}
> +
> +static bool is_perf_trbe(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + struct trbe_cpudata *cpudata = buf->cpudata;
> + struct trbe_drvdata *drvdata = cpudata->drvdata;
> + int cpu = smp_processor_id();
> +
> + WARN_ON(buf->trbe_base != get_trbe_base_pointer());
> + WARN_ON(buf->trbe_limit != get_trbe_limit_pointer());
> +
> + if (cpudata->mode != CS_MODE_PERF)
> + return false;
> +
> + if (cpudata->cpu != cpu)
> + return false;
> +
> + if (!cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + return false;
> +
> + return true;
> +}
> +
> +static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
> +{
> + struct perf_output_handle **handle_ptr = dev;
> + struct perf_output_handle *handle = *handle_ptr;
> + enum trbe_fault_action act;
> + u64 status;
> +
> + /*
> + * Ensure the trace is visible to the CPUs and
> + * any external aborts have been resolved.
> + */
> + trbe_drain_and_disable_local();
> +
> + status = read_sysreg_s(SYS_TRBSR_EL1);
> + /*
> + * If the pending IRQ was handled by update_buffer callback
> + * we have nothing to do here.
> + */
> + if (!is_trbe_irq(status))
> + return IRQ_NONE;
> +
> + clr_trbe_irq();
> + isb();
> +
> + if (WARN_ON_ONCE(!handle) || !perf_get_aux(handle))
> + return IRQ_NONE;
> +
> + if (!is_perf_trbe(handle))
> + return IRQ_NONE;
> +
> + /*
> + * Ensure perf callbacks have completed, which may disable
> + * the trace buffer in response to a TRUNCATION flag.
> + */
> + irq_work_run();
> +
> + act = trbe_get_fault_act(status);
> + switch (act) {
> + case TRBE_FAULT_ACT_WRAP:
> + trbe_handle_overflow(handle);
> + break;
> + case TRBE_FAULT_ACT_SPURIOUS:
> + trbe_handle_spurious(handle);
> + break;
> + case TRBE_FAULT_ACT_FATAL:
> + trbe_stop_and_truncate_event(handle);
> + break;
> + }
> + return IRQ_HANDLED;
> +}
> +
> +static const struct coresight_ops_sink arm_trbe_sink_ops = {
> + .enable = arm_trbe_enable,
> + .disable = arm_trbe_disable,
> + .alloc_buffer = arm_trbe_alloc_buffer,
> + .free_buffer = arm_trbe_free_buffer,
> + .update_buffer = arm_trbe_update_buffer,
> +};
> +
> +static const struct coresight_ops arm_trbe_cs_ops = {
> + .sink_ops = &arm_trbe_sink_ops,
> +};
> +
> +static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%llx\n", cpudata->trbe_align);
> +}
> +static DEVICE_ATTR_RO(align);
> +
> +static ssize_t flag_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%d\n", cpudata->trbe_flag);
> +}
> +static DEVICE_ATTR_RO(flag);
> +
> +static struct attribute *arm_trbe_attrs[] = {
> + &dev_attr_align.attr,
> + &dev_attr_flag.attr,
> + NULL,
> +};
> +
> +static const struct attribute_group arm_trbe_group = {
> + .attrs = arm_trbe_attrs,
> +};
> +
> +static const struct attribute_group *arm_trbe_groups[] = {
> + &arm_trbe_group,
> + NULL,
> +};
> +
> +static void arm_trbe_enable_cpu(void *info)
> +{
> + struct trbe_drvdata *drvdata = info;
> +
> + trbe_reset_local();
> + enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
> +}
> +
> +static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
> +{
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
> + struct coresight_desc desc = { 0 };
> + struct device *dev;
> +
> + if (WARN_ON(trbe_csdev))
> + return;
> +
> + dev = &cpudata->drvdata->pdev->dev;
> + desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu);
> + if (IS_ERR(desc.name))
> + goto cpu_clear;
> +
> + desc.type = CORESIGHT_DEV_TYPE_SINK;
> + desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
> + desc.ops = &arm_trbe_cs_ops;
> + desc.pdata = dev_get_platdata(dev);
> + desc.groups = arm_trbe_groups;
> + desc.dev = dev;
> + trbe_csdev = coresight_register(&desc);
> + if (IS_ERR(trbe_csdev))
> + goto cpu_clear;
> +
> + dev_set_drvdata(&trbe_csdev->dev, cpudata);
> + coresight_set_percpu_sink(cpu, trbe_csdev);
> + return;
> +cpu_clear:
> + cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> +}
> +
> +static void arm_trbe_probe_cpu(void *info)
> +{
> + struct trbe_drvdata *drvdata = info;
> + int cpu = smp_processor_id();
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + u64 trbidr;
> +
> + if (WARN_ON(!cpudata))
> + goto cpu_clear;
> +
> + if (!is_trbe_available()) {
> + pr_err("TRBE is not implemented on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> +
> + trbidr = read_sysreg_s(SYS_TRBIDR_EL1);
> + if (!is_trbe_programmable(trbidr)) {
> + pr_err("TRBE is owned in higher exception level on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> +
> + cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
> + if (cpudata->trbe_align > SZ_2K) {
> + pr_err("Unsupported alignment on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> + cpudata->trbe_flag = get_trbe_flag_update(trbidr);
> + cpudata->cpu = cpu;
> + cpudata->drvdata = drvdata;
> + return;
> +cpu_clear:
> + cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> +}
> +
> +static void arm_trbe_remove_coresight_cpu(void *info)
> +{
> + int cpu = smp_processor_id();
> + struct trbe_drvdata *drvdata = info;
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
> +
> + disable_percpu_irq(drvdata->irq);
> + trbe_reset_local();
> + if (trbe_csdev) {
> + coresight_unregister(trbe_csdev);
> + cpudata->drvdata = NULL;
> + coresight_set_percpu_sink(cpu, NULL);
> + }
> +}
> +
> +static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata)
> +{
> + int cpu;
> +
> + drvdata->cpudata = alloc_percpu(typeof(*drvdata->cpudata));
> + if (!drvdata->cpudata)
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, &drvdata->supported_cpus) {
> + smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_register_coresight_cpu(drvdata, cpu);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + smp_call_function_single(cpu, arm_trbe_enable_cpu, drvdata, 1);
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
> +{
> + int cpu;
> +
> + for_each_cpu(cpu, &drvdata->supported_cpus)
> + smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
> + free_percpu(drvdata->cpudata);
> + return 0;
> +}
> +
> +static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node)
> +{
> + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
> +
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> +
> + /*
> + * If this CPU was not probed for TRBE,
> + * initialize it now.
> + */
> + if (!coresight_get_percpu_sink(cpu)) {
> + arm_trbe_probe_cpu(drvdata);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_register_coresight_cpu(drvdata, cpu);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_enable_cpu(drvdata);
> + } else {
> + arm_trbe_enable_cpu(drvdata);
> + }
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
> +{
> + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
> +
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> + disable_percpu_irq(drvdata->irq);
> + trbe_reset_local();
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_probe_cpuhp(struct trbe_drvdata *drvdata)
> +{
> + enum cpuhp_state trbe_online;
> + int ret;
> +
> + trbe_online = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME,
> + arm_trbe_cpu_startup, arm_trbe_cpu_teardown);
> + if (trbe_online < 0)
> + return trbe_online;
> +
> + ret = cpuhp_state_add_instance(trbe_online, &drvdata->hotplug_node);
> + if (ret) {
> + cpuhp_remove_multi_state(trbe_online);
> + return ret;
> + }
> + drvdata->trbe_online = trbe_online;
> + return 0;
> +}
> +
> +static void arm_trbe_remove_cpuhp(struct trbe_drvdata *drvdata)
> +{
> + cpuhp_remove_multi_state(drvdata->trbe_online);
> +}
> +
> +static int arm_trbe_probe_irq(struct platform_device *pdev,
> + struct trbe_drvdata *drvdata)
> +{
> + int ret;
> +
> + drvdata->irq = platform_get_irq(pdev, 0);
> + if (drvdata->irq < 0) {
> + pr_err("IRQ not found for the platform device\n");
> + return drvdata->irq;
> + }
> +
> + if (!irq_is_percpu(drvdata->irq)) {
> + pr_err("IRQ is not a PPI\n");
> + return -EINVAL;
> + }
> +
> + if (irq_get_percpu_devid_partition(drvdata->irq, &drvdata->supported_cpus))
> + return -EINVAL;
> +
> + drvdata->handle = alloc_percpu(typeof(*drvdata->handle));
> + if (!drvdata->handle)
> + return -ENOMEM;
> +
> + ret = request_percpu_irq(drvdata->irq, arm_trbe_irq_handler, DRVNAME, drvdata->handle);
> + if (ret) {
> + free_percpu(drvdata->handle);
> + return ret;
> + }
> + return 0;
> +}
> +
> +static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
> +{
> + free_percpu_irq(drvdata->irq, drvdata->handle);
> + free_percpu(drvdata->handle);
> +}
> +
> +static int arm_trbe_device_probe(struct platform_device *pdev)
> +{
> + struct coresight_platform_data *pdata;
> + struct trbe_drvdata *drvdata;
> + struct device *dev = &pdev->dev;
> + int ret;
> +
> + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
> + if (!drvdata)
> + return -ENOMEM;
> +
> + pdata = coresight_get_platform_data(dev);
> + if (IS_ERR(pdata))
> + return PTR_ERR(pdata);
> +
> + dev_set_drvdata(dev, drvdata);
> + dev->platform_data = pdata;
> + drvdata->pdev = pdev;
> + ret = arm_trbe_probe_irq(pdev, drvdata);
> + if (ret)
> + return ret;
> +
> + ret = arm_trbe_probe_coresight(drvdata);
> + if (ret)
> + goto probe_failed;
> +
> + ret = arm_trbe_probe_cpuhp(drvdata);
> + if (ret)
> + goto cpuhp_failed;
> +
> + return 0;
> +cpuhp_failed:
> + arm_trbe_remove_coresight(drvdata);
> +probe_failed:
> + arm_trbe_remove_irq(drvdata);
> + return ret;
> +}
> +
> +static int arm_trbe_device_remove(struct platform_device *pdev)
> +{
> + struct trbe_drvdata *drvdata = platform_get_drvdata(pdev);
> +
> + arm_trbe_remove_cpuhp(drvdata);
> + arm_trbe_remove_coresight(drvdata);
> + arm_trbe_remove_irq(drvdata);
> + return 0;
> +}
> +
> +static const struct of_device_id arm_trbe_of_match[] = {
> + { .compatible = "arm,trace-buffer-extension"},
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, arm_trbe_of_match);
> +
> +static struct platform_driver arm_trbe_driver = {
> + .driver = {
> + .name = DRVNAME,
> + .of_match_table = of_match_ptr(arm_trbe_of_match),
> + .suppress_bind_attrs = true,
> + },
> + .probe = arm_trbe_device_probe,
> + .remove = arm_trbe_device_remove,
> +};
> +
> +static int __init arm_trbe_init(void)
> +{
> + int ret;
> +
> + if (arm64_kernel_unmapped_at_el0()) {
> + pr_err("TRBE wouldn't work if kernel gets unmapped at EL0\n");
> + return -EOPNOTSUPP;
> + }
> +
> + ret = platform_driver_register(&arm_trbe_driver);
> + if (!ret)
> + return 0;
> +
> + pr_err("Error registering %s platform driver\n", DRVNAME);
> + return ret;
> +}
> +
> +static void __exit arm_trbe_exit(void)
> +{
> + platform_driver_unregister(&arm_trbe_driver);
> +}
> +module_init(arm_trbe_init);
> +module_exit(arm_trbe_exit);
> +
> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
> +MODULE_DESCRIPTION("Arm Trace Buffer Extension (TRBE) driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
> new file mode 100644
> index 000000000000..499b846ccfee
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-trbe.h
> @@ -0,0 +1,153 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * This contains all required hardware related helper functions for
> + * Trace Buffer Extension (TRBE) driver in the coresight framework.
> + *
> + * Copyright (C) 2020 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#include <linux/coresight.h>
> +#include <linux/device.h>
> +#include <linux/irq.h>
> +#include <linux/kernel.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#include "coresight-etm-perf.h"
> +
> +static inline bool is_trbe_available(void)
> +{
> + u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
> + unsigned int trbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_TRBE_SHIFT);
> +
> + return trbe >= 0b0001;
> +}
> +
> +static inline bool is_trbe_enabled(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + return trblimitr & TRBLIMITR_ENABLE;
> +}
> +
> +#define TRBE_EC_OTHERS 0
> +#define TRBE_EC_STAGE1_ABORT 36
> +#define TRBE_EC_STAGE2_ABORT 37
> +
> +static inline int get_trbe_ec(u64 trbsr)
> +{
> + return (trbsr >> TRBSR_EC_SHIFT) & TRBSR_EC_MASK;
> +}
> +
> +#define TRBE_BSC_NOT_STOPPED 0
> +#define TRBE_BSC_FILLED 1
> +#define TRBE_BSC_TRIGGERED 2
> +
> +static inline int get_trbe_bsc(u64 trbsr)
> +{
> + return (trbsr >> TRBSR_BSC_SHIFT) & TRBSR_BSC_MASK;
> +}
> +
> +static inline void clr_trbe_irq(void)
> +{
> + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
> +
> + trbsr &= ~TRBSR_IRQ;
> + write_sysreg_s(trbsr, SYS_TRBSR_EL1);
> +}
> +
> +static inline bool is_trbe_irq(u64 trbsr)
> +{
> + return trbsr & TRBSR_IRQ;
> +}
> +
> +static inline bool is_trbe_trg(u64 trbsr)
> +{
> + return trbsr & TRBSR_TRG;
> +}
> +
> +static inline bool is_trbe_wrap(u64 trbsr)
> +{
> + return trbsr & TRBSR_WRAP;
> +}
> +
> +static inline bool is_trbe_abort(u64 trbsr)
> +{
> + return trbsr & TRBSR_ABORT;
> +}
> +
> +static inline bool is_trbe_running(u64 trbsr)
> +{
> + return !(trbsr & TRBSR_STOP);
> +}
> +
> +#define TRBE_TRIG_MODE_STOP 0
> +#define TRBE_TRIG_MODE_IRQ 1
> +#define TRBE_TRIG_MODE_IGNORE 3
> +
> +#define TRBE_FILL_MODE_FILL 0
> +#define TRBE_FILL_MODE_WRAP 1
> +#define TRBE_FILL_MODE_CIRCULAR_BUFFER 3
> +
> +static inline void set_trbe_disabled(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + trblimitr &= ~TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +}
> +
> +static inline bool get_trbe_flag_update(u64 trbidr)
> +{
> + return trbidr & TRBIDR_FLAG;
> +}
> +
> +static inline bool is_trbe_programmable(u64 trbidr)
> +{
> + return !(trbidr & TRBIDR_PROG);
> +}
> +
> +static inline int get_trbe_address_align(u64 trbidr)
> +{
> + return (trbidr >> TRBIDR_ALIGN_SHIFT) & TRBIDR_ALIGN_MASK;
> +}
> +
> +static inline unsigned long get_trbe_write_pointer(void)
> +{
> + return read_sysreg_s(SYS_TRBPTR_EL1);
> +}
> +
> +static inline void set_trbe_write_pointer(unsigned long addr)
> +{
> + WARN_ON(is_trbe_enabled());
> + write_sysreg_s(addr, SYS_TRBPTR_EL1);
> +}
> +
> +static inline unsigned long get_trbe_limit_pointer(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> + unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
> + unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;

Could this not be:
unsigned long addr = trblimitr & (TRBLIMITR_LIMIT_MASK <<
TRBLIMITR_LIMIT_SHIFT);
like the base ponter below?

> +
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + return addr;
> +}
> +
> +static inline unsigned long get_trbe_base_pointer(void)
> +{
> + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
> + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
> +
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + return addr;
> +}
> +
> +static inline void set_trbe_base_pointer(unsigned long addr)
> +{
> + WARN_ON(is_trbe_enabled());
> + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBBASER_BASE_SHIFT)));
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + write_sysreg_s(addr, SYS_TRBBASER_EL1);
> +}
> --
> 2.24.1
>

Regards

Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

2021-03-16 21:02:57

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 02/19] perf: aux: Add CoreSight PMU buffer formats

On Thu, Feb 25, 2021 at 07:35:26PM +0000, Suzuki K Poulose wrote:
> CoreSight PMU supports aux-buffer for the ETM tracing. The trace
> generated by the ETM (associated with individual CPUs, like Intel PT)
> is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
>
> The TMC-ETR applies formatting of the raw ETM trace data, as it
> can collect traces from multiple ETMs, with the TraceID to indicate
> the source of a given trace packet.
>
> Arm Trace Buffer Extension is new "sink" IP, attached to individual
> CPUs and thus do not provide additional formatting, like TMC-ETR.
>
> Additionally, a system could have both TRBE *and* TMC-ETR for
> the trace collection. e.g, TMC-ETR could be used as a single
> trace buffer to collect data from multiple ETMs to correlate
> the traces from different CPUs. It is possible to have a
> perf session where some events end up collecting the trace
> in TMC-ETR while the others in TRBE. Thus we need a way
> to identify the type of the trace for each AUX record.
>

The gist of this patch is to introduce formatted and raw trace format. To me
the above paragraph brings confusion to the changelog, especially since we don't
allow events belonging to the same session to use different types of sinks. I
would simply remove it.

> Define the trace formats exported by the CoreSight PMU.
> We don't define the flags following the "ETM" as this
> information is available to the user when issuing
> the session. What is missing is the additional
> formatting applied by the "sink" which is decided
> at the runtime and the user may not have a control on.
>
> So we define :
> - CORESIGHT format (indicates the Frame format)
> - RAW format (indicates the format of the source)
>
> The default value is CORESIGHT format for all the records
> (i,e == 0). Add the RAW format for others that use
> raw format.
>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Leo Yan <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Reviewed-by: Mike Leach <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes from previous:
> - Split from the coresight driver specific code
> for ease of merging
> ---
> include/uapi/linux/perf_event.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index f006eeab6f0e..63971eaef127 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1162,6 +1162,10 @@ enum perf_callchain_context {
> #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */
> #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */
>
> +/* CoreSight PMU AUX buffer formats */
> +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */
> +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */
> +

Is "CORESIGHT" really a format? We are playing with words and the end result is
the same but I think PERF_AUX_FLAG_CORESIGHT_FORMAT_FORMATTED would be best, or
event:

#define PERF_AUX_FLAG_CORESIGHT_TRACE_FORMATTED 0x0000 /* Default for backward compatibility */
#define PERF_AUX_FLAG_CORESIGHT_TRACE_RAW 0x0100 /* Raw format of the source */

Regardless, for patches 01 and 02:

Reviewed-by: Mathieu Poirier <[email protected]>

> #define PERF_FLAG_FD_NO_GROUP (1UL << 0)
> #define PERF_FLAG_FD_OUTPUT (1UL << 1)
> #define PERF_FLAG_PID_CGROUP (1UL << 2) /* pid=cgroup id, per-cpu mode only */
> --
> 2.24.1
>

2021-03-16 21:12:40

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 07/19] arm64: Add TRBE definitions

On Thu, Feb 25, 2021 at 07:35:31PM +0000, Suzuki K Poulose wrote:
> From: Anshuman Khandual <[email protected]>
>
> This adds TRBE related registers and corresponding feature macros.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Reviewed-by: Suzuki K Poulose <[email protected]>
> Reviewed-by: Mike Leach <[email protected]>
> Acked-by: Catalin Marinas <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> arch/arm64/include/asm/sysreg.h | 50 +++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)

Reviewed-by: Mathieu Poirier <[email protected]>

>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index dfd4edbfe360..6470d783ea59 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -333,6 +333,55 @@
>
> /*** End of Statistical Profiling Extension ***/
>
> +/*
> + * TRBE Registers
> + */
> +#define SYS_TRBLIMITR_EL1 sys_reg(3, 0, 9, 11, 0)
> +#define SYS_TRBPTR_EL1 sys_reg(3, 0, 9, 11, 1)
> +#define SYS_TRBBASER_EL1 sys_reg(3, 0, 9, 11, 2)
> +#define SYS_TRBSR_EL1 sys_reg(3, 0, 9, 11, 3)
> +#define SYS_TRBMAR_EL1 sys_reg(3, 0, 9, 11, 4)
> +#define SYS_TRBTRG_EL1 sys_reg(3, 0, 9, 11, 6)
> +#define SYS_TRBIDR_EL1 sys_reg(3, 0, 9, 11, 7)
> +
> +#define TRBLIMITR_LIMIT_MASK GENMASK_ULL(51, 0)
> +#define TRBLIMITR_LIMIT_SHIFT 12
> +#define TRBLIMITR_NVM BIT(5)
> +#define TRBLIMITR_TRIG_MODE_MASK GENMASK(1, 0)
> +#define TRBLIMITR_TRIG_MODE_SHIFT 3
> +#define TRBLIMITR_FILL_MODE_MASK GENMASK(1, 0)
> +#define TRBLIMITR_FILL_MODE_SHIFT 1
> +#define TRBLIMITR_ENABLE BIT(0)
> +#define TRBPTR_PTR_MASK GENMASK_ULL(63, 0)
> +#define TRBPTR_PTR_SHIFT 0
> +#define TRBBASER_BASE_MASK GENMASK_ULL(51, 0)
> +#define TRBBASER_BASE_SHIFT 12
> +#define TRBSR_EC_MASK GENMASK(5, 0)
> +#define TRBSR_EC_SHIFT 26
> +#define TRBSR_IRQ BIT(22)
> +#define TRBSR_TRG BIT(21)
> +#define TRBSR_WRAP BIT(20)
> +#define TRBSR_ABORT BIT(18)
> +#define TRBSR_STOP BIT(17)
> +#define TRBSR_MSS_MASK GENMASK(15, 0)
> +#define TRBSR_MSS_SHIFT 0
> +#define TRBSR_BSC_MASK GENMASK(5, 0)
> +#define TRBSR_BSC_SHIFT 0
> +#define TRBSR_FSC_MASK GENMASK(5, 0)
> +#define TRBSR_FSC_SHIFT 0
> +#define TRBMAR_SHARE_MASK GENMASK(1, 0)
> +#define TRBMAR_SHARE_SHIFT 8
> +#define TRBMAR_OUTER_MASK GENMASK(3, 0)
> +#define TRBMAR_OUTER_SHIFT 4
> +#define TRBMAR_INNER_MASK GENMASK(3, 0)
> +#define TRBMAR_INNER_SHIFT 0
> +#define TRBTRG_TRG_MASK GENMASK(31, 0)
> +#define TRBTRG_TRG_SHIFT 0
> +#define TRBIDR_FLAG BIT(5)
> +#define TRBIDR_PROG BIT(4)
> +#define TRBIDR_ALIGN_MASK GENMASK(3, 0)
> +#define TRBIDR_ALIGN_SHIFT 0
> +
> #define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1)
> #define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2)
>
> @@ -835,6 +884,7 @@
> #define ID_AA64MMFR2_CNP_SHIFT 0
>
> /* id_aa64dfr0 */
> +#define ID_AA64DFR0_TRBE_SHIFT 44
> #define ID_AA64DFR0_TRACE_FILT_SHIFT 40
> #define ID_AA64DFR0_DOUBLELOCK_SHIFT 36
> #define ID_AA64DFR0_PMSVER_SHIFT 32
> --
> 2.24.1
>

2021-03-16 21:15:09

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 08/19] arm64: kvm: Enable access to TRBE support for host

On Thu, Feb 25, 2021 at 07:35:32PM +0000, Suzuki K Poulose wrote:
> For a nvhe host, the EL2 must allow the EL1&0 translation
> regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must
> be saved/restored over a trip to the guest. Also, before
> entering the guest, we must flush any trace data if the
> TRBE was enabled. And we must prohibit the generation
> of trace while we are in EL1 by clearing the TRFCR_EL1.
>
> For vhe, the EL2 must prevent the EL1 access to the Trace
> Buffer.
>
> Cc: Will Deacon <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Mark Rutland <[email protected]>
> cc: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes
> - Rebased to linux-next.
> - Re-enable TRBE access on host restore.
> - For nvhe, flush the trace and prohibit the
> trace while we are at guest.
> ---
> arch/arm64/include/asm/el2_setup.h | 13 +++++++++
> arch/arm64/include/asm/kvm_arm.h | 2 ++
> arch/arm64/include/asm/kvm_host.h | 2 ++
> arch/arm64/kernel/hyp-stub.S | 3 ++-
> arch/arm64/kvm/debug.c | 6 ++---
> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++++++++++++++++++++++++++++++
> arch/arm64/kvm/hyp/nvhe/switch.c | 1 +
> 7 files changed, 65 insertions(+), 4 deletions(-)
>

Acked-by: Mathieu Poirier <[email protected]>

> diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
> index d77d358f9395..bda918948471 100644
> --- a/arch/arm64/include/asm/el2_setup.h
> +++ b/arch/arm64/include/asm/el2_setup.h
> @@ -65,6 +65,19 @@
> // use EL1&0 translation.
>
> .Lskip_spe_\@:
> + /* Trace buffer */
> + ubfx x0, x1, #ID_AA64DFR0_TRBE_SHIFT, #4
> + cbz x0, .Lskip_trace_\@ // Skip if TraceBuffer is not present
> +
> + mrs_s x0, SYS_TRBIDR_EL1
> + and x0, x0, TRBIDR_PROG
> + cbnz x0, .Lskip_trace_\@ // If TRBE is available at EL2
> +
> + mov x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
> + orr x2, x2, x0 // allow the EL1&0 translation
> + // to own it.
> +
> +.Lskip_trace_\@:
> msr mdcr_el2, x2 // Configure debug traps
> .endm
>
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 94d4025acc0b..692c9049befa 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -278,6 +278,8 @@
> #define CPTR_EL2_DEFAULT CPTR_EL2_RES1
>
> /* Hyp Debug Configuration Register bits */
> +#define MDCR_EL2_E2TB_MASK (UL(0x3))
> +#define MDCR_EL2_E2TB_SHIFT (UL(24))
> #define MDCR_EL2_TTRF (1 << 19)
> #define MDCR_EL2_TPMS (1 << 14)
> #define MDCR_EL2_E2PB_MASK (UL(0x3))
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 3d10e6527f7d..80d0a1a82a4c 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -315,6 +315,8 @@ struct kvm_vcpu_arch {
> struct kvm_guest_debug_arch regs;
> /* Statistical profiling extension */
> u64 pmscr_el1;
> + /* Self-hosted trace */
> + u64 trfcr_el1;
> } host_debug_state;
>
> /* VGIC state */
> diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> index 678cd2c618ee..ba84502b582a 100644
> --- a/arch/arm64/kernel/hyp-stub.S
> +++ b/arch/arm64/kernel/hyp-stub.S
> @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
> mrs_s x0, SYS_VBAR_EL12
> msr vbar_el1, x0
>
> - // Use EL2 translations for SPE and disable access from EL1
> + // Use EL2 translations for SPE & TRBE and disable access from EL1
> mrs x0, mdcr_el2
> bic x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
> + bic x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
> msr mdcr_el2, x0
>
> // Transfer the MM state from EL1 to EL2
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index dbc890511631..7b16f42d39f4 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
> * - Debug ROM Address (MDCR_EL2_TDRA)
> * - OS related registers (MDCR_EL2_TDOSA)
> * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
> - * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
> + * - Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
> *
> * Additionally, KVM only traps guest accesses to the debug registers if
> * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
> @@ -107,8 +107,8 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
> trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
>
> /*
> - * This also clears MDCR_EL2_E2PB_MASK to disable guest access
> - * to the profiling buffer.
> + * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
> + * to disable guest access to the profiling and trace buffers
> */
> vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
> vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> index f401724f12ef..9499e18dd28f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
> @@ -58,10 +58,51 @@ static void __debug_restore_spe(u64 pmscr_el1)
> write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
> }
>
> +static void __debug_save_trace(u64 *trfcr_el1)
> +{
> +
> + *trfcr_el1 = 0;
> +
> + /* Check if we have TRBE */
> + if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
> + ID_AA64DFR0_TRBE_SHIFT))
> + return;
> +
> + /* Check we can access the TRBE */
> + if ((read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_PROG))
> + return;
> +
> + /* Check if the TRBE is enabled */
> + if (!(read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_ENABLE))
> + return;
> + /*
> + * Prohibit trace generation while we are in guest.
> + * Since access to TRFCR_EL1 is trapped, the guest can't
> + * modify the filtering set by the host.
> + */
> + *trfcr_el1 = read_sysreg_s(SYS_TRFCR_EL1);
> + write_sysreg_s(0, SYS_TRFCR_EL1);
> + isb();
> + /* Drain the trace buffer to memory */
> + tsb_csync();
> + dsb(nsh);
> +}
> +
> +static void __debug_restore_trace(u64 trfcr_el1)
> +{
> + if (!trfcr_el1)
> + return;
> +
> + /* Restore trace filter controls */
> + write_sysreg_s(trfcr_el1, SYS_TRFCR_EL1);
> +}
> +
> void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> {
> /* Disable and flush SPE data generation */
> __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
> + /* Disable and flush Self-Hosted Trace generation */
> + __debug_save_trace(&vcpu->arch.host_debug_state.trfcr_el1);
> }
>
> void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
> @@ -72,6 +113,7 @@ void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
> void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu)
> {
> __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
> + __debug_restore_trace(vcpu->arch.host_debug_state.trfcr_el1);
> }
>
> void __debug_switch_to_host(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 10eed66136a0..d6ea5c8b5551 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -95,6 +95,7 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
>
> mdcr_el2 &= MDCR_EL2_HPMN_MASK;
> mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> + mdcr_el2 |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
>
> write_sysreg(mdcr_el2, mdcr_el2);
> if (is_protected_kvm_enabled())
> --
> 2.24.1
>

2021-03-16 21:24:49

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 09/19] coresight: etm4x: Move ETM to prohibited region for disable

On Thu, Feb 25, 2021 at 07:35:33PM +0000, Suzuki K Poulose wrote:
> If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF),
> move the ETM to trace prohibited region using TRFCR, while disabling.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> New patch

I would ask you to refrain from introducing new patches. Otherwise the goal
posts keep on moving with every revision and we'll never get through. Fixes and
enhancement can come in later patchsets.

> ---
> .../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++--
> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
> 2 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 15016f757828..00297906669c 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -31,6 +31,7 @@
> #include <linux/pm_runtime.h>
> #include <linux/property.h>
>
> +#include <asm/barrier.h>
> #include <asm/sections.h>
> #include <asm/sysreg.h>
> #include <asm/local.h>
> @@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev,
> static void etm4_disable_hw(void *info)
> {
> u32 control;
> + u64 trfcr;
> struct etmv4_drvdata *drvdata = info;
> struct etmv4_config *config = &drvdata->config;
> struct coresight_device *csdev = drvdata->csdev;
> @@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info)
> /* EN, bit[0] Trace unit enable bit */
> control &= ~0x1;
>
> + /*
> + * If the CPU supports v8.4 Trace filter Control,
> + * set the ETM to trace prohibited region.
> + */
> + if (drvdata->trfc) {
> + trfcr = read_sysreg_s(SYS_TRFCR_EL1);
> + write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
> + SYS_TRFCR_EL1);
> + isb();
> + }
> /*
> * Make sure everything completes before disabling, as recommended
> * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
> @@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info)
> */
> dsb(sy);
> isb();
> + /* Trace synchronization barrier, is a nop if not supported */
> + tsb_csync();
> etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
>
> /* wait for TRCSTATR.PMSTABLE to go to '1' */
> if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
> dev_err(etm_dev,
> "timeout while waiting for PM stable Trace Status\n");
> + if (drvdata->trfc)
> + write_sysreg_s(trfcr, SYS_TRFCR_EL1);

drvdata->trfc is invariably set to true in cpu_enable_tracing() and as such
testing for it is not required.

>
> /* read the status of the single shot comparators */
> for (i = 0; i < drvdata->nr_ss_cmp; i++) {
> @@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
> return false;
> }
>
> -static void cpu_enable_tracing(void)
> +static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
> {
> u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
> u64 trfcr;
> @@ -881,6 +897,7 @@ static void cpu_enable_tracing(void)
> if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
> return;
>
> + drvdata->trfc = true;
> /*
> * If the CPU supports v8.4 SelfHosted Tracing, enable
> * tracing at the kernel EL and EL0, forcing to use the
> @@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info)
> /* NUMCNTR, bits[30:28] number of counters available for tracing */
> drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
> etm4_cs_lock(drvdata, csa);
> - cpu_enable_tracing();
> + cpu_enable_tracing(drvdata);

At least for this patch, the above three hunks aren't needed.

> }
>
> static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> index 0af60571aa23..f6478ef642bf 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> @@ -862,6 +862,7 @@ struct etmv4_save_state {
> * @nooverflow: Indicate if overflow prevention is supported.
> * @atbtrig: If the implementation can support ATB triggers
> * @lpoverride: If the implementation can support low-power state over.
> + * @trfc: If the implementation supports Arm v8.4 trace filter controls.
> * @config: structure holding configuration parameters.
> * @save_state: State to be preserved across power loss
> * @state_needs_restore: True when there is context to restore after PM exit
> @@ -912,6 +913,7 @@ struct etmv4_drvdata {
> bool nooverflow;
> bool atbtrig;
> bool lpoverride;
> + bool trfc;

Nor is this one.

> struct etmv4_config config;
> struct etmv4_save_state *save_state;
> bool state_needs_restore;
> --
> 2.24.1
>

2021-03-16 21:25:04

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 10/19] coresight: etm-perf: Allow an event to use different sinks

On Thu, Feb 25, 2021 at 07:35:34PM +0000, Suzuki K Poulose wrote:
> When a sink is not specified by the user, the etm perf driver
> finds a suitable sink automatically, based on the first ETM
> where this event could be scheduled. Then we allocate the
> sink buffer based on the selected sink. This is fine for a
> CPU bound event as the "sink" is always guaranteed to be
> reachable from the ETM (as this is the only ETM where the
> event is going to be scheduled). However, if we have a thread
> bound event, the event could be scheduled on any of the ETMs
> on the system. In this case, currently we automatically select
> a sink and exclude any ETMs that cannot reach the selected
> sink. This is problematic especially for 1x1 configurations.
> We end up in tracing the event only on the "first" ETM,
> as the default sink is local to the first ETM and unreachable
> from the rest. However, we could allow the other ETMs to
> trace if they all have a sink that is compatible with the
> "selected" sink and can use the sink buffer. This can be
> easily done by verifying that they are all driven by the
> same driver and matches the same subtype. Please note
> that at anytime there can be only one ETM tracing the event.
>
> Adding support for different types of sinks for a single
> event is complex and is not something that we expect
> on a sane configuration.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Tested-by: Linu Cherian <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Rename sinks_match => sinks_compatible
> - Tighten the check by matching the sink subtype
> - Use user_sink instead of "sink_forced" and clean up the code (Mathieu)
> - More comments, better commit description
> ---
> .../hwtracing/coresight/coresight-etm-perf.c | 60 ++++++++++++++++---
> 1 file changed, 52 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 0f603b4094f2..aa0974bd265b 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -232,6 +232,25 @@ static void etm_free_aux(void *data)
> schedule_work(&event_data->work);
> }
>
> +/*
> + * Check if two given sinks are compatible with each other,
> + * so that they can use the same sink buffers, when an event
> + * moves around.
> + */
> +static bool sinks_compatible(struct coresight_device *a,
> + struct coresight_device *b)
> +{
> + if (!a || !b)
> + return false;
> + /*
> + * If the sinks are of the same subtype and driven
> + * by the same driver, we can use the same buffer
> + * on these sinks.
> + */
> + return (a->subtype.sink_subtype == b->subtype.sink_subtype) &&
> + (sink_ops(a) == sink_ops(b));

Ok

> +}
> +
> static void *etm_setup_aux(struct perf_event *event, void **pages,
> int nr_pages, bool overwrite)
> {
> @@ -239,6 +258,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> int cpu = event->cpu;
> cpumask_t *mask;
> struct coresight_device *sink = NULL;
> + struct coresight_device *user_sink = NULL, *last_sink = NULL;
> struct etm_event_data *event_data = NULL;
>
> event_data = alloc_event_data(cpu);
> @@ -249,7 +269,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> /* First get the selected sink from user space. */
> if (event->attr.config2) {
> id = (u32)event->attr.config2;
> - sink = coresight_get_sink_by_id(id);
> + sink = user_sink = coresight_get_sink_by_id(id);
> }
>
> mask = &event_data->mask;
> @@ -277,14 +297,33 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> }
>
> /*
> - * No sink provided - look for a default sink for one of the
> - * devices. At present we only support topology where all CPUs
> - * use the same sink [N:1], so only need to find one sink. The
> - * coresight_build_path later will remove any CPU that does not
> - * attach to the sink, or if we have not found a sink.
> + * No sink provided - look for a default sink for all the ETMs,
> + * where this event can be scheduled.
> + * We allocate the sink specific buffers only once for this
> + * event. If the ETMs have different default sink devices, we
> + * can only use a single "type" of sink as the event can carry
> + * only one sink specific buffer. Thus we have to make sure
> + * that the sinks are of the same type and driven by the same
> + * driver, as the one we allocate the buffer for. As such
> + * we choose the first sink and check if the remaining ETMs
> + * have a compatible default sink. We don't trace on a CPU
> + * if the sink is not compatible.
> */
> - if (!sink)
> + if (!user_sink) {
> + /* Find the default sink for this ETM */
> sink = coresight_find_default_sink(csdev);
> + if (!sink) {
> + cpumask_clear_cpu(cpu, mask);
> + continue;
> + }
> +
> + /* Check if this sink compatible with the last sink */
> + if (last_sink && !sinks_compatible(last_sink, sink)) {
> + cpumask_clear_cpu(cpu, mask);
> + continue;
> + }
> + last_sink = sink;

This is much better.

I thought about something when I first looked a this patch in the previous
revision... With the above we are changing the behavior of the CS framework for
systems that have one sink per CPU _clusters_, but for once it is for the better.

With this patch coresight_find_default_sink() is called for every CPU,
allowing CPUs in the second cluster to find a valid path and be included in the
trace session. Before this patch CPUs in the second cluster couldn't
establish a valid path to the sink since it was only reachable from the first
cluster.

Reviewed-by: Mathieu Poirier <[email protected]>

More comments to come tomorrow.

Thanks,
Mathieu

> + }
>
> /*
> * Building a path doesn't enable it, it simply builds a
> @@ -312,7 +351,12 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> if (!sink_ops(sink)->alloc_buffer || !sink_ops(sink)->free_buffer)
> goto err;
>
> - /* Allocate the sink buffer for this session */
> + /*
> + * Allocate the sink buffer for this session. All the sinks
> + * where this event can be scheduled are ensured to be of the
> + * same type. Thus the same sink configuration is used by the
> + * sinks.
> + */
> event_data->snk_config =
> sink_ops(sink)->alloc_buffer(sink, event, pages,
> nr_pages, overwrite);
> --
> 2.24.1
>

2021-03-17 10:48:19

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 09/19] coresight: etm4x: Move ETM to prohibited region for disable

Hi Mathieu

On 3/16/21 7:30 PM, Mathieu Poirier wrote:
> On Thu, Feb 25, 2021 at 07:35:33PM +0000, Suzuki K Poulose wrote:
>> If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF),
>> move the ETM to trace prohibited region using TRFCR, while disabling.
>>
>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Mike Leach <[email protected]>
>> Cc: Anshuman Khandual <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> ---
>> New patch
>
> I would ask you to refrain from introducing new patches. Otherwise the goal
> posts keep on moving with every revision and we'll never get through. Fixes and
> enhancement can come in later patchsets.
>

While I agree that this is a fix and a new patch, it also attests what
we do in the nvhe hypervisor to disable tracing while we enter the guest, by
using the Trace filter controls.

>> ---
>> .../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++--
>> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
>> 2 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> index 15016f757828..00297906669c 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> @@ -31,6 +31,7 @@
>> #include <linux/pm_runtime.h>
>> #include <linux/property.h>
>>
>> +#include <asm/barrier.h>
>> #include <asm/sections.h>
>> #include <asm/sysreg.h>
>> #include <asm/local.h>
>> @@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev,
>> static void etm4_disable_hw(void *info)
>> {
>> u32 control;
>> + u64 trfcr;
>> struct etmv4_drvdata *drvdata = info;
>> struct etmv4_config *config = &drvdata->config;
>> struct coresight_device *csdev = drvdata->csdev;
>> @@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info)
>> /* EN, bit[0] Trace unit enable bit */
>> control &= ~0x1;
>>
>> + /*
>> + * If the CPU supports v8.4 Trace filter Control,
>> + * set the ETM to trace prohibited region.
>> + */
>> + if (drvdata->trfc) {
>> + trfcr = read_sysreg_s(SYS_TRFCR_EL1);
>> + write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
>> + SYS_TRFCR_EL1);
>> + isb();
>> + }
>> /*
>> * Make sure everything completes before disabling, as recommended
>> * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
>> @@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info)
>> */
>> dsb(sy);
>> isb();
>> + /* Trace synchronization barrier, is a nop if not supported */
>> + tsb_csync();
>> etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
>>
>> /* wait for TRCSTATR.PMSTABLE to go to '1' */
>> if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
>> dev_err(etm_dev,
>> "timeout while waiting for PM stable Trace Status\n");
>> + if (drvdata->trfc)
>> + write_sysreg_s(trfcr, SYS_TRFCR_EL1);
>
> drvdata->trfc is invariably set to true in cpu_enable_tracing() and as such
> testing for it is not required.

That is not true. This is only set when the CPU supports trace filtering.
So, this is more of a capability field for the CPU where the ETM is bound.
Only v8.4+ CPUs implement trace filtering controls.

Cheers
Suzuki


>
>>
>> /* read the status of the single shot comparators */
>> for (i = 0; i < drvdata->nr_ss_cmp; i++) {
>> @@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
>> return false;
>> }
>>
>> -static void cpu_enable_tracing(void)
>> +static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
>> {
>> u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
>> u64 trfcr;
>> @@ -881,6 +897,7 @@ static void cpu_enable_tracing(void)
>> if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
>> return;
>>
>> + drvdata->trfc = true;
>> /*
>> * If the CPU supports v8.4 SelfHosted Tracing, enable
>> * tracing at the kernel EL and EL0, forcing to use the
>> @@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info)
>> /* NUMCNTR, bits[30:28] number of counters available for tracing */
>> drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
>> etm4_cs_lock(drvdata, csa);
>> - cpu_enable_tracing();
>> + cpu_enable_tracing(drvdata);
>
> At least for this patch, the above three hunks aren't needed.
>
>> }
>>
>> static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
>> index 0af60571aa23..f6478ef642bf 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
>> @@ -862,6 +862,7 @@ struct etmv4_save_state {
>> * @nooverflow: Indicate if overflow prevention is supported.
>> * @atbtrig: If the implementation can support ATB triggers
>> * @lpoverride: If the implementation can support low-power state over.
>> + * @trfc: If the implementation supports Arm v8.4 trace filter controls.
>> * @config: structure holding configuration parameters.
>> * @save_state: State to be preserved across power loss
>> * @state_needs_restore: True when there is context to restore after PM exit
>> @@ -912,6 +913,7 @@ struct etmv4_drvdata {
>> bool nooverflow;
>> bool atbtrig;
>> bool lpoverride;
>> + bool trfc;
>
> Nor is this one.
>
>> struct etmv4_config config;
>> struct etmv4_save_state *save_state;
>> bool state_needs_restore;
>> --
>> 2.24.1
>>

2021-03-17 10:49:59

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 10/19] coresight: etm-perf: Allow an event to use different sinks

On 3/16/21 8:23 PM, Mathieu Poirier wrote:
> On Thu, Feb 25, 2021 at 07:35:34PM +0000, Suzuki K Poulose wrote:
>> When a sink is not specified by the user, the etm perf driver
>> finds a suitable sink automatically, based on the first ETM
>> where this event could be scheduled. Then we allocate the
>> sink buffer based on the selected sink. This is fine for a
>> CPU bound event as the "sink" is always guaranteed to be
>> reachable from the ETM (as this is the only ETM where the
>> event is going to be scheduled). However, if we have a thread
>> bound event, the event could be scheduled on any of the ETMs
>> on the system. In this case, currently we automatically select
>> a sink and exclude any ETMs that cannot reach the selected
>> sink. This is problematic especially for 1x1 configurations.
>> We end up in tracing the event only on the "first" ETM,
>> as the default sink is local to the first ETM and unreachable
>> from the rest. However, we could allow the other ETMs to
>> trace if they all have a sink that is compatible with the
>> "selected" sink and can use the sink buffer. This can be
>> easily done by verifying that they are all driven by the
>> same driver and matches the same subtype. Please note
>> that at anytime there can be only one ETM tracing the event.
>>
>> Adding support for different types of sinks for a single
>> event is complex and is not something that we expect
>> on a sane configuration.
>>
>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Mike Leach <[email protected]>
>> Tested-by: Linu Cherian <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> ---
>> Changes:
>> - Rename sinks_match => sinks_compatible
>> - Tighten the check by matching the sink subtype
>> - Use user_sink instead of "sink_forced" and clean up the code (Mathieu)
>> - More comments, better commit description
>> ---
>> .../hwtracing/coresight/coresight-etm-perf.c | 60 ++++++++++++++++---
>> 1 file changed, 52 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index 0f603b4094f2..aa0974bd265b 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -232,6 +232,25 @@ static void etm_free_aux(void *data)
>> schedule_work(&event_data->work);
>> }
>>
>> +/*
>> + * Check if two given sinks are compatible with each other,
>> + * so that they can use the same sink buffers, when an event
>> + * moves around.
>> + */
>> +static bool sinks_compatible(struct coresight_device *a,
>> + struct coresight_device *b)
>> +{
>> + if (!a || !b)
>> + return false;
>> + /*
>> + * If the sinks are of the same subtype and driven
>> + * by the same driver, we can use the same buffer
>> + * on these sinks.
>> + */
>> + return (a->subtype.sink_subtype == b->subtype.sink_subtype) &&
>> + (sink_ops(a) == sink_ops(b));
>
> Ok
>
>> +}
>> +
>> static void *etm_setup_aux(struct perf_event *event, void **pages,
>> int nr_pages, bool overwrite)
>> {
>> @@ -239,6 +258,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>> int cpu = event->cpu;
>> cpumask_t *mask;
>> struct coresight_device *sink = NULL;
>> + struct coresight_device *user_sink = NULL, *last_sink = NULL;
>> struct etm_event_data *event_data = NULL;
>>
>> event_data = alloc_event_data(cpu);
>> @@ -249,7 +269,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>> /* First get the selected sink from user space. */
>> if (event->attr.config2) {
>> id = (u32)event->attr.config2;
>> - sink = coresight_get_sink_by_id(id);
>> + sink = user_sink = coresight_get_sink_by_id(id);
>> }
>>
>> mask = &event_data->mask;
>> @@ -277,14 +297,33 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>> }
>>
>> /*
>> - * No sink provided - look for a default sink for one of the
>> - * devices. At present we only support topology where all CPUs
>> - * use the same sink [N:1], so only need to find one sink. The
>> - * coresight_build_path later will remove any CPU that does not
>> - * attach to the sink, or if we have not found a sink.
>> + * No sink provided - look for a default sink for all the ETMs,
>> + * where this event can be scheduled.
>> + * We allocate the sink specific buffers only once for this
>> + * event. If the ETMs have different default sink devices, we
>> + * can only use a single "type" of sink as the event can carry
>> + * only one sink specific buffer. Thus we have to make sure
>> + * that the sinks are of the same type and driven by the same
>> + * driver, as the one we allocate the buffer for. As such
>> + * we choose the first sink and check if the remaining ETMs
>> + * have a compatible default sink. We don't trace on a CPU
>> + * if the sink is not compatible.
>> */
>> - if (!sink)
>> + if (!user_sink) {
>> + /* Find the default sink for this ETM */
>> sink = coresight_find_default_sink(csdev);
>> + if (!sink) {
>> + cpumask_clear_cpu(cpu, mask);
>> + continue;
>> + }
>> +
>> + /* Check if this sink compatible with the last sink */
>> + if (last_sink && !sinks_compatible(last_sink, sink)) {
>> + cpumask_clear_cpu(cpu, mask);
>> + continue;
>> + }
>> + last_sink = sink;
>
> This is much better.
>
> I thought about something when I first looked a this patch in the previous
> revision... With the above we are changing the behavior of the CS framework for
> systems that have one sink per CPU _clusters_, but for once it is for the better.
>
> With this patch coresight_find_default_sink() is called for every CPU,
> allowing CPUs in the second cluster to find a valid path and be included in the
> trace session. Before this patch CPUs in the second cluster couldn't
> establish a valid path to the sink since it was only reachable from the first
> cluster.

Exactly. That is the whole purpose of this patch. i.e, to allow tracing on all
CPUs with a per-cpu sink configuration.

>
> Reviewed-by: Mathieu Poirier <[email protected]>

Thanks

Suzuki

2021-03-17 19:33:21

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks

On Thu, Feb 25, 2021 at 07:35:41PM +0000, Suzuki K Poulose wrote:
> From: Anshuman Khandual <[email protected]>
>
> Add support for dedicated sinks that are bound to individual CPUs. (e.g,
> TRBE). To allow quicker access to the sink for a given CPU bound source,
> keep a percpu array of the sink devices. Also, add support for building
> a path to the CPU local sink from the ETM.
>
> This adds a new percpu sink type CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM.
> This new sink type is exclusively available and can only work with percpu
> source type device CORESIGHT_DEV_SUBTYPE_SOURCE_PROC.
>
> This defines a percpu structure that accommodates a single coresight_device
> which can be used to store an initialized instance from a sink driver. As
> these sinks are exclusively linked and dependent on corresponding percpu
> sources devices, they should also be the default sink device during a perf
> session.
>
> Outwards device connections are scanned while establishing paths between a
> source and a sink device. But such connections are not present for certain
> percpu source and sink devices which are exclusively linked and dependent.
> Build the path directly and skip connection scanning for such devices.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Tested-by: Suzuki K Poulose <[email protected]>
> Reviewed-by: Suzuki K Poulose <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> [Moved the set/get percpu sink APIs from TRBE patch to here]
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Export methods to set/get percpu sinks for fixing module
> build for TRBE
> - Addressed coding style comments (Suzuki)
> - Check status of _coresight_build_path() (Mathieu)
> ---
> drivers/hwtracing/coresight/coresight-core.c | 29 ++++++++++++++++++--
> drivers/hwtracing/coresight/coresight-priv.h | 3 ++
> include/linux/coresight.h | 12 ++++++++
> 3 files changed, 42 insertions(+), 2 deletions(-)

Reviewed-by: Mathieu Poirier <[email protected]>

>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 0062c8935653..55c645616bf6 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -23,6 +23,7 @@
> #include "coresight-priv.h"
>
> static DEFINE_MUTEX(coresight_mutex);
> +DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
>
> /**
> * struct coresight_node - elements of a path, from source to sink
> @@ -70,6 +71,18 @@ void coresight_remove_cti_ops(void)
> }
> EXPORT_SYMBOL_GPL(coresight_remove_cti_ops);
>
> +void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev)
> +{
> + per_cpu(csdev_sink, cpu) = csdev;
> +}
> +EXPORT_SYMBOL_GPL(coresight_set_percpu_sink);
> +
> +struct coresight_device *coresight_get_percpu_sink(int cpu)
> +{
> + return per_cpu(csdev_sink, cpu);
> +}
> +EXPORT_SYMBOL_GPL(coresight_get_percpu_sink);
> +
> static int coresight_id_match(struct device *dev, void *data)
> {
> int trace_id, i_trace_id;
> @@ -784,6 +797,14 @@ static int _coresight_build_path(struct coresight_device *csdev,
> if (csdev == sink)
> goto out;
>
> + if (coresight_is_percpu_source(csdev) && coresight_is_percpu_sink(sink) &&
> + sink == per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev))) {
> + if (_coresight_build_path(sink, sink, path) == 0) {
> + found = true;
> + goto out;
> + }
> + }
> +
> /* Not a sink - recursively explore each port found on this element */
> for (i = 0; i < csdev->pdata->nr_outport; i++) {
> struct coresight_device *child_dev;
> @@ -999,8 +1020,12 @@ coresight_find_default_sink(struct coresight_device *csdev)
> int depth = 0;
>
> /* look for a default sink if we have not found for this device */
> - if (!csdev->def_sink)
> - csdev->def_sink = coresight_find_sink(csdev, &depth);
> + if (!csdev->def_sink) {
> + if (coresight_is_percpu_source(csdev))
> + csdev->def_sink = per_cpu(csdev_sink, source_ops(csdev)->cpu_id(csdev));
> + if (!csdev->def_sink)
> + csdev->def_sink = coresight_find_sink(csdev, &depth);
> + }
> return csdev->def_sink;
> }
>
> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
> index f5f654ea2994..ff1dd2092ac5 100644
> --- a/drivers/hwtracing/coresight/coresight-priv.h
> +++ b/drivers/hwtracing/coresight/coresight-priv.h
> @@ -232,4 +232,7 @@ coresight_find_csdev_by_fwnode(struct fwnode_handle *r_fwnode);
> void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev,
> struct coresight_device *ect_csdev);
>
> +void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev);
> +struct coresight_device *coresight_get_percpu_sink(int cpu);
> +
> #endif
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 976ec2697610..8a3a3c199087 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -50,6 +50,7 @@ enum coresight_dev_subtype_sink {
> CORESIGHT_DEV_SUBTYPE_SINK_PORT,
> CORESIGHT_DEV_SUBTYPE_SINK_BUFFER,
> CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM,
> + CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM,
> };
>
> enum coresight_dev_subtype_link {
> @@ -428,6 +429,17 @@ static inline void csdev_access_write64(struct csdev_access *csa, u64 val, u32 o
> csa->write(val, offset, false, true);
> }
>
> +static inline bool coresight_is_percpu_source(struct coresight_device *csdev)
> +{
> + return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) &&
> + (csdev->subtype.source_subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_PROC);
> +}
> +
> +static inline bool coresight_is_percpu_sink(struct coresight_device *csdev)
> +{
> + return csdev && (csdev->type == CORESIGHT_DEV_TYPE_SINK) &&
> + (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM);
> +}
> #else /* !CONFIG_64BIT */
>
> static inline u64 csdev_access_relaxed_read64(struct csdev_access *csa,
> --
> 2.24.1
>

2021-03-17 21:23:45

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 09/19] coresight: etm4x: Move ETM to prohibited region for disable

On Wed, Mar 17, 2021 at 10:44:51AM +0000, Suzuki K Poulose wrote:
> Hi Mathieu
>
> On 3/16/21 7:30 PM, Mathieu Poirier wrote:
> > On Thu, Feb 25, 2021 at 07:35:33PM +0000, Suzuki K Poulose wrote:
> > > If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF),
> > > move the ETM to trace prohibited region using TRFCR, while disabling.
> > >
> > > Cc: Mathieu Poirier <[email protected]>
> > > Cc: Mike Leach <[email protected]>
> > > Cc: Anshuman Khandual <[email protected]>
> > > Signed-off-by: Suzuki K Poulose <[email protected]>
> > > ---
> > > New patch
> >
> > I would ask you to refrain from introducing new patches. Otherwise the goal
> > posts keep on moving with every revision and we'll never get through. Fixes and
> > enhancement can come in later patchsets.
> >
>
> While I agree that this is a fix and a new patch, it also attests what
> we do in the nvhe hypervisor to disable tracing while we enter the guest, by
> using the Trace filter controls.
>
> > > ---
> > > .../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++--
> > > drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
> > > 2 files changed, 21 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > > index 15016f757828..00297906669c 100644
> > > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > > @@ -31,6 +31,7 @@
> > > #include <linux/pm_runtime.h>
> > > #include <linux/property.h>
> > > +#include <asm/barrier.h>
> > > #include <asm/sections.h>
> > > #include <asm/sysreg.h>
> > > #include <asm/local.h>
> > > @@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev,
> > > static void etm4_disable_hw(void *info)
> > > {
> > > u32 control;
> > > + u64 trfcr;
> > > struct etmv4_drvdata *drvdata = info;
> > > struct etmv4_config *config = &drvdata->config;
> > > struct coresight_device *csdev = drvdata->csdev;
> > > @@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info)
> > > /* EN, bit[0] Trace unit enable bit */
> > > control &= ~0x1;
> > > + /*
> > > + * If the CPU supports v8.4 Trace filter Control,
> > > + * set the ETM to trace prohibited region.
> > > + */
> > > + if (drvdata->trfc) {
> > > + trfcr = read_sysreg_s(SYS_TRFCR_EL1);
> > > + write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
> > > + SYS_TRFCR_EL1);
> > > + isb();
> > > + }
> > > /*
> > > * Make sure everything completes before disabling, as recommended
> > > * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
> > > @@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info)
> > > */
> > > dsb(sy);
> > > isb();
> > > + /* Trace synchronization barrier, is a nop if not supported */
> > > + tsb_csync();
> > > etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
> > > /* wait for TRCSTATR.PMSTABLE to go to '1' */
> > > if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
> > > dev_err(etm_dev,
> > > "timeout while waiting for PM stable Trace Status\n");
> > > + if (drvdata->trfc)
> > > + write_sysreg_s(trfcr, SYS_TRFCR_EL1);
> >
> > drvdata->trfc is invariably set to true in cpu_enable_tracing() and as such
> > testing for it is not required.
>
> That is not true. This is only set when the CPU supports trace filtering.
> So, this is more of a capability field for the CPU where the ETM is bound.
> Only v8.4+ CPUs implement trace filtering controls.

Ah yes, you are correct - this patch makes sense now.

>
> Cheers
> Suzuki
>
>
> >
> > > /* read the status of the single shot comparators */
> > > for (i = 0; i < drvdata->nr_ss_cmp; i++) {
> > > @@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
> > > return false;
> > > }
> > > -static void cpu_enable_tracing(void)
> > > +static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
> > > {
> > > u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
> > > u64 trfcr;
> > > @@ -881,6 +897,7 @@ static void cpu_enable_tracing(void)
> > > if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
> > > return;
> > > + drvdata->trfc = true;
> > > /*
> > > * If the CPU supports v8.4 SelfHosted Tracing, enable
> > > * tracing at the kernel EL and EL0, forcing to use the
> > > @@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info)
> > > /* NUMCNTR, bits[30:28] number of counters available for tracing */
> > > drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
> > > etm4_cs_lock(drvdata, csa);
> > > - cpu_enable_tracing();
> > > + cpu_enable_tracing(drvdata);
> >
> > At least for this patch, the above three hunks aren't needed.
> >
> > > }
> > > static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
> > > diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> > > index 0af60571aa23..f6478ef642bf 100644
> > > --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> > > +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> > > @@ -862,6 +862,7 @@ struct etmv4_save_state {
> > > * @nooverflow: Indicate if overflow prevention is supported.
> > > * @atbtrig: If the implementation can support ATB triggers
> > > * @lpoverride: If the implementation can support low-power state over.
> > > + * @trfc: If the implementation supports Arm v8.4 trace filter controls.
> > > * @config: structure holding configuration parameters.
> > > * @save_state: State to be preserved across power loss
> > > * @state_needs_restore: True when there is context to restore after PM exit
> > > @@ -912,6 +913,7 @@ struct etmv4_drvdata {
> > > bool nooverflow;
> > > bool atbtrig;
> > > bool lpoverride;
> > > + bool trfc;
> >
> > Nor is this one.
> >
> > > struct etmv4_config config;
> > > struct etmv4_save_state *save_state;
> > > bool state_needs_restore;
> > > --
> > > 2.24.1
> > >
>

2021-03-18 18:10:55

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

Good morning,

On Thu, Feb 25, 2021 at 07:35:42PM +0000, Suzuki K Poulose wrote:
> From: Anshuman Khandual <[email protected]>
>
> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> accessible via the system registers. The TRBE supports different addressing
> modes including CPU virtual address and buffer modes including the circular
> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> access to the trace buffer could be prohibited by a higher exception level
> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> private interrupt (PPI) on address translation errors and when the buffer
> is full. Overall implementation here is inspired from the Arm SPE driver.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Replaced TRBLIMITR_LIMIT_SHIFT with TRBBASER_BASE_SHIFT in set_trbe_base_pointer()
> - Dropped TRBBASER_BASE_MASK and TRBBASER_BASE_SHIFT from get_trbe_base_pointer()
> - Indentation changes for TRBE_BSC_NOT_[STOPPED|FILLED|TRIGGERED] definitions
> - Moved DECLARE_PER_CPU(...., csdev_sink) into coresight-priv.h
> - Moved isb() from trbe_enable_hw() into set_trbe_limit_pointer_enabled()
> - Dropped the space after type casting before vmap()
> - Return 0 instead of EINVAL in arm_trbe_update_buffer()
> - Add a comment in trbe_handle_overflow()
> - Add a comment in arm_trbe_cpu_startup()
> - Unregister coresight TRBE device when not supported
> - Fix potential NULL handle dereference in IRQ handler with a spurious IRQ
> - Read TRBIDR after is_trbe_programmable() in arm_trbe_probe_coresight_cpu()
> - Replaced and modified trbe_drain_and_disable_local() in IRQ handler
> - Updated arm_trbe_update_buffer() for handling a missing interrupt
> - Dropped kfree() for all devm_xxx() allocated buffer
> - Dropped additional blank line in documentation coresight/coresight-trbe.rst
> - Added Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> - Changed CONFIG_CORESIGHT_TRBE options, dependencies and helper write up
> - Added comment for irq_work_run()
> - Updated comment for minumum buffer length in arm_trbe_alloc_buffer()
> - Dropped redundant smp_processor_id() from arm_trbe_probe_coresight_cpu()
> - Fixed indentation in arm_trbe_probe_cpuhp()
> - Added static for arm_trbe_free_buffer()
> - Added comment for trbe_base element in trbe_buf structure
> - Dropped IS_ERR() check from vmap() returned pointer
> - Added WARN_ON(trbe_csdev) in arm_trbe_probe_coresight_cpu()
> - Changed TRBE device names from arm_trbeX to just trbeX
> - Dropped unused argument perf_output_handle from trbe_get_fault_act()
> - Dropped IS_ERR() from kzalloc_node()/kcalloc() buffer in arm_trbe_alloc_buffer()
> - Dropped IS_ERR() and return -ENOMEM in arm_trbe_probe_coresight()
> - Moved TRBE HW disabling before coresight cleanup in arm_trbe_remove_coresight_cpu()
> - Changed error return codes from arm_trbe_probe_irq()
> - Changed error return codes from arm_trbe_device_probe()
> - Changed arm_trbe_remove_coresight() order in arm_trbe_device_remove()
> - Changed TRBE CPU support probe/remove sequence with for_each_cpu() iterator
> - Changed coresight_register() in arm_trbe_probe_coresight_cpu()
> - Changed error return code when cpuhp_setup_state_multi() fails in arm_trbe_probe_cpuhp()
> - Changed error return code when cpuhp_state_add_instance() fails in arm_trbe_probe_cpuhp()
> - Changed trbe_dbm as trbe_flag including its sysfs interface
> - Handle race between update_buffer & IRQ handler
> - Rework and split the TRBE probe to avoid lockdep due to memory allocation
> from IPI calls (via coresight_register())
> - Fix handle->head updat for snapshot mode.
> ---
> .../testing/sysfs-bus-coresight-devices-trbe | 14 +
> .../trace/coresight/coresight-trbe.rst | 38 +
> drivers/hwtracing/coresight/Kconfig | 14 +
> drivers/hwtracing/coresight/Makefile | 1 +
> drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
> drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
> 6 files changed, 1369 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> create mode 100644 Documentation/trace/coresight/coresight-trbe.rst

Please spinoff these two file in a separate patch and CC Jon Corbet and the
linux-doc mailing list.

> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> new file mode 100644
> index 000000000000..ad3bbc6fa751
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> @@ -0,0 +1,14 @@
> +What: /sys/bus/coresight/devices/trbe<cpu>/align
> +Date: March 2021
> +KernelVersion: 5.13
> +Contact: Anshuman Khandual <[email protected]>
> +Description: (Read) Shows the TRBE write pointer alignment. This value
> + is fetched from the TRBIDR register.
> +
> +What: /sys/bus/coresight/devices/trbe<cpu>/flag
> +Date: March 2021
> +KernelVersion: 5.13
> +Contact: Anshuman Khandual <[email protected]>
> +Description: (Read) Shows if TRBE updates in the memory are with access
> + and dirty flag updates as well. This value is fetched from
> + the TRBIDR register.

For this file:

Reviewed-by: Mathieu Poirier <[email protected]>

> diff --git a/Documentation/trace/coresight/coresight-trbe.rst b/Documentation/trace/coresight/coresight-trbe.rst
> new file mode 100644
> index 000000000000..b9928ef148da
> --- /dev/null
> +++ b/Documentation/trace/coresight/coresight-trbe.rst
> @@ -0,0 +1,38 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==============================
> +Trace Buffer Extension (TRBE).
> +==============================
> +
> + :Author: Anshuman Khandual <[email protected]>
> + :Date: November 2020
> +
> +Hardware Description
> +--------------------
> +
> +Trace Buffer Extension (TRBE) is a percpu hardware which captures in system
> +memory, CPU traces generated from a corresponding percpu tracing unit. This
> +gets plugged in as a coresight sink device because the corresponding trace
> +generators (ETE), are plugged in as source device.
> +
> +The TRBE is not compliant to CoreSight architecture specifications, but is
> +driven via the CoreSight driver framework to support the ETE (which is
> +CoreSight compliant) integration.
> +
> +Sysfs files and directories
> +---------------------------
> +
> +The TRBE devices appear on the existing coresight bus alongside the other
> +coresight devices::
> +
> + >$ ls /sys/bus/coresight/devices
> + trbe0 trbe1 trbe2 trbe3
> +
> +The ``trbe<N>`` named TRBEs are associated with a CPU.::
> +
> + >$ ls /sys/bus/coresight/devices/trbe0/
> + align flag
> +
> +*Key file items are:-*
> + * ``align``: TRBE write pointer alignment
> + * ``flag``: TRBE updates memory with access and dirty flags

For this file:

Reviewed-by: Mathieu Poirier <[email protected]>

> diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> index f154ae7e705d..84530fd80998 100644
> --- a/drivers/hwtracing/coresight/Kconfig
> +++ b/drivers/hwtracing/coresight/Kconfig
> @@ -173,4 +173,18 @@ config CORESIGHT_CTI_INTEGRATION_REGS
> CTI trigger connections between this and other devices.These
> registers are not used in normal operation and can leave devices in
> an inconsistent state.
> +
> +config CORESIGHT_TRBE
> + tristate "Trace Buffer Extension (TRBE) driver"
> + depends on ARM64 && CORESIGHT_SOURCE_ETM4X
> + help
> + This driver provides support for percpu Trace Buffer Extension (TRBE).
> + TRBE always needs to be used along with it's corresponding percpu ETE
> + component. ETE generates trace data which is then captured with TRBE.
> + Unlike traditional sink devices, TRBE is a CPU feature accessible via
> + system registers. But it's explicit dependency with trace unit (ETE)
> + requires it to be plugged in as a coresight sink device.
> +
> + To compile this driver as a module, choose M here: the module will be
> + called coresight-trbe.
> endif
> diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> index f20e357758d1..d60816509755 100644
> --- a/drivers/hwtracing/coresight/Makefile
> +++ b/drivers/hwtracing/coresight/Makefile
> @@ -21,5 +21,6 @@ obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
> obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
> obj-$(CONFIG_CORESIGHT_CATU) += coresight-catu.o
> obj-$(CONFIG_CORESIGHT_CTI) += coresight-cti.o
> +obj-$(CONFIG_CORESIGHT_TRBE) += coresight-trbe.o
> coresight-cti-y := coresight-cti-core.o coresight-cti-platform.o \
> coresight-cti-sysfs.o
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> new file mode 100644
> index 000000000000..41a012b525bb
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -0,0 +1,1149 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This driver enables Trace Buffer Extension (TRBE) as a per-cpu coresight
> + * sink device could then pair with an appropriate per-cpu coresight source
> + * device (ETE) thus generating required trace data. Trace can be enabled
> + * via the perf framework.

If I remember correctly the last version stated the driver was tailored on
Will's SPE driver.

> + *
> + * Copyright (C) 2020 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#define DRVNAME "arm_trbe"
> +
> +#define pr_fmt(fmt) DRVNAME ": " fmt
> +
> +#include <asm/barrier.h>
> +#include "coresight-trbe.h"
> +
> +#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
> +
> +/*
> + * A padding packet that will help the user space tools
> + * in skipping relevant sections in the captured trace
> + * data which could not be decoded. TRBE doesn't support
> + * formatting the trace data, unlike the legacy CoreSight
> + * sinks and thus we use ETE trace packets to pad the
> + * sections of the buffer.
> + */
> +#define ETE_IGNORE_PACKET 0x70
> +
> +/*
> + * Minimum amount of meaningful trace will contain:
> + * A-Sync, Trace Info, Trace On, Address, Atom.
> + * This is about 44bytes of ETE trace. To be on
> + * the safer side, we assume 64bytes is the minimum
> + * space required for a meaningful session, before
> + * we hit a "WRAP" event.
> + */
> +#define TRBE_TRACE_MIN_BUF_SIZE 64
> +
> +enum trbe_fault_action {
> + TRBE_FAULT_ACT_WRAP,
> + TRBE_FAULT_ACT_SPURIOUS,
> + TRBE_FAULT_ACT_FATAL,
> +};
> +
> +struct trbe_buf {
> + /*
> + * Even though trbe_base represents vmap()
> + * mapped allocated buffer's start address,
> + * it's being as unsigned long for various
> + * arithmetic and comparision operations &
> + * also to be consistent with trbe_write &
> + * trbe_limit sibling pointers.
> + */
> + unsigned long trbe_base;
> + unsigned long trbe_limit;
> + unsigned long trbe_write;
> + int nr_pages;
> + void **pages;
> + bool snapshot;
> + struct trbe_cpudata *cpudata;
> +};
> +
> +struct trbe_cpudata {
> + bool trbe_flag;
> + u64 trbe_align;
> + int cpu;
> + enum cs_mode mode;
> + struct trbe_buf *buf;
> + struct trbe_drvdata *drvdata;
> +};
> +
> +struct trbe_drvdata {
> + struct trbe_cpudata __percpu *cpudata;
> + struct perf_output_handle __percpu **handle;
> + struct hlist_node hotplug_node;
> + int irq;
> + cpumask_t supported_cpus;
> + enum cpuhp_state trbe_online;
> + struct platform_device *pdev;
> +};
> +
> +static int trbe_alloc_node(struct perf_event *event)
> +{
> + if (event->cpu == -1)
> + return NUMA_NO_NODE;
> + return cpu_to_node(event->cpu);
> +}
> +
> +static void trbe_drain_buffer(void)
> +{
> + tsb_csync();
> + dsb(nsh);
> +}
> +
> +static void trbe_drain_and_disable_local(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + trbe_drain_buffer();
> +
> + /*
> + * Disable the TRBE without clearing LIMITPTR which
> + * might be required for fetching the buffer limits.
> + */
> + trblimitr &= ~TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> + isb();
> +}
> +
> +static void trbe_reset_local(void)
> +{
> + trbe_drain_and_disable_local();
> + write_sysreg_s(0, SYS_TRBLIMITR_EL1);
> + write_sysreg_s(0, SYS_TRBPTR_EL1);
> + write_sysreg_s(0, SYS_TRBBASER_EL1);
> + write_sysreg_s(0, SYS_TRBSR_EL1);
> +}
> +
> +static void trbe_stop_and_truncate_event(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + /*
> + * We cannot proceed with the buffer collection and we
> + * do not have any data for the current session. The
> + * etm_perf driver expects to close out the aux_buffer
> + * at event_stop(). So disable the TRBE here and leave
> + * the update_buffer() to return a 0 size.
> + */
> + trbe_drain_and_disable_local();
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> + *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
> +}
> +
> +/*
> + * TRBE Buffer Management
> + *
> + * The TRBE buffer spans from the base pointer till the limit pointer. When enabled,
> + * it starts writing trace data from the write pointer onward till the limit pointer.
> + * When the write pointer reaches the address just before the limit pointer, it gets
> + * wrapped around again to the base pointer. This is called a TRBE wrap event, which
> + * generates a maintenance interrupt when operated in WRAP or FILL mode. This driver
> + * uses FILL mode, where the TRBE stops the trace collection at wrap event. The IRQ
> + * handler updates the AUX buffer and re-enables the TRBE with updated WRITE and
> + * LIMIT pointers.
> + *
> + * Wrap around with an IRQ
> + * ------ < ------ < ------- < ----- < -----
> + * | |
> + * ------ > ------ > ------- > ----- > -----
> + *
> + * +---------------+-----------------------+
> + * | | |
> + * +---------------+-----------------------+
> + * Base Pointer Write Pointer Limit Pointer
> + *
> + * The base and limit pointers always needs to be PAGE_SIZE aligned. But the write
> + * pointer can be aligned to the implementation defined TRBE trace buffer alignment
> + * as captured in trbe_cpudata->trbe_align.
> + *
> + *
> + * head tail wakeup
> + * +---------------------------------------+----- ~ ~ ------
> + * |$$$$$$$|################|$$$$$$$$$$$$$$| |
> + * +---------------------------------------+----- ~ ~ ------
> + * Base Pointer Write Pointer Limit Pointer
> + *
> + * The perf_output_handle indices (head, tail, wakeup) are monotonically increasing
> + * values which tracks all the driver writes and user reads from the perf auxiliary
> + * buffer. Generally [head..tail] is the area where the driver can write into unless
> + * the wakeup is behind the tail. Enabled TRBE buffer span needs to be adjusted and
> + * configured depending on the perf_output_handle indices, so that the driver does
> + * not override into areas in the perf auxiliary buffer which is being or yet to be
> + * consumed from the user space. The enabled TRBE buffer area is a moving subset of
> + * the allocated perf auxiliary buffer.
> + */
> +static void trbe_pad_buf(struct perf_output_handle *handle, int len)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + u64 head = PERF_IDX2OFF(handle->head, buf);
> +
> + memset((void *)buf->trbe_base + head, ETE_IGNORE_PACKET, len);
> + if (!buf->snapshot)
> + perf_aux_output_skip(handle, len);
> +}
> +
> +static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + /*
> + * The ETE trace has alignment synchronization packets allowing
> + * the decoder to reset in case of an overflow or corruption.
> + * So we can use the entire buffer for the snapshot mode.
> + */
> + return buf->nr_pages * PAGE_SIZE;
> +}
> +
> +/*
> + * TRBE Limit Calculation
> + *
> + * The following markers are used to illustrate various TRBE buffer situations.
> + *
> + * $$$$ - Data area, unconsumed captured trace data, not to be overridden
> + * #### - Free area, enabled, trace will be written
> + * %%%% - Free area, disabled, trace will not be written
> + * ==== - Free area, padded with ETE_IGNORE_PACKET, trace will be skipped
> + */
> +static unsigned long __trbe_normal_offset(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + struct trbe_cpudata *cpudata = buf->cpudata;
> + const u64 bufsize = buf->nr_pages * PAGE_SIZE;
> + u64 limit = bufsize;
> + u64 head, tail, wakeup;
> +
> + head = PERF_IDX2OFF(handle->head, buf);
> +
> + /*
> + * head
> + * ------->|
> + * |
> + * head TRBE align tail
> + * +----|-------|---------------|-------+
> + * |$$$$|=======|###############|$$$$$$$|
> + * +----|-------|---------------|-------+
> + * trbe_base trbe_base + nr_pages
> + *
> + * Perf aux buffer output head position can be misaligned depending on
> + * various factors including user space reads. In case misaligned, head
> + * needs to be aligned before TRBE can be configured. Pad the alignment
> + * gap with ETE_IGNORE_PACKET bytes that will be ignored by user tools
> + * and skip this section thus advancing the head.
> + */
> + if (!IS_ALIGNED(head, cpudata->trbe_align)) {
> + unsigned long delta = roundup(head, cpudata->trbe_align) - head;
> +
> + delta = min(delta, handle->size);
> + trbe_pad_buf(handle, delta);
> + head = PERF_IDX2OFF(handle->head, buf);
> + }
> +
> + /*
> + * head = tail (size = 0)
> + * +----|-------------------------------+
> + * |$$$$|$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ |
> + * +----|-------------------------------+
> + * trbe_base trbe_base + nr_pages
> + *
> + * Perf aux buffer does not have any space for the driver to write into.
> + * Just communicate trace truncation event to the user space by marking
> + * it with PERF_AUX_FLAG_TRUNCATED.
> + */
> + if (!handle->size) {
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> + return 0;
> + }
> +
> + /* Compute the tail and wakeup indices now that we've aligned head */
> + tail = PERF_IDX2OFF(handle->head + handle->size, buf);
> + wakeup = PERF_IDX2OFF(handle->wakeup, buf);
> +
> + /*
> + * Lets calculate the buffer area which TRBE could write into. There
> + * are three possible scenarios here. Limit needs to be aligned with
> + * PAGE_SIZE per the TRBE requirement. Always avoid clobbering the
> + * unconsumed data.
> + *
> + * 1) head < tail
> + *
> + * head tail
> + * +----|-----------------------|-------+
> + * |$$$$|#######################|$$$$$$$|
> + * +----|-----------------------|-------+
> + * trbe_base limit trbe_base + nr_pages
> + *
> + * TRBE could write into [head..tail] area. Unless the tail is right at
> + * the end of the buffer, neither an wrap around nor an IRQ is expected
> + * while being enabled.
> + *
> + * 2) head == tail
> + *
> + * head = tail (size > 0)
> + * +----|-------------------------------+
> + * |%%%%|###############################|
> + * +----|-------------------------------+
> + * trbe_base limit = trbe_base + nr_pages
> + *
> + * TRBE should just write into [head..base + nr_pages] area even though
> + * the entire buffer is empty. Reason being, when the trace reaches the
> + * end of the buffer, it will just wrap around with an IRQ giving an
> + * opportunity to reconfigure the buffer.
> + *
> + * 3) tail < head
> + *
> + * tail head
> + * +----|-----------------------|-------+
> + * |%%%%|$$$$$$$$$$$$$$$$$$$$$$$|#######|
> + * +----|-----------------------|-------+
> + * trbe_base limit = trbe_base + nr_pages
> + *
> + * TRBE should just write into [head..base + nr_pages] area even though
> + * the [trbe_base..tail] is also empty. Reason being, when the trace
> + * reaches the end of the buffer, it will just wrap around with an IRQ
> + * giving an opportunity to reconfigure the buffer.
> + */
> + if (head < tail)
> + limit = round_down(tail, PAGE_SIZE);
> +
> + /*
> + * Wakeup may be arbitrarily far into the future. If it's not in the
> + * current generation, either we'll wrap before hitting it, or it's
> + * in the past and has been handled already.
> + *
> + * If there's a wakeup before we wrap, arrange to be woken up by the
> + * page boundary following it. Keep the tail boundary if that's lower.
> + *
> + * head wakeup tail
> + * +----|---------------|-------|-------+
> + * |$$$$|###############|%%%%%%%|$$$$$$$|
> + * +----|---------------|-------|-------+
> + * trbe_base limit trbe_base + nr_pages
> + */
> + if (handle->wakeup < (handle->head + handle->size) && head <= wakeup)
> + limit = min(limit, round_up(wakeup, PAGE_SIZE));
> +
> + /*
> + * There are two situation when this can happen i.e limit is before
> + * the head and hence TRBE cannot be configured.
> + *
> + * 1) head < tail (aligned down with PAGE_SIZE) and also they are both
> + * within the same PAGE size range.
> + *
> + * PAGE_SIZE
> + * |----------------------|
> + *
> + * limit head tail
> + * +------------|------|--------|-------+
> + * |$$$$$$$$$$$$$$$$$$$|========|$$$$$$$|
> + * +------------|------|--------|-------+
> + * trbe_base trbe_base + nr_pages
> + *
> + * 2) head < wakeup (aligned up with PAGE_SIZE) < tail and also both
> + * head and wakeup are within same PAGE size range.
> + *
> + * PAGE_SIZE
> + * |----------------------|
> + *
> + * limit head wakeup tail
> + * +----|------|-------|--------|-------+
> + * |$$$$$$$$$$$|=======|========|$$$$$$$|
> + * +----|------|-------|--------|-------+
> + * trbe_base trbe_base + nr_pages
> + */
> + if (limit > head)
> + return limit;
> +
> + trbe_pad_buf(handle, handle->size);
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> + return 0;
> +}
> +
> +static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = perf_get_aux(handle);
> + u64 limit = __trbe_normal_offset(handle);
> + u64 head = PERF_IDX2OFF(handle->head, buf);
> +
> + /*
> + * If the head is too close to the limit and we don't
> + * have space for a meaningful run, we rather pad it
> + * and start fresh.
> + */
> + if (limit && (limit - head < TRBE_TRACE_MIN_BUF_SIZE)) {
> + trbe_pad_buf(handle, limit - head);
> + limit = __trbe_normal_offset(handle);
> + }
> + return limit;
> +}
> +
> +static unsigned long compute_trbe_buffer_limit(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + unsigned long offset;
> +
> + if (buf->snapshot)
> + offset = trbe_snapshot_offset(handle);
> + else
> + offset = trbe_normal_offset(handle);
> + return buf->trbe_base + offset;
> +}
> +
> +static void clr_trbe_status(void)
> +{
> + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
> +
> + WARN_ON(is_trbe_enabled());
> + trbsr &= ~TRBSR_IRQ;
> + trbsr &= ~TRBSR_TRG;
> + trbsr &= ~TRBSR_WRAP;
> + trbsr &= ~(TRBSR_EC_MASK << TRBSR_EC_SHIFT);
> + trbsr &= ~(TRBSR_BSC_MASK << TRBSR_BSC_SHIFT);
> + trbsr &= ~TRBSR_STOP;
> + write_sysreg_s(trbsr, SYS_TRBSR_EL1);
> +}
> +
> +static void set_trbe_limit_pointer_enabled(unsigned long addr)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBLIMITR_LIMIT_SHIFT)));
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> +
> + trblimitr &= ~TRBLIMITR_NVM;
> + trblimitr &= ~(TRBLIMITR_FILL_MODE_MASK << TRBLIMITR_FILL_MODE_SHIFT);
> + trblimitr &= ~(TRBLIMITR_TRIG_MODE_MASK << TRBLIMITR_TRIG_MODE_SHIFT);
> + trblimitr &= ~(TRBLIMITR_LIMIT_MASK << TRBLIMITR_LIMIT_SHIFT);
> +
> + /*
> + * Fill trace buffer mode is used here while configuring the
> + * TRBE for trace capture. In this particular mode, the trace
> + * collection is stopped and a maintenance interrupt is raised
> + * when the current write pointer wraps. This pause in trace
> + * collection gives the software an opportunity to capture the
> + * trace data in the interrupt handler, before reconfiguring
> + * the TRBE.
> + */
> + trblimitr |= (TRBE_FILL_MODE_FILL & TRBLIMITR_FILL_MODE_MASK) << TRBLIMITR_FILL_MODE_SHIFT;
> +
> + /*
> + * Trigger mode is not used here while configuring the TRBE for
> + * the trace capture. Hence just keep this in the ignore mode.
> + */
> + trblimitr |= (TRBE_TRIG_MODE_IGNORE & TRBLIMITR_TRIG_MODE_MASK) <<
> + TRBLIMITR_TRIG_MODE_SHIFT;
> + trblimitr |= (addr & PAGE_MASK);
> +
> + trblimitr |= TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +
> + /* Synchronize the TRBE enable event */
> + isb();
> +}
> +
> +static void trbe_enable_hw(struct trbe_buf *buf)
> +{
> + WARN_ON(buf->trbe_write < buf->trbe_base);
> + WARN_ON(buf->trbe_write >= buf->trbe_limit);
> + set_trbe_disabled();
> + isb();
> + clr_trbe_status();
> + set_trbe_base_pointer(buf->trbe_base);
> + set_trbe_write_pointer(buf->trbe_write);
> +
> + /*
> + * Synchronize all the register updates
> + * till now before enabling the TRBE.
> + */
> + isb();
> + set_trbe_limit_pointer_enabled(buf->trbe_limit);
> +}
> +
> +static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
> +{
> + int ec = get_trbe_ec(trbsr);
> + int bsc = get_trbe_bsc(trbsr);
> +
> + WARN_ON(is_trbe_running(trbsr));
> + if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> + return TRBE_FAULT_ACT_FATAL;
> +
> + if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
> + return TRBE_FAULT_ACT_FATAL;
> +
> + if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) {
> + if (get_trbe_write_pointer() == get_trbe_base_pointer())
> + return TRBE_FAULT_ACT_WRAP;
> + }
> + return TRBE_FAULT_ACT_SPURIOUS;
> +}
> +
> +static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
> + struct perf_event *event, void **pages,
> + int nr_pages, bool snapshot)
> +{
> + struct trbe_buf *buf;
> + struct page **pglist;
> + int i;
> +
> + /*
> + * TRBE LIMIT and TRBE WRITE pointers must be page aligned. But with
> + * just a single page, there would not be any room left while writing
> + * into a partially filled TRBE buffer after the page size alignment.
> + * Hence restrict the minimum buffer size as two pages.
> + */
> + if (nr_pages < 2)
> + return NULL;
> +
> + buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, trbe_alloc_node(event));
> + if (!buf)
> + return ERR_PTR(-ENOMEM);
> +
> + pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL);
> + if (!pglist) {
> + kfree(buf);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + for (i = 0; i < nr_pages; i++)
> + pglist[i] = virt_to_page(pages[i]);
> +
> + buf->trbe_base = (unsigned long)vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
> + if (!buf->trbe_base) {
> + kfree(pglist);
> + kfree(buf);
> + return ERR_PTR(buf->trbe_base);
> + }
> + buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
> + buf->trbe_write = buf->trbe_base;
> + buf->snapshot = snapshot;
> + buf->nr_pages = nr_pages;
> + buf->pages = pages;
> + kfree(pglist);
> + return buf;
> +}
> +
> +static void arm_trbe_free_buffer(void *config)
> +{
> + struct trbe_buf *buf = config;
> +
> + vunmap((void *)buf->trbe_base);
> + kfree(buf);
> +}
> +
> +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
> + struct perf_output_handle *handle,
> + void *config)
> +{
> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct trbe_buf *buf = config;
> + enum trbe_fault_action act;
> + unsigned long size, offset;
> + unsigned long write, base, status;
> + unsigned long flags;
> +
> + WARN_ON(buf->cpudata != cpudata);
> + WARN_ON(cpudata->cpu != smp_processor_id());
> + WARN_ON(cpudata->drvdata != drvdata);
> + if (cpudata->mode != CS_MODE_PERF)
> + return 0;
> +
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
> +
> + /*
> + * We are about to disable the TRBE. And this could in turn
> + * fill up the buffer triggering, an IRQ. This could be consumed
> + * by the PE asynchronously, causing a race here against
> + * the IRQ handler in closing out the handle. So, let us
> + * make sure the IRQ can't trigger while we are collecting
> + * the buffer. We also make sure that a WRAP event is handled
> + * accordingly.
> + */
> + local_irq_save(flags);
> +
> + /*
> + * If the TRBE was disabled due to lack of space in the AUX buffer or a
> + * spurious fault, the driver leaves it disabled, truncating the buffer.
> + * Since the etm_perf driver expects to close out the AUX buffer, the
> + * driver skips it. Thus, just pass in 0 size here to indicate that the
> + * buffer was truncated.
> + */
> + if (!is_trbe_enabled()) {
> + size = 0;
> + goto done;
> + }
> + /*
> + * perf handle structure needs to be shared with the TRBE IRQ handler for
> + * capturing trace data and restarting the handle. There is a probability
> + * of an undefined reference based crash when etm event is being stopped
> + * while a TRBE IRQ also getting processed. This happens due the release
> + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
> + * the TRBE here will ensure that no IRQ could be generated when the perf
> + * handle gets freed in etm_event_stop().
> + */
> + trbe_drain_and_disable_local();
> + write = get_trbe_write_pointer();
> + base = get_trbe_base_pointer();
> +
> + /* Check if there is a pending interrupt and handle it here */
> + status = read_sysreg_s(SYS_TRBSR_EL1);
> + if (is_trbe_irq(status)) {
> +
> + /*
> + * Now that we are handling the IRQ here, clear the IRQ
> + * from the status, to let the irq handler know that it
> + * is taken care of.
> + */
> + clr_trbe_irq();
> + isb();
> +
> + act = trbe_get_fault_act(status);
> + /*
> + * If this was not due to a WRAP event, we have some
> + * errors and as such buffer is empty.
> + */
> + if (act != TRBE_FAULT_ACT_WRAP) {
> + size = 0;
> + goto done;
> + }
> +
> + /*
> + * Otherwise, the buffer is full and the write pointer
> + * has reached base. Adjust this back to the Limit pointer
> + * for correct size.
> + */
> + write = get_trbe_limit_pointer();
> + }
> +
> + offset = write - base;
> + if (WARN_ON_ONCE(offset < PERF_IDX2OFF(handle->head, buf)))
> + size = 0;
> + else
> + size = offset - PERF_IDX2OFF(handle->head, buf);
> +
> +done:
> + local_irq_restore(flags);
> +
> + if (buf->snapshot)
> + handle->head += size;
> + return size;
> +}
> +
> +static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data)
> +{
> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct perf_output_handle *handle = data;
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + WARN_ON(cpudata->cpu != smp_processor_id());
> + WARN_ON(cpudata->drvdata != drvdata);
> + if (mode != CS_MODE_PERF)
> + return -EINVAL;
> +
> + *this_cpu_ptr(drvdata->handle) = handle;
> + cpudata->buf = buf;
> + cpudata->mode = mode;
> + buf->cpudata = cpudata;
> + buf->trbe_limit = compute_trbe_buffer_limit(handle);
> + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> + if (buf->trbe_limit == buf->trbe_base) {
> + trbe_stop_and_truncate_event(handle);
> + return 0;
> + }
> + trbe_enable_hw(buf);
> + return 0;
> +}
> +
> +static int arm_trbe_disable(struct coresight_device *csdev)
> +{
> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct trbe_buf *buf = cpudata->buf;
> +
> + WARN_ON(buf->cpudata != cpudata);
> + WARN_ON(cpudata->cpu != smp_processor_id());
> + WARN_ON(cpudata->drvdata != drvdata);
> + if (cpudata->mode != CS_MODE_PERF)
> + return -EINVAL;
> +
> + trbe_drain_and_disable_local();
> + buf->cpudata = NULL;
> + cpudata->buf = NULL;
> + cpudata->mode = CS_MODE_DISABLED;
> + return 0;
> +}
> +
> +static void trbe_handle_spurious(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> +
> + buf->trbe_limit = compute_trbe_buffer_limit(handle);
> + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> + if (buf->trbe_limit == buf->trbe_base) {
> + trbe_drain_and_disable_local();
> + return;
> + }
> + trbe_enable_hw(buf);
> +}
> +
> +static void trbe_handle_overflow(struct perf_output_handle *handle)
> +{
> + struct perf_event *event = handle->event;
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + unsigned long offset, size;
> + struct etm_event_data *event_data;
> +
> + offset = get_trbe_limit_pointer() - get_trbe_base_pointer();
> + size = offset - PERF_IDX2OFF(handle->head, buf);
> + if (buf->snapshot)
> + handle->head += size;
> +
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
> + perf_aux_output_end(handle, size);
> + event_data = perf_aux_output_begin(handle, event);
> + if (!event_data) {
> + /*
> + * We are unable to restart the trace collection,
> + * thus leave the TRBE disabled. The etm-perf driver
> + * is able to detect this with a disconnected handle
> + * (handle->event = NULL).
> + */
> + trbe_drain_and_disable_local();
> + *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
> + return;
> + }
> + buf->trbe_limit = compute_trbe_buffer_limit(handle);
> + buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> + if (buf->trbe_limit == buf->trbe_base) {
> + trbe_stop_and_truncate_event(handle);
> + return;
> + }
> + *this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
> + trbe_enable_hw(buf);
> +}
> +
> +static bool is_perf_trbe(struct perf_output_handle *handle)
> +{
> + struct trbe_buf *buf = etm_perf_sink_config(handle);
> + struct trbe_cpudata *cpudata = buf->cpudata;
> + struct trbe_drvdata *drvdata = cpudata->drvdata;
> + int cpu = smp_processor_id();
> +
> + WARN_ON(buf->trbe_base != get_trbe_base_pointer());
> + WARN_ON(buf->trbe_limit != get_trbe_limit_pointer());
> +
> + if (cpudata->mode != CS_MODE_PERF)
> + return false;
> +
> + if (cpudata->cpu != cpu)
> + return false;
> +
> + if (!cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + return false;
> +
> + return true;
> +}
> +
> +static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
> +{
> + struct perf_output_handle **handle_ptr = dev;
> + struct perf_output_handle *handle = *handle_ptr;
> + enum trbe_fault_action act;
> + u64 status;
> +
> + /*
> + * Ensure the trace is visible to the CPUs and
> + * any external aborts have been resolved.
> + */
> + trbe_drain_and_disable_local();
> +
> + status = read_sysreg_s(SYS_TRBSR_EL1);
> + /*
> + * If the pending IRQ was handled by update_buffer callback
> + * we have nothing to do here.
> + */
> + if (!is_trbe_irq(status))
> + return IRQ_NONE;
> +
> + clr_trbe_irq();
> + isb();
> +
> + if (WARN_ON_ONCE(!handle) || !perf_get_aux(handle))
> + return IRQ_NONE;
> +
> + if (!is_perf_trbe(handle))
> + return IRQ_NONE;
> +
> + /*
> + * Ensure perf callbacks have completed, which may disable
> + * the trace buffer in response to a TRUNCATION flag.
> + */
> + irq_work_run();
> +
> + act = trbe_get_fault_act(status);
> + switch (act) {
> + case TRBE_FAULT_ACT_WRAP:
> + trbe_handle_overflow(handle);
> + break;
> + case TRBE_FAULT_ACT_SPURIOUS:
> + trbe_handle_spurious(handle);
> + break;
> + case TRBE_FAULT_ACT_FATAL:
> + trbe_stop_and_truncate_event(handle);
> + break;
> + }
> + return IRQ_HANDLED;
> +}
> +
> +static const struct coresight_ops_sink arm_trbe_sink_ops = {
> + .enable = arm_trbe_enable,
> + .disable = arm_trbe_disable,
> + .alloc_buffer = arm_trbe_alloc_buffer,
> + .free_buffer = arm_trbe_free_buffer,
> + .update_buffer = arm_trbe_update_buffer,
> +};
> +
> +static const struct coresight_ops arm_trbe_cs_ops = {
> + .sink_ops = &arm_trbe_sink_ops,
> +};
> +
> +static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%llx\n", cpudata->trbe_align);
> +}
> +static DEVICE_ATTR_RO(align);
> +
> +static ssize_t flag_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%d\n", cpudata->trbe_flag);
> +}
> +static DEVICE_ATTR_RO(flag);
> +
> +static struct attribute *arm_trbe_attrs[] = {
> + &dev_attr_align.attr,
> + &dev_attr_flag.attr,
> + NULL,
> +};
> +
> +static const struct attribute_group arm_trbe_group = {
> + .attrs = arm_trbe_attrs,
> +};
> +
> +static const struct attribute_group *arm_trbe_groups[] = {
> + &arm_trbe_group,
> + NULL,
> +};
> +
> +static void arm_trbe_enable_cpu(void *info)
> +{
> + struct trbe_drvdata *drvdata = info;
> +
> + trbe_reset_local();
> + enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
> +}
> +
> +static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
> +{
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
> + struct coresight_desc desc = { 0 };
> + struct device *dev;
> +
> + if (WARN_ON(trbe_csdev))
> + return;
> +
> + dev = &cpudata->drvdata->pdev->dev;
> + desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu);
> + if (IS_ERR(desc.name))
> + goto cpu_clear;
> +
> + desc.type = CORESIGHT_DEV_TYPE_SINK;
> + desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
> + desc.ops = &arm_trbe_cs_ops;
> + desc.pdata = dev_get_platdata(dev);
> + desc.groups = arm_trbe_groups;
> + desc.dev = dev;
> + trbe_csdev = coresight_register(&desc);
> + if (IS_ERR(trbe_csdev))
> + goto cpu_clear;
> +
> + dev_set_drvdata(&trbe_csdev->dev, cpudata);
> + coresight_set_percpu_sink(cpu, trbe_csdev);
> + return;
> +cpu_clear:
> + cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> +}
> +
> +static void arm_trbe_probe_cpu(void *info)
> +{
> + struct trbe_drvdata *drvdata = info;
> + int cpu = smp_processor_id();
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + u64 trbidr;
> +
> + if (WARN_ON(!cpudata))
> + goto cpu_clear;
> +
> + if (!is_trbe_available()) {
> + pr_err("TRBE is not implemented on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> +
> + trbidr = read_sysreg_s(SYS_TRBIDR_EL1);
> + if (!is_trbe_programmable(trbidr)) {
> + pr_err("TRBE is owned in higher exception level on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> +
> + cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
> + if (cpudata->trbe_align > SZ_2K) {
> + pr_err("Unsupported alignment on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> + cpudata->trbe_flag = get_trbe_flag_update(trbidr);
> + cpudata->cpu = cpu;
> + cpudata->drvdata = drvdata;
> + return;
> +cpu_clear:
> + cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> +}
> +
> +static void arm_trbe_remove_coresight_cpu(void *info)
> +{
> + int cpu = smp_processor_id();
> + struct trbe_drvdata *drvdata = info;
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
> +
> + disable_percpu_irq(drvdata->irq);
> + trbe_reset_local();
> + if (trbe_csdev) {
> + coresight_unregister(trbe_csdev);
> + cpudata->drvdata = NULL;
> + coresight_set_percpu_sink(cpu, NULL);
> + }
> +}
> +
> +static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata)
> +{
> + int cpu;
> +
> + drvdata->cpudata = alloc_percpu(typeof(*drvdata->cpudata));
> + if (!drvdata->cpudata)
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, &drvdata->supported_cpus) {
> + smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_register_coresight_cpu(drvdata, cpu);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + smp_call_function_single(cpu, arm_trbe_enable_cpu, drvdata, 1);
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
> +{
> + int cpu;
> +
> + for_each_cpu(cpu, &drvdata->supported_cpus)
> + smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
> + free_percpu(drvdata->cpudata);
> + return 0;
> +}
> +
> +static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node)
> +{
> + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
> +
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> +
> + /*
> + * If this CPU was not probed for TRBE,
> + * initialize it now.
> + */
> + if (!coresight_get_percpu_sink(cpu)) {
> + arm_trbe_probe_cpu(drvdata);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_register_coresight_cpu(drvdata, cpu);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_enable_cpu(drvdata);
> + } else {
> + arm_trbe_enable_cpu(drvdata);
> + }
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
> +{
> + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
> +
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> + disable_percpu_irq(drvdata->irq);
> + trbe_reset_local();
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_probe_cpuhp(struct trbe_drvdata *drvdata)
> +{
> + enum cpuhp_state trbe_online;
> + int ret;
> +
> + trbe_online = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME,
> + arm_trbe_cpu_startup, arm_trbe_cpu_teardown);
> + if (trbe_online < 0)
> + return trbe_online;
> +
> + ret = cpuhp_state_add_instance(trbe_online, &drvdata->hotplug_node);
> + if (ret) {
> + cpuhp_remove_multi_state(trbe_online);
> + return ret;
> + }
> + drvdata->trbe_online = trbe_online;
> + return 0;
> +}
> +
> +static void arm_trbe_remove_cpuhp(struct trbe_drvdata *drvdata)
> +{
> + cpuhp_remove_multi_state(drvdata->trbe_online);
> +}
> +
> +static int arm_trbe_probe_irq(struct platform_device *pdev,
> + struct trbe_drvdata *drvdata)
> +{
> + int ret;
> +
> + drvdata->irq = platform_get_irq(pdev, 0);
> + if (drvdata->irq < 0) {
> + pr_err("IRQ not found for the platform device\n");
> + return drvdata->irq;
> + }
> +
> + if (!irq_is_percpu(drvdata->irq)) {
> + pr_err("IRQ is not a PPI\n");
> + return -EINVAL;
> + }
> +
> + if (irq_get_percpu_devid_partition(drvdata->irq, &drvdata->supported_cpus))
> + return -EINVAL;
> +
> + drvdata->handle = alloc_percpu(typeof(*drvdata->handle));
> + if (!drvdata->handle)
> + return -ENOMEM;
> +
> + ret = request_percpu_irq(drvdata->irq, arm_trbe_irq_handler, DRVNAME, drvdata->handle);
> + if (ret) {
> + free_percpu(drvdata->handle);
> + return ret;
> + }
> + return 0;
> +}
> +
> +static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
> +{
> + free_percpu_irq(drvdata->irq, drvdata->handle);
> + free_percpu(drvdata->handle);
> +}
> +
> +static int arm_trbe_device_probe(struct platform_device *pdev)
> +{
> + struct coresight_platform_data *pdata;
> + struct trbe_drvdata *drvdata;
> + struct device *dev = &pdev->dev;
> + int ret;
> +
> + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
> + if (!drvdata)
> + return -ENOMEM;
> +
> + pdata = coresight_get_platform_data(dev);
> + if (IS_ERR(pdata))
> + return PTR_ERR(pdata);

Given there is no in and out ports, do we need a platform data for this driver?

More comments on this patch tomorrow.

Thanks,
Mathieu

> +
> + dev_set_drvdata(dev, drvdata);
> + dev->platform_data = pdata;
> + drvdata->pdev = pdev;
> + ret = arm_trbe_probe_irq(pdev, drvdata);
> + if (ret)
> + return ret;
> +
> + ret = arm_trbe_probe_coresight(drvdata);
> + if (ret)
> + goto probe_failed;
> +
> + ret = arm_trbe_probe_cpuhp(drvdata);
> + if (ret)
> + goto cpuhp_failed;
> +
> + return 0;
> +cpuhp_failed:
> + arm_trbe_remove_coresight(drvdata);
> +probe_failed:
> + arm_trbe_remove_irq(drvdata);
> + return ret;
> +}
> +
> +static int arm_trbe_device_remove(struct platform_device *pdev)
> +{
> + struct trbe_drvdata *drvdata = platform_get_drvdata(pdev);
> +
> + arm_trbe_remove_cpuhp(drvdata);
> + arm_trbe_remove_coresight(drvdata);
> + arm_trbe_remove_irq(drvdata);
> + return 0;
> +}
> +
> +static const struct of_device_id arm_trbe_of_match[] = {
> + { .compatible = "arm,trace-buffer-extension"},
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, arm_trbe_of_match);
> +
> +static struct platform_driver arm_trbe_driver = {
> + .driver = {
> + .name = DRVNAME,
> + .of_match_table = of_match_ptr(arm_trbe_of_match),
> + .suppress_bind_attrs = true,
> + },
> + .probe = arm_trbe_device_probe,
> + .remove = arm_trbe_device_remove,
> +};
> +
> +static int __init arm_trbe_init(void)
> +{
> + int ret;
> +
> + if (arm64_kernel_unmapped_at_el0()) {
> + pr_err("TRBE wouldn't work if kernel gets unmapped at EL0\n");
> + return -EOPNOTSUPP;
> + }
> +
> + ret = platform_driver_register(&arm_trbe_driver);
> + if (!ret)
> + return 0;
> +
> + pr_err("Error registering %s platform driver\n", DRVNAME);
> + return ret;
> +}
> +
> +static void __exit arm_trbe_exit(void)
> +{
> + platform_driver_unregister(&arm_trbe_driver);
> +}
> +module_init(arm_trbe_init);
> +module_exit(arm_trbe_exit);
> +
> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
> +MODULE_DESCRIPTION("Arm Trace Buffer Extension (TRBE) driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
> new file mode 100644
> index 000000000000..499b846ccfee
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-trbe.h
> @@ -0,0 +1,153 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * This contains all required hardware related helper functions for
> + * Trace Buffer Extension (TRBE) driver in the coresight framework.
> + *
> + * Copyright (C) 2020 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#include <linux/coresight.h>
> +#include <linux/device.h>
> +#include <linux/irq.h>
> +#include <linux/kernel.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#include "coresight-etm-perf.h"
> +
> +static inline bool is_trbe_available(void)
> +{
> + u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
> + unsigned int trbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_TRBE_SHIFT);
> +
> + return trbe >= 0b0001;
> +}
> +
> +static inline bool is_trbe_enabled(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + return trblimitr & TRBLIMITR_ENABLE;
> +}
> +
> +#define TRBE_EC_OTHERS 0
> +#define TRBE_EC_STAGE1_ABORT 36
> +#define TRBE_EC_STAGE2_ABORT 37
> +
> +static inline int get_trbe_ec(u64 trbsr)
> +{
> + return (trbsr >> TRBSR_EC_SHIFT) & TRBSR_EC_MASK;
> +}
> +
> +#define TRBE_BSC_NOT_STOPPED 0
> +#define TRBE_BSC_FILLED 1
> +#define TRBE_BSC_TRIGGERED 2
> +
> +static inline int get_trbe_bsc(u64 trbsr)
> +{
> + return (trbsr >> TRBSR_BSC_SHIFT) & TRBSR_BSC_MASK;
> +}
> +
> +static inline void clr_trbe_irq(void)
> +{
> + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
> +
> + trbsr &= ~TRBSR_IRQ;
> + write_sysreg_s(trbsr, SYS_TRBSR_EL1);
> +}
> +
> +static inline bool is_trbe_irq(u64 trbsr)
> +{
> + return trbsr & TRBSR_IRQ;
> +}
> +
> +static inline bool is_trbe_trg(u64 trbsr)
> +{
> + return trbsr & TRBSR_TRG;
> +}
> +
> +static inline bool is_trbe_wrap(u64 trbsr)
> +{
> + return trbsr & TRBSR_WRAP;
> +}
> +
> +static inline bool is_trbe_abort(u64 trbsr)
> +{
> + return trbsr & TRBSR_ABORT;
> +}
> +
> +static inline bool is_trbe_running(u64 trbsr)
> +{
> + return !(trbsr & TRBSR_STOP);
> +}
> +
> +#define TRBE_TRIG_MODE_STOP 0
> +#define TRBE_TRIG_MODE_IRQ 1
> +#define TRBE_TRIG_MODE_IGNORE 3
> +
> +#define TRBE_FILL_MODE_FILL 0
> +#define TRBE_FILL_MODE_WRAP 1
> +#define TRBE_FILL_MODE_CIRCULAR_BUFFER 3
> +
> +static inline void set_trbe_disabled(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + trblimitr &= ~TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +}
> +
> +static inline bool get_trbe_flag_update(u64 trbidr)
> +{
> + return trbidr & TRBIDR_FLAG;
> +}
> +
> +static inline bool is_trbe_programmable(u64 trbidr)
> +{
> + return !(trbidr & TRBIDR_PROG);
> +}
> +
> +static inline int get_trbe_address_align(u64 trbidr)
> +{
> + return (trbidr >> TRBIDR_ALIGN_SHIFT) & TRBIDR_ALIGN_MASK;
> +}
> +
> +static inline unsigned long get_trbe_write_pointer(void)
> +{
> + return read_sysreg_s(SYS_TRBPTR_EL1);
> +}
> +
> +static inline void set_trbe_write_pointer(unsigned long addr)
> +{
> + WARN_ON(is_trbe_enabled());
> + write_sysreg_s(addr, SYS_TRBPTR_EL1);
> +}
> +
> +static inline unsigned long get_trbe_limit_pointer(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> + unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
> + unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;
> +
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + return addr;
> +}
> +
> +static inline unsigned long get_trbe_base_pointer(void)
> +{
> + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
> + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
> +
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + return addr;
> +}
> +
> +static inline void set_trbe_base_pointer(unsigned long addr)
> +{
> + WARN_ON(is_trbe_enabled());
> + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBBASER_BASE_SHIFT)));
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + write_sysreg_s(addr, SYS_TRBBASER_EL1);
> +}
> --
> 2.24.1
>

2021-03-19 10:32:28

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

Hi Mike

> On 8 Mar 2021, at 17:26, Mike Leach <[email protected]> wrote:
>
> Hi Suzuki,
>
> On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>>
>> From: Anshuman Khandual <[email protected]>
>>
>> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
>> accessible via the system registers. The TRBE supports different addressing
>> modes including CPU virtual address and buffer modes including the circular
>> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
>> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
>> access to the trace buffer could be prohibited by a higher exception level
>> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
>> private interrupt (PPI) on address translation errors and when the buffer
>> is full. Overall implementation here is inspired from the Arm SPE driver.
>>
>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Mike Leach <[email protected]>
>> Cc: Suzuki K Poulose <[email protected]>
>> Signed-off-by: Anshuman Khandual <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>>
>> +
>> +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
>> + struct perf_output_handle *handle,
>> + void *config)
>> +{
>> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
>> + struct trbe_buf *buf = config;
>> + enum trbe_fault_action act;
>> + unsigned long size, offset;
>> + unsigned long write, base, status;
>> + unsigned long flags;
>> +
>> + WARN_ON(buf->cpudata != cpudata);
>> + WARN_ON(cpudata->cpu != smp_processor_id());
>> + WARN_ON(cpudata->drvdata != drvdata);
>> + if (cpudata->mode != CS_MODE_PERF)
>> + return 0;
>> +
>> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
>> +
>> + /*
>> + * We are about to disable the TRBE. And this could in turn
>> + * fill up the buffer triggering, an IRQ. This could be consumed
>> + * by the PE asynchronously, causing a race here against
>> + * the IRQ handler in closing out the handle. So, let us
>> + * make sure the IRQ can't trigger while we are collecting
>> + * the buffer. We also make sure that a WRAP event is handled
>> + * accordingly.
>> + */
>> + local_irq_save(flags);
>> +
>> + /*
>> + * If the TRBE was disabled due to lack of space in the AUX buffer or a
>> + * spurious fault, the driver leaves it disabled, truncating the buffer.
>> + * Since the etm_perf driver expects to close out the AUX buffer, the
>> + * driver skips it. Thus, just pass in 0 size here to indicate that the
>> + * buffer was truncated.
>> + */
>> + if (!is_trbe_enabled()) {
>> + size = 0;
>> + goto done;
>> + }
>> + /*
>> + * perf handle structure needs to be shared with the TRBE IRQ handler for
>> + * capturing trace data and restarting the handle. There is a probability
>> + * of an undefined reference based crash when etm event is being stopped
>> + * while a TRBE IRQ also getting processed. This happens due the release
>> + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
>> + * the TRBE here will ensure that no IRQ could be generated when the perf
>> + * handle gets freed in etm_event_stop().
>> + */
>> + trbe_drain_and_disable_local();
>> + write = get_trbe_write_pointer();
>> + base = get_trbe_base_pointer();
>> +
>> + /* Check if there is a pending interrupt and handle it here */
>> + status = read_sysreg_s(SYS_TRBSR_EL1);
>> + if (is_trbe_irq(status)) {
>> +
>> + /*
>> + * Now that we are handling the IRQ here, clear the IRQ
>> + * from the status, to let the irq handler know that it
>> + * is taken care of.
>> + */
>> + clr_trbe_irq();
>> + isb();
>> +
>> + act = trbe_get_fault_act(status);
>> + /*
>> + * If this was not due to a WRAP event, we have some
>> + * errors and as such buffer is empty.
>> + */
>> + if (act != TRBE_FAULT_ACT_WRAP) {
>> + size = 0;
>> + goto done;
>> + }
>
> We are using TRBE FILL mode - which halts capture on a full buffer and
> triggers the IRQ, without disabling the source first.
> This means that the mode is inherently lossy (unless by some unlikely
> co-incidence the last byte that caused the wrap was also the last byte
> to be sent from an ETE that was in the process of being disabled.)
> Therefore we must have a perf_aux_output_flag(handle,
> PERF_AUX_FLAG_TRUNCATED) call in here to signal that some trace was
> lost, for consistence of operation with ETR etc, and intelpt.
>

I agree that the there is a bit of loss here due to the FILL mode. But it is not comparable to that of the ETR. In this case, the WRAP event is triggered when we flush the ETE. i.e, this could be mostly due to the fact that the tracing was enabled for the kernel mode and the last few bytes of trace which caused the FILL belong to the code responsible for stopping the components in the CoreSight trace. I personally do not think this data is of any interest to the user.
Otherwise, if the data didn’t belong to the perf event side, it should have triggered the IRQ.

This is true in case of the buffer overflow interrupt too, with a bit more data lost. i.e, since the interrupt is PPI, the overflow is triggered when the buffer is full (which includes the data that is cached in the TRBE). But there could be a bit of data that is still cached in the ETE, before it is captured in the trace. And the moment we get a FILL event, we stop executing anything that is relevant for the Trace session (as we are in the driver handling the interrupt).
And then we reconfigure the buffer to continue the execution. Now, the interrupt delivery is not necessarily synchronous and there could be data lost in the interval between WRAP event and the IRQ is triggered.

I am OK with suggesting that there was some loss of trace data during the session, if we hit WRAP event. But this could cause worry to the consumers that they lost too much of trace data of their interest, while that is not the case.

>> +static inline unsigned long get_trbe_limit_pointer(void)
>> +{
>> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
>> + unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
>> + unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;
>
> Could this not be:
> unsigned long addr = trblimitr & (TRBLIMITR_LIMIT_MASK <<
> TRBLIMITR_LIMIT_SHIFT);
> like the base ponter below?
>

Sure, it can be consistent.


>> +
>> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
>> + return addr;
>> +}
>> +
>> +static inline unsigned long get_trbe_base_pointer(void)
>> +{
>> + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
>> + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
>> +
>> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
>> + return addr;
>> +}
>> +

Thank you for the review

Kind regards
Suzuki

2021-03-19 10:37:35

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

Hi Mathieu,

> On 18 Mar 2021, at 18:08, Mathieu Poirier <[email protected]> wrote:
>
> Good morning,
>
> On Thu, Feb 25, 2021 at 07:35:42PM +0000, Suzuki K Poulose wrote:
>> From: Anshuman Khandual <[email protected]>
>>
>> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
>> accessible via the system registers. The TRBE supports different addressing
>> modes including CPU virtual address and buffer modes including the circular
>> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
>> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
>> access to the trace buffer could be prohibited by a higher exception level
>> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
>> private interrupt (PPI) on address translation errors and when the buffer
>> is full. Overall implementation here is inspired from the Arm SPE driver.
>>

There is a mention of the SPE driver here in the commit description.

>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Mike Leach <[email protected]>
>> Cc: Suzuki K Poulose <[email protected]>
>> Signed-off-by: Anshuman Khandual <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> ---
>> Changes:
>> - Replaced TRBLIMITR_LIMIT_SHIFT with TRBBASER_BASE_SHIFT in set_trbe_base_pointer()
>> - Dropped TRBBASER_BASE_MASK and TRBBASER_BASE_SHIFT from get_trbe_base_pointer()
>> - Indentation changes for TRBE_BSC_NOT_[STOPPED|FILLED|TRIGGERED] definitions
>> - Moved DECLARE_PER_CPU(...., csdev_sink) into coresight-priv.h
>> - Moved isb() from trbe_enable_hw() into set_trbe_limit_pointer_enabled()
>> - Dropped the space after type casting before vmap()
>> - Return 0 instead of EINVAL in arm_trbe_update_buffer()
>> - Add a comment in trbe_handle_overflow()
>> - Add a comment in arm_trbe_cpu_startup()
>> - Unregister coresight TRBE device when not supported
>> - Fix potential NULL handle dereference in IRQ handler with a spurious IRQ
>> - Read TRBIDR after is_trbe_programmable() in arm_trbe_probe_coresight_cpu()
>> - Replaced and modified trbe_drain_and_disable_local() in IRQ handler
>> - Updated arm_trbe_update_buffer() for handling a missing interrupt
>> - Dropped kfree() for all devm_xxx() allocated buffer
>> - Dropped additional blank line in documentation coresight/coresight-trbe.rst
>> - Added Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
>> - Changed CONFIG_CORESIGHT_TRBE options, dependencies and helper write up
>> - Added comment for irq_work_run()
>> - Updated comment for minumum buffer length in arm_trbe_alloc_buffer()
>> - Dropped redundant smp_processor_id() from arm_trbe_probe_coresight_cpu()
>> - Fixed indentation in arm_trbe_probe_cpuhp()
>> - Added static for arm_trbe_free_buffer()
>> - Added comment for trbe_base element in trbe_buf structure
>> - Dropped IS_ERR() check from vmap() returned pointer
>> - Added WARN_ON(trbe_csdev) in arm_trbe_probe_coresight_cpu()
>> - Changed TRBE device names from arm_trbeX to just trbeX
>> - Dropped unused argument perf_output_handle from trbe_get_fault_act()
>> - Dropped IS_ERR() from kzalloc_node()/kcalloc() buffer in arm_trbe_alloc_buffer()
>> - Dropped IS_ERR() and return -ENOMEM in arm_trbe_probe_coresight()
>> - Moved TRBE HW disabling before coresight cleanup in arm_trbe_remove_coresight_cpu()
>> - Changed error return codes from arm_trbe_probe_irq()
>> - Changed error return codes from arm_trbe_device_probe()
>> - Changed arm_trbe_remove_coresight() order in arm_trbe_device_remove()
>> - Changed TRBE CPU support probe/remove sequence with for_each_cpu() iterator
>> - Changed coresight_register() in arm_trbe_probe_coresight_cpu()
>> - Changed error return code when cpuhp_setup_state_multi() fails in arm_trbe_probe_cpuhp()
>> - Changed error return code when cpuhp_state_add_instance() fails in arm_trbe_probe_cpuhp()
>> - Changed trbe_dbm as trbe_flag including its sysfs interface
>> - Handle race between update_buffer & IRQ handler
>> - Rework and split the TRBE probe to avoid lockdep due to memory allocation
>> from IPI calls (via coresight_register())
>> - Fix handle->head updat for snapshot mode.
>> ---
>> .../testing/sysfs-bus-coresight-devices-trbe | 14 +
>> .../trace/coresight/coresight-trbe.rst | 38 +
>> drivers/hwtracing/coresight/Kconfig | 14 +
>> drivers/hwtracing/coresight/Makefile | 1 +
>> drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
>> drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
>> 6 files changed, 1369 insertions(+)
>> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
>> create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
>
> Please spinoff these two file in a separate patch and CC Jon Corbet and the
> linux-doc mailing list.

Sure, makes sense.

>
>> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
>> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
>> new file mode 100644
>> index 000000000000..ad3bbc6fa751
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
>> @@ -0,0 +1,14 @@
>> +What: /sys/bus/coresight/devices/trbe<cpu>/align
>> +Date: March 2021
>> +KernelVersion: 5.13
>> +Contact: Anshuman Khandual <[email protected]>
>> +Description: (Read) Shows the TRBE write pointer alignment. This value
>> + is fetched from the TRBIDR register.
>> +
>> +What: /sys/bus/coresight/devices/trbe<cpu>/flag
>> +Date: March 2021
>> +KernelVersion: 5.13
>> +Contact: Anshuman Khandual <[email protected]>
>> +Description: (Read) Shows if TRBE updates in the memory are with access
>> + and dirty flag updates as well. This value is fetched from
>> + the TRBIDR register.
>
> For this file:
>
> Reviewed-by: Mathieu Poirier <[email protected]>
>
>> diff --git a/Documentation/trace/coresight/coresight-trbe.rst b/Documentation/trace/coresight/coresight-trbe.rst
>> new file mode 100644
>> index 000000000000..b9928ef148da
>> --- /dev/null
>> +++ b/Documentation/trace/coresight/coresight-trbe.rst
>> @@ -0,0 +1,38 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +==============================
>> +Trace Buffer Extension (TRBE).
>> +==============================
>> +
>> + :Author: Anshuman Khandual <[email protected]>
>> + :Date: November 2020
>> +
>> +Hardware Description
>> +--------------------
>> +
>> +Trace Buffer Extension (TRBE) is a percpu hardware which captures in system
>> +memory, CPU traces generated from a corresponding percpu tracing unit. This
>> +gets plugged in as a coresight sink device because the corresponding trace
>> +generators (ETE), are plugged in as source device.
>> +
>> +The TRBE is not compliant to CoreSight architecture specifications, but is
>> +driven via the CoreSight driver framework to support the ETE (which is
>> +CoreSight compliant) integration.
>> +
>> +Sysfs files and directories
>> +---------------------------
>> +
>> +The TRBE devices appear on the existing coresight bus alongside the other
>> +coresight devices::
>> +
>> + >$ ls /sys/bus/coresight/devices
>> + trbe0 trbe1 trbe2 trbe3
>> +
>> +The ``trbe<N>`` named TRBEs are associated with a CPU.::
>> +
>> + >$ ls /sys/bus/coresight/devices/trbe0/
>> + align flag
>> +
>> +*Key file items are:-*
>> + * ``align``: TRBE write pointer alignment
>> + * ``flag``: TRBE updates memory with access and dirty flags
>
> For this file:
>
> Reviewed-by: Mathieu Poirier <[email protected]>
>
>> diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
>> index f154ae7e705d..84530fd80998 100644
>> --- a/drivers/hwtracing/coresight/Kconfig
>> +++ b/drivers/hwtracing/coresight/Kconfig
>> @@ -173,4 +173,18 @@ config CORESIGHT_CTI_INTEGRATION_REGS
>> CTI trigger connections between this and other devices.These
>> registers are not used in normal operation and can leave devices in
>> an inconsistent state.
>> +
>> +config CORESIGHT_TRBE
>> + tristate "Trace Buffer Extension (TRBE) driver"
>> + depends on ARM64 && CORESIGHT_SOURCE_ETM4X
>> + help
>> + This driver provides support for percpu Trace Buffer Extension (TRBE).
>> + TRBE always needs to be used along with it's corresponding percpu ETE
>> + component. ETE generates trace data which is then captured with TRBE.
>> + Unlike traditional sink devices, TRBE is a CPU feature accessible via
>> + system registers. But it's explicit dependency with trace unit (ETE)
>> + requires it to be plugged in as a coresight sink device.
>> +
>> + To compile this driver as a module, choose M here: the module will be
>> + called coresight-trbe.
>> endif
>> diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
>> index f20e357758d1..d60816509755 100644
>> --- a/drivers/hwtracing/coresight/Makefile
>> +++ b/drivers/hwtracing/coresight/Makefile
>> @@ -21,5 +21,6 @@ obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
>> obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
>> obj-$(CONFIG_CORESIGHT_CATU) += coresight-catu.o
>> obj-$(CONFIG_CORESIGHT_CTI) += coresight-cti.o
>> +obj-$(CONFIG_CORESIGHT_TRBE) += coresight-trbe.o
>> coresight-cti-y := coresight-cti-core.o coresight-cti-platform.o \
>> coresight-cti-sysfs.o
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> new file mode 100644
>> index 000000000000..41a012b525bb
>> --- /dev/null
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -0,0 +1,1149 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * This driver enables Trace Buffer Extension (TRBE) as a per-cpu coresight
>> + * sink device could then pair with an appropriate per-cpu coresight source
>> + * device (ETE) thus generating required trace data. Trace can be enabled
>> + * via the perf framework.
>
> If I remember correctly the last version stated the driver was tailored on
> Will's SPE driver.
>

Yes, it was and is, still in there the description. But it is good to mention that here.

>>
>> +static int arm_trbe_device_probe(struct platform_device *pdev)
>> +{
>> + struct coresight_platform_data *pdata;
>> + struct trbe_drvdata *drvdata;
>> + struct device *dev = &pdev->dev;
>> + int ret;
>> +
>> + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
>> + if (!drvdata)
>> + return -ENOMEM;
>> +
>> + pdata = coresight_get_platform_data(dev);
>> + if (IS_ERR(pdata))
>> + return PTR_ERR(pdata);
>
> Given there is no in and out ports, do we need a platform data for this driver?
>
> More comments on this patch tomorrow.

I had the same comment in one of the earlier versions.
But it looks like the coresight_register() requires this argument, to scan the connections for a component. And we don’t want to break that assumption everywhere in the generic driver.

Cheers
Suzuki

2021-03-19 11:59:01

by Mike Leach

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

HI Suzuki,

On Fri, 19 Mar 2021 at 10:30, Suzuki K Poulose <[email protected]> wrote:
>
> Hi Mike
>
> > On 8 Mar 2021, at 17:26, Mike Leach <[email protected]> wrote:
> >
> > Hi Suzuki,
> >
> > On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
> >>
> >> From: Anshuman Khandual <[email protected]>
> >>
> >> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> >> accessible via the system registers. The TRBE supports different addressing
> >> modes including CPU virtual address and buffer modes including the circular
> >> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> >> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> >> access to the trace buffer could be prohibited by a higher exception level
> >> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> >> private interrupt (PPI) on address translation errors and when the buffer
> >> is full. Overall implementation here is inspired from the Arm SPE driver.
> >>
> >> Cc: Mathieu Poirier <[email protected]>
> >> Cc: Mike Leach <[email protected]>
> >> Cc: Suzuki K Poulose <[email protected]>
> >> Signed-off-by: Anshuman Khandual <[email protected]>
> >> Signed-off-by: Suzuki K Poulose <[email protected]>
> >>
> >> +
> >> +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
> >> + struct perf_output_handle *handle,
> >> + void *config)
> >> +{
> >> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> >> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> >> + struct trbe_buf *buf = config;
> >> + enum trbe_fault_action act;
> >> + unsigned long size, offset;
> >> + unsigned long write, base, status;
> >> + unsigned long flags;
> >> +
> >> + WARN_ON(buf->cpudata != cpudata);
> >> + WARN_ON(cpudata->cpu != smp_processor_id());
> >> + WARN_ON(cpudata->drvdata != drvdata);
> >> + if (cpudata->mode != CS_MODE_PERF)
> >> + return 0;
> >> +
> >> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
> >> +
> >> + /*
> >> + * We are about to disable the TRBE. And this could in turn
> >> + * fill up the buffer triggering, an IRQ. This could be consumed
> >> + * by the PE asynchronously, causing a race here against
> >> + * the IRQ handler in closing out the handle. So, let us
> >> + * make sure the IRQ can't trigger while we are collecting
> >> + * the buffer. We also make sure that a WRAP event is handled
> >> + * accordingly.
> >> + */
> >> + local_irq_save(flags);
> >> +
> >> + /*
> >> + * If the TRBE was disabled due to lack of space in the AUX buffer or a
> >> + * spurious fault, the driver leaves it disabled, truncating the buffer.
> >> + * Since the etm_perf driver expects to close out the AUX buffer, the
> >> + * driver skips it. Thus, just pass in 0 size here to indicate that the
> >> + * buffer was truncated.
> >> + */
> >> + if (!is_trbe_enabled()) {
> >> + size = 0;
> >> + goto done;
> >> + }
> >> + /*
> >> + * perf handle structure needs to be shared with the TRBE IRQ handler for
> >> + * capturing trace data and restarting the handle. There is a probability
> >> + * of an undefined reference based crash when etm event is being stopped
> >> + * while a TRBE IRQ also getting processed. This happens due the release
> >> + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
> >> + * the TRBE here will ensure that no IRQ could be generated when the perf
> >> + * handle gets freed in etm_event_stop().
> >> + */
> >> + trbe_drain_and_disable_local();
> >> + write = get_trbe_write_pointer();
> >> + base = get_trbe_base_pointer();
> >> +
> >> + /* Check if there is a pending interrupt and handle it here */
> >> + status = read_sysreg_s(SYS_TRBSR_EL1);
> >> + if (is_trbe_irq(status)) {
> >> +
> >> + /*
> >> + * Now that we are handling the IRQ here, clear the IRQ
> >> + * from the status, to let the irq handler know that it
> >> + * is taken care of.
> >> + */
> >> + clr_trbe_irq();
> >> + isb();
> >> +
> >> + act = trbe_get_fault_act(status);
> >> + /*
> >> + * If this was not due to a WRAP event, we have some
> >> + * errors and as such buffer is empty.
> >> + */
> >> + if (act != TRBE_FAULT_ACT_WRAP) {
> >> + size = 0;
> >> + goto done;
> >> + }
> >
> > We are using TRBE FILL mode - which halts capture on a full buffer and
> > triggers the IRQ, without disabling the source first.
> > This means that the mode is inherently lossy (unless by some unlikely
> > co-incidence the last byte that caused the wrap was also the last byte
> > to be sent from an ETE that was in the process of being disabled.)
> > Therefore we must have a perf_aux_output_flag(handle,
> > PERF_AUX_FLAG_TRUNCATED) call in here to signal that some trace was
> > lost, for consistence of operation with ETR etc, and intelpt.
> >
>
> I agree that the there is a bit of loss here due to the FILL mode. But it is not comparable to that of the ETR. In this case, the WRAP event is triggered when we flush the ETE. i.e, this could be mostly due to the fact that the tracing was enabled for the kernel mode and the last few bytes of trace which caused the FILL belong to the code responsible for stopping the components in the CoreSight trace. I personally do not think this data is of any interest to the user.
> Otherwise, if the data didn’t belong to the perf event side, it should have triggered the IRQ.
>
> This is true in case of the buffer overflow interrupt too, with a bit more data lost. i.e, since the interrupt is PPI, the overflow is triggered when the buffer is full (which includes the data that is cached in the TRBE). But there could be a bit of data that is still cached in the ETE, before it is captured in the trace. And the moment we get a FILL event, we stop executing anything that is relevant for the Trace session (as we are in the driver handling the interrupt).
> And then we reconfigure the buffer to continue the execution. Now, the interrupt delivery is not necessarily synchronous and there could be data lost in the interval between WRAP event and the IRQ is triggered.
>
> I am OK with suggesting that there was some loss of trace data during the session, if we hit WRAP event. But this could cause worry to the consumers that they lost too much of trace data of their interest, while that is not the case.
>

We can never know what has been lost. It may be some trace around the
driver of no interest to the user, it may also be an event or
timestamp related to an earlier marker - which could be highly
relevant.
With ETR we do not know how much is lost on wrap - it might be one
byte, it might be much more - but the point is we mark as truncated
for _any_ amount.

It is unfortunate that we will see multiple buffers marked as
truncated - but this is far better than creating the false impression
that no trace has been lost - that there is a continuous record where
there is not.
For some users - such as autofdo where sampling is taking place anyway
- truncated buffers probably do not matter. For others - who are
looking to trace a specific section of code - then they need to be
aware that there could be decode anomolies relating to buffer wrap.

Regards

Mike

> >> +static inline unsigned long get_trbe_limit_pointer(void)
> >> +{
> >> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> >> + unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
> >> + unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;
> >
> > Could this not be:
> > unsigned long addr = trblimitr & (TRBLIMITR_LIMIT_MASK <<
> > TRBLIMITR_LIMIT_SHIFT);
> > like the base ponter below?
> >
>
> Sure, it can be consistent.
>
>
> >> +
> >> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> >> + return addr;
> >> +}
> >> +
> >> +static inline unsigned long get_trbe_base_pointer(void)
> >> +{
> >> + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
> >> + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
> >> +
> >> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> >> + return addr;
> >> +}
> >> +
>
> Thank you for the review
>
> Kind regards
> Suzuki
>


--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK

2021-03-19 14:50:41

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

On Fri, 19 Mar 2021 at 04:34, Suzuki K Poulose <[email protected]> wrote:
>
> Hi Mathieu,
>
> > On 18 Mar 2021, at 18:08, Mathieu Poirier <[email protected]> wrote:
> >
> > Good morning,
> >
> > On Thu, Feb 25, 2021 at 07:35:42PM +0000, Suzuki K Poulose wrote:
> >> From: Anshuman Khandual <[email protected]>
> >>
> >> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> >> accessible via the system registers. The TRBE supports different addressing
> >> modes including CPU virtual address and buffer modes including the circular
> >> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> >> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> >> access to the trace buffer could be prohibited by a higher exception level
> >> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> >> private interrupt (PPI) on address translation errors and when the buffer
> >> is full. Overall implementation here is inspired from the Arm SPE driver.
> >>
>
> There is a mention of the SPE driver here in the commit description.
>
> >> Cc: Mathieu Poirier <[email protected]>
> >> Cc: Mike Leach <[email protected]>
> >> Cc: Suzuki K Poulose <[email protected]>
> >> Signed-off-by: Anshuman Khandual <[email protected]>
> >> Signed-off-by: Suzuki K Poulose <[email protected]>
> >> ---
> >> Changes:
> >> - Replaced TRBLIMITR_LIMIT_SHIFT with TRBBASER_BASE_SHIFT in set_trbe_base_pointer()
> >> - Dropped TRBBASER_BASE_MASK and TRBBASER_BASE_SHIFT from get_trbe_base_pointer()
> >> - Indentation changes for TRBE_BSC_NOT_[STOPPED|FILLED|TRIGGERED] definitions
> >> - Moved DECLARE_PER_CPU(...., csdev_sink) into coresight-priv.h
> >> - Moved isb() from trbe_enable_hw() into set_trbe_limit_pointer_enabled()
> >> - Dropped the space after type casting before vmap()
> >> - Return 0 instead of EINVAL in arm_trbe_update_buffer()
> >> - Add a comment in trbe_handle_overflow()
> >> - Add a comment in arm_trbe_cpu_startup()
> >> - Unregister coresight TRBE device when not supported
> >> - Fix potential NULL handle dereference in IRQ handler with a spurious IRQ
> >> - Read TRBIDR after is_trbe_programmable() in arm_trbe_probe_coresight_cpu()
> >> - Replaced and modified trbe_drain_and_disable_local() in IRQ handler
> >> - Updated arm_trbe_update_buffer() for handling a missing interrupt
> >> - Dropped kfree() for all devm_xxx() allocated buffer
> >> - Dropped additional blank line in documentation coresight/coresight-trbe.rst
> >> - Added Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> >> - Changed CONFIG_CORESIGHT_TRBE options, dependencies and helper write up
> >> - Added comment for irq_work_run()
> >> - Updated comment for minumum buffer length in arm_trbe_alloc_buffer()
> >> - Dropped redundant smp_processor_id() from arm_trbe_probe_coresight_cpu()
> >> - Fixed indentation in arm_trbe_probe_cpuhp()
> >> - Added static for arm_trbe_free_buffer()
> >> - Added comment for trbe_base element in trbe_buf structure
> >> - Dropped IS_ERR() check from vmap() returned pointer
> >> - Added WARN_ON(trbe_csdev) in arm_trbe_probe_coresight_cpu()
> >> - Changed TRBE device names from arm_trbeX to just trbeX
> >> - Dropped unused argument perf_output_handle from trbe_get_fault_act()
> >> - Dropped IS_ERR() from kzalloc_node()/kcalloc() buffer in arm_trbe_alloc_buffer()
> >> - Dropped IS_ERR() and return -ENOMEM in arm_trbe_probe_coresight()
> >> - Moved TRBE HW disabling before coresight cleanup in arm_trbe_remove_coresight_cpu()
> >> - Changed error return codes from arm_trbe_probe_irq()
> >> - Changed error return codes from arm_trbe_device_probe()
> >> - Changed arm_trbe_remove_coresight() order in arm_trbe_device_remove()
> >> - Changed TRBE CPU support probe/remove sequence with for_each_cpu() iterator
> >> - Changed coresight_register() in arm_trbe_probe_coresight_cpu()
> >> - Changed error return code when cpuhp_setup_state_multi() fails in arm_trbe_probe_cpuhp()
> >> - Changed error return code when cpuhp_state_add_instance() fails in arm_trbe_probe_cpuhp()
> >> - Changed trbe_dbm as trbe_flag including its sysfs interface
> >> - Handle race between update_buffer & IRQ handler
> >> - Rework and split the TRBE probe to avoid lockdep due to memory allocation
> >> from IPI calls (via coresight_register())
> >> - Fix handle->head updat for snapshot mode.
> >> ---
> >> .../testing/sysfs-bus-coresight-devices-trbe | 14 +
> >> .../trace/coresight/coresight-trbe.rst | 38 +
> >> drivers/hwtracing/coresight/Kconfig | 14 +
> >> drivers/hwtracing/coresight/Makefile | 1 +
> >> drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
> >> drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
> >> 6 files changed, 1369 insertions(+)
> >> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> >> create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
> >
> > Please spinoff these two file in a separate patch and CC Jon Corbet and the
> > linux-doc mailing list.
>
> Sure, makes sense.
>
> >
> >> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
> >> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> >> new file mode 100644
> >> index 000000000000..ad3bbc6fa751
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> >> @@ -0,0 +1,14 @@
> >> +What: /sys/bus/coresight/devices/trbe<cpu>/align
> >> +Date: March 2021
> >> +KernelVersion: 5.13
> >> +Contact: Anshuman Khandual <[email protected]>
> >> +Description: (Read) Shows the TRBE write pointer alignment. This value
> >> + is fetched from the TRBIDR register.
> >> +
> >> +What: /sys/bus/coresight/devices/trbe<cpu>/flag
> >> +Date: March 2021
> >> +KernelVersion: 5.13
> >> +Contact: Anshuman Khandual <[email protected]>
> >> +Description: (Read) Shows if TRBE updates in the memory are with access
> >> + and dirty flag updates as well. This value is fetched from
> >> + the TRBIDR register.
> >
> > For this file:
> >
> > Reviewed-by: Mathieu Poirier <[email protected]>
> >
> >> diff --git a/Documentation/trace/coresight/coresight-trbe.rst b/Documentation/trace/coresight/coresight-trbe.rst
> >> new file mode 100644
> >> index 000000000000..b9928ef148da
> >> --- /dev/null
> >> +++ b/Documentation/trace/coresight/coresight-trbe.rst
> >> @@ -0,0 +1,38 @@
> >> +.. SPDX-License-Identifier: GPL-2.0
> >> +
> >> +==============================
> >> +Trace Buffer Extension (TRBE).
> >> +==============================
> >> +
> >> + :Author: Anshuman Khandual <[email protected]>
> >> + :Date: November 2020
> >> +
> >> +Hardware Description
> >> +--------------------
> >> +
> >> +Trace Buffer Extension (TRBE) is a percpu hardware which captures in system
> >> +memory, CPU traces generated from a corresponding percpu tracing unit. This
> >> +gets plugged in as a coresight sink device because the corresponding trace
> >> +generators (ETE), are plugged in as source device.
> >> +
> >> +The TRBE is not compliant to CoreSight architecture specifications, but is
> >> +driven via the CoreSight driver framework to support the ETE (which is
> >> +CoreSight compliant) integration.
> >> +
> >> +Sysfs files and directories
> >> +---------------------------
> >> +
> >> +The TRBE devices appear on the existing coresight bus alongside the other
> >> +coresight devices::
> >> +
> >> + >$ ls /sys/bus/coresight/devices
> >> + trbe0 trbe1 trbe2 trbe3
> >> +
> >> +The ``trbe<N>`` named TRBEs are associated with a CPU.::
> >> +
> >> + >$ ls /sys/bus/coresight/devices/trbe0/
> >> + align flag
> >> +
> >> +*Key file items are:-*
> >> + * ``align``: TRBE write pointer alignment
> >> + * ``flag``: TRBE updates memory with access and dirty flags
> >
> > For this file:
> >
> > Reviewed-by: Mathieu Poirier <[email protected]>
> >
> >> diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
> >> index f154ae7e705d..84530fd80998 100644
> >> --- a/drivers/hwtracing/coresight/Kconfig
> >> +++ b/drivers/hwtracing/coresight/Kconfig
> >> @@ -173,4 +173,18 @@ config CORESIGHT_CTI_INTEGRATION_REGS
> >> CTI trigger connections between this and other devices.These
> >> registers are not used in normal operation and can leave devices in
> >> an inconsistent state.
> >> +
> >> +config CORESIGHT_TRBE
> >> + tristate "Trace Buffer Extension (TRBE) driver"
> >> + depends on ARM64 && CORESIGHT_SOURCE_ETM4X
> >> + help
> >> + This driver provides support for percpu Trace Buffer Extension (TRBE).
> >> + TRBE always needs to be used along with it's corresponding percpu ETE
> >> + component. ETE generates trace data which is then captured with TRBE.
> >> + Unlike traditional sink devices, TRBE is a CPU feature accessible via
> >> + system registers. But it's explicit dependency with trace unit (ETE)
> >> + requires it to be plugged in as a coresight sink device.
> >> +
> >> + To compile this driver as a module, choose M here: the module will be
> >> + called coresight-trbe.
> >> endif
> >> diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
> >> index f20e357758d1..d60816509755 100644
> >> --- a/drivers/hwtracing/coresight/Makefile
> >> +++ b/drivers/hwtracing/coresight/Makefile
> >> @@ -21,5 +21,6 @@ obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
> >> obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
> >> obj-$(CONFIG_CORESIGHT_CATU) += coresight-catu.o
> >> obj-$(CONFIG_CORESIGHT_CTI) += coresight-cti.o
> >> +obj-$(CONFIG_CORESIGHT_TRBE) += coresight-trbe.o
> >> coresight-cti-y := coresight-cti-core.o coresight-cti-platform.o \
> >> coresight-cti-sysfs.o
> >> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> >> new file mode 100644
> >> index 000000000000..41a012b525bb
> >> --- /dev/null
> >> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> >> @@ -0,0 +1,1149 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * This driver enables Trace Buffer Extension (TRBE) as a per-cpu coresight
> >> + * sink device could then pair with an appropriate per-cpu coresight source
> >> + * device (ETE) thus generating required trace data. Trace can be enabled
> >> + * via the perf framework.
> >
> > If I remember correctly the last version stated the driver was tailored on
> > Will's SPE driver.
> >
>
> Yes, it was and is, still in there the description. But it is good to mention that here.
>
> >>
> >> +static int arm_trbe_device_probe(struct platform_device *pdev)
> >> +{
> >> + struct coresight_platform_data *pdata;
> >> + struct trbe_drvdata *drvdata;
> >> + struct device *dev = &pdev->dev;
> >> + int ret;
> >> +
> >> + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
> >> + if (!drvdata)
> >> + return -ENOMEM;
> >> +
> >> + pdata = coresight_get_platform_data(dev);
> >> + if (IS_ERR(pdata))
> >> + return PTR_ERR(pdata);
> >
> > Given there is no in and out ports, do we need a platform data for this driver?
> >
> > More comments on this patch tomorrow.
>
> I had the same comment in one of the earlier versions.
> But it looks like the coresight_register() requires this argument, to scan the connections for a component. And we don’t want to break that assumption everywhere in the generic driver.
>

Ok, that confirms my suspicions... Using a platform driver here
doesn't provide anything other than compatibility with the core
framework, which never expected to see components that don't have
ports. Fixing that is out of scope and better left for another day.

> Cheers
> Suzuki
>

2021-03-19 18:00:56

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

On Thu, Feb 25, 2021 at 07:35:42PM +0000, Suzuki K Poulose wrote:
> From: Anshuman Khandual <[email protected]>
>
> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> accessible via the system registers. The TRBE supports different addressing
> modes including CPU virtual address and buffer modes including the circular
> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> access to the trace buffer could be prohibited by a higher exception level
> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> private interrupt (PPI) on address translation errors and when the buffer
> is full. Overall implementation here is inspired from the Arm SPE driver.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---

[...]

> +
> +static const struct coresight_ops_sink arm_trbe_sink_ops = {
> + .enable = arm_trbe_enable,
> + .disable = arm_trbe_disable,
> + .alloc_buffer = arm_trbe_alloc_buffer,
> + .free_buffer = arm_trbe_free_buffer,
> + .update_buffer = arm_trbe_update_buffer,
> +};
> +
> +static const struct coresight_ops arm_trbe_cs_ops = {
> + .sink_ops = &arm_trbe_sink_ops,
> +};

I have reviewed everything below this point and things look quite good. I will
continue with the above on Monday.

Mathieu

> +
> +static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%llx\n", cpudata->trbe_align);
> +}
> +static DEVICE_ATTR_RO(align);
> +
> +static ssize_t flag_show(struct device *dev, struct device_attribute *attr, char *buf)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%d\n", cpudata->trbe_flag);
> +}
> +static DEVICE_ATTR_RO(flag);
> +
> +static struct attribute *arm_trbe_attrs[] = {
> + &dev_attr_align.attr,
> + &dev_attr_flag.attr,
> + NULL,
> +};
> +
> +static const struct attribute_group arm_trbe_group = {
> + .attrs = arm_trbe_attrs,
> +};
> +
> +static const struct attribute_group *arm_trbe_groups[] = {
> + &arm_trbe_group,
> + NULL,
> +};
> +
> +static void arm_trbe_enable_cpu(void *info)
> +{
> + struct trbe_drvdata *drvdata = info;
> +
> + trbe_reset_local();
> + enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
> +}
> +
> +static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
> +{
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
> + struct coresight_desc desc = { 0 };
> + struct device *dev;
> +
> + if (WARN_ON(trbe_csdev))
> + return;
> +
> + dev = &cpudata->drvdata->pdev->dev;
> + desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu);
> + if (IS_ERR(desc.name))
> + goto cpu_clear;
> +
> + desc.type = CORESIGHT_DEV_TYPE_SINK;
> + desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
> + desc.ops = &arm_trbe_cs_ops;
> + desc.pdata = dev_get_platdata(dev);
> + desc.groups = arm_trbe_groups;
> + desc.dev = dev;
> + trbe_csdev = coresight_register(&desc);
> + if (IS_ERR(trbe_csdev))
> + goto cpu_clear;
> +
> + dev_set_drvdata(&trbe_csdev->dev, cpudata);
> + coresight_set_percpu_sink(cpu, trbe_csdev);
> + return;
> +cpu_clear:
> + cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> +}
> +
> +static void arm_trbe_probe_cpu(void *info)
> +{
> + struct trbe_drvdata *drvdata = info;
> + int cpu = smp_processor_id();
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + u64 trbidr;
> +
> + if (WARN_ON(!cpudata))
> + goto cpu_clear;
> +
> + if (!is_trbe_available()) {
> + pr_err("TRBE is not implemented on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> +
> + trbidr = read_sysreg_s(SYS_TRBIDR_EL1);
> + if (!is_trbe_programmable(trbidr)) {
> + pr_err("TRBE is owned in higher exception level on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> +
> + cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
> + if (cpudata->trbe_align > SZ_2K) {
> + pr_err("Unsupported alignment on cpu %d\n", cpu);
> + goto cpu_clear;
> + }
> + cpudata->trbe_flag = get_trbe_flag_update(trbidr);
> + cpudata->cpu = cpu;
> + cpudata->drvdata = drvdata;
> + return;
> +cpu_clear:
> + cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> +}
> +
> +static void arm_trbe_remove_coresight_cpu(void *info)
> +{
> + int cpu = smp_processor_id();
> + struct trbe_drvdata *drvdata = info;
> + struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
> + struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
> +
> + disable_percpu_irq(drvdata->irq);
> + trbe_reset_local();
> + if (trbe_csdev) {
> + coresight_unregister(trbe_csdev);
> + cpudata->drvdata = NULL;
> + coresight_set_percpu_sink(cpu, NULL);
> + }
> +}
> +
> +static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata)
> +{
> + int cpu;
> +
> + drvdata->cpudata = alloc_percpu(typeof(*drvdata->cpudata));
> + if (!drvdata->cpudata)
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, &drvdata->supported_cpus) {
> + smp_call_function_single(cpu, arm_trbe_probe_cpu, drvdata, 1);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_register_coresight_cpu(drvdata, cpu);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + smp_call_function_single(cpu, arm_trbe_enable_cpu, drvdata, 1);
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
> +{
> + int cpu;
> +
> + for_each_cpu(cpu, &drvdata->supported_cpus)
> + smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
> + free_percpu(drvdata->cpudata);
> + return 0;
> +}
> +
> +static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node)
> +{
> + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
> +
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> +
> + /*
> + * If this CPU was not probed for TRBE,
> + * initialize it now.
> + */
> + if (!coresight_get_percpu_sink(cpu)) {
> + arm_trbe_probe_cpu(drvdata);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_register_coresight_cpu(drvdata, cpu);
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> + arm_trbe_enable_cpu(drvdata);
> + } else {
> + arm_trbe_enable_cpu(drvdata);
> + }
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
> +{
> + struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
> +
> + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> + disable_percpu_irq(drvdata->irq);
> + trbe_reset_local();
> + }
> + return 0;
> +}
> +
> +static int arm_trbe_probe_cpuhp(struct trbe_drvdata *drvdata)
> +{
> + enum cpuhp_state trbe_online;
> + int ret;
> +
> + trbe_online = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME,
> + arm_trbe_cpu_startup, arm_trbe_cpu_teardown);
> + if (trbe_online < 0)
> + return trbe_online;
> +
> + ret = cpuhp_state_add_instance(trbe_online, &drvdata->hotplug_node);
> + if (ret) {
> + cpuhp_remove_multi_state(trbe_online);
> + return ret;
> + }
> + drvdata->trbe_online = trbe_online;
> + return 0;
> +}
> +
> +static void arm_trbe_remove_cpuhp(struct trbe_drvdata *drvdata)
> +{
> + cpuhp_remove_multi_state(drvdata->trbe_online);
> +}
> +
> +static int arm_trbe_probe_irq(struct platform_device *pdev,
> + struct trbe_drvdata *drvdata)
> +{
> + int ret;
> +
> + drvdata->irq = platform_get_irq(pdev, 0);
> + if (drvdata->irq < 0) {
> + pr_err("IRQ not found for the platform device\n");
> + return drvdata->irq;
> + }
> +
> + if (!irq_is_percpu(drvdata->irq)) {
> + pr_err("IRQ is not a PPI\n");
> + return -EINVAL;
> + }
> +
> + if (irq_get_percpu_devid_partition(drvdata->irq, &drvdata->supported_cpus))
> + return -EINVAL;
> +
> + drvdata->handle = alloc_percpu(typeof(*drvdata->handle));
> + if (!drvdata->handle)
> + return -ENOMEM;
> +
> + ret = request_percpu_irq(drvdata->irq, arm_trbe_irq_handler, DRVNAME, drvdata->handle);
> + if (ret) {
> + free_percpu(drvdata->handle);
> + return ret;
> + }
> + return 0;
> +}
> +
> +static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata)
> +{
> + free_percpu_irq(drvdata->irq, drvdata->handle);
> + free_percpu(drvdata->handle);
> +}
> +
> +static int arm_trbe_device_probe(struct platform_device *pdev)
> +{
> + struct coresight_platform_data *pdata;
> + struct trbe_drvdata *drvdata;
> + struct device *dev = &pdev->dev;
> + int ret;
> +
> + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
> + if (!drvdata)
> + return -ENOMEM;
> +
> + pdata = coresight_get_platform_data(dev);
> + if (IS_ERR(pdata))
> + return PTR_ERR(pdata);
> +
> + dev_set_drvdata(dev, drvdata);
> + dev->platform_data = pdata;
> + drvdata->pdev = pdev;
> + ret = arm_trbe_probe_irq(pdev, drvdata);
> + if (ret)
> + return ret;
> +
> + ret = arm_trbe_probe_coresight(drvdata);
> + if (ret)
> + goto probe_failed;
> +
> + ret = arm_trbe_probe_cpuhp(drvdata);
> + if (ret)
> + goto cpuhp_failed;
> +
> + return 0;
> +cpuhp_failed:
> + arm_trbe_remove_coresight(drvdata);
> +probe_failed:
> + arm_trbe_remove_irq(drvdata);
> + return ret;
> +}
> +
> +static int arm_trbe_device_remove(struct platform_device *pdev)
> +{
> + struct trbe_drvdata *drvdata = platform_get_drvdata(pdev);
> +
> + arm_trbe_remove_cpuhp(drvdata);
> + arm_trbe_remove_coresight(drvdata);
> + arm_trbe_remove_irq(drvdata);
> + return 0;
> +}
> +
> +static const struct of_device_id arm_trbe_of_match[] = {
> + { .compatible = "arm,trace-buffer-extension"},
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, arm_trbe_of_match);
> +
> +static struct platform_driver arm_trbe_driver = {
> + .driver = {
> + .name = DRVNAME,
> + .of_match_table = of_match_ptr(arm_trbe_of_match),
> + .suppress_bind_attrs = true,
> + },
> + .probe = arm_trbe_device_probe,
> + .remove = arm_trbe_device_remove,
> +};
> +
> +static int __init arm_trbe_init(void)
> +{
> + int ret;
> +
> + if (arm64_kernel_unmapped_at_el0()) {
> + pr_err("TRBE wouldn't work if kernel gets unmapped at EL0\n");
> + return -EOPNOTSUPP;
> + }
> +
> + ret = platform_driver_register(&arm_trbe_driver);
> + if (!ret)
> + return 0;
> +
> + pr_err("Error registering %s platform driver\n", DRVNAME);
> + return ret;
> +}
> +
> +static void __exit arm_trbe_exit(void)
> +{
> + platform_driver_unregister(&arm_trbe_driver);
> +}
> +module_init(arm_trbe_init);
> +module_exit(arm_trbe_exit);
> +
> +MODULE_AUTHOR("Anshuman Khandual <[email protected]>");
> +MODULE_DESCRIPTION("Arm Trace Buffer Extension (TRBE) driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h
> new file mode 100644
> index 000000000000..499b846ccfee
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-trbe.h
> @@ -0,0 +1,153 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * This contains all required hardware related helper functions for
> + * Trace Buffer Extension (TRBE) driver in the coresight framework.
> + *
> + * Copyright (C) 2020 ARM Ltd.
> + *
> + * Author: Anshuman Khandual <[email protected]>
> + */
> +#include <linux/coresight.h>
> +#include <linux/device.h>
> +#include <linux/irq.h>
> +#include <linux/kernel.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/smp.h>
> +
> +#include "coresight-etm-perf.h"
> +
> +static inline bool is_trbe_available(void)
> +{
> + u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
> + unsigned int trbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_TRBE_SHIFT);
> +
> + return trbe >= 0b0001;
> +}
> +
> +static inline bool is_trbe_enabled(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + return trblimitr & TRBLIMITR_ENABLE;
> +}
> +
> +#define TRBE_EC_OTHERS 0
> +#define TRBE_EC_STAGE1_ABORT 36
> +#define TRBE_EC_STAGE2_ABORT 37
> +
> +static inline int get_trbe_ec(u64 trbsr)
> +{
> + return (trbsr >> TRBSR_EC_SHIFT) & TRBSR_EC_MASK;
> +}
> +
> +#define TRBE_BSC_NOT_STOPPED 0
> +#define TRBE_BSC_FILLED 1
> +#define TRBE_BSC_TRIGGERED 2
> +
> +static inline int get_trbe_bsc(u64 trbsr)
> +{
> + return (trbsr >> TRBSR_BSC_SHIFT) & TRBSR_BSC_MASK;
> +}
> +
> +static inline void clr_trbe_irq(void)
> +{
> + u64 trbsr = read_sysreg_s(SYS_TRBSR_EL1);
> +
> + trbsr &= ~TRBSR_IRQ;
> + write_sysreg_s(trbsr, SYS_TRBSR_EL1);
> +}
> +
> +static inline bool is_trbe_irq(u64 trbsr)
> +{
> + return trbsr & TRBSR_IRQ;
> +}
> +
> +static inline bool is_trbe_trg(u64 trbsr)
> +{
> + return trbsr & TRBSR_TRG;
> +}
> +
> +static inline bool is_trbe_wrap(u64 trbsr)
> +{
> + return trbsr & TRBSR_WRAP;
> +}
> +
> +static inline bool is_trbe_abort(u64 trbsr)
> +{
> + return trbsr & TRBSR_ABORT;
> +}
> +
> +static inline bool is_trbe_running(u64 trbsr)
> +{
> + return !(trbsr & TRBSR_STOP);
> +}
> +
> +#define TRBE_TRIG_MODE_STOP 0
> +#define TRBE_TRIG_MODE_IRQ 1
> +#define TRBE_TRIG_MODE_IGNORE 3
> +
> +#define TRBE_FILL_MODE_FILL 0
> +#define TRBE_FILL_MODE_WRAP 1
> +#define TRBE_FILL_MODE_CIRCULAR_BUFFER 3
> +
> +static inline void set_trbe_disabled(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> +
> + trblimitr &= ~TRBLIMITR_ENABLE;
> + write_sysreg_s(trblimitr, SYS_TRBLIMITR_EL1);
> +}
> +
> +static inline bool get_trbe_flag_update(u64 trbidr)
> +{
> + return trbidr & TRBIDR_FLAG;
> +}
> +
> +static inline bool is_trbe_programmable(u64 trbidr)
> +{
> + return !(trbidr & TRBIDR_PROG);
> +}
> +
> +static inline int get_trbe_address_align(u64 trbidr)
> +{
> + return (trbidr >> TRBIDR_ALIGN_SHIFT) & TRBIDR_ALIGN_MASK;
> +}
> +
> +static inline unsigned long get_trbe_write_pointer(void)
> +{
> + return read_sysreg_s(SYS_TRBPTR_EL1);
> +}
> +
> +static inline void set_trbe_write_pointer(unsigned long addr)
> +{
> + WARN_ON(is_trbe_enabled());
> + write_sysreg_s(addr, SYS_TRBPTR_EL1);
> +}
> +
> +static inline unsigned long get_trbe_limit_pointer(void)
> +{
> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> + unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
> + unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;
> +
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + return addr;
> +}
> +
> +static inline unsigned long get_trbe_base_pointer(void)
> +{
> + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
> + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
> +
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + return addr;
> +}
> +
> +static inline void set_trbe_base_pointer(unsigned long addr)
> +{
> + WARN_ON(is_trbe_enabled());
> + WARN_ON(!IS_ALIGNED(addr, (1UL << TRBBASER_BASE_SHIFT)));
> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> + write_sysreg_s(addr, SYS_TRBBASER_EL1);
> +}
> --
> 2.24.1
>

2021-03-22 12:32:12

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 02/19] perf: aux: Add CoreSight PMU buffer formats

On 16/03/2021 17:04, Mathieu Poirier wrote:
> On Thu, Feb 25, 2021 at 07:35:26PM +0000, Suzuki K Poulose wrote:
>> CoreSight PMU supports aux-buffer for the ETM tracing. The trace
>> generated by the ETM (associated with individual CPUs, like Intel PT)
>> is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
>>
>> The TMC-ETR applies formatting of the raw ETM trace data, as it
>> can collect traces from multiple ETMs, with the TraceID to indicate
>> the source of a given trace packet.
>>
>> Arm Trace Buffer Extension is new "sink" IP, attached to individual
>> CPUs and thus do not provide additional formatting, like TMC-ETR.
>>
>> Additionally, a system could have both TRBE *and* TMC-ETR for
>> the trace collection. e.g, TMC-ETR could be used as a single
>> trace buffer to collect data from multiple ETMs to correlate
>> the traces from different CPUs. It is possible to have a
>> perf session where some events end up collecting the trace
>> in TMC-ETR while the others in TRBE. Thus we need a way
>> to identify the type of the trace for each AUX record.
>>
>
> The gist of this patch is to introduce formatted and raw trace format. To me
> the above paragraph brings confusion to the changelog, especially since we don't
> allow events belonging to the same session to use different types of sinks. I
> would simply remove it.

This is not entirely correct. We could still have different formatted
trace in a *session* but not for an *event* in the session. i.e,
imagine a system wide/task bound (not per-thread) session, where there
are events created per-CPU and bound to the CPU. Each of these CPUs
could have different types of preferred sink and thus, we could have
a single session with an AUX record per CPU event, with different
formats. However any AUX record is guaranteed to be of the same type.
And this is why the flag bit is important, so that the perf tool
could create a decoder for an AUX record stream looking at the type.

>
>> Define the trace formats exported by the CoreSight PMU.
>> We don't define the flags following the "ETM" as this
>> information is available to the user when issuing
>> the session. What is missing is the additional
>> formatting applied by the "sink" which is decided
>> at the runtime and the user may not have a control on.
>>
>> So we define :
>> - CORESIGHT format (indicates the Frame format)
>> - RAW format (indicates the format of the source)
>>
>> The default value is CORESIGHT format for all the records
>> (i,e == 0). Add the RAW format for others that use
>> raw format.
>>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Mike Leach <[email protected]>
>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Leo Yan <[email protected]>
>> Cc: Anshuman Khandual <[email protected]>
>> Reviewed-by: Mike Leach <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> ---
>> Changes from previous:
>> - Split from the coresight driver specific code
>> for ease of merging
>> ---
>> include/uapi/linux/perf_event.h | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index f006eeab6f0e..63971eaef127 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -1162,6 +1162,10 @@ enum perf_callchain_context {
>> #define PERF_AUX_FLAG_COLLISION 0x08 /* sample collided with another */
>> #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK 0xff00 /* PMU specific trace format type */
>>
>> +/* CoreSight PMU AUX buffer formats */
>> +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT 0x0000 /* Default for backward compatibility */
>> +#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW 0x0100 /* Raw format of the source */
>> +
>
> Is "CORESIGHT" really a format? We are playing with words and the end result is
> the same but I think PERF_AUX_FLAG_CORESIGHT_FORMAT_FORMATTED would be best, or
> event:

It is really CoreSight FRAME Format. So unless we specify the "actual"
format, which is CoreSight Frame format, simply FORMATTED doesn't
distinguish it from a new format that could be applied in the future.

I would prefer to retain the above names to indicate the definitions
apply to CORESIGH pmu FORMAT flags.

>
> #define PERF_AUX_FLAG_CORESIGHT_TRACE_FORMATTED 0x0000 /* Default for backward compatibility */
> #define PERF_AUX_FLAG_CORESIGHT_TRACE_RAW 0x0100 /* Raw format of the source */
>
> Regardless, for patches 01 and 02: >
> Reviewed-by: Mathieu Poirier <[email protected]>

Thanks
Suzuki

>
>> #define PERF_FLAG_FD_NO_GROUP (1UL << 0)
>> #define PERF_FLAG_FD_OUTPUT (1UL << 1)
>> #define PERF_FLAG_PID_CGROUP (1UL << 2) /* pid=cgroup id, per-cpu mode only */
>> --
>> 2.24.1
>>

2021-03-22 16:55:14

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 15/19] dts: bindings: Document device tree bindings for ETE

Hi Rob

On 06/03/2021 21:06, Rob Herring wrote:
> On Thu, Feb 25, 2021 at 07:35:39PM +0000, Suzuki K Poulose wrote:
>> Document the device tree bindings for Embedded Trace Extensions.
>> ETE can be connected to legacy coresight components and thus
>> could optionally contain a connection graph as described by
>> the CoreSight bindings.
>>
>> Cc: [email protected]
>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Mike Leach <[email protected]>
>> Cc: Rob Herring <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>> ---
>> Changes:
>> - Fix out-ports defintion
>> ---
>> .../devicetree/bindings/arm/ete.yaml | 71 +++++++++++++++++++
>> 1 file changed, 71 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/arm/ete.yaml b/Documentation/devicetree/bindings/arm/ete.yaml
>> new file mode 100644
>> index 000000000000..35a42d92bf97
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/ete.yaml
>> @@ -0,0 +1,71 @@
>> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
>> +# Copyright 2021, Arm Ltd
>> +%YAML 1.2
>> +---
>> +$id: "http://devicetree.org/schemas/arm/ete.yaml#"
>> +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
>> +
>> +title: ARM Embedded Trace Extensions
>> +
>> +maintainers:
>> + - Suzuki K Poulose <[email protected]>
>> + - Mathieu Poirier <[email protected]>
>> +
>> +description: |
>> + Arm Embedded Trace Extension(ETE) is a per CPU trace component that
>> + allows tracing the CPU execution. It overlaps with the CoreSight ETMv4
>> + architecture and has extended support for future architecture changes.
>> + The trace generated by the ETE could be stored via legacy CoreSight
>> + components (e.g, TMC-ETR) or other means (e.g, using a per CPU buffer
>> + Arm Trace Buffer Extension (TRBE)). Since the ETE can be connected to
>> + legacy CoreSight components, a node must be listed per instance, along
>> + with any optional connection graph as per the coresight bindings.
>> + See bindings/arm/coresight.txt.
>> +
>> +properties:
>> + $nodename:
>> + pattern: "^ete([0-9a-f]+)$"
>> + compatible:
>> + items:
>> + - const: arm,embedded-trace-extension
>> +
>> + cpu:
>> + description: |
>> + Handle to the cpu this ETE is bound to.
>> + $ref: /schemas/types.yaml#/definitions/phandle
>> +
>> + out-ports:
>> + description: |
>> + Output connections from the ETE to legacy CoreSight trace bus.
>> + $ref: /schemas/graph.yaml#/properties/port
>
> s/port/ports/

Ok.

>
> And then you need:
>
> properties:
> port:
> description: what this port is
> $ref: /schemas/graph.yaml#/properties/port

Isn't this already covered by the definition of ports ? There are no
fixed connections for ETE. It is optional and could be connected to
any legacy CoreSight component. i.e, a "ports" object can have port
objects inside.

Given we have defined out-ports as an object "confirming to the ports"
do we need to describe the individual port nodes ?

Cheers
Suzuki

2021-03-22 17:00:13

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 17/19] coresight: core: Add support for dedicated percpu sinks

Hi Mike

On 08/03/2021 17:26, Mike Leach wrote:

> Hi,
>
> On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>>
>> From: Anshuman Khandual <[email protected]>
>>
>> Add support for dedicated sinks that are bound to individual CPUs. (e.g,
>> TRBE). To allow quicker access to the sink for a given CPU bound source,
>> keep a percpu array of the sink devices. Also, add support for building
>> a path to the CPU local sink from the ETM.
>>
>> This adds a new percpu sink type CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM.
>> This new sink type is exclusively available and can only work with percpu
>> source type device CORESIGHT_DEV_SUBTYPE_SOURCE_PROC.
>>
>
> Minor nit: FEAT_TRBE architecturally guarantees a compatible
> architectural FEAT_ETE source.
> However _all_ CPU sources have CORESIGHT_DEV_SUBTYPE_SOURCE_PROC set,
> ETMv3.x, PTM, ETM4.x and ETE alike.
> In the code that follows - coresight_is_percpu_source() checks it is
> any type of CPU source, not the FEAT_ETE type, which is fine as we
> then check the cpu and if it has TRBE.

Agreed. But we would like to keep this CoreSight generic code away from
the specifics of underlying "source", which is why we used the generic
notion of a per-CPU source.

> So the simplifications to the code from the first couple of patch sets
> make this explanation slightly misleading. Could do to adjust if
> re-spinning set.
>
> Reviewed-by: Mike Leach <[email protected]>

Thanks
Suzuki

2021-03-22 17:31:29

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v4 15/19] dts: bindings: Document device tree bindings for ETE

On Mon, Mar 22, 2021 at 10:53 AM Suzuki K Poulose
<[email protected]> wrote:
>
> Hi Rob
>
> On 06/03/2021 21:06, Rob Herring wrote:
> > On Thu, Feb 25, 2021 at 07:35:39PM +0000, Suzuki K Poulose wrote:
> >> Document the device tree bindings for Embedded Trace Extensions.
> >> ETE can be connected to legacy coresight components and thus
> >> could optionally contain a connection graph as described by
> >> the CoreSight bindings.
> >>
> >> Cc: [email protected]
> >> Cc: Mathieu Poirier <[email protected]>
> >> Cc: Mike Leach <[email protected]>
> >> Cc: Rob Herring <[email protected]>
> >> Signed-off-by: Suzuki K Poulose <[email protected]>
> >> ---
> >> Changes:
> >> - Fix out-ports defintion
> >> ---
> >> .../devicetree/bindings/arm/ete.yaml | 71 +++++++++++++++++++
> >> 1 file changed, 71 insertions(+)
> >> create mode 100644 Documentation/devicetree/bindings/arm/ete.yaml
> >>
> >> diff --git a/Documentation/devicetree/bindings/arm/ete.yaml b/Documentation/devicetree/bindings/arm/ete.yaml
> >> new file mode 100644
> >> index 000000000000..35a42d92bf97
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/bindings/arm/ete.yaml
> >> @@ -0,0 +1,71 @@
> >> +# SPDX-License-Identifier: GPL-2.0-only or BSD-2-Clause
> >> +# Copyright 2021, Arm Ltd
> >> +%YAML 1.2
> >> +---
> >> +$id: "http://devicetree.org/schemas/arm/ete.yaml#"
> >> +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> >> +
> >> +title: ARM Embedded Trace Extensions
> >> +
> >> +maintainers:
> >> + - Suzuki K Poulose <[email protected]>
> >> + - Mathieu Poirier <[email protected]>
> >> +
> >> +description: |
> >> + Arm Embedded Trace Extension(ETE) is a per CPU trace component that
> >> + allows tracing the CPU execution. It overlaps with the CoreSight ETMv4
> >> + architecture and has extended support for future architecture changes.
> >> + The trace generated by the ETE could be stored via legacy CoreSight
> >> + components (e.g, TMC-ETR) or other means (e.g, using a per CPU buffer
> >> + Arm Trace Buffer Extension (TRBE)). Since the ETE can be connected to
> >> + legacy CoreSight components, a node must be listed per instance, along
> >> + with any optional connection graph as per the coresight bindings.
> >> + See bindings/arm/coresight.txt.
> >> +
> >> +properties:
> >> + $nodename:
> >> + pattern: "^ete([0-9a-f]+)$"
> >> + compatible:
> >> + items:
> >> + - const: arm,embedded-trace-extension
> >> +
> >> + cpu:
> >> + description: |
> >> + Handle to the cpu this ETE is bound to.
> >> + $ref: /schemas/types.yaml#/definitions/phandle
> >> +
> >> + out-ports:
> >> + description: |
> >> + Output connections from the ETE to legacy CoreSight trace bus.
> >> + $ref: /schemas/graph.yaml#/properties/port
> >
> > s/port/ports/
>
> Ok.
>
> >
> > And then you need:
> >
> > properties:
> > port:
> > description: what this port is
> > $ref: /schemas/graph.yaml#/properties/port
>
> Isn't this already covered by the definition of ports ? There are no
> fixed connections for ETE. It is optional and could be connected to
> any legacy CoreSight component. i.e, a "ports" object can have port
> objects inside.

'properties/ports' only defines that you have 'port' nodes within it.

> Given we have defined out-ports as an object "confirming to the ports"
> do we need to describe the individual port nodes ?

Yes, you have to define what the 'port' nodes are. A port is a data
stream and you should know what your hardware has. What the data
stream is connected to is outside the scope of the binding.

Rob

2021-03-22 21:23:06

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

On Thu, Feb 25, 2021 at 07:35:42PM +0000, Suzuki K Poulose wrote:
> From: Anshuman Khandual <[email protected]>
>
> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> accessible via the system registers. The TRBE supports different addressing
> modes including CPU virtual address and buffer modes including the circular
> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> access to the trace buffer could be prohibited by a higher exception level
> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> private interrupt (PPI) on address translation errors and when the buffer
> is full. Overall implementation here is inspired from the Arm SPE driver.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Signed-off-by: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> Changes:
> - Replaced TRBLIMITR_LIMIT_SHIFT with TRBBASER_BASE_SHIFT in set_trbe_base_pointer()
> - Dropped TRBBASER_BASE_MASK and TRBBASER_BASE_SHIFT from get_trbe_base_pointer()
> - Indentation changes for TRBE_BSC_NOT_[STOPPED|FILLED|TRIGGERED] definitions
> - Moved DECLARE_PER_CPU(...., csdev_sink) into coresight-priv.h
> - Moved isb() from trbe_enable_hw() into set_trbe_limit_pointer_enabled()
> - Dropped the space after type casting before vmap()
> - Return 0 instead of EINVAL in arm_trbe_update_buffer()
> - Add a comment in trbe_handle_overflow()
> - Add a comment in arm_trbe_cpu_startup()
> - Unregister coresight TRBE device when not supported
> - Fix potential NULL handle dereference in IRQ handler with a spurious IRQ
> - Read TRBIDR after is_trbe_programmable() in arm_trbe_probe_coresight_cpu()
> - Replaced and modified trbe_drain_and_disable_local() in IRQ handler
> - Updated arm_trbe_update_buffer() for handling a missing interrupt
> - Dropped kfree() for all devm_xxx() allocated buffer
> - Dropped additional blank line in documentation coresight/coresight-trbe.rst
> - Added Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> - Changed CONFIG_CORESIGHT_TRBE options, dependencies and helper write up
> - Added comment for irq_work_run()
> - Updated comment for minumum buffer length in arm_trbe_alloc_buffer()
> - Dropped redundant smp_processor_id() from arm_trbe_probe_coresight_cpu()
> - Fixed indentation in arm_trbe_probe_cpuhp()
> - Added static for arm_trbe_free_buffer()
> - Added comment for trbe_base element in trbe_buf structure
> - Dropped IS_ERR() check from vmap() returned pointer
> - Added WARN_ON(trbe_csdev) in arm_trbe_probe_coresight_cpu()
> - Changed TRBE device names from arm_trbeX to just trbeX
> - Dropped unused argument perf_output_handle from trbe_get_fault_act()
> - Dropped IS_ERR() from kzalloc_node()/kcalloc() buffer in arm_trbe_alloc_buffer()
> - Dropped IS_ERR() and return -ENOMEM in arm_trbe_probe_coresight()
> - Moved TRBE HW disabling before coresight cleanup in arm_trbe_remove_coresight_cpu()
> - Changed error return codes from arm_trbe_probe_irq()
> - Changed error return codes from arm_trbe_device_probe()
> - Changed arm_trbe_remove_coresight() order in arm_trbe_device_remove()
> - Changed TRBE CPU support probe/remove sequence with for_each_cpu() iterator
> - Changed coresight_register() in arm_trbe_probe_coresight_cpu()
> - Changed error return code when cpuhp_setup_state_multi() fails in arm_trbe_probe_cpuhp()
> - Changed error return code when cpuhp_state_add_instance() fails in arm_trbe_probe_cpuhp()
> - Changed trbe_dbm as trbe_flag including its sysfs interface
> - Handle race between update_buffer & IRQ handler
> - Rework and split the TRBE probe to avoid lockdep due to memory allocation
> from IPI calls (via coresight_register())
> - Fix handle->head updat for snapshot mode.

All of the above make this driver much easier to read.

> ---
> .../testing/sysfs-bus-coresight-devices-trbe | 14 +
> .../trace/coresight/coresight-trbe.rst | 38 +
> drivers/hwtracing/coresight/Kconfig | 14 +
> drivers/hwtracing/coresight/Makefile | 1 +
> drivers/hwtracing/coresight/coresight-trbe.c | 1149 +++++++++++++++++
> drivers/hwtracing/coresight/coresight-trbe.h | 153 +++
> 6 files changed, 1369 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-trbe
> create mode 100644 Documentation/trace/coresight/coresight-trbe.rst
> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c
> create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
>

[...]

> +
> +static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
> + struct perf_event *event, void **pages,
> + int nr_pages, bool snapshot)
> +{
> + struct trbe_buf *buf;
> + struct page **pglist;
> + int i;
> +
> + /*
> + * TRBE LIMIT and TRBE WRITE pointers must be page aligned. But with
> + * just a single page, there would not be any room left while writing
> + * into a partially filled TRBE buffer after the page size alignment.
> + * Hence restrict the minimum buffer size as two pages.
> + */
> + if (nr_pages < 2)
> + return NULL;
> +
> + buf = kzalloc_node(sizeof(*buf), GFP_KERNEL, trbe_alloc_node(event));
> + if (!buf)
> + return ERR_PTR(-ENOMEM);
> +
> + pglist = kcalloc(nr_pages, sizeof(*pglist), GFP_KERNEL);
> + if (!pglist) {
> + kfree(buf);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + for (i = 0; i < nr_pages; i++)
> + pglist[i] = virt_to_page(pages[i]);
> +
> + buf->trbe_base = (unsigned long)vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
> + if (!buf->trbe_base) {
> + kfree(pglist);
> + kfree(buf);
> + return ERR_PTR(buf->trbe_base);

return ERR_PTR(-ENOMEM);

> + }
> + buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
> + buf->trbe_write = buf->trbe_base;
> + buf->snapshot = snapshot;
> + buf->nr_pages = nr_pages;
> + buf->pages = pages;
> + kfree(pglist);
> + return buf;
> +}
> +

2021-03-22 21:27:53

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

On Fri, Mar 19, 2021 at 11:55:10AM +0000, Mike Leach wrote:
> HI Suzuki,
>
> On Fri, 19 Mar 2021 at 10:30, Suzuki K Poulose <[email protected]> wrote:
> >
> > Hi Mike
> >
> > > On 8 Mar 2021, at 17:26, Mike Leach <[email protected]> wrote:
> > >
> > > Hi Suzuki,
> > >
> > > On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
> > >>
> > >> From: Anshuman Khandual <[email protected]>
> > >>
> > >> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
> > >> accessible via the system registers. The TRBE supports different addressing
> > >> modes including CPU virtual address and buffer modes including the circular
> > >> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
> > >> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
> > >> access to the trace buffer could be prohibited by a higher exception level
> > >> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
> > >> private interrupt (PPI) on address translation errors and when the buffer
> > >> is full. Overall implementation here is inspired from the Arm SPE driver.
> > >>
> > >> Cc: Mathieu Poirier <[email protected]>
> > >> Cc: Mike Leach <[email protected]>
> > >> Cc: Suzuki K Poulose <[email protected]>
> > >> Signed-off-by: Anshuman Khandual <[email protected]>
> > >> Signed-off-by: Suzuki K Poulose <[email protected]>
> > >>
> > >> +
> > >> +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
> > >> + struct perf_output_handle *handle,
> > >> + void *config)
> > >> +{
> > >> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> > >> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> > >> + struct trbe_buf *buf = config;
> > >> + enum trbe_fault_action act;
> > >> + unsigned long size, offset;
> > >> + unsigned long write, base, status;
> > >> + unsigned long flags;
> > >> +
> > >> + WARN_ON(buf->cpudata != cpudata);
> > >> + WARN_ON(cpudata->cpu != smp_processor_id());
> > >> + WARN_ON(cpudata->drvdata != drvdata);
> > >> + if (cpudata->mode != CS_MODE_PERF)
> > >> + return 0;
> > >> +
> > >> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
> > >> +
> > >> + /*
> > >> + * We are about to disable the TRBE. And this could in turn
> > >> + * fill up the buffer triggering, an IRQ. This could be consumed
> > >> + * by the PE asynchronously, causing a race here against
> > >> + * the IRQ handler in closing out the handle. So, let us
> > >> + * make sure the IRQ can't trigger while we are collecting
> > >> + * the buffer. We also make sure that a WRAP event is handled
> > >> + * accordingly.
> > >> + */
> > >> + local_irq_save(flags);
> > >> +
> > >> + /*
> > >> + * If the TRBE was disabled due to lack of space in the AUX buffer or a
> > >> + * spurious fault, the driver leaves it disabled, truncating the buffer.
> > >> + * Since the etm_perf driver expects to close out the AUX buffer, the
> > >> + * driver skips it. Thus, just pass in 0 size here to indicate that the
> > >> + * buffer was truncated.
> > >> + */
> > >> + if (!is_trbe_enabled()) {
> > >> + size = 0;
> > >> + goto done;
> > >> + }
> > >> + /*
> > >> + * perf handle structure needs to be shared with the TRBE IRQ handler for
> > >> + * capturing trace data and restarting the handle. There is a probability
> > >> + * of an undefined reference based crash when etm event is being stopped
> > >> + * while a TRBE IRQ also getting processed. This happens due the release
> > >> + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
> > >> + * the TRBE here will ensure that no IRQ could be generated when the perf
> > >> + * handle gets freed in etm_event_stop().
> > >> + */
> > >> + trbe_drain_and_disable_local();
> > >> + write = get_trbe_write_pointer();
> > >> + base = get_trbe_base_pointer();
> > >> +
> > >> + /* Check if there is a pending interrupt and handle it here */
> > >> + status = read_sysreg_s(SYS_TRBSR_EL1);
> > >> + if (is_trbe_irq(status)) {
> > >> +
> > >> + /*
> > >> + * Now that we are handling the IRQ here, clear the IRQ
> > >> + * from the status, to let the irq handler know that it
> > >> + * is taken care of.
> > >> + */
> > >> + clr_trbe_irq();
> > >> + isb();
> > >> +
> > >> + act = trbe_get_fault_act(status);
> > >> + /*
> > >> + * If this was not due to a WRAP event, we have some
> > >> + * errors and as such buffer is empty.
> > >> + */
> > >> + if (act != TRBE_FAULT_ACT_WRAP) {
> > >> + size = 0;
> > >> + goto done;
> > >> + }
> > >
> > > We are using TRBE FILL mode - which halts capture on a full buffer and
> > > triggers the IRQ, without disabling the source first.
> > > This means that the mode is inherently lossy (unless by some unlikely
> > > co-incidence the last byte that caused the wrap was also the last byte
> > > to be sent from an ETE that was in the process of being disabled.)
> > > Therefore we must have a perf_aux_output_flag(handle,
> > > PERF_AUX_FLAG_TRUNCATED) call in here to signal that some trace was
> > > lost, for consistence of operation with ETR etc, and intelpt.
> > >
> >
> > I agree that the there is a bit of loss here due to the FILL mode. But it is not comparable to that of the ETR. In this case, the WRAP event is triggered when we flush the ETE. i.e, this could be mostly due to the fact that the tracing was enabled for the kernel mode and the last few bytes of trace which caused the FILL belong to the code responsible for stopping the components in the CoreSight trace. I personally do not think this data is of any interest to the user.
> > Otherwise, if the data didn’t belong to the perf event side, it should have triggered the IRQ.
> >
> > This is true in case of the buffer overflow interrupt too, with a bit more data lost. i.e, since the interrupt is PPI, the overflow is triggered when the buffer is full (which includes the data that is cached in the TRBE). But there could be a bit of data that is still cached in the ETE, before it is captured in the trace. And the moment we get a FILL event, we stop executing anything that is relevant for the Trace session (as we are in the driver handling the interrupt).
> > And then we reconfigure the buffer to continue the execution. Now, the interrupt delivery is not necessarily synchronous and there could be data lost in the interval between WRAP event and the IRQ is triggered.
> >
> > I am OK with suggesting that there was some loss of trace data during the session, if we hit WRAP event. But this could cause worry to the consumers that they lost too much of trace data of their interest, while that is not the case.
> >
>
> We can never know what has been lost. It may be some trace around the
> driver of no interest to the user, it may also be an event or
> timestamp related to an earlier marker - which could be highly
> relevant.
> With ETR we do not know how much is lost on wrap - it might be one
> byte, it might be much more - but the point is we mark as truncated
> for _any_ amount.
>
> It is unfortunate that we will see multiple buffers marked as
> truncated - but this is far better than creating the false impression
> that no trace has been lost - that there is a continuous record where
> there is not.
> For some users - such as autofdo where sampling is taking place anyway
> - truncated buffers probably do not matter. For others - who are
> looking to trace a specific section of code - then they need to be
> aware that there could be decode anomolies relating to buffer wrap.
>

I think Mike has a point here - we should report it to users when data gets
lost, no matter how small that lost is. If that is a problem they always have
the choice of dedicating more pages to the AUX buffer.

Thanks,
Mathieu

> Regards
>
> Mike
>
> > >> +static inline unsigned long get_trbe_limit_pointer(void)
> > >> +{
> > >> + u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> > >> + unsigned long limit = (trblimitr >> TRBLIMITR_LIMIT_SHIFT) & TRBLIMITR_LIMIT_MASK;
> > >> + unsigned long addr = limit << TRBLIMITR_LIMIT_SHIFT;
> > >
> > > Could this not be:
> > > unsigned long addr = trblimitr & (TRBLIMITR_LIMIT_MASK <<
> > > TRBLIMITR_LIMIT_SHIFT);
> > > like the base ponter below?
> > >
> >
> > Sure, it can be consistent.
> >
> >
> > >> +
> > >> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> > >> + return addr;
> > >> +}
> > >> +
> > >> +static inline unsigned long get_trbe_base_pointer(void)
> > >> +{
> > >> + u64 trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
> > >> + unsigned long addr = trbbaser & (TRBBASER_BASE_MASK << TRBBASER_BASE_SHIFT);
> > >> +
> > >> + WARN_ON(!IS_ALIGNED(addr, PAGE_SIZE));
> > >> + return addr;
> > >> +}
> > >> +
> >
> > Thank you for the review
> >
> > Kind regards
> > Suzuki
> >
>
>
> --
> Mike Leach
> Principal Engineer, ARM Ltd.
> Manchester Design Centre. UK

2021-03-22 21:30:13

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 09/19] coresight: etm4x: Move ETM to prohibited region for disable

On Thu, Feb 25, 2021 at 07:35:33PM +0000, Suzuki K Poulose wrote:
> If the CPU implements Arm v8.4 Trace filter controls (FEAT_TRF),
> move the ETM to trace prohibited region using TRFCR, while disabling.
>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Mike Leach <[email protected]>
> Cc: Anshuman Khandual <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> New patch
> ---
> .../coresight/coresight-etm4x-core.c | 21 +++++++++++++++++--
> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
> 2 files changed, 21 insertions(+), 2 deletions(-)
>

Reviewed-by: Mathieu Poirier <[email protected]>

I am done reviewing this set.

Thanks,
Mathieu

> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 15016f757828..00297906669c 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -31,6 +31,7 @@
> #include <linux/pm_runtime.h>
> #include <linux/property.h>
>
> +#include <asm/barrier.h>
> #include <asm/sections.h>
> #include <asm/sysreg.h>
> #include <asm/local.h>
> @@ -654,6 +655,7 @@ static int etm4_enable(struct coresight_device *csdev,
> static void etm4_disable_hw(void *info)
> {
> u32 control;
> + u64 trfcr;
> struct etmv4_drvdata *drvdata = info;
> struct etmv4_config *config = &drvdata->config;
> struct coresight_device *csdev = drvdata->csdev;
> @@ -676,6 +678,16 @@ static void etm4_disable_hw(void *info)
> /* EN, bit[0] Trace unit enable bit */
> control &= ~0x1;
>
> + /*
> + * If the CPU supports v8.4 Trace filter Control,
> + * set the ETM to trace prohibited region.
> + */
> + if (drvdata->trfc) {
> + trfcr = read_sysreg_s(SYS_TRFCR_EL1);
> + write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
> + SYS_TRFCR_EL1);
> + isb();
> + }
> /*
> * Make sure everything completes before disabling, as recommended
> * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
> @@ -683,12 +695,16 @@ static void etm4_disable_hw(void *info)
> */
> dsb(sy);
> isb();
> + /* Trace synchronization barrier, is a nop if not supported */
> + tsb_csync();
> etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
>
> /* wait for TRCSTATR.PMSTABLE to go to '1' */
> if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
> dev_err(etm_dev,
> "timeout while waiting for PM stable Trace Status\n");
> + if (drvdata->trfc)
> + write_sysreg_s(trfcr, SYS_TRFCR_EL1);
>
> /* read the status of the single shot comparators */
> for (i = 0; i < drvdata->nr_ss_cmp; i++) {
> @@ -873,7 +889,7 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
> return false;
> }
>
> -static void cpu_enable_tracing(void)
> +static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
> {
> u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
> u64 trfcr;
> @@ -881,6 +897,7 @@ static void cpu_enable_tracing(void)
> if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
> return;
>
> + drvdata->trfc = true;
> /*
> * If the CPU supports v8.4 SelfHosted Tracing, enable
> * tracing at the kernel EL and EL0, forcing to use the
> @@ -1082,7 +1099,7 @@ static void etm4_init_arch_data(void *info)
> /* NUMCNTR, bits[30:28] number of counters available for tracing */
> drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
> etm4_cs_lock(drvdata, csa);
> - cpu_enable_tracing();
> + cpu_enable_tracing(drvdata);
> }
>
> static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> index 0af60571aa23..f6478ef642bf 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> @@ -862,6 +862,7 @@ struct etmv4_save_state {
> * @nooverflow: Indicate if overflow prevention is supported.
> * @atbtrig: If the implementation can support ATB triggers
> * @lpoverride: If the implementation can support low-power state over.
> + * @trfc: If the implementation supports Arm v8.4 trace filter controls.
> * @config: structure holding configuration parameters.
> * @save_state: State to be preserved across power loss
> * @state_needs_restore: True when there is context to restore after PM exit
> @@ -912,6 +913,7 @@ struct etmv4_drvdata {
> bool nooverflow;
> bool atbtrig;
> bool lpoverride;
> + bool trfc;
> struct etmv4_config config;
> struct etmv4_save_state *save_state;
> bool state_needs_restore;
> --
> 2.24.1
>

2021-03-22 22:24:19

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 03/19] kvm: arm64: Hide system instruction access to Trace registers

Will, Catalin,

On 25/02/2021 19:35, Suzuki K Poulose wrote:
> Currently we advertise the ID_AA6DFR0_EL1.TRACEVER for the guest,
> when the trace register accesses are trapped (CPTR_EL2.TTA == 1).
> So, the guest will get an undefined instruction, if trusts the
> ID registers and access one of the trace registers.
> Lets be nice to the guest and hide the feature to avoid
> unexpected behavior.
>
> Even though this can be done at KVM sysreg emulation layer,
> we do this by removing the TRACEVER from the sanitised feature
> register field. This is fine as long as the ETM drivers
> can handle the individual trace units separately, even
> when there are differences among the CPUs.
>
> Cc: Marc Zyngier <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>
> ---
> New patch
> ---
> arch/arm64/kernel/cpufeature.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 066030717a4c..a4698f09bf32 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -383,7 +383,6 @@ static const struct arm64_ftr_bits ftr_id_aa64dfr0[] = {
> * of support.
> */
> S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_AA64DFR0_PMUVER_SHIFT, 4, 0),
> - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_TRACEVER_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64DFR0_DEBUGVER_SHIFT, 4, 0x6),
> ARM64_FTR_END,
> };
>

Are you happy to pick this patch for 5.12 as a fix ?

Suzuki

2021-03-22 22:28:17

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 05/19] kvm: arm64: Disable guest access to trace filter controls

Hi Marc,

On 25/02/2021 19:35, Suzuki K Poulose wrote:
> Disable guest access to the Trace Filter control registers.
> We do not advertise the Trace filter feature to the guest
> (ID_AA64DFR0_EL1: TRACE_FILT is cleared) already, but the guest
> can still access the TRFCR_EL1 unless we trap it.
>
> This will also make sure that the guest cannot fiddle with
> the filtering controls set by a nvhe host.
>
> Cc: Marc Zyngier <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Signed-off-by: Suzuki K Poulose <[email protected]>

We have already have the v8.4 self hosted tracing support in 5.12-rcX.
Do you think you can pick this up for this 5.12 ?

Cheers
Suzuki

> ---
> New patch
> ---
> arch/arm64/include/asm/kvm_arm.h | 1 +
> arch/arm64/kvm/debug.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 4e90c2debf70..94d4025acc0b 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -278,6 +278,7 @@
> #define CPTR_EL2_DEFAULT CPTR_EL2_RES1
>
> /* Hyp Debug Configuration Register bits */
> +#define MDCR_EL2_TTRF (1 << 19)
> #define MDCR_EL2_TPMS (1 << 14)
> #define MDCR_EL2_E2PB_MASK (UL(0x3))
> #define MDCR_EL2_E2PB_SHIFT (UL(12))
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 7a7e425616b5..dbc890511631 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -89,6 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
> * - Debug ROM Address (MDCR_EL2_TDRA)
> * - OS related registers (MDCR_EL2_TDOSA)
> * - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
> + * - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
> *
> * Additionally, KVM only traps guest accesses to the debug registers if
> * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
> @@ -112,6 +113,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
> vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
> vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
> MDCR_EL2_TPMS |
> + MDCR_EL2_TTRF |
> MDCR_EL2_TPMCR |
> MDCR_EL2_TDRA |
> MDCR_EL2_TDOSA);
>

2021-03-22 22:51:07

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 15/19] dts: bindings: Document device tree bindings for ETE

On 22/03/2021 17:28, Rob Herring wrote:
> On Mon, Mar 22, 2021 at 10:53 AM Suzuki K Poulose
> <[email protected]> wrote:
>>
>> Hi Rob
>>
>> On 06/03/2021 21:06, Rob Herring wrote:
>>> On Thu, Feb 25, 2021 at 07:35:39PM +0000, Suzuki K Poulose wrote:
>>>> Document the device tree bindings for Embedded Trace Extensions.
>>>> ETE can be connected to legacy coresight components and thus
>>>> could optionally contain a connection graph as described by
>>>> the CoreSight bindings.
>>>>
>>>> Cc: [email protected]
>>>> Cc: Mathieu Poirier <[email protected]>
>>>> Cc: Mike Leach <[email protected]>
>>>> Cc: Rob Herring <[email protected]>
>>>> Signed-off-by: Suzuki K Poulose <[email protected]>
>>>> ---

>>>> + out-ports:
>>>> + description: |
>>>> + Output connections from the ETE to legacy CoreSight trace bus.
>>>> + $ref: /schemas/graph.yaml#/properties/port
>>>
>>> s/port/ports/
>>
>> Ok.
>>
>>>
>>> And then you need:
>>>
>>> properties:
>>> port:
>>> description: what this port is
>>> $ref: /schemas/graph.yaml#/properties/port
>>
>> Isn't this already covered by the definition of ports ? There are no
>> fixed connections for ETE. It is optional and could be connected to
>> any legacy CoreSight component. i.e, a "ports" object can have port
>> objects inside.
>
> 'properties/ports' only defines that you have 'port' nodes within it.
>
>> Given we have defined out-ports as an object "confirming to the ports"
>> do we need to describe the individual port nodes ?
>
> Yes, you have to define what the 'port' nodes are. A port is a data
> stream and you should know what your hardware has. What the data
> stream is connected to is outside the scope of the binding.

Ok, I have included the above changes for the next version.

Thanks
Suzuki

2021-03-22 23:03:40

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 18/19] coresight: sink: Add TRBE driver

On 22/03/2021 21:24, Mathieu Poirier wrote:
> On Fri, Mar 19, 2021 at 11:55:10AM +0000, Mike Leach wrote:
>> HI Suzuki,
>>
>> On Fri, 19 Mar 2021 at 10:30, Suzuki K Poulose <[email protected]> wrote:
>>>
>>> Hi Mike
>>>
>>>> On 8 Mar 2021, at 17:26, Mike Leach <[email protected]> wrote:
>>>>
>>>> Hi Suzuki,
>>>>
>>>> On Thu, 25 Feb 2021 at 19:36, Suzuki K Poulose <[email protected]> wrote:
>>>>>
>>>>> From: Anshuman Khandual <[email protected]>
>>>>>
>>>>> Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is
>>>>> accessible via the system registers. The TRBE supports different addressing
>>>>> modes including CPU virtual address and buffer modes including the circular
>>>>> buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1),
>>>>> an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the
>>>>> access to the trace buffer could be prohibited by a higher exception level
>>>>> (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU
>>>>> private interrupt (PPI) on address translation errors and when the buffer
>>>>> is full. Overall implementation here is inspired from the Arm SPE driver.
>>>>>
>>>>> Cc: Mathieu Poirier <[email protected]>
>>>>> Cc: Mike Leach <[email protected]>
>>>>> Cc: Suzuki K Poulose <[email protected]>
>>>>> Signed-off-by: Anshuman Khandual <[email protected]>
>>>>> Signed-off-by: Suzuki K Poulose <[email protected]>
>>>>>
>>>>> +
>>>>> +static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
>>>>> + struct perf_output_handle *handle,
>>>>> + void *config)
>>>>> +{
>>>>> + struct trbe_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>>>> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
>>>>> + struct trbe_buf *buf = config;
>>>>> + enum trbe_fault_action act;
>>>>> + unsigned long size, offset;
>>>>> + unsigned long write, base, status;
>>>>> + unsigned long flags;
>>>>> +
>>>>> + WARN_ON(buf->cpudata != cpudata);
>>>>> + WARN_ON(cpudata->cpu != smp_processor_id());
>>>>> + WARN_ON(cpudata->drvdata != drvdata);
>>>>> + if (cpudata->mode != CS_MODE_PERF)
>>>>> + return 0;
>>>>> +
>>>>> + perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
>>>>> +
>>>>> + /*
>>>>> + * We are about to disable the TRBE. And this could in turn
>>>>> + * fill up the buffer triggering, an IRQ. This could be consumed
>>>>> + * by the PE asynchronously, causing a race here against
>>>>> + * the IRQ handler in closing out the handle. So, let us
>>>>> + * make sure the IRQ can't trigger while we are collecting
>>>>> + * the buffer. We also make sure that a WRAP event is handled
>>>>> + * accordingly.
>>>>> + */
>>>>> + local_irq_save(flags);
>>>>> +
>>>>> + /*
>>>>> + * If the TRBE was disabled due to lack of space in the AUX buffer or a
>>>>> + * spurious fault, the driver leaves it disabled, truncating the buffer.
>>>>> + * Since the etm_perf driver expects to close out the AUX buffer, the
>>>>> + * driver skips it. Thus, just pass in 0 size here to indicate that the
>>>>> + * buffer was truncated.
>>>>> + */
>>>>> + if (!is_trbe_enabled()) {
>>>>> + size = 0;
>>>>> + goto done;
>>>>> + }
>>>>> + /*
>>>>> + * perf handle structure needs to be shared with the TRBE IRQ handler for
>>>>> + * capturing trace data and restarting the handle. There is a probability
>>>>> + * of an undefined reference based crash when etm event is being stopped
>>>>> + * while a TRBE IRQ also getting processed. This happens due the release
>>>>> + * of perf handle via perf_aux_output_end() in etm_event_stop(). Stopping
>>>>> + * the TRBE here will ensure that no IRQ could be generated when the perf
>>>>> + * handle gets freed in etm_event_stop().
>>>>> + */
>>>>> + trbe_drain_and_disable_local();
>>>>> + write = get_trbe_write_pointer();
>>>>> + base = get_trbe_base_pointer();
>>>>> +
>>>>> + /* Check if there is a pending interrupt and handle it here */
>>>>> + status = read_sysreg_s(SYS_TRBSR_EL1);
>>>>> + if (is_trbe_irq(status)) {
>>>>> +
>>>>> + /*
>>>>> + * Now that we are handling the IRQ here, clear the IRQ
>>>>> + * from the status, to let the irq handler know that it
>>>>> + * is taken care of.
>>>>> + */
>>>>> + clr_trbe_irq();
>>>>> + isb();
>>>>> +
>>>>> + act = trbe_get_fault_act(status);
>>>>> + /*
>>>>> + * If this was not due to a WRAP event, we have some
>>>>> + * errors and as such buffer is empty.
>>>>> + */
>>>>> + if (act != TRBE_FAULT_ACT_WRAP) {
>>>>> + size = 0;
>>>>> + goto done;
>>>>> + }
>>>>
>>>> We are using TRBE FILL mode - which halts capture on a full buffer and
>>>> triggers the IRQ, without disabling the source first.
>>>> This means that the mode is inherently lossy (unless by some unlikely
>>>> co-incidence the last byte that caused the wrap was also the last byte
>>>> to be sent from an ETE that was in the process of being disabled.)
>>>> Therefore we must have a perf_aux_output_flag(handle,
>>>> PERF_AUX_FLAG_TRUNCATED) call in here to signal that some trace was
>>>> lost, for consistence of operation with ETR etc, and intelpt.
>>>>
>>>
>>> I agree that the there is a bit of loss here due to the FILL mode. But it is not comparable to that of the ETR. In this case, the WRAP event is triggered when we flush the ETE. i.e, this could be mostly due to the fact that the tracing was enabled for the kernel mode and the last few bytes of trace which caused the FILL belong to the code responsible for stopping the components in the CoreSight trace. I personally do not think this data is of any interest to the user.
>>> Otherwise, if the data didn’t belong to the perf event side, it should have triggered the IRQ.
>>>
>>> This is true in case of the buffer overflow interrupt too, with a bit more data lost. i.e, since the interrupt is PPI, the overflow is triggered when the buffer is full (which includes the data that is cached in the TRBE). But there could be a bit of data that is still cached in the ETE, before it is captured in the trace. And the moment we get a FILL event, we stop executing anything that is relevant for the Trace session (as we are in the driver handling the interrupt).
>>> And then we reconfigure the buffer to continue the execution. Now, the interrupt delivery is not necessarily synchronous and there could be data lost in the interval between WRAP event and the IRQ is triggered.
>>>
>>> I am OK with suggesting that there was some loss of trace data during the session, if we hit WRAP event. But this could cause worry to the consumers that they lost too much of trace data of their interest, while that is not the case.
>>>
>>
>> We can never know what has been lost. It may be some trace around the
>> driver of no interest to the user, it may also be an event or
>> timestamp related to an earlier marker - which could be highly
>> relevant.
>> With ETR we do not know how much is lost on wrap - it might be one
>> byte, it might be much more - but the point is we mark as truncated
>> for _any_ amount.
>>
>> It is unfortunate that we will see multiple buffers marked as
>> truncated - but this is far better than creating the false impression
>> that no trace has been lost - that there is a continuous record where
>> there is not.
>> For some users - such as autofdo where sampling is taking place anyway
>> - truncated buffers probably do not matter. For others - who are
>> looking to trace a specific section of code - then they need to be
>> aware that there could be decode anomolies relating to buffer wrap.
>>
>
> I think Mike has a point here - we should report it to users when data gets
> lost, no matter how small that lost is. If that is a problem they always have
> the choice of dedicating more pages to the AUX buffer.

Agreed, I have included this in the next version.

Thanks
Suzuki

2021-03-23 09:19:09

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 05/19] kvm: arm64: Disable guest access to trace filter controls

Hi Suzuki,

On 2021-03-22 22:24, Suzuki K Poulose wrote:
> Hi Marc,
>
> On 25/02/2021 19:35, Suzuki K Poulose wrote:
>> Disable guest access to the Trace Filter control registers.
>> We do not advertise the Trace filter feature to the guest
>> (ID_AA64DFR0_EL1: TRACE_FILT is cleared) already, but the guest
>> can still access the TRFCR_EL1 unless we trap it.
>>
>> This will also make sure that the guest cannot fiddle with
>> the filtering controls set by a nvhe host.
>>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Signed-off-by: Suzuki K Poulose <[email protected]>
>
> We have already have the v8.4 self hosted tracing support in 5.12-rcX.
> Do you think you can pick this up for this 5.12 ?

Sure, no problem. Shall I pick patch #3 at the same time?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2021-03-23 09:47:04

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v4 05/19] kvm: arm64: Disable guest access to trace filter controls

On 23/03/2021 09:16, Marc Zyngier wrote:
> Hi Suzuki,
>
> On 2021-03-22 22:24, Suzuki K Poulose wrote:
>> Hi Marc,
>>
>> On 25/02/2021 19:35, Suzuki K Poulose wrote:
>>> Disable guest access to the Trace Filter control registers.
>>> We do not advertise the Trace filter feature to the guest
>>> (ID_AA64DFR0_EL1: TRACE_FILT is cleared) already, but the guest
>>> can still access the TRFCR_EL1 unless we trap it.
>>>
>>> This will also make sure that the guest cannot fiddle with
>>> the filtering controls set by a nvhe host.
>>>
>>> Cc: Marc Zyngier <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Mark Rutland <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Signed-off-by: Suzuki K Poulose <[email protected]>
>>
>> We have already have the v8.4 self hosted tracing support in 5.12-rcX.
>> Do you think you can pick this up for this 5.12 ?
>
> Sure, no problem. Shall I pick patch #3 at the same time?

Yes please.

Thanks !
Suzuki