This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
as long as there are fewer than that many ETMs connected to each sink.
Each sink owns its own trace ID map, and any Perf session connecting to
that sink will allocate from it, even if the sink is currently in use by
other users. This is similar to the existing behavior where the dynamic
trace IDs are constant as long as there is any concurrent Perf session
active. It's not completely optimal because slightly more IDs will be
used than necessary, but the optimal solution involves tracking the PIDs
of each session and allocating ID maps based on the session owner. This
is difficult to do with the combination of per-thread and per-cpu modes
and some scheduling issues. The complexity of this isn't likely to worth
it because even with multiple users they'd just see a difference in the
ordering of ID allocations rather than hitting any limits (unless the
hardware does have too many ETMs connected to one sink).
Per-thread mode works but only until there are any overlapping IDs, at
which point Perf will error out. Both per-thread mode and sysfs mode are
left to future changes, but both can be added on top of this initial
implementation and only sysfs mode requires further driver changes.
The HW_ID version field hasn't been bumped in order to not break Perf
which already has an error condition for other values of that field.
Instead a new minor version has been added which signifies that there
are new fields but the old fields are backwards compatible.
James Clark (17):
perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
perf auxtrace: Allow number of queues to be specified
perf: cs-etm: Create decoders after both AUX and HW_ID search passes
perf: cs-etm: Allocate queues for all CPUs
perf: cs-etm: Move traceid_list to each queue
perf: cs-etm: Create decoders based on the trace ID mappings
perf: cs-etm: Support version 0.1 of HW_ID packets
coresight: Remove unused stubs
coresight: Clarify comments around the PID of the sink owner
coresight: Move struct coresight_trace_id_map to common header
coresight: Expose map argument in trace ID API
coresight: Make CPU id map a property of a trace ID map
coresight: Pass trace ID map into source enable
coresight: Use per-sink trace ID maps for Perf sessions
coresight: Remove pending trace ID release mechanism
coresight: Re-emit trace IDs when the sink changes in per-thread mode
coresight: Emit HW_IDs for all ETMs that are using the sink
drivers/hwtracing/coresight/coresight-core.c | 10 +
drivers/hwtracing/coresight/coresight-dummy.c | 3 +-
.../hwtracing/coresight/coresight-etm-perf.c | 82 ++-
.../hwtracing/coresight/coresight-etm-perf.h | 20 +-
.../coresight/coresight-etm3x-core.c | 14 +-
.../coresight/coresight-etm4x-core.c | 14 +-
drivers/hwtracing/coresight/coresight-stm.c | 3 +-
drivers/hwtracing/coresight/coresight-sysfs.c | 3 +-
.../hwtracing/coresight/coresight-tmc-etr.c | 5 +-
drivers/hwtracing/coresight/coresight-tmc.h | 5 +-
drivers/hwtracing/coresight/coresight-tpdm.c | 3 +-
.../hwtracing/coresight/coresight-trace-id.c | 107 +--
.../hwtracing/coresight/coresight-trace-id.h | 57 +-
include/linux/coresight-pmu.h | 17 +-
include/linux/coresight.h | 20 +-
tools/include/linux/coresight-pmu.h | 17 +-
tools/perf/util/auxtrace.c | 9 +-
tools/perf/util/auxtrace.h | 1 +
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 28 +-
tools/perf/util/cs-etm.c | 617 ++++++++++++------
tools/perf/util/cs-etm.h | 2 +-
21 files changed, 633 insertions(+), 404 deletions(-)
--
2.34.1
Now that each queue has a unique set of trace ID mappings, use this
list to create the decoders. This also works the same way for
unformatted where a single dummy entry is added into the trace ID list.
Previously each queue would have a decoder created for each traced CPU
on the system but this won't work anymore because CPUs can have
overlapping trace IDs.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/cs-etm.c | 143 +++++++++++++++++++++++----------------
1 file changed, 85 insertions(+), 58 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index be858aed26c4..73fc0ab2fb09 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -268,6 +268,10 @@ static int cs_etm__map_trace_id_v0(struct cs_etm_auxtrace *etm, u8 trace_chan_id
int ret;
struct cs_etm_queue *etmq = etm->queues.queue_array[i].priv;
+ /* Ignore HW_IDs on unformatted queues */
+ if (etmq->formatted_set && !etmq->formatted)
+ continue;
+
ret = cs_etm__insert_trace_id_node(etmq, trace_chan_id,
cpu_metadata);
if (ret)
@@ -673,80 +677,58 @@ static void cs_etm__packet_dump(const char *pkt_string)
}
static void cs_etm__set_trace_param_etmv3(struct cs_etm_trace_params *t_params,
- struct cs_etm_auxtrace *etm, int t_idx,
- int m_idx, u32 etmidr)
+ u64 *metadata, u32 etmidr)
{
- u64 **metadata = etm->metadata;
-
- t_params[t_idx].protocol = cs_etm__get_v7_protocol_version(etmidr);
- t_params[t_idx].etmv3.reg_ctrl = metadata[m_idx][CS_ETM_ETMCR];
- t_params[t_idx].etmv3.reg_trc_id = metadata[m_idx][CS_ETM_ETMTRACEIDR];
+ t_params->protocol = cs_etm__get_v7_protocol_version(etmidr);
+ t_params->etmv3.reg_ctrl = metadata[CS_ETM_ETMCR];
+ t_params->etmv3.reg_trc_id = metadata[CS_ETM_ETMTRACEIDR];
}
static void cs_etm__set_trace_param_etmv4(struct cs_etm_trace_params *t_params,
- struct cs_etm_auxtrace *etm, int t_idx,
- int m_idx)
+ u64 *metadata)
{
- u64 **metadata = etm->metadata;
-
- t_params[t_idx].protocol = CS_ETM_PROTO_ETMV4i;
- t_params[t_idx].etmv4.reg_idr0 = metadata[m_idx][CS_ETMV4_TRCIDR0];
- t_params[t_idx].etmv4.reg_idr1 = metadata[m_idx][CS_ETMV4_TRCIDR1];
- t_params[t_idx].etmv4.reg_idr2 = metadata[m_idx][CS_ETMV4_TRCIDR2];
- t_params[t_idx].etmv4.reg_idr8 = metadata[m_idx][CS_ETMV4_TRCIDR8];
- t_params[t_idx].etmv4.reg_configr = metadata[m_idx][CS_ETMV4_TRCCONFIGR];
- t_params[t_idx].etmv4.reg_traceidr = metadata[m_idx][CS_ETMV4_TRCTRACEIDR];
+ t_params->protocol = CS_ETM_PROTO_ETMV4i;
+ t_params->etmv4.reg_idr0 = metadata[CS_ETMV4_TRCIDR0];
+ t_params->etmv4.reg_idr1 = metadata[CS_ETMV4_TRCIDR1];
+ t_params->etmv4.reg_idr2 = metadata[CS_ETMV4_TRCIDR2];
+ t_params->etmv4.reg_idr8 = metadata[CS_ETMV4_TRCIDR8];
+ t_params->etmv4.reg_configr = metadata[CS_ETMV4_TRCCONFIGR];
+ t_params->etmv4.reg_traceidr = metadata[CS_ETMV4_TRCTRACEIDR];
}
static void cs_etm__set_trace_param_ete(struct cs_etm_trace_params *t_params,
- struct cs_etm_auxtrace *etm, int t_idx,
- int m_idx)
+ u64 *metadata)
{
- u64 **metadata = etm->metadata;
-
- t_params[t_idx].protocol = CS_ETM_PROTO_ETE;
- t_params[t_idx].ete.reg_idr0 = metadata[m_idx][CS_ETE_TRCIDR0];
- t_params[t_idx].ete.reg_idr1 = metadata[m_idx][CS_ETE_TRCIDR1];
- t_params[t_idx].ete.reg_idr2 = metadata[m_idx][CS_ETE_TRCIDR2];
- t_params[t_idx].ete.reg_idr8 = metadata[m_idx][CS_ETE_TRCIDR8];
- t_params[t_idx].ete.reg_configr = metadata[m_idx][CS_ETE_TRCCONFIGR];
- t_params[t_idx].ete.reg_traceidr = metadata[m_idx][CS_ETE_TRCTRACEIDR];
- t_params[t_idx].ete.reg_devarch = metadata[m_idx][CS_ETE_TRCDEVARCH];
+ t_params->protocol = CS_ETM_PROTO_ETE;
+ t_params->ete.reg_idr0 = metadata[CS_ETE_TRCIDR0];
+ t_params->ete.reg_idr1 = metadata[CS_ETE_TRCIDR1];
+ t_params->ete.reg_idr2 = metadata[CS_ETE_TRCIDR2];
+ t_params->ete.reg_idr8 = metadata[CS_ETE_TRCIDR8];
+ t_params->ete.reg_configr = metadata[CS_ETE_TRCCONFIGR];
+ t_params->ete.reg_traceidr = metadata[CS_ETE_TRCTRACEIDR];
+ t_params->ete.reg_devarch = metadata[CS_ETE_TRCDEVARCH];
}
static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params,
- struct cs_etm_auxtrace *etm,
- bool formatted,
- int sample_cpu,
- int decoders)
-{
- int t_idx, m_idx;
- u32 etmidr;
- u64 architecture;
-
- for (t_idx = 0; t_idx < decoders; t_idx++) {
- if (formatted)
- m_idx = t_idx;
- else {
- m_idx = get_cpu_data_idx(etm, sample_cpu);
- if (m_idx == -1) {
- pr_warning("CS_ETM: unknown CPU, falling back to first metadata\n");
- m_idx = 0;
- }
- }
+ struct cs_etm_queue *etmq)
+{
+ struct int_node *inode;
- architecture = etm->metadata[m_idx][CS_ETM_MAGIC];
+ intlist__for_each_entry(inode, etmq->traceid_list) {
+ u64 *metadata = inode->priv;
+ u64 architecture = metadata[CS_ETM_MAGIC];
+ u32 etmidr;
switch (architecture) {
case __perf_cs_etmv3_magic:
- etmidr = etm->metadata[m_idx][CS_ETM_ETMIDR];
- cs_etm__set_trace_param_etmv3(t_params, etm, t_idx, m_idx, etmidr);
+ etmidr = metadata[CS_ETM_ETMIDR];
+ cs_etm__set_trace_param_etmv3(t_params++, metadata, etmidr);
break;
case __perf_cs_etmv4_magic:
- cs_etm__set_trace_param_etmv4(t_params, etm, t_idx, m_idx);
+ cs_etm__set_trace_param_etmv4(t_params++, metadata);
break;
case __perf_cs_ete_magic:
- cs_etm__set_trace_param_ete(t_params, etm, t_idx, m_idx);
+ cs_etm__set_trace_param_ete(t_params++, metadata);
break;
default:
return -EINVAL;
@@ -2918,6 +2900,42 @@ static u64 *cs_etm__create_meta_blk(u64 *buff_in, int *buff_in_offset,
return metadata;
}
+/*
+ * traceid_list is used to create decoders and give them the trace ID
+ * mappings. In unformatted mode just insert one entry for the sample
+ * CPU so the decoder knows which settings to use.
+ */
+static int cs_etm__map_trace_ids_unformatted(struct cs_etm_auxtrace *etm)
+{
+ for (unsigned int i = 0; i < etm->queues.nr_queues; ++i) {
+ int ret;
+ struct cs_etm_queue *etmq;
+ u8 trace_chan_id;
+ u64 *cpu_data;
+
+ etmq = etm->queues.queue_array[i].priv;
+ if (!etmq->formatted_set || etmq->formatted)
+ continue;
+
+ /* Giving it a real ID doesn't do much but can help with debugging */
+ trace_chan_id = CORESIGHT_LEGACY_CPU_TRACE_ID(i);
+ cpu_data = get_cpu_data(etm, i);
+ if (cpu_data == NULL) {
+ pr_warning("CS_ETM: unknown CPU, falling back to first metadata\n");
+ cpu_data = etm->metadata[0];
+ }
+
+ ret = cs_etm__insert_trace_id_node(etmq, trace_chan_id, cpu_data);
+ if (ret)
+ return ret;
+
+ ret = cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
/**
* Puts a fragment of an auxtrace buffer into the auxtrace queues based
* on the bounds of aux_event, if it matches with the buffer that's at
@@ -3220,21 +3238,26 @@ static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64 **metadata)
static int cs_etm__create_queue_decoders(struct cs_etm_queue *etmq)
{
struct cs_etm_decoder_params d_params;
+ struct cs_etm_trace_params *t_params;
+ int decoders = intlist__nr_entries(etmq->traceid_list);
/*
* Each queue can only contain data from one CPU when unformatted, so only one decoder is
* needed.
*/
- int decoders = etmq->formatted ? etmq->etm->num_cpu : 1;
+ if (etmq->formatted_set && !etmq->formatted)
+ assert(decoders == 1);
+
+ if (decoders == 0)
+ return 0;
/* Use metadata to fill in trace parameters for trace decoder */
- struct cs_etm_trace_params *t_params = zalloc(sizeof(*t_params) * decoders);
+ t_params = zalloc(sizeof(*t_params) * decoders);
if (!t_params)
goto out_free;
- if (cs_etm__init_trace_params(t_params, etmq->etm, etmq->formatted,
- etmq->queue_nr, decoders))
+ if (cs_etm__init_trace_params(t_params, etmq))
goto out_free;
/* Set decoder parameters to decode trace packets */
@@ -3497,6 +3520,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
if (err)
goto err_free_queues;
+ err = cs_etm__map_trace_ids_unformatted(etm);
+ if (err)
+ goto err_free_queues;
+
err = cs_etm__create_decoders(etm);
if (err)
goto err_free_queues;
--
2.34.1
v0.1 HW_ID packets have two new fields so that it's possible to describe
mappings that may be for a different CPU, but are valid in that queue.
Use the existing CPU field from the sample to decide which queue to
write the new mapping into, but use both of the new fields to create the
mapping.
When it becomes apparent that two queues share a sink by looking at
which CPUs are writing into that sink, make both queues share the same
traceid_list.
Also update the error message to show that overlapping IDs aren't an
error in per-thread mode, just not supported. In the future we can
use the CPU ID from the AUX record to place per-thread trace into the
correct queue which will solve the problem.
Signed-off-by: James Clark <[email protected]>
---
tools/include/linux/coresight-pmu.h | 17 +++-
tools/perf/util/cs-etm.c | 122 +++++++++++++++++++++++++---
2 files changed, 125 insertions(+), 14 deletions(-)
diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h
index 51ac441a37c3..4a7fac6f66b9 100644
--- a/tools/include/linux/coresight-pmu.h
+++ b/tools/include/linux/coresight-pmu.h
@@ -49,12 +49,21 @@
* Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload.
* Used to associate a CPU with the CoreSight Trace ID.
* [07:00] - Trace ID - uses 8 bits to make value easy to read in file.
- * [59:08] - Unused (SBZ)
- * [63:60] - Version
+ * [15:08] - V2 Trace ID - ID for the ETM/CPU referenced by V2 CPU
+ * [31:16] - V2 CPU ID - CPU that corresponds to the trace ID in V2 trace ID
+ * [55:32] - Unused (SBZ)
+ * [59:56] - Minor Version - previously existing fields are compatible with
+ * all minor versions.
+ * [63:60] - Major Version - previously existing fields mean different things
+ * in new major versions.
*/
#define CS_AUX_HW_ID_TRACE_ID_MASK GENMASK_ULL(7, 0)
-#define CS_AUX_HW_ID_VERSION_MASK GENMASK_ULL(63, 60)
+#define CS_AUX_HW_ID_V01_TRACE_ID_MASK GENMASK_ULL(15, 8)
+#define CS_AUX_HW_ID_V01_CPU_MASK GENMASK_ULL(31, 16)
+#define CS_AUX_HW_ID_MINOR_VERSION_MASK GENMASK_ULL(59, 56)
+#define CS_AUX_HW_ID_MAJOR_VERSION_MASK GENMASK_ULL(63, 60)
-#define CS_AUX_HW_ID_CURR_VERSION 0
+#define CS_AUX_HW_ID_MAJOR_VERSION 0
+#define CS_AUX_HW_ID_MINOR_VERSION 1
#endif
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 73fc0ab2fb09..db459ce71b98 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -113,6 +113,8 @@ struct cs_etm_queue {
struct cs_etm_traceid_queue **traceid_queues;
/* Conversion between traceID and metadata pointers */
struct intlist *traceid_list;
+ /* Same as traceid_list, but traceid_list may be a reference to another queue's */
+ struct intlist *own_traceid_list;
};
static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm);
@@ -236,7 +238,16 @@ static int cs_etm__insert_trace_id_node(struct cs_etm_queue *etmq,
int err;
if (curr_cpu_data[CS_ETM_CPU] != cpu_metadata[CS_ETM_CPU]) {
- pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n");
+ /*
+ * With > CORESIGHT_TRACE_IDS_MAX ETMs, overlapping IDs
+ * are expected (but not supported) in per-thread mode,
+ * rather than signifying an error.
+ */
+ if (etmq->etm->per_thread_decoding)
+ pr_err("CS_ETM: overlapping Trace IDs aren't currently supported in per-thread mode\n");
+ else
+ pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n");
+
return -EINVAL;
}
@@ -303,6 +314,92 @@ static int cs_etm__process_trace_id_v0(struct cs_etm_auxtrace *etm, int cpu,
return cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data);
}
+static int cs_etm__process_trace_id_v0_1(struct cs_etm_auxtrace *etm, int this_cpu,
+ u64 hw_id)
+{
+ struct auxtrace_queue *this_aq, *aq;
+ struct cs_etm_queue *this_etmq, *etmq;
+ int ret, cpu;
+ u8 trace_id;
+ u64 *cpu_data;
+ struct int_node *inode, *tmp;
+ struct intlist *old_list;
+
+ cpu = FIELD_GET(CS_AUX_HW_ID_V01_CPU_MASK, hw_id);
+ trace_id = FIELD_GET(CS_AUX_HW_ID_V01_TRACE_ID_MASK, hw_id);
+
+ /* In per thread mode (cpu == -1) there is only one queue and it's index 0 */
+ if (etm->queues.queue_array[0].cpu == -1)
+ this_aq = aq = &etm->queues.queue_array[0];
+ else {
+ aq = &etm->queues.queue_array[cpu];
+ this_aq = &etm->queues.queue_array[this_cpu];
+ }
+
+ this_etmq = this_aq->priv;
+ etmq = aq->priv;
+
+ /* Ignore HW_IDs on unformatted queues */
+ if (this_etmq->formatted_set && !this_etmq->formatted)
+ return 0;
+
+ cpu_data = get_cpu_data(etm, cpu);
+ ret = cs_etm__insert_trace_id_node(this_etmq, trace_id, cpu_data);
+ if (ret)
+ return ret;
+
+ ret = cs_etm__metadata_set_trace_id(trace_id, cpu_data);
+ if (ret)
+ return ret;
+
+ /*
+ * With each HW_ID we can also deduce valid mappings that were not
+ * published by the kernel because later events were not yet created.
+ * For example below we see CPU 0 has valid mappings for buffer 0. Once
+ * we receive a mapping linking CPU 1 to buffers 0 and 1, we know that
+ * CPU 0's mappings also apply to buffer 1, even though we never
+ * received an explicit mapping for it. Below we can also see HW_IDs
+ * can be received incrementally, and the earlier ones don't contain all
+ * 3 mappings:
+ *
+ * CPU 0 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id: 0x100000000001010
+ *
+ * CPU 1 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id: 0x100000000001012
+ * CPU 1 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id: 0x100000000011212
+ * - implied: CPU 0 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id:
+ * 0x100000000011212
+ *
+ * CPU 2 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id: 0x100000000001014
+ * CPU 2 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id: 0x100000000011214
+ * CPU 2 PERF_RECORD_AUX_OUTPUT_HW_ID hw_id: 0x100000000021414
+ *
+ */
+
+ /* Mapping was for this queue, or queues are already linked */
+ if (this_cpu == cpu || this_etmq->traceid_list == etmq->traceid_list)
+ return 0;
+
+ /*
+ * Received a mapping for someone elses ETM signifying that the
+ * sink is shared. Copy its entries into this queue, and then make
+ * both queues use the same traceID map.
+ */
+ intlist__for_each_entry(inode, etmq->traceid_list) {
+ ret = cs_etm__insert_trace_id_node(this_etmq, inode->i, inode->priv);
+ if (ret)
+ return ret;
+ }
+
+ old_list = etmq->traceid_list;
+ etmq->traceid_list = this_etmq->traceid_list;
+ etmq->own_traceid_list = NULL;
+ intlist__for_each_entry_safe(inode, tmp, old_list)
+ intlist__remove(old_list, inode);
+ intlist__delete(old_list);
+
+ return 0;
+}
+
static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata)
{
u64 cs_etm_magic = cpu_metadata[CS_ETM_MAGIC];
@@ -392,10 +489,10 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
/* extract and parse the HW ID */
hw_id = event->aux_output_hw_id.hw_id;
- version = FIELD_GET(CS_AUX_HW_ID_VERSION_MASK, hw_id);
+ version = FIELD_GET(CS_AUX_HW_ID_MAJOR_VERSION_MASK, hw_id);
/* check that we can handle this version */
- if (version > CS_AUX_HW_ID_CURR_VERSION) {
+ if (version > CS_AUX_HW_ID_MAJOR_VERSION) {
pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n",
version);
return -EINVAL;
@@ -420,7 +517,10 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
return -EINVAL;
}
- return cs_etm__process_trace_id_v0(etm, cpu, hw_id);
+ if (FIELD_GET(CS_AUX_HW_ID_MINOR_VERSION_MASK, hw_id) == 0)
+ return cs_etm__process_trace_id_v0(etm, cpu, hw_id);
+ else
+ return cs_etm__process_trace_id_v0_1(etm, cpu, hw_id);
}
void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq,
@@ -860,12 +960,14 @@ static void cs_etm__free_queue(void *priv)
cs_etm_decoder__free(etmq->decoder);
cs_etm__free_traceid_queues(etmq);
- /* First remove all traceID/metadata nodes for the RB tree */
- intlist__for_each_entry_safe(inode, tmp, etmq->traceid_list)
- intlist__remove(etmq->traceid_list, inode);
+ if (etmq->own_traceid_list) {
+ /* First remove all traceID/metadata nodes for the RB tree */
+ intlist__for_each_entry_safe(inode, tmp, etmq->own_traceid_list)
+ intlist__remove(etmq->own_traceid_list, inode);
- /* Then the RB tree itself */
- intlist__delete(etmq->traceid_list);
+ /* Then the RB tree itself */
+ intlist__delete(etmq->own_traceid_list);
+ }
free(etmq);
}
@@ -1059,7 +1161,7 @@ static struct cs_etm_queue *cs_etm__alloc_queue(void)
* has to be made for each packet that gets decoded, optimizing access
* in anything other than a sequential array is worth doing.
*/
- etmq->traceid_list = intlist__new(NULL);
+ etmq->traceid_list = etmq->own_traceid_list = intlist__new(NULL);
if (!etmq->traceid_list)
goto out_free;
--
2.34.1
The trace ID API is currently hard coded to always use the global map.
The functions that take the map as an argument aren't currently public.
Make them public so that Perf mode can pass in its own maps. At the
moment all usages are still hard coded to use the global map, but now
on the caller side.
System ID functions are unchanged because they will always use the
default map.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-etm-perf.c | 5 +++--
.../coresight/coresight-etm3x-core.c | 5 +++--
.../coresight/coresight-etm4x-core.c | 5 +++--
.../hwtracing/coresight/coresight-trace-id.c | 22 +++++++------------
.../hwtracing/coresight/coresight-trace-id.h | 9 +++++---
5 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index c0c60e6a1703..4afb9d29f355 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -232,7 +232,7 @@ static void free_event_data(struct work_struct *work)
if (!(IS_ERR_OR_NULL(*ppath)))
coresight_release_path(*ppath);
*ppath = NULL;
- coresight_trace_id_put_cpu_id(cpu);
+ coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
}
/* mark perf event as done for trace id allocator */
@@ -401,7 +401,8 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
}
/* ensure we can allocate a trace ID for this CPU */
- trace_id = coresight_trace_id_get_cpu_id(cpu);
+ trace_id = coresight_trace_id_get_cpu_id(cpu,
+ coresight_trace_id_map_default());
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
cpumask_clear_cpu(cpu, mask);
coresight_release_path(path);
diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
index 9d5c1391ffb1..4149e7675ceb 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
@@ -465,7 +465,8 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
*
* trace id function has its own lock
*/
- trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
+ trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
+ coresight_trace_id_map_default());
if (IS_VALID_CS_TRACE_ID(trace_id))
drvdata->traceid = (u8)trace_id;
else
@@ -477,7 +478,7 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
void etm_release_trace_id(struct etm_drvdata *drvdata)
{
- coresight_trace_id_put_cpu_id(drvdata->cpu);
+ coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
}
static int etm_enable_perf(struct coresight_device *csdev,
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index a0bdfabddbc6..f32c8cd7742d 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -241,7 +241,8 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
* or return the one currently allocated.
* The trace id function has its own lock
*/
- trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
+ trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
+ coresight_trace_id_map_default());
if (IS_VALID_CS_TRACE_ID(trace_id))
drvdata->trcid = (u8)trace_id;
else
@@ -253,7 +254,7 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
void etm4_release_trace_id(struct etmv4_drvdata *drvdata)
{
- coresight_trace_id_put_cpu_id(drvdata->cpu);
+ coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
}
struct etm4_enable_arg {
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
index 19005b5b4dc4..45ddd50d09a6 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.c
+++ b/drivers/hwtracing/coresight/coresight-trace-id.c
@@ -12,7 +12,7 @@
#include "coresight-trace-id.h"
-/* Default trace ID map. Used on systems that don't require per sink mappings */
+/* Default trace ID map. Used in sysfs mode and for system sources */
static struct coresight_trace_id_map id_map_default;
/* maintain a record of the mapping of IDs and pending releases per cpu */
@@ -152,7 +152,7 @@ static void coresight_trace_id_release_all_pending(void)
DUMP_ID_MAP(id_map);
}
-static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
+int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
{
unsigned long flags;
int id;
@@ -195,8 +195,9 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_
DUMP_ID_MAP(id_map);
return id;
}
+EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
-static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
+void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
{
unsigned long flags;
int id;
@@ -222,6 +223,7 @@ static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id
DUMP_ID_CPU(cpu, id);
DUMP_ID_MAP(id_map);
}
+EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
static int coresight_trace_id_map_get_system_id(struct coresight_trace_id_map *id_map)
{
@@ -250,19 +252,11 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map *
DUMP_ID_MAP(id_map);
}
-/* API functions */
-
-int coresight_trace_id_get_cpu_id(int cpu)
-{
- return coresight_trace_id_map_get_cpu_id(cpu, &id_map_default);
-}
-EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
-
-void coresight_trace_id_put_cpu_id(int cpu)
+struct coresight_trace_id_map *coresight_trace_id_map_default(void)
{
- coresight_trace_id_map_put_cpu_id(cpu, &id_map_default);
+ return &id_map_default;
}
-EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
+EXPORT_SYMBOL_GPL(coresight_trace_id_map_default);
int coresight_trace_id_read_cpu_id(int cpu)
{
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
index 49438a96fcc6..54b9d8ed903b 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.h
+++ b/drivers/hwtracing/coresight/coresight-trace-id.h
@@ -42,7 +42,10 @@
#define IS_VALID_CS_TRACE_ID(id) \
((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
-/* Allocate and release IDs for a single default trace ID map */
+/**
+ * Get the global map that's used by sysfs
+ */
+struct coresight_trace_id_map *coresight_trace_id_map_default(void);
/**
* Read and optionally allocate a CoreSight trace ID and associate with a CPU.
@@ -57,7 +60,7 @@
*
* return: CoreSight trace ID or -EINVAL if allocation impossible.
*/
-int coresight_trace_id_get_cpu_id(int cpu);
+int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
/**
* Release an allocated trace ID associated with the CPU.
@@ -70,7 +73,7 @@ int coresight_trace_id_get_cpu_id(int cpu);
*
* @cpu: The CPU index to release the associated trace ID.
*/
-void coresight_trace_id_put_cpu_id(int cpu);
+void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
/**
* Read the current allocated CoreSight Trace ID value for the CPU.
--
2.34.1
..
System ID functions are unchanged because they will always use the
default map.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-etm-perf.c | 5 +++--
.../coresight/coresight-etm3x-core.c | 5 +++--
.../coresight/coresight-etm4x-core.c | 5 +++--
.../hwtracing/coresight/coresight-trace-id.c | 22 +++++++------------
.../hwtracing/coresight/coresight-trace-id.h | 9 +++++---
5 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index c0c60e6a1703..4afb9d29f355 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -232,7 +232,7 @@ static void free_event_data(struct work_struct *work)
if (!(IS_ERR_OR_NULL(*ppath)))
coresight_release_path(*ppath);
*ppath = NULL;
- coresight_trace_id_put_cpu_id(cpu);
+ coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
}
/* mark perf event as done for trace id allocator */
@@ -401,7 +401,8 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
}
/* ensure we can allocate a trace ID for this CPU */
- trace_id = coresight_trace_id_get_cpu_id(cpu);
+ trace_id = coresight_trace_id_get_cpu_id(cpu,
+ coresight_trace_id_map_default());
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
cpumask_clear_cpu(cpu, mask);
coresight_release_path(path);
diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
index 9d5c1391ffb1..4149e7675ceb 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
@@ -465,7 +465,8 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
*
* trace id function has its own lock
*/
- trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
+ trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
+ coresight_trace_id_map_default());
if (IS_VALID_CS_TRACE_ID(trace_id))
drvdata->traceid = (u8)trace_id;
else
@@ -477,7 +478,7 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
void etm_release_trace_id(struct etm_drvdata *drvdata)
{
- coresight_trace_id_put_cpu_id(drvdata->cpu);
+ coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
}
static int etm_enable_perf(struct coresight_device *csdev,
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index c2ca4a02dfce..562ef6cb72d8 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -241,7 +241,8 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
* or return the one currently allocated.
* The trace id function has its own lock
*/
- trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
+ trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
+ coresight_trace_id_map_default());
if (IS_VALID_CS_TRACE_ID(trace_id))
drvdata->trcid = (u8)trace_id;
else
@@ -253,7 +254,7 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
void etm4_release_trace_id(struct etmv4_drvdata *drvdata)
{
- coresight_trace_id_put_cpu_id(drvdata->cpu);
+ coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
}
struct etm4_enable_arg {
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
index 19005b5b4dc4..45ddd50d09a6 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.c
+++ b/drivers/hwtracing/coresight/coresight-trace-id.c
@@ -12,7 +12,7 @@
#include "coresight-trace-id.h"
-/* Default trace ID map. Used on systems that don't require per sink mappings */
+/* Default trace ID map. Used in sysfs mode and for system sources */
static struct coresight_trace_id_map id_map_default;
/* maintain a record of the mapping of IDs and pending releases per cpu */
@@ -152,7 +152,7 @@ static void coresight_trace_id_release_all_pending(void)
DUMP_ID_MAP(id_map);
}
-static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
+int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
{
unsigned long flags;
int id;
@@ -195,8 +195,9 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_
DUMP_ID_MAP(id_map);
return id;
}
+EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
-static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
+void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
{
unsigned long flags;
int id;
@@ -222,6 +223,7 @@ static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id
DUMP_ID_CPU(cpu, id);
DUMP_ID_MAP(id_map);
}
+EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
static int coresight_trace_id_map_get_system_id(struct coresight_trace_id_map *id_map)
{
@@ -250,19 +252,11 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map *
DUMP_ID_MAP(id_map);
}
-/* API functions */
-
-int coresight_trace_id_get_cpu_id(int cpu)
-{
- return coresight_trace_id_map_get_cpu_id(cpu, &id_map_default);
-}
-EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
-
-void coresight_trace_id_put_cpu_id(int cpu)
+struct coresight_trace_id_map *coresight_trace_id_map_default(void)
{
- coresight_trace_id_map_put_cpu_id(cpu, &id_map_default);
+ return &id_map_default;
}
-EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
+EXPORT_SYMBOL_GPL(coresight_trace_id_map_default);
int coresight_trace_id_read_cpu_id(int cpu)
{
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
index 49438a96fcc6..54b9d8ed903b 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.h
+++ b/drivers/hwtracing/coresight/coresight-trace-id.h
@@ -42,7 +42,10 @@
#define IS_VALID_CS_TRACE_ID(id) \
((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
-/* Allocate and release IDs for a single default trace ID map */
+/**
+ * Get the global map that's used by sysfs
+ */
+struct coresight_trace_id_map *coresight_trace_id_map_default(void);
/**
* Read and optionally allocate a CoreSight trace ID and associate with a CPU.
@@ -57,7 +60,7 @@
*
* return: CoreSight trace ID or -EINVAL if allocation impossible.
*/
-int coresight_trace_id_get_cpu_id(int cpu);
+int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
/**
* Release an allocated trace ID associated with the CPU.
@@ -70,7 +73,7 @@ int coresight_trace_id_get_cpu_id(int cpu);
*
* @cpu: The CPU index to release the associated trace ID.
*/
-void coresight_trace_id_put_cpu_id(int cpu);
+void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
/**
* Read the current allocated CoreSight Trace ID value for the CPU.
--
2.34.1
The global CPU ID mappings won't work for per-sink ID maps so move it to
the ID map struct. coresight_trace_id_release_all_pending() is hard
coded to operate on the default map, but once Perf sessions use their
own maps the pending release mechanism will be deleted. So it doesn't
need to be extended to accept a trace ID map argument at this point.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-etm-perf.c | 3 +-
.../coresight/coresight-etm3x-core.c | 3 +-
.../coresight/coresight-etm4x-core.c | 3 +-
.../hwtracing/coresight/coresight-trace-id.c | 28 ++++++++-----------
.../hwtracing/coresight/coresight-trace-id.h | 2 +-
include/linux/coresight.h | 1 +
6 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 4afb9d29f355..25f1f87c90d1 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -508,7 +508,8 @@ static void etm_event_start(struct perf_event *event, int flags)
hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
CS_AUX_HW_ID_CURR_VERSION);
hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
- coresight_trace_id_read_cpu_id(cpu));
+ coresight_trace_id_read_cpu_id(cpu,
+ coresight_trace_id_map_default()));
perf_report_aux_output_id(event, hw_id);
}
diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
index 4149e7675ceb..b21f5ad94e63 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
@@ -501,7 +501,8 @@ static int etm_enable_perf(struct coresight_device *csdev,
* with perf locks - we know the ID cannot change until perf shuts down
* the session
*/
- trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu);
+ trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
+ coresight_trace_id_map_default());
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
dev_name(&drvdata->csdev->dev), drvdata->cpu);
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index f32c8cd7742d..d16d6efb26fa 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -776,7 +776,8 @@ static int etm4_enable_perf(struct coresight_device *csdev,
* with perf locks - we know the ID cannot change until perf shuts down
* the session
*/
- trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu);
+ trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
+ coresight_trace_id_map_default());
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
dev_name(&drvdata->csdev->dev), drvdata->cpu);
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
index 45ddd50d09a6..b393603dd713 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.c
+++ b/drivers/hwtracing/coresight/coresight-trace-id.c
@@ -13,10 +13,12 @@
#include "coresight-trace-id.h"
/* Default trace ID map. Used in sysfs mode and for system sources */
-static struct coresight_trace_id_map id_map_default;
+static DEFINE_PER_CPU(atomic_t, id_map_default_cpu_ids) = ATOMIC_INIT(0);
+static struct coresight_trace_id_map id_map_default = {
+ .cpu_map = &id_map_default_cpu_ids
+};
-/* maintain a record of the mapping of IDs and pending releases per cpu */
-static DEFINE_PER_CPU(atomic_t, cpu_id) = ATOMIC_INIT(0);
+/* maintain a record of the pending releases per cpu */
static cpumask_t cpu_id_release_pending;
/* perf session active counter */
@@ -46,12 +48,6 @@ static void coresight_trace_id_dump_table(struct coresight_trace_id_map *id_map,
#define PERF_SESSION(n)
#endif
-/* unlocked read of current trace ID value for given CPU */
-static int _coresight_trace_id_read_cpu_id(int cpu)
-{
- return atomic_read(&per_cpu(cpu_id, cpu));
-}
-
/* look for next available odd ID, return 0 if none found */
static int coresight_trace_id_find_odd_id(struct coresight_trace_id_map *id_map)
{
@@ -145,7 +141,7 @@ static void coresight_trace_id_release_all_pending(void)
clear_bit(bit, id_map->pend_rel_ids);
}
for_each_cpu(cpu, &cpu_id_release_pending) {
- atomic_set(&per_cpu(cpu_id, cpu), 0);
+ atomic_set(per_cpu_ptr(id_map_default.cpu_map, cpu), 0);
cpumask_clear_cpu(cpu, &cpu_id_release_pending);
}
spin_unlock_irqrestore(&id_map_lock, flags);
@@ -160,7 +156,7 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
spin_lock_irqsave(&id_map_lock, flags);
/* check for existing allocation for this CPU */
- id = _coresight_trace_id_read_cpu_id(cpu);
+ id = coresight_trace_id_read_cpu_id(cpu, id_map);
if (id)
goto get_cpu_id_clr_pend;
@@ -181,7 +177,7 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
goto get_cpu_id_out_unlock;
/* allocate the new id to the cpu */
- atomic_set(&per_cpu(cpu_id, cpu), id);
+ atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), id);
get_cpu_id_clr_pend:
/* we are (re)using this ID - so ensure it is not marked for release */
@@ -203,7 +199,7 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
int id;
/* check for existing allocation for this CPU */
- id = _coresight_trace_id_read_cpu_id(cpu);
+ id = coresight_trace_id_read_cpu_id(cpu, id_map);
if (!id)
return;
@@ -216,7 +212,7 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
} else {
/* otherwise clear id */
coresight_trace_id_free(id, id_map);
- atomic_set(&per_cpu(cpu_id, cpu), 0);
+ atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0);
}
spin_unlock_irqrestore(&id_map_lock, flags);
@@ -258,9 +254,9 @@ struct coresight_trace_id_map *coresight_trace_id_map_default(void)
}
EXPORT_SYMBOL_GPL(coresight_trace_id_map_default);
-int coresight_trace_id_read_cpu_id(int cpu)
+int coresight_trace_id_read_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
{
- return _coresight_trace_id_read_cpu_id(cpu);
+ return atomic_read(per_cpu_ptr(id_map->cpu_map, cpu));
}
EXPORT_SYMBOL_GPL(coresight_trace_id_read_cpu_id);
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
index 54b9d8ed903b..ed2bc4b3ad2a 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.h
+++ b/drivers/hwtracing/coresight/coresight-trace-id.h
@@ -93,7 +93,7 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
*
* return: current value, will be 0 if unallocated.
*/
-int coresight_trace_id_read_cpu_id(int cpu);
+int coresight_trace_id_read_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
/**
* Allocate a CoreSight trace ID for a system component.
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index c16c61a8411d..7d62b88bfb5c 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -234,6 +234,7 @@ struct coresight_sysfs_link {
struct coresight_trace_id_map {
DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
+ atomic_t __percpu *cpu_map;
};
/**
--
2.34.1
This will allow Perf mode to pass in per-sink maps. System sources
allocate IDs on probe so they don't use this and it's __maybe_unused.
Sysfs mode also has the global map hard coded in various places, so pass
in NULL when enabling for sysfs. We could bubble the global map all the
way down to where it's used, but it wouldn't have any functional
difference, so it's probably not worth the code churn.
Signed-off-by: James Clark <[email protected]>
---
drivers/hwtracing/coresight/coresight-dummy.c | 3 ++-
drivers/hwtracing/coresight/coresight-etm-perf.c | 3 ++-
drivers/hwtracing/coresight/coresight-etm3x-core.c | 10 +++++-----
drivers/hwtracing/coresight/coresight-etm4x-core.c | 10 +++++-----
drivers/hwtracing/coresight/coresight-stm.c | 3 ++-
drivers/hwtracing/coresight/coresight-sysfs.c | 3 ++-
drivers/hwtracing/coresight/coresight-tpdm.c | 3 ++-
include/linux/coresight.h | 2 +-
8 files changed, 21 insertions(+), 16 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-dummy.c b/drivers/hwtracing/coresight/coresight-dummy.c
index ac70c0b491be..1f1b9ad160f6 100644
--- a/drivers/hwtracing/coresight/coresight-dummy.c
+++ b/drivers/hwtracing/coresight/coresight-dummy.c
@@ -21,7 +21,8 @@ DEFINE_CORESIGHT_DEVLIST(source_devs, "dummy_source");
DEFINE_CORESIGHT_DEVLIST(sink_devs, "dummy_sink");
static int dummy_source_enable(struct coresight_device *csdev,
- struct perf_event *event, enum cs_mode mode)
+ struct perf_event *event, enum cs_mode mode,
+ __maybe_unused struct coresight_trace_id_map *id_map)
{
dev_dbg(csdev->dev.parent, "Dummy source enabled\n");
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 25f1f87c90d1..177cecae38d9 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -496,7 +496,8 @@ static void etm_event_start(struct perf_event *event, int flags)
goto fail_end_stop;
/* Finally enable the tracer */
- if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
+ if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF,
+ coresight_trace_id_map_default()))
goto fail_disable_path;
/*
diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
index b21f5ad94e63..b310bdf19038 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
@@ -482,7 +482,8 @@ void etm_release_trace_id(struct etm_drvdata *drvdata)
}
static int etm_enable_perf(struct coresight_device *csdev,
- struct perf_event *event)
+ struct perf_event *event,
+ struct coresight_trace_id_map *id_map)
{
struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
int trace_id;
@@ -501,8 +502,7 @@ static int etm_enable_perf(struct coresight_device *csdev,
* with perf locks - we know the ID cannot change until perf shuts down
* the session
*/
- trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
- coresight_trace_id_map_default());
+ trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu, id_map);
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
dev_name(&drvdata->csdev->dev), drvdata->cpu);
@@ -555,7 +555,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev)
}
static int etm_enable(struct coresight_device *csdev, struct perf_event *event,
- enum cs_mode mode)
+ enum cs_mode mode, struct coresight_trace_id_map *id_map)
{
int ret;
struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
@@ -570,7 +570,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event,
ret = etm_enable_sysfs(csdev);
break;
case CS_MODE_PERF:
- ret = etm_enable_perf(csdev, event);
+ ret = etm_enable_perf(csdev, event, id_map);
break;
default:
ret = -EINVAL;
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index d16d6efb26fa..02dbb6c4daf5 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -753,7 +753,8 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
}
static int etm4_enable_perf(struct coresight_device *csdev,
- struct perf_event *event)
+ struct perf_event *event,
+ struct coresight_trace_id_map *id_map)
{
int ret = 0, trace_id;
struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
@@ -776,8 +777,7 @@ static int etm4_enable_perf(struct coresight_device *csdev,
* with perf locks - we know the ID cannot change until perf shuts down
* the session
*/
- trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
- coresight_trace_id_map_default());
+ trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu, id_map);
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
dev_name(&drvdata->csdev->dev), drvdata->cpu);
@@ -839,7 +839,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev)
}
static int etm4_enable(struct coresight_device *csdev, struct perf_event *event,
- enum cs_mode mode)
+ enum cs_mode mode, struct coresight_trace_id_map *id_map)
{
int ret;
@@ -853,7 +853,7 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event,
ret = etm4_enable_sysfs(csdev);
break;
case CS_MODE_PERF:
- ret = etm4_enable_perf(csdev, event);
+ ret = etm4_enable_perf(csdev, event, id_map);
break;
default:
ret = -EINVAL;
diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c
index e1c62820dfda..a80ad1de4c23 100644
--- a/drivers/hwtracing/coresight/coresight-stm.c
+++ b/drivers/hwtracing/coresight/coresight-stm.c
@@ -194,7 +194,8 @@ static void stm_enable_hw(struct stm_drvdata *drvdata)
}
static int stm_enable(struct coresight_device *csdev, struct perf_event *event,
- enum cs_mode mode)
+ enum cs_mode mode,
+ __maybe_unused struct coresight_trace_id_map *trace_id)
{
struct stm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
index 1e67cc7758d7..a01c9e54e2ed 100644
--- a/drivers/hwtracing/coresight/coresight-sysfs.c
+++ b/drivers/hwtracing/coresight/coresight-sysfs.c
@@ -9,6 +9,7 @@
#include <linux/kernel.h>
#include "coresight-priv.h"
+#include "coresight-trace-id.h"
/*
* Use IDR to map the hash of the source's device name
@@ -63,7 +64,7 @@ static int coresight_enable_source_sysfs(struct coresight_device *csdev,
*/
lockdep_assert_held(&coresight_mutex);
if (coresight_get_mode(csdev) != CS_MODE_SYSFS) {
- ret = source_ops(csdev)->enable(csdev, data, mode);
+ ret = source_ops(csdev)->enable(csdev, data, mode, NULL);
if (ret)
return ret;
}
diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c
index a9708ab0d488..0376ad326a2f 100644
--- a/drivers/hwtracing/coresight/coresight-tpdm.c
+++ b/drivers/hwtracing/coresight/coresight-tpdm.c
@@ -439,7 +439,8 @@ static void __tpdm_enable(struct tpdm_drvdata *drvdata)
}
static int tpdm_enable(struct coresight_device *csdev, struct perf_event *event,
- enum cs_mode mode)
+ enum cs_mode mode,
+ __maybe_unused struct coresight_trace_id_map *id_map)
{
struct tpdm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 7d62b88bfb5c..3a678e5425dc 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -384,7 +384,7 @@ struct coresight_ops_link {
struct coresight_ops_source {
int (*cpu_id)(struct coresight_device *csdev);
int (*enable)(struct coresight_device *csdev, struct perf_event *event,
- enum cs_mode mode);
+ enum cs_mode mode, struct coresight_trace_id_map *id_map);
void (*disable)(struct coresight_device *csdev,
struct perf_event *event);
};
--
2.34.1
Pending the release of IDs was a way of managing concurrent sysfs and
Perf sessions in a single global ID map. Perf may have finished while
sysfs hadn't, and Perf shouldn't release the IDs in use by sysfs and
vice versa.
Now that Perf uses its own exclusive ID maps, pending release doesn't
result in any different behavior than just releasing all IDs when the
last Perf session finishes. As part of the per-sink trace ID change, we
would have still had to make the pending mechanism work on a per-sink
basis, due to the overlapping ID allocations, so instead of making that
more complicated, just remove it.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-etm-perf.c | 11 ++--
.../hwtracing/coresight/coresight-trace-id.c | 62 +++++--------------
.../hwtracing/coresight/coresight-trace-id.h | 31 +++++-----
include/linux/coresight.h | 6 +-
4 files changed, 34 insertions(+), 76 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 86ca1a9d09a7..f07173aa4d66 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -232,15 +232,14 @@ static void free_event_data(struct work_struct *work)
if (!(IS_ERR_OR_NULL(*ppath))) {
struct coresight_device *sink = coresight_get_sink(*ppath);
- coresight_trace_id_put_cpu_id(cpu, &sink->perf_id_map);
+ /* mark perf event as done for trace id allocator */
+ coresight_trace_id_perf_stop(&sink->perf_id_map);
+
coresight_release_path(*ppath);
}
*ppath = NULL;
}
- /* mark perf event as done for trace id allocator */
- coresight_trace_id_perf_stop();
-
free_percpu(event_data->path);
kfree(event_data);
}
@@ -328,9 +327,6 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
sink = user_sink = coresight_get_sink_by_id(id);
}
- /* tell the trace ID allocator that a perf event is starting up */
- coresight_trace_id_perf_start();
-
/* check if user wants a coresight configuration selected */
cfg_hash = (u32)((event->attr.config2 & GENMASK_ULL(63, 32)) >> 32);
if (cfg_hash) {
@@ -404,6 +400,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
}
/* ensure we can allocate a trace ID for this CPU */
+ coresight_trace_id_perf_start(&sink->perf_id_map);
trace_id = coresight_trace_id_get_cpu_id(cpu, &sink->perf_id_map);
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
cpumask_clear_cpu(cpu, mask);
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
index b393603dd713..84721706b4a4 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.c
+++ b/drivers/hwtracing/coresight/coresight-trace-id.c
@@ -18,12 +18,6 @@ static struct coresight_trace_id_map id_map_default = {
.cpu_map = &id_map_default_cpu_ids
};
-/* maintain a record of the pending releases per cpu */
-static cpumask_t cpu_id_release_pending;
-
-/* perf session active counter */
-static atomic_t perf_cs_etm_session_active = ATOMIC_INIT(0);
-
/* lock to protect id_map and cpu data */
static DEFINE_SPINLOCK(id_map_lock);
@@ -116,34 +110,18 @@ static void coresight_trace_id_free(int id, struct coresight_trace_id_map *id_ma
clear_bit(id, id_map->used_ids);
}
-static void coresight_trace_id_set_pend_rel(int id, struct coresight_trace_id_map *id_map)
-{
- if (WARN(!IS_VALID_CS_TRACE_ID(id), "Invalid Trace ID %d\n", id))
- return;
- set_bit(id, id_map->pend_rel_ids);
-}
-
/*
- * release all pending IDs for all current maps & clear CPU associations
- *
- * This currently operates on the default id map, but may be extended to
- * operate on all registered id maps if per sink id maps are used.
+ * release all IDs and clear CPU associations
*/
-static void coresight_trace_id_release_all_pending(void)
+static void coresight_trace_id_release_all(struct coresight_trace_id_map *id_map)
{
- struct coresight_trace_id_map *id_map = &id_map_default;
unsigned long flags;
- int cpu, bit;
+ int cpu;
spin_lock_irqsave(&id_map_lock, flags);
- for_each_set_bit(bit, id_map->pend_rel_ids, CORESIGHT_TRACE_ID_RES_TOP) {
- clear_bit(bit, id_map->used_ids);
- clear_bit(bit, id_map->pend_rel_ids);
- }
- for_each_cpu(cpu, &cpu_id_release_pending) {
- atomic_set(per_cpu_ptr(id_map_default.cpu_map, cpu), 0);
- cpumask_clear_cpu(cpu, &cpu_id_release_pending);
- }
+ bitmap_zero(id_map->used_ids, CORESIGHT_TRACE_IDS_MAX);
+ for_each_possible_cpu(cpu)
+ atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0);
spin_unlock_irqrestore(&id_map_lock, flags);
DUMP_ID_MAP(id_map);
}
@@ -158,7 +136,7 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
/* check for existing allocation for this CPU */
id = coresight_trace_id_read_cpu_id(cpu, id_map);
if (id)
- goto get_cpu_id_clr_pend;
+ goto get_cpu_id_out_unlock;
/*
* Find a new ID.
@@ -179,11 +157,6 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
/* allocate the new id to the cpu */
atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), id);
-get_cpu_id_clr_pend:
- /* we are (re)using this ID - so ensure it is not marked for release */
- cpumask_clear_cpu(cpu, &cpu_id_release_pending);
- clear_bit(id, id_map->pend_rel_ids);
-
get_cpu_id_out_unlock:
spin_unlock_irqrestore(&id_map_lock, flags);
@@ -205,15 +178,8 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
spin_lock_irqsave(&id_map_lock, flags);
- if (atomic_read(&perf_cs_etm_session_active)) {
- /* set release at pending if perf still active */
- coresight_trace_id_set_pend_rel(id, id_map);
- cpumask_set_cpu(cpu, &cpu_id_release_pending);
- } else {
- /* otherwise clear id */
- coresight_trace_id_free(id, id_map);
- atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0);
- }
+ coresight_trace_id_free(id, id_map);
+ atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0);
spin_unlock_irqrestore(&id_map_lock, flags);
DUMP_ID_CPU(cpu, id);
@@ -272,17 +238,17 @@ void coresight_trace_id_put_system_id(int id)
}
EXPORT_SYMBOL_GPL(coresight_trace_id_put_system_id);
-void coresight_trace_id_perf_start(void)
+void coresight_trace_id_perf_start(struct coresight_trace_id_map *id_map)
{
- atomic_inc(&perf_cs_etm_session_active);
+ atomic_inc(&id_map->perf_cs_etm_session_active);
PERF_SESSION(atomic_read(&perf_cs_etm_session_active));
}
EXPORT_SYMBOL_GPL(coresight_trace_id_perf_start);
-void coresight_trace_id_perf_stop(void)
+void coresight_trace_id_perf_stop(struct coresight_trace_id_map *id_map)
{
- if (!atomic_dec_return(&perf_cs_etm_session_active))
- coresight_trace_id_release_all_pending();
+ if (!atomic_dec_return(&id_map->perf_cs_etm_session_active))
+ coresight_trace_id_release_all(id_map);
PERF_SESSION(atomic_read(&perf_cs_etm_session_active));
}
EXPORT_SYMBOL_GPL(coresight_trace_id_perf_stop);
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
index ed2bc4b3ad2a..409713901f21 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.h
+++ b/drivers/hwtracing/coresight/coresight-trace-id.h
@@ -17,9 +17,10 @@
* released when done.
*
* In order to ensure that a consistent cpu / ID matching is maintained
- * throughout a perf cs_etm event session - a session in progress flag will
- * be maintained, and released IDs not cleared until the perf session is
- * complete. This allows the same CPU to be re-allocated its prior ID.
+ * throughout a perf cs_etm event session - a session in progress flag will be
+ * maintained for each sink, and IDs are cleared when all the perf sessions
+ * complete. This allows the same CPU to be re-allocated its prior ID when
+ * events are scheduled in and out.
*
*
* Trace ID maps will be created and initialised to prevent architecturally
@@ -65,11 +66,7 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
/**
* Release an allocated trace ID associated with the CPU.
*
- * This will release the CoreSight trace ID associated with the CPU,
- * unless a perf session is in operation.
- *
- * If a perf session is in operation then the ID will be marked as pending
- * release.
+ * This will release the CoreSight trace ID associated with the CPU.
*
* @cpu: The CPU index to release the associated trace ID.
*/
@@ -120,21 +117,21 @@ void coresight_trace_id_put_system_id(int id);
/**
* Notify the Trace ID allocator that a perf session is starting.
*
- * Increase the perf session reference count - called by perf when setting up
- * a trace event.
+ * Increase the perf session reference count - called by perf when setting up a
+ * trace event.
*
- * This reference count is used by the ID allocator to ensure that trace IDs
- * associated with a CPU cannot change or be released during a perf session.
+ * Perf sessions never free trace IDs to ensure that the ID associated with a
+ * CPU cannot change during their and other's concurrent sessions. Instead,
+ * this refcount is used so that the last event to finish always frees all IDs.
*/
-void coresight_trace_id_perf_start(void);
+void coresight_trace_id_perf_start(struct coresight_trace_id_map *id_map);
/**
* Notify the ID allocator that a perf session is stopping.
*
- * Decrease the perf session reference count.
- * if this causes the count to go to zero, then all Trace IDs marked as pending
- * release, will be released.
+ * Decrease the perf session reference count. If this causes the count to go to
+ * zero, then all Trace IDs will be released.
*/
-void coresight_trace_id_perf_stop(void);
+void coresight_trace_id_perf_stop(struct coresight_trace_id_map *id_map);
#endif /* _CORESIGHT_TRACE_ID_H */
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 8c4c1860c76b..4b7719b061f9 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -227,14 +227,12 @@ struct coresight_sysfs_link {
* @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
* Initialised so that the reserved IDs are permanently marked as
* in use.
- * @pend_rel_ids: CPU IDs that have been released by the trace source but not
- * yet marked as available, to allow re-allocation to the same
- * CPU during a perf session.
+ * @perf_cs_etm_session_active: Number of Perf sessions using this ID map.
*/
struct coresight_trace_id_map {
DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
- DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
atomic_t __percpu *cpu_map;
+ atomic_t perf_cs_etm_session_active;
};
/**
--
2.34.1
For Perf to be able to decode when per-sink trace IDs are used, emit all
the mappings for each sink.
Perf currently errors out if it sees a newer packet version so instead
of bumping it, add a new minor version field. This can be used to
signify new versions that have backwards compatible fields. Considering
this change is only for high core count machines, it doesn't make sense
to make a breaking change for everyone.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-etm-perf.c | 47 ++++++++++++++++---
include/linux/coresight-pmu.h | 17 +++++--
2 files changed, 54 insertions(+), 10 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 08f3958f9367..3bb1ae1e5264 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -444,6 +444,46 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
goto out;
}
+static void etm_output_hw_ids(struct perf_event *event,
+ struct coresight_trace_id_map *id_map,
+ int this_events_cpu)
+{
+ int cpu;
+ u8 this_events_trace_id = coresight_trace_id_read_cpu_id(this_events_cpu, id_map);
+
+ /*
+ * This isn't optimal because we likely only have a couple of IDs
+ * allocated per-sink, but we only currently track the used trace IDs as
+ * a bitmask, rather than the used CPUs in each ID map. It would also
+ * require some extra locking to iterate a used CPUs bitmask and then
+ * output the ID from a different structure. So at the moment just
+ * iterate all CPUs.
+ */
+ for_each_possible_cpu(cpu) {
+ u64 hw_id;
+ u8 trace_id = coresight_trace_id_read_cpu_id(cpu, id_map);
+
+ if (!IS_VALID_CS_TRACE_ID(trace_id))
+ continue;
+
+ hw_id = FIELD_PREP(CS_AUX_HW_ID_MAJOR_VERSION_MASK,
+ CS_AUX_HW_ID_MAJOR_VERSION);
+ hw_id |= FIELD_PREP(CS_AUX_HW_ID_MINOR_VERSION_MASK,
+ CS_AUX_HW_ID_MINOR_VERSION);
+
+ /* Repeat sending the ID for this event so that it's backwards compatible */
+ hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK, this_events_trace_id);
+
+ /*
+ * Output the V0.1 HW_ID info that shows which other ID mappings
+ * are valid on this sink.
+ */
+ hw_id |= FIELD_PREP(CS_AUX_HW_ID_V01_CPU_MASK, cpu);
+ hw_id |= FIELD_PREP(CS_AUX_HW_ID_V01_TRACE_ID_MASK, trace_id);
+ perf_report_aux_output_id(event, hw_id);
+ }
+}
+
static void etm_event_start(struct perf_event *event, int flags)
{
int cpu = smp_processor_id();
@@ -452,7 +492,6 @@ static void etm_event_start(struct perf_event *event, int flags)
struct perf_output_handle *handle = &ctxt->handle;
struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu);
struct list_head *path;
- u64 hw_id;
if (!csdev)
goto fail;
@@ -519,11 +558,7 @@ static void etm_event_start(struct perf_event *event, int flags)
*/
if (!cpumask_test_cpu(cpu, &event_data->aux_hwid_done)) {
cpumask_set_cpu(cpu, &event_data->aux_hwid_done);
- hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
- CS_AUX_HW_ID_CURR_VERSION);
- hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
- coresight_trace_id_read_cpu_id(cpu, &sink->perf_id_map));
- perf_report_aux_output_id(event, hw_id);
+ etm_output_hw_ids(event, &sink->perf_id_map, cpu);
}
out:
diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h
index 51ac441a37c3..4a7fac6f66b9 100644
--- a/include/linux/coresight-pmu.h
+++ b/include/linux/coresight-pmu.h
@@ -49,12 +49,21 @@
* Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload.
* Used to associate a CPU with the CoreSight Trace ID.
* [07:00] - Trace ID - uses 8 bits to make value easy to read in file.
- * [59:08] - Unused (SBZ)
- * [63:60] - Version
+ * [15:08] - V2 Trace ID - ID for the ETM/CPU referenced by V2 CPU
+ * [31:16] - V2 CPU ID - CPU that corresponds to the trace ID in V2 trace ID
+ * [55:32] - Unused (SBZ)
+ * [59:56] - Minor Version - previously existing fields are compatible with
+ * all minor versions.
+ * [63:60] - Major Version - previously existing fields mean different things
+ * in new major versions.
*/
#define CS_AUX_HW_ID_TRACE_ID_MASK GENMASK_ULL(7, 0)
-#define CS_AUX_HW_ID_VERSION_MASK GENMASK_ULL(63, 60)
+#define CS_AUX_HW_ID_V01_TRACE_ID_MASK GENMASK_ULL(15, 8)
+#define CS_AUX_HW_ID_V01_CPU_MASK GENMASK_ULL(31, 16)
+#define CS_AUX_HW_ID_MINOR_VERSION_MASK GENMASK_ULL(59, 56)
+#define CS_AUX_HW_ID_MAJOR_VERSION_MASK GENMASK_ULL(63, 60)
-#define CS_AUX_HW_ID_CURR_VERSION 0
+#define CS_AUX_HW_ID_MAJOR_VERSION 0
+#define CS_AUX_HW_ID_MINOR_VERSION 1
#endif
--
2.34.1
"Process being monitored" and "pid of the process to monitor" imply that
this would be the same PID if there were two sessions targeting the same
process. But this is actually the PID of the process that did the Perf
event open call, rather than the target of the session. So update the
comments to make this clearer.
Signed-off-by: James Clark <[email protected]>
---
drivers/hwtracing/coresight/coresight-tmc-etr.c | 5 +++--
drivers/hwtracing/coresight/coresight-tmc.h | 5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
index e75428fa1592..8962fc27d04f 100644
--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
@@ -36,7 +36,8 @@ struct etr_buf_hw {
* etr_perf_buffer - Perf buffer used for ETR
* @drvdata - The ETR drvdaga this buffer has been allocated for.
* @etr_buf - Actual buffer used by the ETR
- * @pid - The PID this etr_perf_buffer belongs to.
+ * @pid - The PID of the session owner that etr_perf_buffer
+ * belongs to.
* @snaphost - Perf session mode
* @nr_pages - Number of pages in the ring buffer.
* @pages - Array of Pages in the ring buffer.
@@ -1662,7 +1663,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
goto unlock_out;
}
- /* Get a handle on the pid of the process to monitor */
+ /* Get a handle on the pid of the session owner */
pid = etr_perf->pid;
/* Do not proceed if this device is associated with another session */
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
index c77763b49de0..2671926be62a 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -171,8 +171,9 @@ struct etr_buf {
* @csdev: component vitals needed by the framework.
* @miscdev: specifics to handle "/dev/xyz.tmc" entry.
* @spinlock: only one at a time pls.
- * @pid: Process ID of the process being monitored by the session
- * that is using this component.
+ * @pid: Process ID of the process that owns the session that is using
+ * this component. For example this would be the pid of the Perf
+ * process.
* @buf: Snapshot of the trace data for ETF/ETB.
* @etr_buf: details of buffer used in TMC-ETR
* @len: size of the available trace for ETF/ETB.
--
2.34.1
The global list won't work for per-sink trace ID allocations, so put a
list in each queue where the IDs will be unique to that queue.
To keep the same behavior as before, for version 0 of the HW_ID packets,
copy all the HW_ID mappings into all queues.
This change doesn't effect the decoders, only trace ID lookups on the
Perf side. The decoders are still created with global mappings which
will be fixed in a later commit.
Signed-off-by: James Clark <[email protected]>
---
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 28 +--
tools/perf/util/cs-etm.c | 195 ++++++++++--------
tools/perf/util/cs-etm.h | 2 +-
3 files changed, 126 insertions(+), 99 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index e917985bbbe6..0c9c48cedbf1 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -388,7 +388,8 @@ cs_etm_decoder__reset_timestamp(struct cs_etm_packet_queue *packet_queue)
}
static ocsd_datapath_resp_t
-cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue,
+cs_etm_decoder__buffer_packet(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *packet_queue,
const u8 trace_chan_id,
enum cs_etm_sample_type sample_type)
{
@@ -398,7 +399,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue,
if (packet_queue->packet_count >= CS_ETM_PACKET_MAX_BUFFER - 1)
return OCSD_RESP_FATAL_SYS_ERR;
- if (cs_etm__get_cpu(trace_chan_id, &cpu) < 0)
+ if (cs_etm__get_cpu(etmq, trace_chan_id, &cpu) < 0)
return OCSD_RESP_FATAL_SYS_ERR;
et = packet_queue->tail;
@@ -436,7 +437,7 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq,
int ret = 0;
struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(packet_queue, trace_chan_id,
+ ret = cs_etm_decoder__buffer_packet(etmq, packet_queue, trace_chan_id,
CS_ETM_RANGE);
if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
@@ -496,7 +497,8 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq,
}
static ocsd_datapath_resp_t
-cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue,
+cs_etm_decoder__buffer_discontinuity(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *queue,
const uint8_t trace_chan_id)
{
/*
@@ -504,18 +506,19 @@ cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue,
* reset time statistics.
*/
cs_etm_decoder__reset_timestamp(queue);
- return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
+ return cs_etm_decoder__buffer_packet(etmq, queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
}
static ocsd_datapath_resp_t
-cs_etm_decoder__buffer_exception(struct cs_etm_packet_queue *queue,
+cs_etm_decoder__buffer_exception(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *queue,
const ocsd_generic_trace_elem *elem,
const uint8_t trace_chan_id)
{ int ret = 0;
struct cs_etm_packet *packet;
- ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id,
+ ret = cs_etm_decoder__buffer_packet(etmq, queue, trace_chan_id,
CS_ETM_EXCEPTION);
if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT)
return ret;
@@ -527,10 +530,11 @@ cs_etm_decoder__buffer_exception(struct cs_etm_packet_queue *queue,
}
static ocsd_datapath_resp_t
-cs_etm_decoder__buffer_exception_ret(struct cs_etm_packet_queue *queue,
+cs_etm_decoder__buffer_exception_ret(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *queue,
const uint8_t trace_chan_id)
{
- return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
+ return cs_etm_decoder__buffer_packet(etmq, queue, trace_chan_id,
CS_ETM_EXCEPTION_RET);
}
@@ -599,7 +603,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
case OCSD_GEN_TRC_ELEM_EO_TRACE:
case OCSD_GEN_TRC_ELEM_NO_SYNC:
case OCSD_GEN_TRC_ELEM_TRACE_ON:
- resp = cs_etm_decoder__buffer_discontinuity(packet_queue,
+ resp = cs_etm_decoder__buffer_discontinuity(etmq, packet_queue,
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_INSTR_RANGE:
@@ -607,11 +611,11 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION:
- resp = cs_etm_decoder__buffer_exception(packet_queue, elem,
+ resp = cs_etm_decoder__buffer_exception(etmq, packet_queue, elem,
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:
- resp = cs_etm_decoder__buffer_exception_ret(packet_queue,
+ resp = cs_etm_decoder__buffer_exception_ret(etmq, packet_queue,
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_TIMESTAMP:
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index e40dfd09d3ed..be858aed26c4 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -111,16 +111,18 @@ struct cs_etm_queue {
/* Conversion between traceID and index in traceid_queues array */
struct intlist *traceid_queues_list;
struct cs_etm_traceid_queue **traceid_queues;
+ /* Conversion between traceID and metadata pointers */
+ struct intlist *traceid_list;
};
-/* RB tree for quick conversion between traceID and metadata pointers */
-static struct intlist *traceid_list;
-
static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm);
static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
pid_t tid);
static int cs_etm__get_data_block(struct cs_etm_queue *etmq);
static int cs_etm__decode_data_block(struct cs_etm_queue *etmq);
+static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata);
+static u64 *get_cpu_data(struct cs_etm_auxtrace *etm, int cpu);
+static int cs_etm__metadata_set_trace_id(u8 trace_chan_id, u64 *cpu_metadata);
/* PTMs ETMIDR [11:8] set to b0011 */
#define ETMIDR_PTM_VERSION 0x00000300
@@ -146,12 +148,12 @@ static u32 cs_etm__get_v7_protocol_version(u32 etmidr)
return CS_ETM_PROTO_ETMV3;
}
-static int cs_etm__get_magic(u8 trace_chan_id, u64 *magic)
+static int cs_etm__get_magic(struct cs_etm_queue *etmq, u8 trace_chan_id, u64 *magic)
{
struct int_node *inode;
u64 *metadata;
- inode = intlist__find(traceid_list, trace_chan_id);
+ inode = intlist__find(etmq->traceid_list, trace_chan_id);
if (!inode)
return -EINVAL;
@@ -160,12 +162,12 @@ static int cs_etm__get_magic(u8 trace_chan_id, u64 *magic)
return 0;
}
-int cs_etm__get_cpu(u8 trace_chan_id, int *cpu)
+int cs_etm__get_cpu(struct cs_etm_queue *etmq, u8 trace_chan_id, int *cpu)
{
struct int_node *inode;
u64 *metadata;
- inode = intlist__find(traceid_list, trace_chan_id);
+ inode = intlist__find(etmq->traceid_list, trace_chan_id);
if (!inode)
return -EINVAL;
@@ -217,30 +219,86 @@ enum cs_etm_pid_fmt cs_etm__get_pid_fmt(struct cs_etm_queue *etmq)
return etmq->etm->pid_fmt;
}
-static int cs_etm__map_trace_id(u8 trace_chan_id, u64 *cpu_metadata)
+static int cs_etm__insert_trace_id_node(struct cs_etm_queue *etmq,
+ u8 trace_chan_id, u64 *cpu_metadata)
{
- struct int_node *inode;
-
/* Get an RB node for this CPU */
- inode = intlist__findnew(traceid_list, trace_chan_id);
+ struct int_node *inode = intlist__findnew(etmq->traceid_list, trace_chan_id);
/* Something went wrong, no need to continue */
if (!inode)
return -ENOMEM;
- /*
- * The node for that CPU should not be taken.
- * Back out if that's the case.
- */
- if (inode->priv)
- return -EINVAL;
+ /* Disallow re-mapping a different traceID to metadata pair. */
+ if (inode->priv) {
+ u64 *curr_cpu_data = inode->priv;
+ u8 curr_chan_id;
+ int err;
+
+ if (curr_cpu_data[CS_ETM_CPU] != cpu_metadata[CS_ETM_CPU]) {
+ pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n");
+ return -EINVAL;
+ }
+
+ /* check that the mapped ID matches */
+ err = cs_etm__metadata_get_trace_id(&curr_chan_id, curr_cpu_data);
+ if (err)
+ return err;
- /* All good, associate the traceID with the metadata pointer */
+ if (curr_chan_id != trace_chan_id) {
+ pr_err("CS_ETM: mismatch between CPU trace ID and HW_ID packet ID\n");
+ return -EINVAL;
+ }
+
+ /* Skip re-adding the same mappings if everything matched */
+ return 0;
+ }
+
+ /* Not one we've seen before, associate the traceID with the metadata pointer */
inode->priv = cpu_metadata;
return 0;
}
+static int cs_etm__map_trace_id_v0(struct cs_etm_auxtrace *etm, u8 trace_chan_id,
+ u64 *cpu_metadata)
+{
+ /* Version 0 trace IDs are global so save them into every queue */
+ for (unsigned int i = 0; i < etm->queues.nr_queues; ++i) {
+ int ret;
+ struct cs_etm_queue *etmq = etm->queues.queue_array[i].priv;
+
+ ret = cs_etm__insert_trace_id_node(etmq, trace_chan_id,
+ cpu_metadata);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int cs_etm__process_trace_id_v0(struct cs_etm_auxtrace *etm, int cpu,
+ u64 hw_id)
+{
+ int err;
+ u64 *cpu_data;
+ u8 trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
+
+ cpu_data = get_cpu_data(etm, cpu);
+ if (cpu_data == NULL)
+ return -EINVAL;
+
+ err = cs_etm__map_trace_id_v0(etm, trace_chan_id, cpu_data);
+ if (err)
+ return err;
+
+ /*
+ * if we are picking up the association from the packet, need to plug
+ * the correct trace ID into the metadata for setting up decoders later.
+ */
+ return cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data);
+}
+
static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata)
{
u64 cs_etm_magic = cpu_metadata[CS_ETM_MAGIC];
@@ -324,17 +382,13 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
{
struct cs_etm_auxtrace *etm;
struct perf_sample sample;
- struct int_node *inode;
struct evsel *evsel;
- u64 *cpu_data;
u64 hw_id;
int cpu, version, err;
- u8 trace_chan_id, curr_chan_id;
/* extract and parse the HW ID */
hw_id = event->aux_output_hw_id.hw_id;
version = FIELD_GET(CS_AUX_HW_ID_VERSION_MASK, hw_id);
- trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
/* check that we can handle this version */
if (version > CS_AUX_HW_ID_CURR_VERSION) {
@@ -362,43 +416,7 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
return -EINVAL;
}
- /* See if the ID is mapped to a CPU, and it matches the current CPU */
- inode = intlist__find(traceid_list, trace_chan_id);
- if (inode) {
- cpu_data = inode->priv;
- if ((int)cpu_data[CS_ETM_CPU] != cpu) {
- pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n");
- return -EINVAL;
- }
-
- /* check that the mapped ID matches */
- err = cs_etm__metadata_get_trace_id(&curr_chan_id, cpu_data);
- if (err)
- return err;
- if (curr_chan_id != trace_chan_id) {
- pr_err("CS_ETM: mismatch between CPU trace ID and HW_ID packet ID\n");
- return -EINVAL;
- }
-
- /* mapped and matched - return OK */
- return 0;
- }
-
- cpu_data = get_cpu_data(etm, cpu);
- if (cpu_data == NULL)
- return err;
-
- /* not one we've seen before - lets map it */
- err = cs_etm__map_trace_id(trace_chan_id, cpu_data);
- if (err)
- return err;
-
- /*
- * if we are picking up the association from the packet, need to plug
- * the correct trace ID into the metadata for setting up decoders later.
- */
- err = cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data);
- return err;
+ return cs_etm__process_trace_id_v0(etm, cpu, hw_id);
}
void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq,
@@ -851,6 +869,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
static void cs_etm__free_queue(void *priv)
{
+ struct int_node *inode, *tmp;
struct cs_etm_queue *etmq = priv;
if (!etmq)
@@ -858,6 +877,14 @@ static void cs_etm__free_queue(void *priv)
cs_etm_decoder__free(etmq->decoder);
cs_etm__free_traceid_queues(etmq);
+
+ /* First remove all traceID/metadata nodes for the RB tree */
+ intlist__for_each_entry_safe(inode, tmp, etmq->traceid_list)
+ intlist__remove(etmq->traceid_list, inode);
+
+ /* Then the RB tree itself */
+ intlist__delete(etmq->traceid_list);
+
free(etmq);
}
@@ -880,19 +907,12 @@ static void cs_etm__free_events(struct perf_session *session)
static void cs_etm__free(struct perf_session *session)
{
int i;
- struct int_node *inode, *tmp;
struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
struct cs_etm_auxtrace,
auxtrace);
cs_etm__free_events(session);
session->auxtrace = NULL;
- /* First remove all traceID/metadata nodes for the RB tree */
- intlist__for_each_entry_safe(inode, tmp, traceid_list)
- intlist__remove(traceid_list, inode);
- /* Then the RB tree itself */
- intlist__delete(traceid_list);
-
for (i = 0; i < aux->num_cpu; i++)
zfree(&aux->metadata[i]);
@@ -1050,9 +1070,24 @@ static struct cs_etm_queue *cs_etm__alloc_queue(void)
etmq->traceid_queues_list = intlist__new(NULL);
if (!etmq->traceid_queues_list)
- free(etmq);
+ goto out_free;
+
+ /*
+ * Create an RB tree for traceID-metadata tuple. Since the conversion
+ * has to be made for each packet that gets decoded, optimizing access
+ * in anything other than a sequential array is worth doing.
+ */
+ etmq->traceid_list = intlist__new(NULL);
+ if (!etmq->traceid_list)
+ goto out_free;
return etmq;
+
+out_free:
+ intlist__delete(etmq->traceid_queues_list);
+ free(etmq);
+
+ return NULL;
}
static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
@@ -2202,7 +2237,7 @@ static int cs_etm__set_sample_flags(struct cs_etm_queue *etmq,
PERF_IP_FLAG_TRACE_END;
break;
case CS_ETM_EXCEPTION:
- ret = cs_etm__get_magic(packet->trace_chan_id, &magic);
+ ret = cs_etm__get_magic(etmq, packet->trace_chan_id, &magic);
if (ret)
return ret;
@@ -3119,7 +3154,8 @@ static bool cs_etm__has_virtual_ts(u64 **metadata, int num_cpu)
}
/* map trace ids to correct metadata block, from information in metadata */
-static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata)
+static int cs_etm__map_trace_ids_metadata(struct cs_etm_auxtrace *etm, int num_cpu,
+ u64 **metadata)
{
u64 cs_etm_magic;
u8 trace_chan_id;
@@ -3141,7 +3177,7 @@ static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata)
/* unknown magic number */
return -EINVAL;
}
- err = cs_etm__map_trace_id(trace_chan_id, metadata[i]);
+ err = cs_etm__map_trace_id_v0(etm, trace_chan_id, metadata[i]);
if (err)
return err;
}
@@ -3272,23 +3308,12 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
u64 *ptr = NULL;
u64 **metadata = NULL;
- /*
- * Create an RB tree for traceID-metadata tuple. Since the conversion
- * has to be made for each packet that gets decoded, optimizing access
- * in anything other than a sequential array is worth doing.
- */
- traceid_list = intlist__new(NULL);
- if (!traceid_list)
- return -ENOMEM;
-
/* First the global part */
ptr = (u64 *) auxtrace_info->priv;
num_cpu = ptr[CS_PMU_TYPE_CPUS] & 0xffffffff;
metadata = zalloc(sizeof(*metadata) * num_cpu);
- if (!metadata) {
- err = -ENOMEM;
- goto err_free_traceid_list;
- }
+ if (!metadata)
+ return -ENOMEM;
/* Start parsing after the common part of the header */
i = CS_HEADER_VERSION_MAX;
@@ -3467,7 +3492,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
err = cs_etm__clear_unused_trace_ids_metadata(num_cpu, metadata);
/* otherwise, this is a file with metadata values only, map from metadata */
else
- err = cs_etm__map_trace_ids_metadata(num_cpu, metadata);
+ err = cs_etm__map_trace_ids_metadata(etm, num_cpu, metadata);
if (err)
goto err_free_queues;
@@ -3489,7 +3514,5 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
for (int j = 0; j < num_cpu; j++)
zfree(&metadata[j]);
zfree(&metadata);
-err_free_traceid_list:
- intlist__delete(traceid_list);
return err;
}
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 4696267a32f0..f4f69f7cc0f3 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -252,7 +252,7 @@ enum cs_etm_pid_fmt {
#ifdef HAVE_CSTRACE_SUPPORT
#include <opencsd/ocsd_if_types.h>
-int cs_etm__get_cpu(u8 trace_chan_id, int *cpu);
+int cs_etm__get_cpu(struct cs_etm_queue *etmq, u8 trace_chan_id, int *cpu);
enum cs_etm_pid_fmt cs_etm__get_pid_fmt(struct cs_etm_queue *etmq);
int cs_etm__etmq_set_tid_el(struct cs_etm_queue *etmq, pid_t tid,
u8 trace_chan_id, ocsd_ex_level el);
--
2.34.1
Currently it's only possible to initialize with the default number of
queues and then use auxtrace_queues__add_event() to grow the array. But
that's problematic if you don't have a real event to pass into that
function yet.
The queues hold a void *priv member to store custom state, and for
Coresight we want to create decoders upfront before receiving data, so
add a new function that allows pre-allocating queues. One reason to do
this is because we might need to store metadata (HW_ID events) that
effects other queues, but never actually receive auxtrace data on that
queue.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/auxtrace.c | 9 +++++++--
tools/perf/util/auxtrace.h | 1 +
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 3684e6009b63..563b6c4fca31 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -218,15 +218,20 @@ static struct auxtrace_queue *auxtrace_alloc_queue_array(unsigned int nr_queues)
return queue_array;
}
-int auxtrace_queues__init(struct auxtrace_queues *queues)
+int auxtrace_queues__init_nr(struct auxtrace_queues *queues, int nr_queues)
{
- queues->nr_queues = AUXTRACE_INIT_NR_QUEUES;
+ queues->nr_queues = nr_queues;
queues->queue_array = auxtrace_alloc_queue_array(queues->nr_queues);
if (!queues->queue_array)
return -ENOMEM;
return 0;
}
+int auxtrace_queues__init(struct auxtrace_queues *queues)
+{
+ return auxtrace_queues__init_nr(queues, AUXTRACE_INIT_NR_QUEUES);
+}
+
static int auxtrace_queues__grow(struct auxtrace_queues *queues,
unsigned int new_nr_queues)
{
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 55702215a82d..8a6ec9565835 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -521,6 +521,7 @@ int auxtrace_mmap__read_snapshot(struct mmap *map,
struct perf_tool *tool, process_auxtrace_t fn,
size_t snapshot_size);
+int auxtrace_queues__init_nr(struct auxtrace_queues *queues, int nr_queues);
int auxtrace_queues__init(struct auxtrace_queues *queues);
int auxtrace_queues__add_event(struct auxtrace_queues *queues,
struct perf_session *session,
--
2.34.1
The trace ID maps will need to be created and stored by the core and
Perf code so move the definition up to the common header.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-trace-id.c | 1 +
.../hwtracing/coresight/coresight-trace-id.h | 19 -------------------
include/linux/coresight.h | 18 ++++++++++++++++++
3 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
index af5b4ef59cea..19005b5b4dc4 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.c
+++ b/drivers/hwtracing/coresight/coresight-trace-id.c
@@ -3,6 +3,7 @@
* Copyright (c) 2022, Linaro Limited, All rights reserved.
* Author: Mike Leach <[email protected]>
*/
+#include <linux/coresight.h>
#include <linux/coresight-pmu.h>
#include <linux/cpumask.h>
#include <linux/kernel.h>
diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
index 3797777d367e..49438a96fcc6 100644
--- a/drivers/hwtracing/coresight/coresight-trace-id.h
+++ b/drivers/hwtracing/coresight/coresight-trace-id.h
@@ -32,10 +32,6 @@
#include <linux/bitops.h>
#include <linux/types.h>
-
-/* architecturally we have 128 IDs some of which are reserved */
-#define CORESIGHT_TRACE_IDS_MAX 128
-
/* ID 0 is reserved */
#define CORESIGHT_TRACE_ID_RES_0 0
@@ -46,21 +42,6 @@
#define IS_VALID_CS_TRACE_ID(id) \
((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
-/**
- * Trace ID map.
- *
- * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
- * Initialised so that the reserved IDs are permanently marked as
- * in use.
- * @pend_rel_ids: CPU IDs that have been released by the trace source but not
- * yet marked as available, to allow re-allocation to the same
- * CPU during a perf session.
- */
-struct coresight_trace_id_map {
- DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
- DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
-};
-
/* Allocate and release IDs for a single default trace ID map */
/**
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index f09ace92176e..c16c61a8411d 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -218,6 +218,24 @@ struct coresight_sysfs_link {
const char *target_name;
};
+/* architecturally we have 128 IDs some of which are reserved */
+#define CORESIGHT_TRACE_IDS_MAX 128
+
+/**
+ * Trace ID map.
+ *
+ * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
+ * Initialised so that the reserved IDs are permanently marked as
+ * in use.
+ * @pend_rel_ids: CPU IDs that have been released by the trace source but not
+ * yet marked as available, to allow re-allocation to the same
+ * CPU during a perf session.
+ */
+struct coresight_trace_id_map {
+ DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
+ DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
+};
+
/**
* struct coresight_device - representation of a device as used by the framework
* @pdata: Platform data with device connections associated to this device.
--
2.34.1
This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
as long as there are fewer than that many ETMs connected to each sink.
Each sink owns its own trace ID map, and any Perf session connecting to
that sink will allocate from it, even if the sink is currently in use by
other users. This is similar to the existing behavior where the dynamic
trace IDs are constant as long as there is any concurrent Perf session
active. It's not completely optimal because slightly more IDs will be
used than necessary, but the optimal solution involves tracking the PIDs
of each session and allocating ID maps based on the session owner. This
is difficult to do with the combination of per-thread and per-cpu modes
and some scheduling issues. The complexity of this isn't likely to worth
it because even with multiple users they'd just see a difference in the
ordering of ID allocations rather than hitting any limits (unless the
hardware does have too many ETMs connected to one sink).
Signed-off-by: James Clark <[email protected]>
---
drivers/hwtracing/coresight/coresight-core.c | 10 ++++++++++
drivers/hwtracing/coresight/coresight-etm-perf.c | 15 ++++++++-------
include/linux/coresight.h | 1 +
3 files changed, 19 insertions(+), 7 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 9fc6f6b863e0..d1adff467670 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -902,6 +902,7 @@ static void coresight_device_release(struct device *dev)
struct coresight_device *csdev = to_coresight_device(dev);
fwnode_handle_put(csdev->dev.fwnode);
+ free_percpu(csdev->perf_id_map.cpu_map);
kfree(csdev);
}
@@ -1159,6 +1160,14 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
dev_set_name(&csdev->dev, "%s", desc->name);
+ if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
+ csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) {
+ csdev->perf_id_map.cpu_map = alloc_percpu(atomic_t);
+ if (!csdev->perf_id_map.cpu_map) {
+ ret = -ENOMEM;
+ goto err_out;
+ }
+ }
/*
* Make sure the device registration and the connection fixup
* are synchronised, so that we don't see uninitialised devices
@@ -1216,6 +1225,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
err_out:
/* Cleanup the connection information */
coresight_release_platform_data(NULL, desc->dev, desc->pdata);
+ kfree(csdev);
return ERR_PTR(ret);
}
EXPORT_SYMBOL_GPL(coresight_register);
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index 177cecae38d9..86ca1a9d09a7 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -229,10 +229,13 @@ static void free_event_data(struct work_struct *work)
struct list_head **ppath;
ppath = etm_event_cpu_path_ptr(event_data, cpu);
- if (!(IS_ERR_OR_NULL(*ppath)))
+ if (!(IS_ERR_OR_NULL(*ppath))) {
+ struct coresight_device *sink = coresight_get_sink(*ppath);
+
+ coresight_trace_id_put_cpu_id(cpu, &sink->perf_id_map);
coresight_release_path(*ppath);
+ }
*ppath = NULL;
- coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
}
/* mark perf event as done for trace id allocator */
@@ -401,8 +404,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
}
/* ensure we can allocate a trace ID for this CPU */
- trace_id = coresight_trace_id_get_cpu_id(cpu,
- coresight_trace_id_map_default());
+ trace_id = coresight_trace_id_get_cpu_id(cpu, &sink->perf_id_map);
if (!IS_VALID_CS_TRACE_ID(trace_id)) {
cpumask_clear_cpu(cpu, mask);
coresight_release_path(path);
@@ -497,7 +499,7 @@ static void etm_event_start(struct perf_event *event, int flags)
/* Finally enable the tracer */
if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF,
- coresight_trace_id_map_default()))
+ &sink->perf_id_map))
goto fail_disable_path;
/*
@@ -509,8 +511,7 @@ static void etm_event_start(struct perf_event *event, int flags)
hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
CS_AUX_HW_ID_CURR_VERSION);
hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
- coresight_trace_id_read_cpu_id(cpu,
- coresight_trace_id_map_default()));
+ coresight_trace_id_read_cpu_id(cpu, &sink->perf_id_map));
perf_report_aux_output_id(event, hw_id);
}
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 3a678e5425dc..8c4c1860c76b 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -290,6 +290,7 @@ struct coresight_device {
bool sysfs_sink_activated;
struct dev_ext_attribute *ea;
struct coresight_device *def_sink;
+ struct coresight_trace_id_map perf_id_map;
/* sysfs links between components */
int nr_links;
bool has_conns_grp;
--
2.34.1
On 29/04/2024 16:21, James Clark wrote:
> ...
>
> System ID functions are unchanged because they will always use the
> default map.
>
> Signed-off-by: James Clark <[email protected]>
Ignore this patch, the other 11/17 is the correct one.
Make cs_etm__setup_queue() setup a queue even if it's empty, and
pre-allocate queues based on the max CPU that was recorded. In per-CPU
mode aux queues are indexed based on CPU ID even if all CPUs aren't
recorded, sparse queue arrays aren't used.
This will allow HW_IDs to be saved even if no aux data was received in
that queue without having to call cs_etm__setup_queue() from two
different places.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/cs-etm.c | 55 ++++++++++++++++++++--------------------
1 file changed, 27 insertions(+), 28 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index f09004c4ba44..e40dfd09d3ed 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -104,6 +104,7 @@ struct cs_etm_queue {
unsigned int queue_nr;
u8 pending_timestamp_chan_id;
bool formatted;
+ bool formatted_set;
u64 offset;
const unsigned char *buf;
size_t buf_len, buf_used;
@@ -1056,16 +1057,11 @@ static struct cs_etm_queue *cs_etm__alloc_queue(void)
static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
struct auxtrace_queue *queue,
- unsigned int queue_nr, bool formatted)
+ unsigned int queue_nr)
{
struct cs_etm_queue *etmq = queue->priv;
- if (etmq && formatted != etmq->formatted) {
- pr_err("CS_ETM: mixed formatted and unformatted trace not supported\n");
- return -EINVAL;
- }
-
- if (list_empty(&queue->head) || etmq)
+ if (etmq)
return 0;
etmq = cs_etm__alloc_queue();
@@ -1078,7 +1074,6 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
etmq->queue_nr = queue_nr;
queue->cpu = queue_nr; /* Placeholder, may be reset to -1 in per-thread mode */
etmq->offset = 0;
- etmq->formatted = formatted;
return 0;
}
@@ -2791,17 +2786,6 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
if (err)
return err;
- /*
- * Knowing if the trace is formatted or not requires a lookup of
- * the aux record so only works in non-piped mode where data is
- * queued in cs_etm__queue_aux_records(). Always assume
- * formatted in piped mode (true).
- */
- err = cs_etm__setup_queue(etm, &etm->queues.queue_array[idx],
- idx, true);
- if (err)
- return err;
-
if (dump_trace)
if (auxtrace_buffer__get_data(buffer, fd)) {
cs_etm__dump_event(etm->queues.queue_array[idx].priv, buffer);
@@ -2918,7 +2902,6 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
struct perf_record_auxtrace *auxtrace_event;
union perf_event auxtrace_fragment;
__u64 aux_offset, aux_size;
- __u32 idx;
bool formatted;
struct cs_etm_auxtrace *etm = container_of(session->auxtrace,
@@ -2985,6 +2968,8 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
if (aux_offset >= auxtrace_event->offset &&
aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
+ struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
+
/*
* If this AUX event was inside this buffer somewhere, create a new auxtrace event
* based on the sizes of the aux event, and queue that fragment.
@@ -3001,10 +2986,14 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
if (err)
return err;
- idx = auxtrace_event->idx;
formatted = !(aux_event->flags & PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
-
- return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], idx, formatted);
+ if (etmq->formatted_set && formatted != etmq->formatted) {
+ pr_err("CS_ETM: mixed formatted and unformatted trace not supported\n");
+ return -EINVAL;
+ }
+ etmq->formatted = formatted;
+ etmq->formatted_set = true;
+ return 0;
}
/* Wasn't inside this buffer, but there were no parse errors. 1 == 'not found' */
@@ -3256,6 +3245,7 @@ static int cs_etm__create_decoders(struct cs_etm_auxtrace *etm)
* Don't create decoders for empty queues, mainly because
* etmq->formatted is unknown for empty queues.
*/
+ assert(empty == !etmq->formatted_set);
if (empty)
continue;
@@ -3275,10 +3265,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
int event_header_size = sizeof(struct perf_event_header);
int total_size = auxtrace_info->header.size;
int priv_size = 0;
- int num_cpu;
+ int num_cpu, max_cpu = 0;
int err = 0;
int aux_hw_id_found;
- int i, j;
+ int i;
u64 *ptr = NULL;
u64 **metadata = NULL;
@@ -3309,7 +3299,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
* required by the trace decoder to properly decode the trace due
* to its highly compressed nature.
*/
- for (j = 0; j < num_cpu; j++) {
+ for (int j = 0; j < num_cpu; j++) {
if (ptr[i] == __perf_cs_etmv3_magic) {
metadata[j] =
cs_etm__create_meta_blk(ptr, &i,
@@ -3333,6 +3323,9 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
err = -ENOMEM;
goto err_free_metadata;
}
+
+ if ((int) metadata[j][CS_ETM_CPU] > max_cpu)
+ max_cpu = metadata[j][CS_ETM_CPU];
}
/*
@@ -3362,10 +3355,16 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
*/
etm->pid_fmt = cs_etm__init_pid_fmt(metadata[0]);
- err = auxtrace_queues__init(&etm->queues);
+ err = auxtrace_queues__init_nr(&etm->queues, max_cpu + 1);
if (err)
goto err_free_etm;
+ for (unsigned int j = 0; j < etm->queues.nr_queues; ++j) {
+ err = cs_etm__setup_queue(etm, &etm->queues.queue_array[j], j);
+ if (err)
+ goto err_free_queues;
+ }
+
if (session->itrace_synth_opts->set) {
etm->synth_opts = *session->itrace_synth_opts;
} else {
@@ -3487,7 +3486,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
zfree(&etm);
err_free_metadata:
/* No need to check @metadata[j], free(NULL) is supported */
- for (j = 0; j < num_cpu; j++)
+ for (int j = 0; j < num_cpu; j++)
zfree(&metadata[j]);
zfree(&metadata);
err_free_traceid_list:
--
2.34.1
The likely fix for this is to update Perf so print a helpful message.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/cs-etm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index d65d7485886c..32818bd7cd17 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -335,8 +335,11 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
/* check that we can handle this version */
- if (version > CS_AUX_HW_ID_CURR_VERSION)
+ if (version > CS_AUX_HW_ID_CURR_VERSION) {
+ pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n",
+ version);
return -EINVAL;
+ }
/* get access to the etm metadata */
etm = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace);
--
2.34.1
This file is never included anywhere if CONFIG_CORESIGHT is not set so
they are unused and aren't currently compile tested with any config so
remove them.
Signed-off-by: James Clark <[email protected]>
---
.../hwtracing/coresight/coresight-etm-perf.h | 18 ------------------
1 file changed, 18 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h
index bebbadee2ceb..744531158d6b 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.h
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.h
@@ -62,7 +62,6 @@ struct etm_event_data {
struct list_head * __percpu *path;
};
-#if IS_ENABLED(CONFIG_CORESIGHT)
int etm_perf_symlink(struct coresight_device *csdev, bool link);
int etm_perf_add_symlink_sink(struct coresight_device *csdev);
void etm_perf_del_symlink_sink(struct coresight_device *csdev);
@@ -77,23 +76,6 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
int etm_perf_add_symlink_cscfg(struct device *dev,
struct cscfg_config_desc *config_desc);
void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc);
-#else
-static inline int etm_perf_symlink(struct coresight_device *csdev, bool link)
-{ return -EINVAL; }
-int etm_perf_add_symlink_sink(struct coresight_device *csdev)
-{ return -EINVAL; }
-void etm_perf_del_symlink_sink(struct coresight_device *csdev) {}
-static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
-{
- return NULL;
-}
-int etm_perf_add_symlink_cscfg(struct device *dev,
- struct cscfg_config_desc *config_desc)
-{ return -EINVAL; }
-void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc) {}
-
-#endif /* CONFIG_CORESIGHT */
-
int __init etm_perf_init(void);
void etm_perf_exit(void);
--
2.34.1
Both of these passes gather information about how to create the
decoders. AUX records determine formatted/unformatted, and the HW_IDs
determine the traceID/metadata mappings. Therefore it makes sense to
cache the information and wait until both passes are over until creating
the decoders, rather than creating them at the first HW_ID found. This
will allow a simplification of the creation process where
cs_etm_queue->traceid_list will exclusively used to create the decoders,
rather than the current two methods depending on whether the trace is
formatted or not.
Previously the sample CPU from the AUX record was used to initialize
the decoder CPU, but actually sample CPU == AUX queue index in per-CPU
mode, so saving the sample CPU isn't required. Similarly
formatted/unformatted was used upfront to create the decoders, but now
it's cached until later.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/cs-etm.c | 167 ++++++++++++++++++++++++---------------
1 file changed, 102 insertions(+), 65 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 32818bd7cd17..f09004c4ba44 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -103,6 +103,7 @@ struct cs_etm_queue {
struct auxtrace_buffer *buffer;
unsigned int queue_nr;
u8 pending_timestamp_chan_id;
+ bool formatted;
u64 offset;
const unsigned char *buf;
size_t buf_len, buf_used;
@@ -738,8 +739,7 @@ static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params,
static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params,
struct cs_etm_queue *etmq,
- enum cs_etm_decoder_operation mode,
- bool formatted)
+ enum cs_etm_decoder_operation mode)
{
int ret = -EINVAL;
@@ -749,7 +749,7 @@ static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params,
d_params->packet_printer = cs_etm__packet_dump;
d_params->operation = mode;
d_params->data = etmq;
- d_params->formatted = formatted;
+ d_params->formatted = etmq->formatted;
d_params->fsyncs = false;
d_params->hsyncs = false;
d_params->frame_aligned = true;
@@ -1041,81 +1041,34 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u8 trace_chan_id,
return ret;
}
-static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,
- bool formatted, int sample_cpu)
+static struct cs_etm_queue *cs_etm__alloc_queue(void)
{
- struct cs_etm_decoder_params d_params;
- struct cs_etm_trace_params *t_params = NULL;
- struct cs_etm_queue *etmq;
- /*
- * Each queue can only contain data from one CPU when unformatted, so only one decoder is
- * needed.
- */
- int decoders = formatted ? etm->num_cpu : 1;
-
- etmq = zalloc(sizeof(*etmq));
+ struct cs_etm_queue *etmq = zalloc(sizeof(*etmq));
if (!etmq)
return NULL;
etmq->traceid_queues_list = intlist__new(NULL);
if (!etmq->traceid_queues_list)
- goto out_free;
-
- /* Use metadata to fill in trace parameters for trace decoder */
- t_params = zalloc(sizeof(*t_params) * decoders);
+ free(etmq);
- if (!t_params)
- goto out_free;
-
- if (cs_etm__init_trace_params(t_params, etm, formatted, sample_cpu, decoders))
- goto out_free;
-
- /* Set decoder parameters to decode trace packets */
- if (cs_etm__init_decoder_params(&d_params, etmq,
- dump_trace ? CS_ETM_OPERATION_PRINT :
- CS_ETM_OPERATION_DECODE,
- formatted))
- goto out_free;
-
- etmq->decoder = cs_etm_decoder__new(decoders, &d_params,
- t_params);
-
- if (!etmq->decoder)
- goto out_free;
-
- /*
- * Register a function to handle all memory accesses required by
- * the trace decoder library.
- */
- if (cs_etm_decoder__add_mem_access_cb(etmq->decoder,
- 0x0L, ((u64) -1L),
- cs_etm__mem_access))
- goto out_free_decoder;
-
- zfree(&t_params);
return etmq;
-
-out_free_decoder:
- cs_etm_decoder__free(etmq->decoder);
-out_free:
- intlist__delete(etmq->traceid_queues_list);
- free(etmq);
-
- return NULL;
}
static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
struct auxtrace_queue *queue,
- unsigned int queue_nr,
- bool formatted,
- int sample_cpu)
+ unsigned int queue_nr, bool formatted)
{
struct cs_etm_queue *etmq = queue->priv;
+ if (etmq && formatted != etmq->formatted) {
+ pr_err("CS_ETM: mixed formatted and unformatted trace not supported\n");
+ return -EINVAL;
+ }
+
if (list_empty(&queue->head) || etmq)
return 0;
- etmq = cs_etm__alloc_queue(etm, formatted, sample_cpu);
+ etmq = cs_etm__alloc_queue();
if (!etmq)
return -ENOMEM;
@@ -1123,7 +1076,9 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
queue->priv = etmq;
etmq->etm = etm;
etmq->queue_nr = queue_nr;
+ queue->cpu = queue_nr; /* Placeholder, may be reset to -1 in per-thread mode */
etmq->offset = 0;
+ etmq->formatted = formatted;
return 0;
}
@@ -2843,7 +2798,7 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
* formatted in piped mode (true).
*/
err = cs_etm__setup_queue(etm, &etm->queues.queue_array[idx],
- idx, true, -1);
+ idx, true);
if (err)
return err;
@@ -3048,8 +3003,8 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
idx = auxtrace_event->idx;
formatted = !(aux_event->flags & PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
- return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx],
- idx, formatted, sample->cpu);
+
+ return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], idx, formatted);
}
/* Wasn't inside this buffer, but there were no parse errors. 1 == 'not found' */
@@ -3233,6 +3188,84 @@ static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64 **metadata)
return 0;
}
+/*
+ * Use the data gathered by the peeks for HW_ID (trace ID mappings) and AUX
+ * (formatted or not) packets to create the decoders.
+ */
+static int cs_etm__create_queue_decoders(struct cs_etm_queue *etmq)
+{
+ struct cs_etm_decoder_params d_params;
+
+ /*
+ * Each queue can only contain data from one CPU when unformatted, so only one decoder is
+ * needed.
+ */
+ int decoders = etmq->formatted ? etmq->etm->num_cpu : 1;
+
+ /* Use metadata to fill in trace parameters for trace decoder */
+ struct cs_etm_trace_params *t_params = zalloc(sizeof(*t_params) * decoders);
+
+ if (!t_params)
+ goto out_free;
+
+ if (cs_etm__init_trace_params(t_params, etmq->etm, etmq->formatted,
+ etmq->queue_nr, decoders))
+ goto out_free;
+
+ /* Set decoder parameters to decode trace packets */
+ if (cs_etm__init_decoder_params(&d_params, etmq,
+ dump_trace ? CS_ETM_OPERATION_PRINT :
+ CS_ETM_OPERATION_DECODE))
+ goto out_free;
+
+ etmq->decoder = cs_etm_decoder__new(decoders, &d_params,
+ t_params);
+
+ if (!etmq->decoder)
+ goto out_free;
+
+ /*
+ * Register a function to handle all memory accesses required by
+ * the trace decoder library.
+ */
+ if (cs_etm_decoder__add_mem_access_cb(etmq->decoder,
+ 0x0L, ((u64) -1L),
+ cs_etm__mem_access))
+ goto out_free_decoder;
+
+ zfree(&t_params);
+ return 0;
+
+out_free_decoder:
+ cs_etm_decoder__free(etmq->decoder);
+out_free:
+ zfree(&t_params);
+ return -EINVAL;
+}
+
+static int cs_etm__create_decoders(struct cs_etm_auxtrace *etm)
+{
+ struct auxtrace_queues *queues = &etm->queues;
+
+ for (unsigned int i = 0; i < queues->nr_queues; i++) {
+ bool empty = list_empty(&queues->queue_array[i].head);
+ struct cs_etm_queue *etmq = queues->queue_array[i].priv;
+ int ret;
+
+ /*
+ * Don't create decoders for empty queues, mainly because
+ * etmq->formatted is unknown for empty queues.
+ */
+ if (empty)
+ continue;
+
+ ret = cs_etm__create_queue_decoders(etmq);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
int cs_etm__process_auxtrace_info_full(union perf_event *event,
struct perf_session *session)
{
@@ -3396,6 +3429,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
if (err)
goto err_free_queues;
+ err = cs_etm__queue_aux_records(session);
+ if (err)
+ goto err_free_queues;
+
/*
* Map Trace ID values to CPU metadata.
*
@@ -3418,7 +3455,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
* flags if present.
*/
- /* first scan for AUX_OUTPUT_HW_ID records to map trace ID values to CPU metadata */
+ /* Scan for AUX_OUTPUT_HW_ID records to map trace ID values to CPU metadata */
aux_hw_id_found = 0;
err = perf_session__peek_events(session, session->header.data_offset,
session->header.data_size,
@@ -3436,7 +3473,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
if (err)
goto err_free_queues;
- err = cs_etm__queue_aux_records(session);
+ err = cs_etm__create_decoders(etm);
if (err)
goto err_free_queues;
--
2.34.1
In per-cpu mode there are multiple aux buffers and each one has a
fixed sink, so the hw ID mappings which only need to be emitted once
for each buffer, even with the new per-sink trace ID pools.
But in per-thread mode there is only a single buffer which can be
written to from any sink with now potentially overlapping trace IDs, so
hw ID mappings need to be re-emitted every time the sink changes.
This will require a change in Perf to track this so it knows which
decode tree to use for each segment of the buffer. In theory it's also
possible to look at the CPU ID on the AUX records, but this is more
consistent with the existing system, and allows for correct decode using
either mechanism.
Signed-off-by: James Clark <[email protected]>
---
drivers/hwtracing/coresight/coresight-etm-perf.c | 14 ++++++++++++++
drivers/hwtracing/coresight/coresight-etm-perf.h | 2 ++
2 files changed, 16 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index f07173aa4d66..08f3958f9367 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -499,6 +499,20 @@ static void etm_event_start(struct perf_event *event, int flags)
&sink->perf_id_map))
goto fail_disable_path;
+ /*
+ * In per-cpu mode there are multiple aux buffers and each one has a
+ * fixed sink, so the hw ID mappings which only need to be emitted once
+ * for each buffer.
+ *
+ * But in per-thread mode there is only a single buffer which can be
+ * written to from any sink with potentially overlapping trace IDs, so
+ * hw ID mappings need to be re-emitted every time the sink changes.
+ */
+ if (event->cpu == -1 && event_data->last_sink_hwid != sink) {
+ cpumask_clear(&event_data->aux_hwid_done);
+ event_data->last_sink_hwid = sink;
+ }
+
/*
* output cpu / trace ID in perf record, once for the lifetime
* of the event.
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h
index 744531158d6b..bd4553b2a1ec 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.h
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.h
@@ -52,6 +52,7 @@ struct etm_filters {
* @snk_config: The sink configuration.
* @cfg_hash: The hash id of any coresight config selected.
* @path: An array of path, each slot for one CPU.
+ * @last_sink_hwid: Last sink that a hwid was emitted for.
*/
struct etm_event_data {
struct work_struct work;
@@ -60,6 +61,7 @@ struct etm_event_data {
void *snk_config;
u32 cfg_hash;
struct list_head * __percpu *path;
+ struct coresight_device *last_sink_hwid;
};
int etm_perf_symlink(struct coresight_device *csdev, bool link);
--
2.34.1
On 29/04/24 18:21, James Clark wrote:
> Currently it's only possible to initialize with the default number of
> queues and then use auxtrace_queues__add_event() to grow the array. But
> that's problematic if you don't have a real event to pass into that
> function yet.
>
> The queues hold a void *priv member to store custom state, and for
> Coresight we want to create decoders upfront before receiving data, so
> add a new function that allows pre-allocating queues. One reason to do
> this is because we might need to store metadata (HW_ID events) that
> effects other queues, but never actually receive auxtrace data on that
> queue.
>
> Signed-off-by: James Clark <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/auxtrace.c | 9 +++++++--
> tools/perf/util/auxtrace.h | 1 +
> 2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 3684e6009b63..563b6c4fca31 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -218,15 +218,20 @@ static struct auxtrace_queue *auxtrace_alloc_queue_array(unsigned int nr_queues)
> return queue_array;
> }
>
> -int auxtrace_queues__init(struct auxtrace_queues *queues)
> +int auxtrace_queues__init_nr(struct auxtrace_queues *queues, int nr_queues)
> {
> - queues->nr_queues = AUXTRACE_INIT_NR_QUEUES;
> + queues->nr_queues = nr_queues;
> queues->queue_array = auxtrace_alloc_queue_array(queues->nr_queues);
> if (!queues->queue_array)
> return -ENOMEM;
> return 0;
> }
>
> +int auxtrace_queues__init(struct auxtrace_queues *queues)
> +{
> + return auxtrace_queues__init_nr(queues, AUXTRACE_INIT_NR_QUEUES);
> +}
> +
> static int auxtrace_queues__grow(struct auxtrace_queues *queues,
> unsigned int new_nr_queues)
> {
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index 55702215a82d..8a6ec9565835 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -521,6 +521,7 @@ int auxtrace_mmap__read_snapshot(struct mmap *map,
> struct perf_tool *tool, process_auxtrace_t fn,
> size_t snapshot_size);
>
> +int auxtrace_queues__init_nr(struct auxtrace_queues *queues, int nr_queues);
> int auxtrace_queues__init(struct auxtrace_queues *queues);
> int auxtrace_queues__add_event(struct auxtrace_queues *queues,
> struct perf_session *session,
Hi James,
On Mon, 29 Apr 2024 at 16:25, James Clark <[email protected]> wrote:
>
> The trace ID API is currently hard coded to always use the global map.
> The functions that take the map as an argument aren't currently public.
> Make them public so that Perf mode can pass in its own maps. At the
> moment all usages are still hard coded to use the global map, but now
> on the caller side.
>
> System ID functions are unchanged because they will always use the
> default map.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> .../hwtracing/coresight/coresight-etm-perf.c | 5 +++--
> .../coresight/coresight-etm3x-core.c | 5 +++--
> .../coresight/coresight-etm4x-core.c | 5 +++--
> .../hwtracing/coresight/coresight-trace-id.c | 22 +++++++------------
> .../hwtracing/coresight/coresight-trace-id.h | 9 +++++---
> 5 files changed, 23 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index c0c60e6a1703..4afb9d29f355 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -232,7 +232,7 @@ static void free_event_data(struct work_struct *work)
> if (!(IS_ERR_OR_NULL(*ppath)))
> coresight_release_path(*ppath);
> *ppath = NULL;
> - coresight_trace_id_put_cpu_id(cpu);
> + coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
> }
>
> /* mark perf event as done for trace id allocator */
> @@ -401,7 +401,8 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> }
>
> /* ensure we can allocate a trace ID for this CPU */
> - trace_id = coresight_trace_id_get_cpu_id(cpu);
> + trace_id = coresight_trace_id_get_cpu_id(cpu,
> + coresight_trace_id_map_default());
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> cpumask_clear_cpu(cpu, mask);
> coresight_release_path(path);
> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> index 9d5c1391ffb1..4149e7675ceb 100644
> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> @@ -465,7 +465,8 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
> *
> * trace id function has its own lock
> */
> - trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
> + trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
> + coresight_trace_id_map_default());
> if (IS_VALID_CS_TRACE_ID(trace_id))
> drvdata->traceid = (u8)trace_id;
> else
> @@ -477,7 +478,7 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
>
> void etm_release_trace_id(struct etm_drvdata *drvdata)
> {
> - coresight_trace_id_put_cpu_id(drvdata->cpu);
> + coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
> }
>
> static int etm_enable_perf(struct coresight_device *csdev,
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index a0bdfabddbc6..f32c8cd7742d 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -241,7 +241,8 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
> * or return the one currently allocated.
> * The trace id function has its own lock
> */
> - trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
> + trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
> + coresight_trace_id_map_default());
> if (IS_VALID_CS_TRACE_ID(trace_id))
> drvdata->trcid = (u8)trace_id;
> else
> @@ -253,7 +254,7 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
>
> void etm4_release_trace_id(struct etmv4_drvdata *drvdata)
> {
> - coresight_trace_id_put_cpu_id(drvdata->cpu);
> + coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
> }
>
> struct etm4_enable_arg {
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
> index 19005b5b4dc4..45ddd50d09a6 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
> @@ -12,7 +12,7 @@
>
> #include "coresight-trace-id.h"
>
> -/* Default trace ID map. Used on systems that don't require per sink mappings */
> +/* Default trace ID map. Used in sysfs mode and for system sources */
> static struct coresight_trace_id_map id_map_default;
>
> /* maintain a record of the mapping of IDs and pending releases per cpu */
> @@ -152,7 +152,7 @@ static void coresight_trace_id_release_all_pending(void)
> DUMP_ID_MAP(id_map);
> }
>
> -static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> +int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> {
> unsigned long flags;
> int id;
> @@ -195,8 +195,9 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_
> DUMP_ID_MAP(id_map);
> return id;
> }
> +EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
>
> -static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> +void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> {
> unsigned long flags;
> int id;
> @@ -222,6 +223,7 @@ static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id
> DUMP_ID_CPU(cpu, id);
> DUMP_ID_MAP(id_map);
> }
> +EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
>
> static int coresight_trace_id_map_get_system_id(struct coresight_trace_id_map *id_map)
> {
> @@ -250,19 +252,11 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map *
> DUMP_ID_MAP(id_map);
> }
>
> -/* API functions */
> -
Rather than remove the existing default trace ID functions, simply add
a few new ones...
e.g. given the existing...
void coresight_trace_id_put_cpu_id(int cpu)
{
coresight_trace_id_map_put_cpu_id(cpu, &id_map_default);
}
EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
add:-
void coresight_trace_id_put_cpu_id_map(int cpu, struct
coresight_trace_id_map *id_map)
{
coresight_trace_id_map_put_cpu_id(cpu, id_map);
}
EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id_map);
This avoids a whole lot of churn in exposing the default map to
external functions, putting it in there and then removing it later.
When any location that was using coresight_trace_id_put_cpu_id() needs
to supply its own map, change the function name at that point to
coresight_trace_id_put_cpu_id_map()
Mike
> -int coresight_trace_id_get_cpu_id(int cpu)
> -{
> - return coresight_trace_id_map_get_cpu_id(cpu, &id_map_default);
> -}
> -EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
> -
> -void coresight_trace_id_put_cpu_id(int cpu)
> +struct coresight_trace_id_map *coresight_trace_id_map_default(void)
> {
> - coresight_trace_id_map_put_cpu_id(cpu, &id_map_default);
> + return &id_map_default;
> }
> -EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
> +EXPORT_SYMBOL_GPL(coresight_trace_id_map_default);
>
> int coresight_trace_id_read_cpu_id(int cpu)
> {
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
> index 49438a96fcc6..54b9d8ed903b 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.h
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.h
> @@ -42,7 +42,10 @@
> #define IS_VALID_CS_TRACE_ID(id) \
> ((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
>
> -/* Allocate and release IDs for a single default trace ID map */
> +/**
> + * Get the global map that's used by sysfs
> + */
> +struct coresight_trace_id_map *coresight_trace_id_map_default(void);
>
> /**
> * Read and optionally allocate a CoreSight trace ID and associate with a CPU.
> @@ -57,7 +60,7 @@
> *
> * return: CoreSight trace ID or -EINVAL if allocation impossible.
> */
> -int coresight_trace_id_get_cpu_id(int cpu);
> +int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
>
> /**
> * Release an allocated trace ID associated with the CPU.
> @@ -70,7 +73,7 @@ int coresight_trace_id_get_cpu_id(int cpu);
> *
> * @cpu: The CPU index to release the associated trace ID.
> */
> -void coresight_trace_id_put_cpu_id(int cpu);
> +void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
>
> /**
> * Read the current allocated CoreSight Trace ID value for the CPU.
> --
> 2.34.1
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On Mon, 29 Apr 2024 at 16:24, James Clark <[email protected]> wrote:
>
> This file is never included anywhere if CONFIG_CORESIGHT is not set so
> they are unused and aren't currently compile tested with any config so
> remove them.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> .../hwtracing/coresight/coresight-etm-perf.h | 18 ------------------
> 1 file changed, 18 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h
> index bebbadee2ceb..744531158d6b 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.h
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h
> @@ -62,7 +62,6 @@ struct etm_event_data {
> struct list_head * __percpu *path;
> };
>
> -#if IS_ENABLED(CONFIG_CORESIGHT)
> int etm_perf_symlink(struct coresight_device *csdev, bool link);
> int etm_perf_add_symlink_sink(struct coresight_device *csdev);
> void etm_perf_del_symlink_sink(struct coresight_device *csdev);
> @@ -77,23 +76,6 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
> int etm_perf_add_symlink_cscfg(struct device *dev,
> struct cscfg_config_desc *config_desc);
> void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc);
> -#else
> -static inline int etm_perf_symlink(struct coresight_device *csdev, bool link)
> -{ return -EINVAL; }
> -int etm_perf_add_symlink_sink(struct coresight_device *csdev)
> -{ return -EINVAL; }
> -void etm_perf_del_symlink_sink(struct coresight_device *csdev) {}
> -static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
> -{
> - return NULL;
> -}
> -int etm_perf_add_symlink_cscfg(struct device *dev,
> - struct cscfg_config_desc *config_desc)
> -{ return -EINVAL; }
> -void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc) {}
> -
> -#endif /* CONFIG_CORESIGHT */
> -
> int __init etm_perf_init(void);
> void etm_perf_exit(void);
>
> --
> 2.34.1
>
Reviewed-by: Mike Leach <[email protected]>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On Mon, 29 Apr 2024 at 16:24, James Clark <[email protected]> wrote:
>
> "Process being monitored" and "pid of the process to monitor" imply that
> this would be the same PID if there were two sessions targeting the same
> process. But this is actually the PID of the process that did the Perf
> event open call, rather than the target of the session. So update the
> comments to make this clearer.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-tmc-etr.c | 5 +++--
> drivers/hwtracing/coresight/coresight-tmc.h | 5 +++--
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index e75428fa1592..8962fc27d04f 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -36,7 +36,8 @@ struct etr_buf_hw {
> * etr_perf_buffer - Perf buffer used for ETR
> * @drvdata - The ETR drvdaga this buffer has been allocated for.
> * @etr_buf - Actual buffer used by the ETR
> - * @pid - The PID this etr_perf_buffer belongs to.
> + * @pid - The PID of the session owner that etr_perf_buffer
> + * belongs to.
> * @snaphost - Perf session mode
> * @nr_pages - Number of pages in the ring buffer.
> * @pages - Array of Pages in the ring buffer.
> @@ -1662,7 +1663,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
> goto unlock_out;
> }
>
> - /* Get a handle on the pid of the process to monitor */
> + /* Get a handle on the pid of the session owner */
> pid = etr_perf->pid;
>
> /* Do not proceed if this device is associated with another session */
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index c77763b49de0..2671926be62a 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -171,8 +171,9 @@ struct etr_buf {
> * @csdev: component vitals needed by the framework.
> * @miscdev: specifics to handle "/dev/xyz.tmc" entry.
> * @spinlock: only one at a time pls.
> - * @pid: Process ID of the process being monitored by the session
> - * that is using this component.
> + * @pid: Process ID of the process that owns the session that is using
> + * this component. For example this would be the pid of the Perf
> + * process.
> * @buf: Snapshot of the trace data for ETF/ETB.
> * @etr_buf: details of buffer used in TMC-ETR
> * @len: size of the available trace for ETF/ETB.
> --
> 2.34.1
>
Reviewed-by: Mike Leach <[email protected]>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On Mon, 29 Apr 2024 at 16:24, James Clark <[email protected]> wrote:
>
> The trace ID maps will need to be created and stored by the core and
> Perf code so move the definition up to the common header.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> .../hwtracing/coresight/coresight-trace-id.c | 1 +
> .../hwtracing/coresight/coresight-trace-id.h | 19 -------------------
> include/linux/coresight.h | 18 ++++++++++++++++++
> 3 files changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
> index af5b4ef59cea..19005b5b4dc4 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
> @@ -3,6 +3,7 @@
> * Copyright (c) 2022, Linaro Limited, All rights reserved.
> * Author: Mike Leach <[email protected]>
> */
> +#include <linux/coresight.h>
> #include <linux/coresight-pmu.h>
> #include <linux/cpumask.h>
> #include <linux/kernel.h>
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
> index 3797777d367e..49438a96fcc6 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.h
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.h
> @@ -32,10 +32,6 @@
> #include <linux/bitops.h>
> #include <linux/types.h>
>
> -
> -/* architecturally we have 128 IDs some of which are reserved */
> -#define CORESIGHT_TRACE_IDS_MAX 128
> -
> /* ID 0 is reserved */
> #define CORESIGHT_TRACE_ID_RES_0 0
>
> @@ -46,21 +42,6 @@
> #define IS_VALID_CS_TRACE_ID(id) \
> ((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
>
> -/**
> - * Trace ID map.
> - *
> - * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
> - * Initialised so that the reserved IDs are permanently marked as
> - * in use.
> - * @pend_rel_ids: CPU IDs that have been released by the trace source but not
> - * yet marked as available, to allow re-allocation to the same
> - * CPU during a perf session.
> - */
> -struct coresight_trace_id_map {
> - DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
> - DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
> -};
> -
> /* Allocate and release IDs for a single default trace ID map */
>
> /**
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index f09ace92176e..c16c61a8411d 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -218,6 +218,24 @@ struct coresight_sysfs_link {
> const char *target_name;
> };
>
> +/* architecturally we have 128 IDs some of which are reserved */
> +#define CORESIGHT_TRACE_IDS_MAX 128
> +
> +/**
> + * Trace ID map.
> + *
> + * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
> + * Initialised so that the reserved IDs are permanently marked as
> + * in use.
> + * @pend_rel_ids: CPU IDs that have been released by the trace source but not
> + * yet marked as available, to allow re-allocation to the same
> + * CPU during a perf session.
> + */
> +struct coresight_trace_id_map {
> + DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
> + DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
> +};
> +
> /**
> * struct coresight_device - representation of a device as used by the framework
> * @pdata: Platform data with device connections associated to this device.
> --
> 2.34.1
>
Reviewed-by: Mike Leach <[email protected]>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hi James
On Mon, 29 Apr 2024 at 16:25, James Clark <[email protected]> wrote:
>
> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
> as long as there are fewer than that many ETMs connected to each sink.
>
> Each sink owns its own trace ID map, and any Perf session connecting to
> that sink will allocate from it, even if the sink is currently in use by
> other users. This is similar to the existing behavior where the dynamic
> trace IDs are constant as long as there is any concurrent Perf session
> active. It's not completely optimal because slightly more IDs will be
> used than necessary, but the optimal solution involves tracking the PIDs
> of each session and allocating ID maps based on the session owner. This
> is difficult to do with the combination of per-thread and per-cpu modes
> and some scheduling issues. The complexity of this isn't likely to worth
> it because even with multiple users they'd just see a difference in the
> ordering of ID allocations rather than hitting any limits (unless the
> hardware does have too many ETMs connected to one sink).
>
> Signed-off-by: James Clark <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 10 ++++++++++
> drivers/hwtracing/coresight/coresight-etm-perf.c | 15 ++++++++-------
> include/linux/coresight.h | 1 +
> 3 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 9fc6f6b863e0..d1adff467670 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -902,6 +902,7 @@ static void coresight_device_release(struct device *dev)
> struct coresight_device *csdev = to_coresight_device(dev);
>
> fwnode_handle_put(csdev->dev.fwnode);
> + free_percpu(csdev->perf_id_map.cpu_map);
> kfree(csdev);
> }
>
> @@ -1159,6 +1160,14 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
> dev_set_name(&csdev->dev, "%s", desc->name);
>
> + if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
> + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) {
> + csdev->perf_id_map.cpu_map = alloc_percpu(atomic_t);
> + if (!csdev->perf_id_map.cpu_map) {
> + ret = -ENOMEM;
> + goto err_out;
> + }
> + }
> /*
> * Make sure the device registration and the connection fixup
> * are synchronised, so that we don't see uninitialised devices
> @@ -1216,6 +1225,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> err_out:
> /* Cleanup the connection information */
> coresight_release_platform_data(NULL, desc->dev, desc->pdata);
> + kfree(csdev);
> return ERR_PTR(ret);
> }
> EXPORT_SYMBOL_GPL(coresight_register);
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 177cecae38d9..86ca1a9d09a7 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -229,10 +229,13 @@ static void free_event_data(struct work_struct *work)
> struct list_head **ppath;
>
> ppath = etm_event_cpu_path_ptr(event_data, cpu);
> - if (!(IS_ERR_OR_NULL(*ppath)))
> + if (!(IS_ERR_OR_NULL(*ppath))) {
> + struct coresight_device *sink = coresight_get_sink(*ppath);
> +
> + coresight_trace_id_put_cpu_id(cpu, &sink->perf_id_map);
> coresight_release_path(*ppath);
> + }
> *ppath = NULL;
> - coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
> }
>
> /* mark perf event as done for trace id allocator */
> @@ -401,8 +404,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> }
>
> /* ensure we can allocate a trace ID for this CPU */
> - trace_id = coresight_trace_id_get_cpu_id(cpu,
> - coresight_trace_id_map_default());
> + trace_id = coresight_trace_id_get_cpu_id(cpu, &sink->perf_id_map);
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> cpumask_clear_cpu(cpu, mask);
> coresight_release_path(path);
> @@ -497,7 +499,7 @@ static void etm_event_start(struct perf_event *event, int flags)
>
> /* Finally enable the tracer */
> if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF,
> - coresight_trace_id_map_default()))
> + &sink->perf_id_map))
> goto fail_disable_path;
>
> /*
> @@ -509,8 +511,7 @@ static void etm_event_start(struct perf_event *event, int flags)
> hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
> CS_AUX_HW_ID_CURR_VERSION);
> hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
> - coresight_trace_id_read_cpu_id(cpu,
> - coresight_trace_id_map_default()));
> + coresight_trace_id_read_cpu_id(cpu, &sink->perf_id_map));
> perf_report_aux_output_id(event, hw_id);
> }
>
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 3a678e5425dc..8c4c1860c76b 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -290,6 +290,7 @@ struct coresight_device {
> bool sysfs_sink_activated;
> struct dev_ext_attribute *ea;
> struct coresight_device *def_sink;
> + struct coresight_trace_id_map perf_id_map;
perhaps this should be sink_id_map? At some point sysfs may use is and
naming is sink... is more consistent with other sink attributes in the
structure.
> /* sysfs links between components */
> int nr_links;
> bool has_conns_grp;
> --
> 2.34.1
>
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hi James
On Mon, 29 Apr 2024 at 16:23, James Clark <[email protected]> wrote:
>
> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
> as long as there are fewer than that many ETMs connected to each sink.
>
> Each sink owns its own trace ID map, and any Perf session connecting to
> that sink will allocate from it, even if the sink is currently in use by
> other users. This is similar to the existing behavior where the dynamic
> trace IDs are constant as long as there is any concurrent Perf session
> active. It's not completely optimal because slightly more IDs will be
> used than necessary, but the optimal solution involves tracking the PIDs
> of each session and allocating ID maps based on the session owner. This
> is difficult to do with the combination of per-thread and per-cpu modes
> and some scheduling issues. The complexity of this isn't likely to worth
> it because even with multiple users they'd just see a difference in the
> ordering of ID allocations rather than hitting any limits (unless the
> hardware does have too many ETMs connected to one sink).
>
> Per-thread mode works but only until there are any overlapping IDs, at
> which point Perf will error out. Both per-thread mode and sysfs mode are
> left to future changes, but both can be added on top of this initial
> implementation and only sysfs mode requires further driver changes.
>
> The HW_ID version field hasn't been bumped in order to not break Perf
> which already has an error condition for other values of that field.
> Instead a new minor version has been added which signifies that there
> are new fields but the old fields are backwards compatible.
>
Looking at this overall - would it not be better to introduce the
concept of a "sink ID" to allow the detection of multiple sources into
the single sink that is now done by emitting multiple AUX_HWID packets
with the CPU+ID extra data?
This sink ID could be part of the sink csdev struct - or even the
id_map struct - a simple count of sinks as the per sink maps are
created would be sufficient. If this sink ID replaced the CPU+ID extra
data in the HWID packets, then each packet could be emitted just once,
and perf can then collate based on the sink id.
Moreover, once we are ready to address the per-thread issues - then
the overlap would not matter. Generate OpenCSD decode trees per sink
ID, add docoders to the tree per Trace ID. Thus if a buffer has data
from sink 1 trace id 5, ans sink 2, trace ID 5, then pick the right
decoder for the combo.
Finally in systems with ETE+TRBE were there is no use of trace IDs, a
sink ID of 0x0 could potentially indicate that 1:1 relationship.
Regards
Mike
>
> James Clark (17):
> perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
> perf auxtrace: Allow number of queues to be specified
> perf: cs-etm: Create decoders after both AUX and HW_ID search passes
> perf: cs-etm: Allocate queues for all CPUs
> perf: cs-etm: Move traceid_list to each queue
> perf: cs-etm: Create decoders based on the trace ID mappings
> perf: cs-etm: Support version 0.1 of HW_ID packets
> coresight: Remove unused stubs
> coresight: Clarify comments around the PID of the sink owner
> coresight: Move struct coresight_trace_id_map to common header
> coresight: Expose map argument in trace ID API
> coresight: Make CPU id map a property of a trace ID map
> coresight: Pass trace ID map into source enable
> coresight: Use per-sink trace ID maps for Perf sessions
> coresight: Remove pending trace ID release mechanism
> coresight: Re-emit trace IDs when the sink changes in per-thread mode
> coresight: Emit HW_IDs for all ETMs that are using the sink
>
> drivers/hwtracing/coresight/coresight-core.c | 10 +
> drivers/hwtracing/coresight/coresight-dummy.c | 3 +-
> .../hwtracing/coresight/coresight-etm-perf.c | 82 ++-
> .../hwtracing/coresight/coresight-etm-perf.h | 20 +-
> .../coresight/coresight-etm3x-core.c | 14 +-
> .../coresight/coresight-etm4x-core.c | 14 +-
> drivers/hwtracing/coresight/coresight-stm.c | 3 +-
> drivers/hwtracing/coresight/coresight-sysfs.c | 3 +-
> .../hwtracing/coresight/coresight-tmc-etr.c | 5 +-
> drivers/hwtracing/coresight/coresight-tmc.h | 5 +-
> drivers/hwtracing/coresight/coresight-tpdm.c | 3 +-
> .../hwtracing/coresight/coresight-trace-id.c | 107 +--
> .../hwtracing/coresight/coresight-trace-id.h | 57 +-
> include/linux/coresight-pmu.h | 17 +-
> include/linux/coresight.h | 20 +-
> tools/include/linux/coresight-pmu.h | 17 +-
> tools/perf/util/auxtrace.c | 9 +-
> tools/perf/util/auxtrace.h | 1 +
> .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 28 +-
> tools/perf/util/cs-etm.c | 617 ++++++++++++------
> tools/perf/util/cs-etm.h | 2 +-
> 21 files changed, 633 insertions(+), 404 deletions(-)
>
> --
> 2.34.1
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On 03/05/2024 10:43, Mike Leach wrote:
> Hi James
>
> On Mon, 29 Apr 2024 at 16:25, James Clark <[email protected]> wrote:
>>
>> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
>> as long as there are fewer than that many ETMs connected to each sink.
>>
>> Each sink owns its own trace ID map, and any Perf session connecting to
>> that sink will allocate from it, even if the sink is currently in use by
>> other users. This is similar to the existing behavior where the dynamic
>> trace IDs are constant as long as there is any concurrent Perf session
>> active. It's not completely optimal because slightly more IDs will be
>> used than necessary, but the optimal solution involves tracking the PIDs
>> of each session and allocating ID maps based on the session owner. This
>> is difficult to do with the combination of per-thread and per-cpu modes
>> and some scheduling issues. The complexity of this isn't likely to worth
>> it because even with multiple users they'd just see a difference in the
>> ordering of ID allocations rather than hitting any limits (unless the
>> hardware does have too many ETMs connected to one sink).
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> drivers/hwtracing/coresight/coresight-core.c | 10 ++++++++++
>> drivers/hwtracing/coresight/coresight-etm-perf.c | 15 ++++++++-------
>> include/linux/coresight.h | 1 +
>> 3 files changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
>> index 9fc6f6b863e0..d1adff467670 100644
>> --- a/drivers/hwtracing/coresight/coresight-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>> @@ -902,6 +902,7 @@ static void coresight_device_release(struct device *dev)
>> struct coresight_device *csdev = to_coresight_device(dev);
>>
>> fwnode_handle_put(csdev->dev.fwnode);
>> + free_percpu(csdev->perf_id_map.cpu_map);
>> kfree(csdev);
>> }
>>
>> @@ -1159,6 +1160,14 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
>> csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
>> dev_set_name(&csdev->dev, "%s", desc->name);
>>
>> + if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
>> + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) {
>> + csdev->perf_id_map.cpu_map = alloc_percpu(atomic_t);
>> + if (!csdev->perf_id_map.cpu_map) {
>> + ret = -ENOMEM;
>> + goto err_out;
>> + }
>> + }
>> /*
>> * Make sure the device registration and the connection fixup
>> * are synchronised, so that we don't see uninitialised devices
>> @@ -1216,6 +1225,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
>> err_out:
>> /* Cleanup the connection information */
>> coresight_release_platform_data(NULL, desc->dev, desc->pdata);
>> + kfree(csdev);
>> return ERR_PTR(ret);
>> }
>> EXPORT_SYMBOL_GPL(coresight_register);
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index 177cecae38d9..86ca1a9d09a7 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -229,10 +229,13 @@ static void free_event_data(struct work_struct *work)
>> struct list_head **ppath;
>>
>> ppath = etm_event_cpu_path_ptr(event_data, cpu);
>> - if (!(IS_ERR_OR_NULL(*ppath)))
>> + if (!(IS_ERR_OR_NULL(*ppath))) {
>> + struct coresight_device *sink = coresight_get_sink(*ppath);
>> +
>> + coresight_trace_id_put_cpu_id(cpu, &sink->perf_id_map);
>> coresight_release_path(*ppath);
>> + }
>> *ppath = NULL;
>> - coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
>> }
>>
>> /* mark perf event as done for trace id allocator */
>> @@ -401,8 +404,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>> }
>>
>> /* ensure we can allocate a trace ID for this CPU */
>> - trace_id = coresight_trace_id_get_cpu_id(cpu,
>> - coresight_trace_id_map_default());
>> + trace_id = coresight_trace_id_get_cpu_id(cpu, &sink->perf_id_map);
>> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
>> cpumask_clear_cpu(cpu, mask);
>> coresight_release_path(path);
>> @@ -497,7 +499,7 @@ static void etm_event_start(struct perf_event *event, int flags)
>>
>> /* Finally enable the tracer */
>> if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF,
>> - coresight_trace_id_map_default()))
>> + &sink->perf_id_map))
>> goto fail_disable_path;
>>
>> /*
>> @@ -509,8 +511,7 @@ static void etm_event_start(struct perf_event *event, int flags)
>> hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
>> CS_AUX_HW_ID_CURR_VERSION);
>> hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
>> - coresight_trace_id_read_cpu_id(cpu,
>> - coresight_trace_id_map_default()));
>> + coresight_trace_id_read_cpu_id(cpu, &sink->perf_id_map));
>> perf_report_aux_output_id(event, hw_id);
>> }
>>
>> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
>> index 3a678e5425dc..8c4c1860c76b 100644
>> --- a/include/linux/coresight.h
>> +++ b/include/linux/coresight.h
>> @@ -290,6 +290,7 @@ struct coresight_device {
>> bool sysfs_sink_activated;
>> struct dev_ext_attribute *ea;
>> struct coresight_device *def_sink;
>> + struct coresight_trace_id_map perf_id_map;
>
> perhaps this should be sink_id_map? At some point sysfs may use is and
> naming is sink... is more consistent with other sink attributes in the
> structure.
>
When we implement this for sysfs I was thinking sysfs would use its own
per-sink map. One reason is that the global map might be used if the
sink is in use by a system source (which allocate their IDs on probe).
Resulting in something like this:
struct coresight_trace_id_map perf_id_map;
struct coresight_trace_id_map sysfs_id_map;
struct coresight_trace_id_map *sysfs_id_map_in_use;
syfs_id_map_in_use could either be &sysfs_id_map or
coresight_trace_id_map_default()
If sysfs did share the same map as Perf, the sink would have to have
some kind of mechanism of saving and restoring the mappings depending on
who was currently using the sink. At that point just storing two
versions of the mappings is easier. I believe there are some scenarios
for concurrent sysfs and Perf sessions that don't fail immediately, but
that one will get -EBUSY when trying to use the sink, until the other
session ends, at which point the other mappings need to be ready to use
immediately.
Also sysfs might either use the per-sink map or the global map depending
on the ordering of when the IDs were allocated or if a system source is
used at the same time, which means sysfs needs to use a reference to a map.
We could also change system sources so that they only allocate IDs when
the ID is read, rather than on probe. That might remove some of the
above issues, but maybe not all of them because you can always read the
ID before enabling the sink, at which point you have no choice but to
use the global map.
>> /* sysfs links between components */
>> int nr_links;
>> bool has_conns_grp;
>> --
>> 2.34.1
>>
>
> Mike
>
On Mon, Apr 29, 2024 at 04:21:45PM +0100, James Clark wrote:
> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
> as long as there are fewer than that many ETMs connected to each sink.
>
> Each sink owns its own trace ID map, and any Perf session connecting to
> that sink will allocate from it, even if the sink is currently in use by
> other users. This is similar to the existing behavior where the dynamic
> trace IDs are constant as long as there is any concurrent Perf session
> active. It's not completely optimal because slightly more IDs will be
> used than necessary, but the optimal solution involves tracking the PIDs
> of each session and allocating ID maps based on the session owner. This
> is difficult to do with the combination of per-thread and per-cpu modes
> and some scheduling issues. The complexity of this isn't likely to worth
> it because even with multiple users they'd just see a difference in the
> ordering of ID allocations rather than hitting any limits (unless the
> hardware does have too many ETMs connected to one sink).
>
> Per-thread mode works but only until there are any overlapping IDs, at
> which point Perf will error out. Both per-thread mode and sysfs mode are
> left to future changes, but both can be added on top of this initial
> implementation and only sysfs mode requires further driver changes.
>
> The HW_ID version field hasn't been bumped in order to not break Perf
> which already has an error condition for other values of that field.
> Instead a new minor version has been added which signifies that there
> are new fields but the old fields are backwards compatible.
I guess I can pick the tooling part now, right? Further reviewing would
be nice tho.
- Arnaldo
On 4/29/24 20:51, James Clark wrote:
> The likely fix for this is to update Perf so print a helpful message.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> tools/perf/util/cs-etm.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index d65d7485886c..32818bd7cd17 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -335,8 +335,11 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
> trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
>
> /* check that we can handle this version */
> - if (version > CS_AUX_HW_ID_CURR_VERSION)
> + if (version > CS_AUX_HW_ID_CURR_VERSION) {
> + pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n",
Is not this bit misleading ? PERF_RECORD_AUX_OUTPUT_HW_ID is just the perf record
format identifier. The record version here, is derived from the platform specific
hardware ID information embedded in this perf record.
Should not this be just s/PERF_RECORD_AUX_OUTPUT_HW_ID/hardware ID/ instead ?
> + version);
> return -EINVAL;
> + }
>
> /* get access to the etm metadata */
> etm = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace);
Minor nit: Should the subject line be "coresight: Remove unused etm perf stubs" ?
On 4/29/24 20:51, James Clark wrote:
> This file is never included anywhere if CONFIG_CORESIGHT is not set so
> they are unused and aren't currently compile tested with any config so
> remove them.
Searching for this header's inclusion throws up the following source files,
all of which needs CONFIG_CORESIGHT to be enabled to be compiled.
git grep "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-core.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-etb10.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-etm-perf.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-etm3x-core.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-etm4x-core.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-syscfg.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-tmc-etf.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-tmc-etr.c:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/coresight-trbe.h:#include "coresight-etm-perf.h"
drivers/hwtracing/coresight/ultrasoc-smb.c:#include "coresight-etm-perf.h"
>
> Signed-off-by: James Clark <[email protected]>
LGTM, with or without the subject line change.
Reviewed-by: Anshuman Khandual <[email protected]>
> ---
> .../hwtracing/coresight/coresight-etm-perf.h | 18 ------------------
> 1 file changed, 18 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h
> index bebbadee2ceb..744531158d6b 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.h
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h
> @@ -62,7 +62,6 @@ struct etm_event_data {
> struct list_head * __percpu *path;
> };
>
> -#if IS_ENABLED(CONFIG_CORESIGHT)
> int etm_perf_symlink(struct coresight_device *csdev, bool link);
> int etm_perf_add_symlink_sink(struct coresight_device *csdev);
> void etm_perf_del_symlink_sink(struct coresight_device *csdev);
> @@ -77,23 +76,6 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
> int etm_perf_add_symlink_cscfg(struct device *dev,
> struct cscfg_config_desc *config_desc);
> void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc);
> -#else
> -static inline int etm_perf_symlink(struct coresight_device *csdev, bool link)
> -{ return -EINVAL; }
> -int etm_perf_add_symlink_sink(struct coresight_device *csdev)
> -{ return -EINVAL; }
> -void etm_perf_del_symlink_sink(struct coresight_device *csdev) {}
> -static inline void *etm_perf_sink_config(struct perf_output_handle *handle)
> -{
> - return NULL;
> -}
> -int etm_perf_add_symlink_cscfg(struct device *dev,
> - struct cscfg_config_desc *config_desc)
> -{ return -EINVAL; }
> -void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc) {}
> -
> -#endif /* CONFIG_CORESIGHT */
> -
> int __init etm_perf_init(void);
> void etm_perf_exit(void);
>
On 4/29/24 20:51, James Clark wrote:
> "Process being monitored" and "pid of the process to monitor" imply that
> this would be the same PID if there were two sessions targeting the same
> process. But this is actually the PID of the process that did the Perf
> event open call, rather than the target of the session. So update the
> comments to make this clearer.
>
> Signed-off-by: James Clark <[email protected]>
This indeed removes the ambiguity that the PID belongs to the perf session
owner rather than the monitored or target process.
Reviewed-by: Anshuman Khandual <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-tmc-etr.c | 5 +++--
> drivers/hwtracing/coresight/coresight-tmc.h | 5 +++--
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index e75428fa1592..8962fc27d04f 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -36,7 +36,8 @@ struct etr_buf_hw {
> * etr_perf_buffer - Perf buffer used for ETR
> * @drvdata - The ETR drvdaga this buffer has been allocated for.
> * @etr_buf - Actual buffer used by the ETR
> - * @pid - The PID this etr_perf_buffer belongs to.
> + * @pid - The PID of the session owner that etr_perf_buffer
> + * belongs to.
> * @snaphost - Perf session mode
> * @nr_pages - Number of pages in the ring buffer.
> * @pages - Array of Pages in the ring buffer.
> @@ -1662,7 +1663,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data)
> goto unlock_out;
> }
>
> - /* Get a handle on the pid of the process to monitor */
> + /* Get a handle on the pid of the session owner */
> pid = etr_perf->pid;
>
> /* Do not proceed if this device is associated with another session */
> diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
> index c77763b49de0..2671926be62a 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc.h
> +++ b/drivers/hwtracing/coresight/coresight-tmc.h
> @@ -171,8 +171,9 @@ struct etr_buf {
> * @csdev: component vitals needed by the framework.
> * @miscdev: specifics to handle "/dev/xyz.tmc" entry.
> * @spinlock: only one at a time pls.
> - * @pid: Process ID of the process being monitored by the session
> - * that is using this component.
> + * @pid: Process ID of the process that owns the session that is using
> + * this component. For example this would be the pid of the Perf
> + * process.
> * @buf: Snapshot of the trace data for ETF/ETB.
> * @etr_buf: details of buffer used in TMC-ETR
> * @len: size of the available trace for ETF/ETB.
On 4/29/24 20:51, James Clark wrote:
> Currently it's only possible to initialize with the default number of
> queues and then use auxtrace_queues__add_event() to grow the array. But
> that's problematic if you don't have a real event to pass into that
> function yet.
>
> The queues hold a void *priv member to store custom state, and for
> Coresight we want to create decoders upfront before receiving data, so
> add a new function that allows pre-allocating queues. One reason to do
> this is because we might need to store metadata (HW_ID events) that
> effects other queues, but never actually receive auxtrace data on that
> queue.
>
> Signed-off-by: James Clark <[email protected]>
LGTM
Reviewed-by: Anshuman Khandual <[email protected]>
> ---
> tools/perf/util/auxtrace.c | 9 +++++++--
> tools/perf/util/auxtrace.h | 1 +
> 2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 3684e6009b63..563b6c4fca31 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -218,15 +218,20 @@ static struct auxtrace_queue *auxtrace_alloc_queue_array(unsigned int nr_queues)
> return queue_array;
> }
>
> -int auxtrace_queues__init(struct auxtrace_queues *queues)
> +int auxtrace_queues__init_nr(struct auxtrace_queues *queues, int nr_queues)
> {
> - queues->nr_queues = AUXTRACE_INIT_NR_QUEUES;
> + queues->nr_queues = nr_queues;
> queues->queue_array = auxtrace_alloc_queue_array(queues->nr_queues);
> if (!queues->queue_array)
> return -ENOMEM;
> return 0;
> }
>
> +int auxtrace_queues__init(struct auxtrace_queues *queues)
> +{
> + return auxtrace_queues__init_nr(queues, AUXTRACE_INIT_NR_QUEUES);
> +}
> +
> static int auxtrace_queues__grow(struct auxtrace_queues *queues,
> unsigned int new_nr_queues)
> {
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index 55702215a82d..8a6ec9565835 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -521,6 +521,7 @@ int auxtrace_mmap__read_snapshot(struct mmap *map,
> struct perf_tool *tool, process_auxtrace_t fn,
> size_t snapshot_size);
>
> +int auxtrace_queues__init_nr(struct auxtrace_queues *queues, int nr_queues);
> int auxtrace_queues__init(struct auxtrace_queues *queues);
> int auxtrace_queues__add_event(struct auxtrace_queues *queues,
> struct perf_session *session,
On 4/29/24 20:51, James Clark wrote:
> The trace ID maps will need to be created and stored by the core and
> Perf code so move the definition up to the common header.
>
> Signed-off-by: James Clark <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
> ---
> .../hwtracing/coresight/coresight-trace-id.c | 1 +
> .../hwtracing/coresight/coresight-trace-id.h | 19 -------------------
> include/linux/coresight.h | 18 ++++++++++++++++++
> 3 files changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
> index af5b4ef59cea..19005b5b4dc4 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
> @@ -3,6 +3,7 @@
> * Copyright (c) 2022, Linaro Limited, All rights reserved.
> * Author: Mike Leach <[email protected]>
> */
> +#include <linux/coresight.h>
> #include <linux/coresight-pmu.h>
> #include <linux/cpumask.h>
> #include <linux/kernel.h>
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
> index 3797777d367e..49438a96fcc6 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.h
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.h
> @@ -32,10 +32,6 @@
> #include <linux/bitops.h>
> #include <linux/types.h>
>
> -
> -/* architecturally we have 128 IDs some of which are reserved */
> -#define CORESIGHT_TRACE_IDS_MAX 128
> -
> /* ID 0 is reserved */
> #define CORESIGHT_TRACE_ID_RES_0 0
>
> @@ -46,21 +42,6 @@
> #define IS_VALID_CS_TRACE_ID(id) \
> ((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
>
> -/**
> - * Trace ID map.
> - *
> - * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
> - * Initialised so that the reserved IDs are permanently marked as
> - * in use.
> - * @pend_rel_ids: CPU IDs that have been released by the trace source but not
> - * yet marked as available, to allow re-allocation to the same
> - * CPU during a perf session.
> - */
> -struct coresight_trace_id_map {
> - DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
> - DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
> -};
> -
> /* Allocate and release IDs for a single default trace ID map */
>
> /**
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index f09ace92176e..c16c61a8411d 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -218,6 +218,24 @@ struct coresight_sysfs_link {
> const char *target_name;
> };
>
> +/* architecturally we have 128 IDs some of which are reserved */
> +#define CORESIGHT_TRACE_IDS_MAX 128
> +
> +/**
> + * Trace ID map.
> + *
> + * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs.
> + * Initialised so that the reserved IDs are permanently marked as
> + * in use.
> + * @pend_rel_ids: CPU IDs that have been released by the trace source but not
> + * yet marked as available, to allow re-allocation to the same
> + * CPU during a perf session.
> + */
> +struct coresight_trace_id_map {
> + DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
> + DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
> +};
> +
> /**
> * struct coresight_device - representation of a device as used by the framework
> * @pdata: Platform data with device connections associated to this device.
On 4/29/24 20:51, James Clark wrote:
> The trace ID API is currently hard coded to always use the global map.
> The functions that take the map as an argument aren't currently public.
> Make them public so that Perf mode can pass in its own maps. At the
> moment all usages are still hard coded to use the global map, but now
> on the caller side.
>
> System ID functions are unchanged because they will always use the
> default map.
>
> Signed-off-by: James Clark <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
> ---
> .../hwtracing/coresight/coresight-etm-perf.c | 5 +++--
> .../coresight/coresight-etm3x-core.c | 5 +++--
> .../coresight/coresight-etm4x-core.c | 5 +++--
> .../hwtracing/coresight/coresight-trace-id.c | 22 +++++++------------
> .../hwtracing/coresight/coresight-trace-id.h | 9 +++++---
> 5 files changed, 23 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index c0c60e6a1703..4afb9d29f355 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -232,7 +232,7 @@ static void free_event_data(struct work_struct *work)
> if (!(IS_ERR_OR_NULL(*ppath)))
> coresight_release_path(*ppath);
> *ppath = NULL;
> - coresight_trace_id_put_cpu_id(cpu);
> + coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
> }
>
> /* mark perf event as done for trace id allocator */
> @@ -401,7 +401,8 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> }
>
> /* ensure we can allocate a trace ID for this CPU */
> - trace_id = coresight_trace_id_get_cpu_id(cpu);
> + trace_id = coresight_trace_id_get_cpu_id(cpu,
> + coresight_trace_id_map_default());
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> cpumask_clear_cpu(cpu, mask);
> coresight_release_path(path);
> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> index 9d5c1391ffb1..4149e7675ceb 100644
> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> @@ -465,7 +465,8 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
> *
> * trace id function has its own lock
> */
> - trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
> + trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
> + coresight_trace_id_map_default());
> if (IS_VALID_CS_TRACE_ID(trace_id))
> drvdata->traceid = (u8)trace_id;
> else
> @@ -477,7 +478,7 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
>
> void etm_release_trace_id(struct etm_drvdata *drvdata)
> {
> - coresight_trace_id_put_cpu_id(drvdata->cpu);
> + coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
> }
>
> static int etm_enable_perf(struct coresight_device *csdev,
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index a0bdfabddbc6..f32c8cd7742d 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -241,7 +241,8 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
> * or return the one currently allocated.
> * The trace id function has its own lock
> */
> - trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
> + trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
> + coresight_trace_id_map_default());
> if (IS_VALID_CS_TRACE_ID(trace_id))
> drvdata->trcid = (u8)trace_id;
> else
> @@ -253,7 +254,7 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
>
> void etm4_release_trace_id(struct etmv4_drvdata *drvdata)
> {
> - coresight_trace_id_put_cpu_id(drvdata->cpu);
> + coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
> }
>
> struct etm4_enable_arg {
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
> index 19005b5b4dc4..45ddd50d09a6 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
> @@ -12,7 +12,7 @@
>
> #include "coresight-trace-id.h"
>
> -/* Default trace ID map. Used on systems that don't require per sink mappings */
> +/* Default trace ID map. Used in sysfs mode and for system sources */
> static struct coresight_trace_id_map id_map_default;
>
> /* maintain a record of the mapping of IDs and pending releases per cpu */
> @@ -152,7 +152,7 @@ static void coresight_trace_id_release_all_pending(void)
> DUMP_ID_MAP(id_map);
> }
>
> -static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> +int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> {
> unsigned long flags;
> int id;
> @@ -195,8 +195,9 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_
> DUMP_ID_MAP(id_map);
> return id;
> }
> +EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
>
> -static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> +void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> {
> unsigned long flags;
> int id;
> @@ -222,6 +223,7 @@ static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id
> DUMP_ID_CPU(cpu, id);
> DUMP_ID_MAP(id_map);
> }
> +EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
>
> static int coresight_trace_id_map_get_system_id(struct coresight_trace_id_map *id_map)
> {
> @@ -250,19 +252,11 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map *
> DUMP_ID_MAP(id_map);
> }
>
> -/* API functions */
> -
> -int coresight_trace_id_get_cpu_id(int cpu)
> -{
> - return coresight_trace_id_map_get_cpu_id(cpu, &id_map_default);
> -}
> -EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
> -
> -void coresight_trace_id_put_cpu_id(int cpu)
> +struct coresight_trace_id_map *coresight_trace_id_map_default(void)
> {
> - coresight_trace_id_map_put_cpu_id(cpu, &id_map_default);
> + return &id_map_default;
> }
> -EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
> +EXPORT_SYMBOL_GPL(coresight_trace_id_map_default);
>
> int coresight_trace_id_read_cpu_id(int cpu)
> {
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
> index 49438a96fcc6..54b9d8ed903b 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.h
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.h
> @@ -42,7 +42,10 @@
> #define IS_VALID_CS_TRACE_ID(id) \
> ((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP))
>
> -/* Allocate and release IDs for a single default trace ID map */
> +/**
> + * Get the global map that's used by sysfs
> + */
> +struct coresight_trace_id_map *coresight_trace_id_map_default(void);
>
> /**
> * Read and optionally allocate a CoreSight trace ID and associate with a CPU.
> @@ -57,7 +60,7 @@
> *
> * return: CoreSight trace ID or -EINVAL if allocation impossible.
> */
> -int coresight_trace_id_get_cpu_id(int cpu);
> +int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
>
> /**
> * Release an allocated trace ID associated with the CPU.
> @@ -70,7 +73,7 @@ int coresight_trace_id_get_cpu_id(int cpu);
> *
> * @cpu: The CPU index to release the associated trace ID.
> */
> -void coresight_trace_id_put_cpu_id(int cpu);
> +void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
>
> /**
> * Read the current allocated CoreSight Trace ID value for the CPU.
On 4/29/24 20:51, James Clark wrote:
> The global CPU ID mappings won't work for per-sink ID maps so move it to
> the ID map struct. coresight_trace_id_release_all_pending() is hard
> coded to operate on the default map, but once Perf sessions use their
> own maps the pending release mechanism will be deleted. So it doesn't
> need to be extended to accept a trace ID map argument at this point.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> .../hwtracing/coresight/coresight-etm-perf.c | 3 +-
> .../coresight/coresight-etm3x-core.c | 3 +-
> .../coresight/coresight-etm4x-core.c | 3 +-
> .../hwtracing/coresight/coresight-trace-id.c | 28 ++++++++-----------
> .../hwtracing/coresight/coresight-trace-id.h | 2 +-
> include/linux/coresight.h | 1 +
> 6 files changed, 20 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 4afb9d29f355..25f1f87c90d1 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -508,7 +508,8 @@ static void etm_event_start(struct perf_event *event, int flags)
> hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
> CS_AUX_HW_ID_CURR_VERSION);
> hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
> - coresight_trace_id_read_cpu_id(cpu));
> + coresight_trace_id_read_cpu_id(cpu,
> + coresight_trace_id_map_default()));
> perf_report_aux_output_id(event, hw_id);
> }
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> index 4149e7675ceb..b21f5ad94e63 100644
> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> @@ -501,7 +501,8 @@ static int etm_enable_perf(struct coresight_device *csdev,
> * with perf locks - we know the ID cannot change until perf shuts down
> * the session
> */
> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu);
> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
> + coresight_trace_id_map_default());
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
> dev_name(&drvdata->csdev->dev), drvdata->cpu);
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index f32c8cd7742d..d16d6efb26fa 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -776,7 +776,8 @@ static int etm4_enable_perf(struct coresight_device *csdev,
> * with perf locks - we know the ID cannot change until perf shuts down
> * the session
> */
> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu);
> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
> + coresight_trace_id_map_default());
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
> dev_name(&drvdata->csdev->dev), drvdata->cpu);
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
> index 45ddd50d09a6..b393603dd713 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
> @@ -13,10 +13,12 @@
> #include "coresight-trace-id.h"
>
> /* Default trace ID map. Used in sysfs mode and for system sources */
> -static struct coresight_trace_id_map id_map_default;
> +static DEFINE_PER_CPU(atomic_t, id_map_default_cpu_ids) = ATOMIC_INIT(0);
> +static struct coresight_trace_id_map id_map_default = {
> + .cpu_map = &id_map_default_cpu_ids
> +};
>
> -/* maintain a record of the mapping of IDs and pending releases per cpu */
> -static DEFINE_PER_CPU(atomic_t, cpu_id) = ATOMIC_INIT(0);
> +/* maintain a record of the pending releases per cpu */
> static cpumask_t cpu_id_release_pending;
>
> /* perf session active counter */
> @@ -46,12 +48,6 @@ static void coresight_trace_id_dump_table(struct coresight_trace_id_map *id_map,
> #define PERF_SESSION(n)
> #endif
>
> -/* unlocked read of current trace ID value for given CPU */
> -static int _coresight_trace_id_read_cpu_id(int cpu)
> -{
> - return atomic_read(&per_cpu(cpu_id, cpu));
> -}
Just wondering where this per cpu cpu_id ^^ is being dropped off as well
OR is it still getting used ?
> -
> /* look for next available odd ID, return 0 if none found */
> static int coresight_trace_id_find_odd_id(struct coresight_trace_id_map *id_map)
> {
> @@ -145,7 +141,7 @@ static void coresight_trace_id_release_all_pending(void)
> clear_bit(bit, id_map->pend_rel_ids);
> }
> for_each_cpu(cpu, &cpu_id_release_pending) {
> - atomic_set(&per_cpu(cpu_id, cpu), 0);
> + atomic_set(per_cpu_ptr(id_map_default.cpu_map, cpu), 0);
> cpumask_clear_cpu(cpu, &cpu_id_release_pending);
> }
> spin_unlock_irqrestore(&id_map_lock, flags);
> @@ -160,7 +156,7 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
> spin_lock_irqsave(&id_map_lock, flags);
>
> /* check for existing allocation for this CPU */
> - id = _coresight_trace_id_read_cpu_id(cpu);
> + id = coresight_trace_id_read_cpu_id(cpu, id_map);
> if (id)
> goto get_cpu_id_clr_pend;
>
> @@ -181,7 +177,7 @@ int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map
> goto get_cpu_id_out_unlock;
>
> /* allocate the new id to the cpu */
> - atomic_set(&per_cpu(cpu_id, cpu), id);
> + atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), id);
>
> get_cpu_id_clr_pend:
> /* we are (re)using this ID - so ensure it is not marked for release */
> @@ -203,7 +199,7 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
> int id;
>
> /* check for existing allocation for this CPU */
> - id = _coresight_trace_id_read_cpu_id(cpu);
> + id = coresight_trace_id_read_cpu_id(cpu, id_map);
> if (!id)
> return;
>
> @@ -216,7 +212,7 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
> } else {
> /* otherwise clear id */
> coresight_trace_id_free(id, id_map);
> - atomic_set(&per_cpu(cpu_id, cpu), 0);
> + atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0);
> }
>
> spin_unlock_irqrestore(&id_map_lock, flags);
> @@ -258,9 +254,9 @@ struct coresight_trace_id_map *coresight_trace_id_map_default(void)
> }
> EXPORT_SYMBOL_GPL(coresight_trace_id_map_default);
>
> -int coresight_trace_id_read_cpu_id(int cpu)
> +int coresight_trace_id_read_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
> {
> - return _coresight_trace_id_read_cpu_id(cpu);
> + return atomic_read(per_cpu_ptr(id_map->cpu_map, cpu));
> }
> EXPORT_SYMBOL_GPL(coresight_trace_id_read_cpu_id);
>
> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h
> index 54b9d8ed903b..ed2bc4b3ad2a 100644
> --- a/drivers/hwtracing/coresight/coresight-trace-id.h
> +++ b/drivers/hwtracing/coresight/coresight-trace-id.h
> @@ -93,7 +93,7 @@ void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_ma
> *
> * return: current value, will be 0 if unallocated.
> */
> -int coresight_trace_id_read_cpu_id(int cpu);
> +int coresight_trace_id_read_cpu_id(int cpu, struct coresight_trace_id_map *id_map);
>
> /**
> * Allocate a CoreSight trace ID for a system component.
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index c16c61a8411d..7d62b88bfb5c 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -234,6 +234,7 @@ struct coresight_sysfs_link {
> struct coresight_trace_id_map {
> DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX);
> DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX);
> + atomic_t __percpu *cpu_map;
> };
>
> /**
On 4/29/24 20:51, James Clark wrote:
> This will allow Perf mode to pass in per-sink maps. System sources
> allocate IDs on probe so they don't use this and it's __maybe_unused.
Right, makes sense.
>
> Sysfs mode also has the global map hard coded in various places, so pass
> in NULL when enabling for sysfs. We could bubble the global map all the
> way down to where it's used, but it wouldn't have any functional
> difference, so it's probably not worth the code churn.
Agreed.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-dummy.c | 3 ++-
> drivers/hwtracing/coresight/coresight-etm-perf.c | 3 ++-
> drivers/hwtracing/coresight/coresight-etm3x-core.c | 10 +++++-----
> drivers/hwtracing/coresight/coresight-etm4x-core.c | 10 +++++-----
> drivers/hwtracing/coresight/coresight-stm.c | 3 ++-
> drivers/hwtracing/coresight/coresight-sysfs.c | 3 ++-
> drivers/hwtracing/coresight/coresight-tpdm.c | 3 ++-
> include/linux/coresight.h | 2 +-
> 8 files changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-dummy.c b/drivers/hwtracing/coresight/coresight-dummy.c
> index ac70c0b491be..1f1b9ad160f6 100644
> --- a/drivers/hwtracing/coresight/coresight-dummy.c
> +++ b/drivers/hwtracing/coresight/coresight-dummy.c
> @@ -21,7 +21,8 @@ DEFINE_CORESIGHT_DEVLIST(source_devs, "dummy_source");
> DEFINE_CORESIGHT_DEVLIST(sink_devs, "dummy_sink");
>
> static int dummy_source_enable(struct coresight_device *csdev,
> - struct perf_event *event, enum cs_mode mode)
> + struct perf_event *event, enum cs_mode mode,
> + __maybe_unused struct coresight_trace_id_map *id_map)
> {
> dev_dbg(csdev->dev.parent, "Dummy source enabled\n");
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 25f1f87c90d1..177cecae38d9 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -496,7 +496,8 @@ static void etm_event_start(struct perf_event *event, int flags)
> goto fail_end_stop;
>
> /* Finally enable the tracer */
> - if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
> + if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF,
> + coresight_trace_id_map_default()))
> goto fail_disable_path;
>
> /*
> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> index b21f5ad94e63..b310bdf19038 100644
> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> @@ -482,7 +482,8 @@ void etm_release_trace_id(struct etm_drvdata *drvdata)
> }
>
> static int etm_enable_perf(struct coresight_device *csdev,
> - struct perf_event *event)
> + struct perf_event *event,
> + struct coresight_trace_id_map *id_map)
> {
> struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> int trace_id;
> @@ -501,8 +502,7 @@ static int etm_enable_perf(struct coresight_device *csdev,
> * with perf locks - we know the ID cannot change until perf shuts down
> * the session
> */
> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
> - coresight_trace_id_map_default());
> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu, id_map);
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
> dev_name(&drvdata->csdev->dev), drvdata->cpu);
> @@ -555,7 +555,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev)
> }
>
> static int etm_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode, struct coresight_trace_id_map *id_map)
> {
> int ret;
> struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> @@ -570,7 +570,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event,
> ret = etm_enable_sysfs(csdev);
> break;
> case CS_MODE_PERF:
> - ret = etm_enable_perf(csdev, event);
> + ret = etm_enable_perf(csdev, event, id_map);
> break;
> default:
> ret = -EINVAL;
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index d16d6efb26fa..02dbb6c4daf5 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -753,7 +753,8 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
> }
>
> static int etm4_enable_perf(struct coresight_device *csdev,
> - struct perf_event *event)
> + struct perf_event *event,
> + struct coresight_trace_id_map *id_map)
> {
> int ret = 0, trace_id;
> struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> @@ -776,8 +777,7 @@ static int etm4_enable_perf(struct coresight_device *csdev,
> * with perf locks - we know the ID cannot change until perf shuts down
> * the session
> */
> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
> - coresight_trace_id_map_default());
> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu, id_map);
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
> dev_name(&drvdata->csdev->dev), drvdata->cpu);
> @@ -839,7 +839,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev)
> }
>
> static int etm4_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode, struct coresight_trace_id_map *id_map)
> {
> int ret;
>
> @@ -853,7 +853,7 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event,
> ret = etm4_enable_sysfs(csdev);
> break;
> case CS_MODE_PERF:
> - ret = etm4_enable_perf(csdev, event);
> + ret = etm4_enable_perf(csdev, event, id_map);
> break;
> default:
> ret = -EINVAL;
> diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c
> index e1c62820dfda..a80ad1de4c23 100644
> --- a/drivers/hwtracing/coresight/coresight-stm.c
> +++ b/drivers/hwtracing/coresight/coresight-stm.c
> @@ -194,7 +194,8 @@ static void stm_enable_hw(struct stm_drvdata *drvdata)
> }
>
> static int stm_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode,
> + __maybe_unused struct coresight_trace_id_map *trace_id)
> {
> struct stm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>
> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
> index 1e67cc7758d7..a01c9e54e2ed 100644
> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
> @@ -9,6 +9,7 @@
> #include <linux/kernel.h>
>
> #include "coresight-priv.h"
> +#include "coresight-trace-id.h"
>
> /*
> * Use IDR to map the hash of the source's device name
> @@ -63,7 +64,7 @@ static int coresight_enable_source_sysfs(struct coresight_device *csdev,
> */
> lockdep_assert_held(&coresight_mutex);
> if (coresight_get_mode(csdev) != CS_MODE_SYSFS) {
> - ret = source_ops(csdev)->enable(csdev, data, mode);
> + ret = source_ops(csdev)->enable(csdev, data, mode, NULL);
> if (ret)
> return ret;
> }
> diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c
> index a9708ab0d488..0376ad326a2f 100644
> --- a/drivers/hwtracing/coresight/coresight-tpdm.c
> +++ b/drivers/hwtracing/coresight/coresight-tpdm.c
> @@ -439,7 +439,8 @@ static void __tpdm_enable(struct tpdm_drvdata *drvdata)
> }
>
> static int tpdm_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode,
> + __maybe_unused struct coresight_trace_id_map *id_map)
> {
> struct tpdm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 7d62b88bfb5c..3a678e5425dc 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -384,7 +384,7 @@ struct coresight_ops_link {
> struct coresight_ops_source {
> int (*cpu_id)(struct coresight_device *csdev);
> int (*enable)(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode);
> + enum cs_mode mode, struct coresight_trace_id_map *id_map);
> void (*disable)(struct coresight_device *csdev,
> struct perf_event *event);
> };
LGTM
Reviewed-by: Anshuman Khandual <[email protected]>
On 07/05/2024 07:22, Anshuman Khandual wrote:
>
>
> On 4/29/24 20:51, James Clark wrote:
>> The global CPU ID mappings won't work for per-sink ID maps so move it to
>> the ID map struct. coresight_trace_id_release_all_pending() is hard
>> coded to operate on the default map, but once Perf sessions use their
>> own maps the pending release mechanism will be deleted. So it doesn't
>> need to be extended to accept a trace ID map argument at this point.
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> .../hwtracing/coresight/coresight-etm-perf.c | 3 +-
>> .../coresight/coresight-etm3x-core.c | 3 +-
>> .../coresight/coresight-etm4x-core.c | 3 +-
>> .../hwtracing/coresight/coresight-trace-id.c | 28 ++++++++-----------
>> .../hwtracing/coresight/coresight-trace-id.h | 2 +-
>> include/linux/coresight.h | 1 +
>> 6 files changed, 20 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index 4afb9d29f355..25f1f87c90d1 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -508,7 +508,8 @@ static void etm_event_start(struct perf_event *event, int flags)
>> hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK,
>> CS_AUX_HW_ID_CURR_VERSION);
>> hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK,
>> - coresight_trace_id_read_cpu_id(cpu));
>> + coresight_trace_id_read_cpu_id(cpu,
>> + coresight_trace_id_map_default()));
>> perf_report_aux_output_id(event, hw_id);
>> }
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
>> index 4149e7675ceb..b21f5ad94e63 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
>> @@ -501,7 +501,8 @@ static int etm_enable_perf(struct coresight_device *csdev,
>> * with perf locks - we know the ID cannot change until perf shuts down
>> * the session
>> */
>> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu);
>> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
>> + coresight_trace_id_map_default());
>> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
>> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
>> dev_name(&drvdata->csdev->dev), drvdata->cpu);
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> index f32c8cd7742d..d16d6efb26fa 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> @@ -776,7 +776,8 @@ static int etm4_enable_perf(struct coresight_device *csdev,
>> * with perf locks - we know the ID cannot change until perf shuts down
>> * the session
>> */
>> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu);
>> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
>> + coresight_trace_id_map_default());
>> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
>> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
>> dev_name(&drvdata->csdev->dev), drvdata->cpu);
>> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
>> index 45ddd50d09a6..b393603dd713 100644
>> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
>> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
>> @@ -13,10 +13,12 @@
>> #include "coresight-trace-id.h"
>>
>> /* Default trace ID map. Used in sysfs mode and for system sources */
>> -static struct coresight_trace_id_map id_map_default;
>> +static DEFINE_PER_CPU(atomic_t, id_map_default_cpu_ids) = ATOMIC_INIT(0);
>> +static struct coresight_trace_id_map id_map_default = {
>> + .cpu_map = &id_map_default_cpu_ids
>> +};
>>
>> -/* maintain a record of the mapping of IDs and pending releases per cpu */
>> -static DEFINE_PER_CPU(atomic_t, cpu_id) = ATOMIC_INIT(0);
>> +/* maintain a record of the pending releases per cpu */
>> static cpumask_t cpu_id_release_pending;
>>
>> /* perf session active counter */
>> @@ -46,12 +48,6 @@ static void coresight_trace_id_dump_table(struct coresight_trace_id_map *id_map,
>> #define PERF_SESSION(n)
>> #endif
>>
>> -/* unlocked read of current trace ID value for given CPU */
>> -static int _coresight_trace_id_read_cpu_id(int cpu)
>> -{
>> - return atomic_read(&per_cpu(cpu_id, cpu));
>> -}
>
> Just wondering where this per cpu cpu_id ^^ is being dropped off as well
> OR is it still getting used ?
>
No it's still needed. It's not dropped in this set.
I just moved the implementation into coresight_trace_id_read_cpu_id()
rather than add a new argument to _coresight_trace_id_read_cpu_id().
Although I might change that because of Mike's comment about keeping
both the map and non map versions of the functions to reduce some of the
churn changing the callers.
On 03/05/2024 21:23, Arnaldo Carvalho de Melo wrote:
> On Mon, Apr 29, 2024 at 04:21:45PM +0100, James Clark wrote:
>> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
>> as long as there are fewer than that many ETMs connected to each sink.
>>
>> Each sink owns its own trace ID map, and any Perf session connecting to
>> that sink will allocate from it, even if the sink is currently in use by
>> other users. This is similar to the existing behavior where the dynamic
>> trace IDs are constant as long as there is any concurrent Perf session
>> active. It's not completely optimal because slightly more IDs will be
>> used than necessary, but the optimal solution involves tracking the PIDs
>> of each session and allocating ID maps based on the session owner. This
>> is difficult to do with the combination of per-thread and per-cpu modes
>> and some scheduling issues. The complexity of this isn't likely to worth
>> it because even with multiple users they'd just see a difference in the
>> ordering of ID allocations rather than hitting any limits (unless the
>> hardware does have too many ETMs connected to one sink).
>>
>> Per-thread mode works but only until there are any overlapping IDs, at
>> which point Perf will error out. Both per-thread mode and sysfs mode are
>> left to future changes, but both can be added on top of this initial
>> implementation and only sysfs mode requires further driver changes.
>>
>> The HW_ID version field hasn't been bumped in order to not break Perf
>> which already has an error condition for other values of that field.
>> Instead a new minor version has been added which signifies that there
>> are new fields but the old fields are backwards compatible.
>
> I guess I can pick the tooling part now, right? Further reviewing would
> be nice tho.
>
> - Arnaldo
Is it ok if we wait for the driver changes to be merged first? There
might some review comments which need a format change to the packets and
then a re-write of the tool changes.
You could take 1 and 2 though because they're unrelated.
Thanks
James
On 07/05/2024 04:47, Anshuman Khandual wrote:
>
>
> On 4/29/24 20:51, James Clark wrote:
>> The likely fix for this is to update Perf so print a helpful message.
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> tools/perf/util/cs-etm.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>> index d65d7485886c..32818bd7cd17 100644
>> --- a/tools/perf/util/cs-etm.c
>> +++ b/tools/perf/util/cs-etm.c
>> @@ -335,8 +335,11 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
>> trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
>>
>> /* check that we can handle this version */
>> - if (version > CS_AUX_HW_ID_CURR_VERSION)
>> + if (version > CS_AUX_HW_ID_CURR_VERSION) {
>> + pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n",
>
> Is not this bit misleading ? PERF_RECORD_AUX_OUTPUT_HW_ID is just the perf record
> format identifier. The record version here, is derived from the platform specific
> hardware ID information embedded in this perf record.
Not sure I follow what you mean here. 'version' is something that's
output by the kernel. It's saved into a perf.data file, and if Perf
can't handle version 2 for example, you need to update Perf.
>
> Should not this be just s/PERF_RECORD_AUX_OUTPUT_HW_ID/hardware ID/ instead ?
>
It's just a way to go from the error message to the part of the code or
docs that you need to look at. "hardware ID" wouldn't lead you anywhere
so I don't think it would be useful.
>> + version);
>> return -EINVAL;
>> + }
>>
>> /* get access to the etm metadata */
>> etm = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace);
>
Hi James
On 29/04/2024 16:21, James Clark wrote:
> This will allow Perf mode to pass in per-sink maps. System sources
> allocate IDs on probe so they don't use this and it's __maybe_unused.
I am wondering if we need this change ? We have event_data associated
with each event and we allocate the traceid from the sink map at
setup_aux. So, why not set the traceid or even link the id_map in the
event_data ?
something like:
struct etm_event_data {
struct work_struct work;
cpumask_t mask;
cpumask_t aux_hwid_done;
void *snk_config;
u32 cfg_hash;
struct list_head * __percpu *path;
+ u8 __percpu *traceid;
};
See my comment in the other patch.
Suzuki
>
> Sysfs mode also has the global map hard coded in various places, so pass
> in NULL when enabling for sysfs. We could bubble the global map all the
> way down to where it's used, but it wouldn't have any functional
> difference, so it's probably not worth the code churn.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-dummy.c | 3 ++-
> drivers/hwtracing/coresight/coresight-etm-perf.c | 3 ++-
> drivers/hwtracing/coresight/coresight-etm3x-core.c | 10 +++++-----
> drivers/hwtracing/coresight/coresight-etm4x-core.c | 10 +++++-----
> drivers/hwtracing/coresight/coresight-stm.c | 3 ++-
> drivers/hwtracing/coresight/coresight-sysfs.c | 3 ++-
> drivers/hwtracing/coresight/coresight-tpdm.c | 3 ++-
> include/linux/coresight.h | 2 +-
> 8 files changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-dummy.c b/drivers/hwtracing/coresight/coresight-dummy.c
> index ac70c0b491be..1f1b9ad160f6 100644
> --- a/drivers/hwtracing/coresight/coresight-dummy.c
> +++ b/drivers/hwtracing/coresight/coresight-dummy.c
> @@ -21,7 +21,8 @@ DEFINE_CORESIGHT_DEVLIST(source_devs, "dummy_source");
> DEFINE_CORESIGHT_DEVLIST(sink_devs, "dummy_sink");
>
> static int dummy_source_enable(struct coresight_device *csdev,
> - struct perf_event *event, enum cs_mode mode)
> + struct perf_event *event, enum cs_mode mode,
> + __maybe_unused struct coresight_trace_id_map *id_map)
> {
> dev_dbg(csdev->dev.parent, "Dummy source enabled\n");
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 25f1f87c90d1..177cecae38d9 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -496,7 +496,8 @@ static void etm_event_start(struct perf_event *event, int flags)
> goto fail_end_stop;
>
> /* Finally enable the tracer */
> - if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
> + if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF,
> + coresight_trace_id_map_default()))
> goto fail_disable_path;
>
> /*
> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> index b21f5ad94e63..b310bdf19038 100644
> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
> @@ -482,7 +482,8 @@ void etm_release_trace_id(struct etm_drvdata *drvdata)
> }
>
> static int etm_enable_perf(struct coresight_device *csdev,
> - struct perf_event *event)
> + struct perf_event *event,
> + struct coresight_trace_id_map *id_map)
> {
> struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> int trace_id;
> @@ -501,8 +502,7 @@ static int etm_enable_perf(struct coresight_device *csdev,
> * with perf locks - we know the ID cannot change until perf shuts down
> * the session
> */
> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
> - coresight_trace_id_map_default());
> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu, id_map);
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
> dev_name(&drvdata->csdev->dev), drvdata->cpu);
> @@ -555,7 +555,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev)
> }
>
> static int etm_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode, struct coresight_trace_id_map *id_map)
> {
> int ret;
> struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> @@ -570,7 +570,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event,
> ret = etm_enable_sysfs(csdev);
> break;
> case CS_MODE_PERF:
> - ret = etm_enable_perf(csdev, event);
> + ret = etm_enable_perf(csdev, event, id_map);
> break;
> default:
> ret = -EINVAL;
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index d16d6efb26fa..02dbb6c4daf5 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -753,7 +753,8 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
> }
>
> static int etm4_enable_perf(struct coresight_device *csdev,
> - struct perf_event *event)
> + struct perf_event *event,
> + struct coresight_trace_id_map *id_map)
> {
> int ret = 0, trace_id;
> struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
> @@ -776,8 +777,7 @@ static int etm4_enable_perf(struct coresight_device *csdev,
> * with perf locks - we know the ID cannot change until perf shuts down
> * the session
> */
> - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu,
> - coresight_trace_id_map_default());
> + trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu, id_map);
> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
> dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n",
> dev_name(&drvdata->csdev->dev), drvdata->cpu);
> @@ -839,7 +839,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev)
> }
>
> static int etm4_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode, struct coresight_trace_id_map *id_map)
> {
> int ret;
>
> @@ -853,7 +853,7 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event,
> ret = etm4_enable_sysfs(csdev);
> break;
> case CS_MODE_PERF:
> - ret = etm4_enable_perf(csdev, event);
> + ret = etm4_enable_perf(csdev, event, id_map);
> break;
> default:
> ret = -EINVAL;
> diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c
> index e1c62820dfda..a80ad1de4c23 100644
> --- a/drivers/hwtracing/coresight/coresight-stm.c
> +++ b/drivers/hwtracing/coresight/coresight-stm.c
> @@ -194,7 +194,8 @@ static void stm_enable_hw(struct stm_drvdata *drvdata)
> }
>
> static int stm_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode,
> + __maybe_unused struct coresight_trace_id_map *trace_id)
> {
> struct stm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>
> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
> index 1e67cc7758d7..a01c9e54e2ed 100644
> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
> @@ -9,6 +9,7 @@
> #include <linux/kernel.h>
>
> #include "coresight-priv.h"
> +#include "coresight-trace-id.h"
>
> /*
> * Use IDR to map the hash of the source's device name
> @@ -63,7 +64,7 @@ static int coresight_enable_source_sysfs(struct coresight_device *csdev,
> */
> lockdep_assert_held(&coresight_mutex);
> if (coresight_get_mode(csdev) != CS_MODE_SYSFS) {
> - ret = source_ops(csdev)->enable(csdev, data, mode);
> + ret = source_ops(csdev)->enable(csdev, data, mode, NULL);
> if (ret)
> return ret;
> }
> diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c
> index a9708ab0d488..0376ad326a2f 100644
> --- a/drivers/hwtracing/coresight/coresight-tpdm.c
> +++ b/drivers/hwtracing/coresight/coresight-tpdm.c
> @@ -439,7 +439,8 @@ static void __tpdm_enable(struct tpdm_drvdata *drvdata)
> }
>
> static int tpdm_enable(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode)
> + enum cs_mode mode,
> + __maybe_unused struct coresight_trace_id_map *id_map)
> {
> struct tpdm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index 7d62b88bfb5c..3a678e5425dc 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -384,7 +384,7 @@ struct coresight_ops_link {
> struct coresight_ops_source {
> int (*cpu_id)(struct coresight_device *csdev);
> int (*enable)(struct coresight_device *csdev, struct perf_event *event,
> - enum cs_mode mode);
> + enum cs_mode mode, struct coresight_trace_id_map *id_map);
> void (*disable)(struct coresight_device *csdev,
> struct perf_event *event);
> };
On 29/04/2024 16:22, James Clark wrote:
> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
> as long as there are fewer than that many ETMs connected to each sink.
>
> Each sink owns its own trace ID map, and any Perf session connecting to
> that sink will allocate from it, even if the sink is currently in use by
> other users. This is similar to the existing behavior where the dynamic
> trace IDs are constant as long as there is any concurrent Perf session
> active. It's not completely optimal because slightly more IDs will be
> used than necessary, but the optimal solution involves tracking the PIDs
> of each session and allocating ID maps based on the session owner. This
> is difficult to do with the combination of per-thread and per-cpu modes
> and some scheduling issues. The complexity of this isn't likely to worth
> it because even with multiple users they'd just see a difference in the
> ordering of ID allocations rather than hitting any limits (unless the
> hardware does have too many ETMs connected to one sink).
>
> Signed-off-by: James Clark <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 10 ++++++++++
> drivers/hwtracing/coresight/coresight-etm-perf.c | 15 ++++++++-------
> include/linux/coresight.h | 1 +
> 3 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 9fc6f6b863e0..d1adff467670 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -902,6 +902,7 @@ static void coresight_device_release(struct device *dev)
> struct coresight_device *csdev = to_coresight_device(dev);
>
> fwnode_handle_put(csdev->dev.fwnode);
> + free_percpu(csdev->perf_id_map.cpu_map);
> kfree(csdev);
> }
>
> @@ -1159,6 +1160,14 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
> dev_set_name(&csdev->dev, "%s", desc->name);
>
> + if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
> + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) {
> + csdev->perf_id_map.cpu_map = alloc_percpu(atomic_t);
> + if (!csdev->perf_id_map.cpu_map) {
> + ret = -ENOMEM;
> + goto err_out;
> + }
> + }
> /*
> * Make sure the device registration and the connection fixup
> * are synchronised, so that we don't see uninitialised devices
> @@ -1216,6 +1225,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> err_out:
> /* Cleanup the connection information */
> coresight_release_platform_data(NULL, desc->dev, desc->pdata);
> + kfree(csdev);
> return ERR_PTR(ret);
> }
> EXPORT_SYMBOL_GPL(coresight_register);
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index 177cecae38d9..86ca1a9d09a7 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -229,10 +229,13 @@ static void free_event_data(struct work_struct *work)
> struct list_head **ppath;
>
> ppath = etm_event_cpu_path_ptr(event_data, cpu);
> - if (!(IS_ERR_OR_NULL(*ppath)))
> + if (!(IS_ERR_OR_NULL(*ppath))) {
> + struct coresight_device *sink = coresight_get_sink(*ppath);
> +
> + coresight_trace_id_put_cpu_id(cpu, &sink->perf_id_map);
> coresight_release_path(*ppath);
> + }
> *ppath = NULL;
> - coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
> }
>
> /* mark perf event as done for trace id allocator */
> @@ -401,8 +404,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
> }
>
> /* ensure we can allocate a trace ID for this CPU */
> - trace_id = coresight_trace_id_get_cpu_id(cpu,
> - coresight_trace_id_map_default());
> + trace_id = coresight_trace_id_get_cpu_id(cpu, &sink->perf_id_map);
We could either store the perf_id_map or the traceid itself in the
event_data isn't it ? Rather than passing the idmap to enable_source ?
Suzuki
On 29/04/2024 16:22, James Clark wrote:
> In per-cpu mode there are multiple aux buffers and each one has a
> fixed sink, so the hw ID mappings which only need to be emitted once
> for each buffer, even with the new per-sink trace ID pools.
>
> But in per-thread mode there is only a single buffer which can be
> written to from any sink with now potentially overlapping trace IDs, so
> hw ID mappings need to be re-emitted every time the sink changes.
>
> This will require a change in Perf to track this so it knows which
> decode tree to use for each segment of the buffer. In theory it's also
> possible to look at the CPU ID on the AUX records, but this is more
> consistent with the existing system, and allows for correct decode using
> either mechanism.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> drivers/hwtracing/coresight/coresight-etm-perf.c | 14 ++++++++++++++
> drivers/hwtracing/coresight/coresight-etm-perf.h | 2 ++
> 2 files changed, 16 insertions(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index f07173aa4d66..08f3958f9367 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -499,6 +499,20 @@ static void etm_event_start(struct perf_event *event, int flags)
> &sink->perf_id_map))
> goto fail_disable_path;
>
> + /*
> + * In per-cpu mode there are multiple aux buffers and each one has a
> + * fixed sink, so the hw ID mappings which only need to be emitted once
> + * for each buffer.
> + *
> + * But in per-thread mode there is only a single buffer which can be
> + * written to from any sink with potentially overlapping trace IDs, so
> + * hw ID mappings need to be re-emitted every time the sink changes.
> + */
> + if (event->cpu == -1 && event_data->last_sink_hwid != sink) {
> + cpumask_clear(&event_data->aux_hwid_done);
> + event_data->last_sink_hwid = sink;
> + }
> +
With the traceid caching in the event_data per-cpu , we could avoid this
step ?
Suzuki
On 5/7/24 15:36, James Clark wrote:
>
>
> On 07/05/2024 04:47, Anshuman Khandual wrote:
>>
>>
>> On 4/29/24 20:51, James Clark wrote:
>>> The likely fix for this is to update Perf so print a helpful message.
>>>
>>> Signed-off-by: James Clark <[email protected]>
>>> ---
>>> tools/perf/util/cs-etm.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>>> index d65d7485886c..32818bd7cd17 100644
>>> --- a/tools/perf/util/cs-etm.c
>>> +++ b/tools/perf/util/cs-etm.c
>>> @@ -335,8 +335,11 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
>>> trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
>>>
>>> /* check that we can handle this version */
>>> - if (version > CS_AUX_HW_ID_CURR_VERSION)
>>> + if (version > CS_AUX_HW_ID_CURR_VERSION) {
>>> + pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n",
>>
>> Is not this bit misleading ? PERF_RECORD_AUX_OUTPUT_HW_ID is just the perf record
>> format identifier. The record version here, is derived from the platform specific
>> hardware ID information embedded in this perf record.
>
> Not sure I follow what you mean here. 'version' is something that's
> output by the kernel. It's saved into a perf.data file, and if Perf
> can't handle version 2 for example, you need to update Perf.
Got it.
>
>>
>> Should not this be just s/PERF_RECORD_AUX_OUTPUT_HW_ID/hardware ID/ instead ?
>>
>
> It's just a way to go from the error message to the part of the code or
> docs that you need to look at. "hardware ID" wouldn't lead you anywhere
> so I don't think it would be useful.
Sure, fair enough.
>
>>> + version);
>>> return -EINVAL;
>>> + }
>>>
>>> /* get access to the etm metadata */
>>> etm = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace);
>>
> _______________________________________________
> CoreSight mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
Hi James,
On 29-04-2024 08:51 pm, James Clark wrote:
> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
> as long as there are fewer than that many ETMs connected to each sink.
>
> Each sink owns its own trace ID map, and any Perf session connecting to
> that sink will allocate from it, even if the sink is currently in use by
> other users. This is similar to the existing behavior where the dynamic
> trace IDs are constant as long as there is any concurrent Perf session
> active. It's not completely optimal because slightly more IDs will be
> used than necessary, but the optimal solution involves tracking the PIDs
> of each session and allocating ID maps based on the session owner. This
> is difficult to do with the combination of per-thread and per-cpu modes
> and some scheduling issues. The complexity of this isn't likely to worth
> it because even with multiple users they'd just see a difference in the
> ordering of ID allocations rather than hitting any limits (unless the
> hardware does have too many ETMs connected to one sink).
>
> Per-thread mode works but only until there are any overlapping IDs, at
> which point Perf will error out. Both per-thread mode and sysfs mode are
> left to future changes, but both can be added on top of this initial
> implementation and only sysfs mode requires further driver changes.
>
> The HW_ID version field hasn't been bumped in order to not break Perf
> which already has an error condition for other values of that field.
> Instead a new minor version has been added which signifies that there
> are new fields but the old fields are backwards compatible.
>
>
> James Clark (17):
> perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
> perf auxtrace: Allow number of queues to be specified
> perf: cs-etm: Create decoders after both AUX and HW_ID search passes
> perf: cs-etm: Allocate queues for all CPUs
> perf: cs-etm: Move traceid_list to each queue
> perf: cs-etm: Create decoders based on the trace ID mappings
> perf: cs-etm: Support version 0.1 of HW_ID packets
> coresight: Remove unused stubs
> coresight: Clarify comments around the PID of the sink owner
> coresight: Move struct coresight_trace_id_map to common header
> coresight: Expose map argument in trace ID API
> coresight: Make CPU id map a property of a trace ID map
> coresight: Pass trace ID map into source enable
> coresight: Use per-sink trace ID maps for Perf sessions
> coresight: Remove pending trace ID release mechanism
> coresight: Re-emit trace IDs when the sink changes in per-thread mode
> coresight: Emit HW_IDs for all ETMs that are using the sink
>
> drivers/hwtracing/coresight/coresight-core.c | 10 +
> drivers/hwtracing/coresight/coresight-dummy.c | 3 +-
> .../hwtracing/coresight/coresight-etm-perf.c | 82 ++-
> .../hwtracing/coresight/coresight-etm-perf.h | 20 +-
> .../coresight/coresight-etm3x-core.c | 14 +-
> .../coresight/coresight-etm4x-core.c | 14 +-
> drivers/hwtracing/coresight/coresight-stm.c | 3 +-
> drivers/hwtracing/coresight/coresight-sysfs.c | 3 +-
> .../hwtracing/coresight/coresight-tmc-etr.c | 5 +-
> drivers/hwtracing/coresight/coresight-tmc.h | 5 +-
> drivers/hwtracing/coresight/coresight-tpdm.c | 3 +-
> .../hwtracing/coresight/coresight-trace-id.c | 107 +--
> .../hwtracing/coresight/coresight-trace-id.h | 57 +-
> include/linux/coresight-pmu.h | 17 +-
> include/linux/coresight.h | 20 +-
> tools/include/linux/coresight-pmu.h | 17 +-
> tools/perf/util/auxtrace.c | 9 +-
> tools/perf/util/auxtrace.h | 1 +
> .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 28 +-
> tools/perf/util/cs-etm.c | 617 ++++++++++++------
> tools/perf/util/cs-etm.h | 2 +-
> 21 files changed, 633 insertions(+), 404 deletions(-)
>
Thanks for implementing trace-id mapping per sink, with that we could
try perf (coresight tracing) for more than 100+ CPUs.
Please feel free to add,
Tested-by: Ganapatrao Kulkarni <[email protected]>
Thanks,
Ganapat
On Tue, May 07, 2024 at 04:27:25PM +0530, Anshuman Khandual wrote:
> On 5/7/24 15:36, James Clark wrote:
> > On 07/05/2024 04:47, Anshuman Khandual wrote:
> >> On 4/29/24 20:51, James Clark wrote:
> >>> The likely fix for this is to update Perf so print a helpful message.
> >>>
> >>> Signed-off-by: James Clark <[email protected]>
> >>> ---
> >>> tools/perf/util/cs-etm.c | 5 ++++-
> >>> 1 file changed, 4 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> >>> index d65d7485886c..32818bd7cd17 100644
> >>> --- a/tools/perf/util/cs-etm.c
> >>> +++ b/tools/perf/util/cs-etm.c
> >>> @@ -335,8 +335,11 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session,
> >>> trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id);
> >>>
> >>> /* check that we can handle this version */
> >>> - if (version > CS_AUX_HW_ID_CURR_VERSION)
> >>> + if (version > CS_AUX_HW_ID_CURR_VERSION) {
> >>> + pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n",
> >>
> >> Is not this bit misleading ? PERF_RECORD_AUX_OUTPUT_HW_ID is just the perf record
> >> format identifier. The record version here, is derived from the platform specific
> >> hardware ID information embedded in this perf record.
> >
> > Not sure I follow what you mean here. 'version' is something that's
> > output by the kernel. It's saved into a perf.data file, and if Perf
> > can't handle version 2 for example, you need to update Perf.
> Got it.
> >> Should not this be just s/PERF_RECORD_AUX_OUTPUT_HW_ID/hardware ID/ instead ?
> >>
> >
> > It's just a way to go from the error message to the part of the code or
> > docs that you need to look at. "hardware ID" wouldn't lead you anywhere
> > so I don't think it would be useful.
>
> Sure, fair enough.
I'm taking this as an Acked-by, ok?
- Arnaldo
On Tue, May 07, 2024 at 11:01:07AM +0100, James Clark wrote:
> On 03/05/2024 21:23, Arnaldo Carvalho de Melo wrote:
> > I guess I can pick the tooling part now, right? Further reviewing would
> > be nice tho.
> Is it ok if we wait for the driver changes to be merged first? There
> might some review comments which need a format change to the packets and
> then a re-write of the tool changes.
Ok
> You could take 1 and 2 though because they're unrelated.
Done.
- Arnaldo
On 07/05/2024 13:05, Suzuki K Poulose wrote:
> On 29/04/2024 16:22, James Clark wrote:
>> In per-cpu mode there are multiple aux buffers and each one has a
>> fixed sink, so the hw ID mappings which only need to be emitted once
>> for each buffer, even with the new per-sink trace ID pools.
>>
>> But in per-thread mode there is only a single buffer which can be
>> written to from any sink with now potentially overlapping trace IDs, so
>> hw ID mappings need to be re-emitted every time the sink changes.
>>
>> This will require a change in Perf to track this so it knows which
>> decode tree to use for each segment of the buffer. In theory it's also
>> possible to look at the CPU ID on the AUX records, but this is more
>> consistent with the existing system, and allows for correct decode using
>> either mechanism.
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> drivers/hwtracing/coresight/coresight-etm-perf.c | 14 ++++++++++++++
>> drivers/hwtracing/coresight/coresight-etm-perf.h | 2 ++
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index f07173aa4d66..08f3958f9367 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -499,6 +499,20 @@ static void etm_event_start(struct perf_event
>> *event, int flags)
>> &sink->perf_id_map))
>> goto fail_disable_path;
>> + /*
>> + * In per-cpu mode there are multiple aux buffers and each one has a
>> + * fixed sink, so the hw ID mappings which only need to be
>> emitted once
>> + * for each buffer.
>> + *
>> + * But in per-thread mode there is only a single buffer which can be
>> + * written to from any sink with potentially overlapping trace
>> IDs, so
>> + * hw ID mappings need to be re-emitted every time the sink changes.
>> + */
>> + if (event->cpu == -1 && event_data->last_sink_hwid != sink) {
>> + cpumask_clear(&event_data->aux_hwid_done);
>> + event_data->last_sink_hwid = sink;
>> + }
>> +
>
> With the traceid caching in the event_data per-cpu , we could avoid this
> step ?
>
> Suzuki
>
>
I don't think so, this is to inform the tool that the mappings have
changed if the tool doesn't want to follow switch events.
Unless I'm missing something, moving where the trace ids are stored
doesn't mean that they will be re-sent when the mappings change?
On 07/05/2024 12:52, Suzuki K Poulose wrote:
> On 29/04/2024 16:22, James Clark wrote:
>> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
>> as long as there are fewer than that many ETMs connected to each sink.
>>
>> Each sink owns its own trace ID map, and any Perf session connecting to
>> that sink will allocate from it, even if the sink is currently in use by
>> other users. This is similar to the existing behavior where the dynamic
>> trace IDs are constant as long as there is any concurrent Perf session
>> active. It's not completely optimal because slightly more IDs will be
>> used than necessary, but the optimal solution involves tracking the PIDs
>> of each session and allocating ID maps based on the session owner. This
>> is difficult to do with the combination of per-thread and per-cpu modes
>> and some scheduling issues. The complexity of this isn't likely to worth
>> it because even with multiple users they'd just see a difference in the
>> ordering of ID allocations rather than hitting any limits (unless the
>> hardware does have too many ETMs connected to one sink).
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> drivers/hwtracing/coresight/coresight-core.c | 10 ++++++++++
>> drivers/hwtracing/coresight/coresight-etm-perf.c | 15 ++++++++-------
>> include/linux/coresight.h | 1 +
>> 3 files changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-core.c
>> b/drivers/hwtracing/coresight/coresight-core.c
>> index 9fc6f6b863e0..d1adff467670 100644
>> --- a/drivers/hwtracing/coresight/coresight-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>> @@ -902,6 +902,7 @@ static void coresight_device_release(struct device
>> *dev)
>> struct coresight_device *csdev = to_coresight_device(dev);
>> fwnode_handle_put(csdev->dev.fwnode);
>> + free_percpu(csdev->perf_id_map.cpu_map);
>> kfree(csdev);
>> }
>> @@ -1159,6 +1160,14 @@ struct coresight_device
>> *coresight_register(struct coresight_desc *desc)
>> csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
>> dev_set_name(&csdev->dev, "%s", desc->name);
>> + if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
>> + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) {
>> + csdev->perf_id_map.cpu_map = alloc_percpu(atomic_t);
>> + if (!csdev->perf_id_map.cpu_map) {
>> + ret = -ENOMEM;
>> + goto err_out;
>> + }
>> + }
>> /*
>> * Make sure the device registration and the connection fixup
>> * are synchronised, so that we don't see uninitialised devices
>> @@ -1216,6 +1225,7 @@ struct coresight_device
>> *coresight_register(struct coresight_desc *desc)
>> err_out:
>> /* Cleanup the connection information */
>> coresight_release_platform_data(NULL, desc->dev, desc->pdata);
>> + kfree(csdev);
>> return ERR_PTR(ret);
>> }
>> EXPORT_SYMBOL_GPL(coresight_register);
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index 177cecae38d9..86ca1a9d09a7 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -229,10 +229,13 @@ static void free_event_data(struct work_struct
>> *work)
>> struct list_head **ppath;
>> ppath = etm_event_cpu_path_ptr(event_data, cpu);
>> - if (!(IS_ERR_OR_NULL(*ppath)))
>> + if (!(IS_ERR_OR_NULL(*ppath))) {
>> + struct coresight_device *sink = coresight_get_sink(*ppath);
>> +
>> + coresight_trace_id_put_cpu_id(cpu, &sink->perf_id_map);
>> coresight_release_path(*ppath);
>> + }
>> *ppath = NULL;
>> - coresight_trace_id_put_cpu_id(cpu,
>> coresight_trace_id_map_default());
>> }
>> /* mark perf event as done for trace id allocator */
>> @@ -401,8 +404,7 @@ static void *etm_setup_aux(struct perf_event
>> *event, void **pages,
>> }
>> /* ensure we can allocate a trace ID for this CPU */
>> - trace_id = coresight_trace_id_get_cpu_id(cpu,
>> - coresight_trace_id_map_default());
>> + trace_id = coresight_trace_id_get_cpu_id(cpu,
>> &sink->perf_id_map);
>
> We could either store the perf_id_map or the traceid itself in the
> event_data isn't it ? Rather than passing the idmap to enable_source ?
>
> Suzuki
>
Yes the end result would be the same. By doing it this way I was keeping
in mind the potential change for sysfs mode in the future. This way
there is common path between the two modes.
IMO an argument is easier to understand, rather than having to know
where/how/at what point the ID is initialised before calling
enable_source().
James
On 01/05/2024 12:31, Mike Leach wrote:
> Hi James,
>
> On Mon, 29 Apr 2024 at 16:25, James Clark <[email protected]> wrote:
>>
>> The trace ID API is currently hard coded to always use the global map.
>> The functions that take the map as an argument aren't currently public.
>> Make them public so that Perf mode can pass in its own maps. At the
>> moment all usages are still hard coded to use the global map, but now
>> on the caller side.
>>
>> System ID functions are unchanged because they will always use the
>> default map.
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> .../hwtracing/coresight/coresight-etm-perf.c | 5 +++--
>> .../coresight/coresight-etm3x-core.c | 5 +++--
>> .../coresight/coresight-etm4x-core.c | 5 +++--
>> .../hwtracing/coresight/coresight-trace-id.c | 22 +++++++------------
>> .../hwtracing/coresight/coresight-trace-id.h | 9 +++++---
>> 5 files changed, 23 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> index c0c60e6a1703..4afb9d29f355 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>> @@ -232,7 +232,7 @@ static void free_event_data(struct work_struct *work)
>> if (!(IS_ERR_OR_NULL(*ppath)))
>> coresight_release_path(*ppath);
>> *ppath = NULL;
>> - coresight_trace_id_put_cpu_id(cpu);
>> + coresight_trace_id_put_cpu_id(cpu, coresight_trace_id_map_default());
>> }
>>
>> /* mark perf event as done for trace id allocator */
>> @@ -401,7 +401,8 @@ static void *etm_setup_aux(struct perf_event *event, void **pages,
>> }
>>
>> /* ensure we can allocate a trace ID for this CPU */
>> - trace_id = coresight_trace_id_get_cpu_id(cpu);
>> + trace_id = coresight_trace_id_get_cpu_id(cpu,
>> + coresight_trace_id_map_default());
>> if (!IS_VALID_CS_TRACE_ID(trace_id)) {
>> cpumask_clear_cpu(cpu, mask);
>> coresight_release_path(path);
>> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c
>> index 9d5c1391ffb1..4149e7675ceb 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
>> @@ -465,7 +465,8 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
>> *
>> * trace id function has its own lock
>> */
>> - trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
>> + trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
>> + coresight_trace_id_map_default());
>> if (IS_VALID_CS_TRACE_ID(trace_id))
>> drvdata->traceid = (u8)trace_id;
>> else
>> @@ -477,7 +478,7 @@ int etm_read_alloc_trace_id(struct etm_drvdata *drvdata)
>>
>> void etm_release_trace_id(struct etm_drvdata *drvdata)
>> {
>> - coresight_trace_id_put_cpu_id(drvdata->cpu);
>> + coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
>> }
>>
>> static int etm_enable_perf(struct coresight_device *csdev,
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> index a0bdfabddbc6..f32c8cd7742d 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> @@ -241,7 +241,8 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
>> * or return the one currently allocated.
>> * The trace id function has its own lock
>> */
>> - trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu);
>> + trace_id = coresight_trace_id_get_cpu_id(drvdata->cpu,
>> + coresight_trace_id_map_default());
>> if (IS_VALID_CS_TRACE_ID(trace_id))
>> drvdata->trcid = (u8)trace_id;
>> else
>> @@ -253,7 +254,7 @@ int etm4_read_alloc_trace_id(struct etmv4_drvdata *drvdata)
>>
>> void etm4_release_trace_id(struct etmv4_drvdata *drvdata)
>> {
>> - coresight_trace_id_put_cpu_id(drvdata->cpu);
>> + coresight_trace_id_put_cpu_id(drvdata->cpu, coresight_trace_id_map_default());
>> }
>>
>> struct etm4_enable_arg {
>> diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c
>> index 19005b5b4dc4..45ddd50d09a6 100644
>> --- a/drivers/hwtracing/coresight/coresight-trace-id.c
>> +++ b/drivers/hwtracing/coresight/coresight-trace-id.c
>> @@ -12,7 +12,7 @@
>>
>> #include "coresight-trace-id.h"
>>
>> -/* Default trace ID map. Used on systems that don't require per sink mappings */
>> +/* Default trace ID map. Used in sysfs mode and for system sources */
>> static struct coresight_trace_id_map id_map_default;
>>
>> /* maintain a record of the mapping of IDs and pending releases per cpu */
>> @@ -152,7 +152,7 @@ static void coresight_trace_id_release_all_pending(void)
>> DUMP_ID_MAP(id_map);
>> }
>>
>> -static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
>> +int coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
>> {
>> unsigned long flags;
>> int id;
>> @@ -195,8 +195,9 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_
>> DUMP_ID_MAP(id_map);
>> return id;
>> }
>> +EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id);
>>
>> -static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
>> +void coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map)
>> {
>> unsigned long flags;
>> int id;
>> @@ -222,6 +223,7 @@ static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id
>> DUMP_ID_CPU(cpu, id);
>> DUMP_ID_MAP(id_map);
>> }
>> +EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
>>
>> static int coresight_trace_id_map_get_system_id(struct coresight_trace_id_map *id_map)
>> {
>> @@ -250,19 +252,11 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map *
>> DUMP_ID_MAP(id_map);
>> }
>>
>> -/* API functions */
>> -
>
> Rather than remove the existing default trace ID functions, simply add
> a few new ones...
>
> e.g. given the existing...
>
> void coresight_trace_id_put_cpu_id(int cpu)
> {
> coresight_trace_id_map_put_cpu_id(cpu, &id_map_default);
> }
> EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id);
>
> add:-
>
> void coresight_trace_id_put_cpu_id_map(int cpu, struct
> coresight_trace_id_map *id_map)
> {
> coresight_trace_id_map_put_cpu_id(cpu, id_map);
> }
> EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id_map);
>
> This avoids a whole lot of churn in exposing the default map to
> external functions, putting it in there and then removing it later.
> When any location that was using coresight_trace_id_put_cpu_id() needs
> to supply its own map, change the function name at that point to
> coresight_trace_id_put_cpu_id_map()
>
> Mike
>
>
Yeah I was on the fence about this one when writing it, but I agree I
can make this change.
James
On 03/05/2024 14:40, Mike Leach wrote:
> Hi James
>
> On Mon, 29 Apr 2024 at 16:23, James Clark <[email protected]> wrote:
>>
>> This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs
>> as long as there are fewer than that many ETMs connected to each sink.
>>
>> Each sink owns its own trace ID map, and any Perf session connecting to
>> that sink will allocate from it, even if the sink is currently in use by
>> other users. This is similar to the existing behavior where the dynamic
>> trace IDs are constant as long as there is any concurrent Perf session
>> active. It's not completely optimal because slightly more IDs will be
>> used than necessary, but the optimal solution involves tracking the PIDs
>> of each session and allocating ID maps based on the session owner. This
>> is difficult to do with the combination of per-thread and per-cpu modes
>> and some scheduling issues. The complexity of this isn't likely to worth
>> it because even with multiple users they'd just see a difference in the
>> ordering of ID allocations rather than hitting any limits (unless the
>> hardware does have too many ETMs connected to one sink).
>>
>> Per-thread mode works but only until there are any overlapping IDs, at
>> which point Perf will error out. Both per-thread mode and sysfs mode are
>> left to future changes, but both can be added on top of this initial
>> implementation and only sysfs mode requires further driver changes.
>>
>> The HW_ID version field hasn't been bumped in order to not break Perf
>> which already has an error condition for other values of that field.
>> Instead a new minor version has been added which signifies that there
>> are new fields but the old fields are backwards compatible.
>>
>
> Looking at this overall - would it not be better to introduce the
> concept of a "sink ID" to allow the detection of multiple sources into
> the single sink that is now done by emitting multiple AUX_HWID packets
> with the CPU+ID extra data?
I think it's worth trying so I'll give it a go and see if it ends up
being cleaner. One of the reasons for emitting multiple packets was to
tie events and sinks together that may not have been created when the
first set of HW_IDs were output. It seems like static sink IDs could
solve this with less repetition.
> This sink ID could be part of the sink csdev struct - or even the
> id_map struct - a simple count of sinks as the per sink maps are
> created would be sufficient. If this sink ID replaced the CPU+ID extra
We could also use the existing sink IDs that are used to select the sink
on the perf commandline: coresight_get_sink_by_id().
> data in the HWID packets, then each packet could be emitted just once,
> and perf can then collate based on the sink id.
>
> Moreover, once we are ready to address the per-thread issues - then
> the overlap would not matter. Generate OpenCSD decode trees per sink
> ID, add docoders to the tree per Trace ID. Thus if a buffer has data
> from sink 1 trace id 5, ans sink 2, trace ID 5, then pick the right
> decoder for the combo.
The issue is more about trace in a single buffer with overlapping IDs.
Whether we use a sink ID or the CPU+ID mappings, Perf still needs to
know when to change the tree. In either case we'd have to re-emit the
mappings to prompt the decoder change.
So yes with either solution there would be multiple decode trees, it's
just knowing which fragments to send to which tree that's the issue.
>
> Finally in systems with ETE+TRBE were there is no use of trace IDs, a
> sink ID of 0x0 could potentially indicate that 1:1 relationship.
>
This is already indicated with the PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW
flag on the AUX record. But I suppose we could add sink ID 0x0 as well
for consistency with the new HW_ID packets. For backwards compatibility
it probably makes sense to continue using the AUX flag in Perf, but I'll
see how it looks with the sink ID implementation.
James
> Regards
>
> Mike
>
>>
>> James Clark (17):
>> perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
>> perf auxtrace: Allow number of queues to be specified
>> perf: cs-etm: Create decoders after both AUX and HW_ID search passes
>> perf: cs-etm: Allocate queues for all CPUs
>> perf: cs-etm: Move traceid_list to each queue
>> perf: cs-etm: Create decoders based on the trace ID mappings
>> perf: cs-etm: Support version 0.1 of HW_ID packets
>> coresight: Remove unused stubs
>> coresight: Clarify comments around the PID of the sink owner
>> coresight: Move struct coresight_trace_id_map to common header
>> coresight: Expose map argument in trace ID API
>> coresight: Make CPU id map a property of a trace ID map
>> coresight: Pass trace ID map into source enable
>> coresight: Use per-sink trace ID maps for Perf sessions
>> coresight: Remove pending trace ID release mechanism
>> coresight: Re-emit trace IDs when the sink changes in per-thread mode
>> coresight: Emit HW_IDs for all ETMs that are using the sink
>>
>> drivers/hwtracing/coresight/coresight-core.c | 10 +
>> drivers/hwtracing/coresight/coresight-dummy.c | 3 +-
>> .../hwtracing/coresight/coresight-etm-perf.c | 82 ++-
>> .../hwtracing/coresight/coresight-etm-perf.h | 20 +-
>> .../coresight/coresight-etm3x-core.c | 14 +-
>> .../coresight/coresight-etm4x-core.c | 14 +-
>> drivers/hwtracing/coresight/coresight-stm.c | 3 +-
>> drivers/hwtracing/coresight/coresight-sysfs.c | 3 +-
>> .../hwtracing/coresight/coresight-tmc-etr.c | 5 +-
>> drivers/hwtracing/coresight/coresight-tmc.h | 5 +-
>> drivers/hwtracing/coresight/coresight-tpdm.c | 3 +-
>> .../hwtracing/coresight/coresight-trace-id.c | 107 +--
>> .../hwtracing/coresight/coresight-trace-id.h | 57 +-
>> include/linux/coresight-pmu.h | 17 +-
>> include/linux/coresight.h | 20 +-
>> tools/include/linux/coresight-pmu.h | 17 +-
>> tools/perf/util/auxtrace.c | 9 +-
>> tools/perf/util/auxtrace.h | 1 +
>> .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 28 +-
>> tools/perf/util/cs-etm.c | 617 ++++++++++++------
>> tools/perf/util/cs-etm.h | 2 +-
>> 21 files changed, 633 insertions(+), 404 deletions(-)
>>
>> --
>> 2.34.1
>>
>
>