2023-04-24 13:55:29

by James Clark

[permalink] [raw]
Subject: [PATCH 0/7] perf: cs-etm: Fixes around timestamped and timeless decoding

The first commit contains a fix for a recently introduced regression,
but was always a shortcoming in the Coresight code anyway.

The following commits are a tidyup in preparation for the last commit,
which is a fairly major change to the decode logic that's also
indirectly related to the regression so I thought it would be good time
to fix that now.

Applies to perf/core (9be6ab181b7b)

James Clark (7):
perf: cs-etm: Fix timeless decode mode detection
perf tools: Add util function for overriding user set config values
perf: cs-etm: Don't test full_auxtrace because it's always set
perf: cs-etm: Validate options after applying them
perf: cs-etm: Allow user to override timestamp and contextid settings
perf: cs-etm: Use bool type for boolean values
perf: cs-etm: Add separate decode paths for timeless and per-thread
modes

tools/perf/arch/arm/util/cs-etm.c | 223 +++++++++---------
tools/perf/arch/arm/util/pmu.c | 2 +
tools/perf/arch/arm64/util/arm-spe.c | 26 +-
tools/perf/arch/x86/util/intel-pt.c | 22 +-
tools/perf/tests/shell/test_arm_coresight.sh | 24 ++
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 8 +-
tools/perf/util/cs-etm.c | 200 +++++++++++-----
tools/perf/util/cs-etm.h | 6 +-
tools/perf/util/evsel.c | 29 +++
tools/perf/util/evsel.h | 3 +
10 files changed, 325 insertions(+), 218 deletions(-)

--
2.34.1


2023-04-24 13:55:32

by James Clark

[permalink] [raw]
Subject: [PATCH 1/7] perf: cs-etm: Fix timeless decode mode detection

In this context, timeless refers to the trace data rather than the perf
event data. But when detecting whether there are timestamps in the trace
data or not, the presence of a timestamp flag on any perf event is used.

Since commit f42c0ce573df ("perf record: Always get text_poke events
with --kcore option") timestamps were added to a tracking event when
--kcore is used which breaks this detection mechanism. Fix it by
detecting if trace timestamps exist by looking at the ETM config flags.
This would have always been a more accurate way of doing it anyway.

This fixes the following error message when using --kcore with
Coresight:

$ perf record --kcore -e cs_etm// --per-thread
$ perf report
The perf.data/data data has no samples!

Fixes: f42c0ce573df ("perf record: Always get text_poke events with --kcore option")
Reported-by: Yang Shi <[email protected]>
Link: https://lore.kernel.org/lkml/CAHbLzkrJQTrYBtPkf=jf3OpQ-yBcJe7XkvQstX9j2frz4WF-SQ@mail.gmail.com/
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/cs-etm.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 8dd81ddd9e4e..50593289d53c 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -2684,26 +2684,29 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
return 0;
}

-static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
+static int cs_etm__setup_timeless_decoding(struct cs_etm_auxtrace *etm)
{
struct evsel *evsel;
struct evlist *evlist = etm->session->evlist;
- bool timeless_decoding = true;

/* Override timeless mode with user input from --itrace=Z */
- if (etm->synth_opts.timeless_decoding)
- return true;
+ if (etm->synth_opts.timeless_decoding) {
+ etm->timeless_decoding = true;
+ return 0;
+ }

/*
- * Circle through the list of event and complain if we find one
- * with the time bit set.
+ * Find the cs_etm evsel and look at what its timestamp setting was
*/
- evlist__for_each_entry(evlist, evsel) {
- if ((evsel->core.attr.sample_type & PERF_SAMPLE_TIME))
- timeless_decoding = false;
- }
+ evlist__for_each_entry(evlist, evsel)
+ if (cs_etm__evsel_is_auxtrace(etm->session, evsel)) {
+ etm->timeless_decoding =
+ !(evsel->core.attr.config & BIT(ETM_OPT_TS));
+ return 0;
+ }

- return timeless_decoding;
+ pr_err("CS ETM: Couldn't find ETM evsel\n");
+ return -EINVAL;
}

/*
@@ -3155,7 +3158,6 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
etm->snapshot_mode = (ptr[CS_ETM_SNAPSHOT] != 0);
etm->metadata = metadata;
etm->auxtrace_type = auxtrace_info->type;
- etm->timeless_decoding = cs_etm__is_timeless_decoding(etm);

/* Use virtual timestamps if all ETMs report ts_source = 1 */
etm->has_virtual_ts = cs_etm__has_virtual_ts(metadata, num_cpu);
@@ -3172,6 +3174,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
etm->auxtrace.evsel_is_auxtrace = cs_etm__evsel_is_auxtrace;
session->auxtrace = &etm->auxtrace;

+ err = cs_etm__setup_timeless_decoding(etm);
+ if (err)
+ return err;
+
etm->unknown_thread = thread__new(999999999, 999999999);
if (!etm->unknown_thread) {
err = -ENOMEM;
--
2.34.1

2023-04-24 13:55:39

by James Clark

[permalink] [raw]
Subject: [PATCH 2/7] perf tools: Add util function for overriding user set config values

There is some duplicated code to only override config values if they
haven't already been set by the user so make a util function for this.

Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm64/util/arm-spe.c | 26 ++-----------------------
tools/perf/arch/x86/util/intel-pt.c | 22 ++-------------------
tools/perf/util/evsel.c | 29 ++++++++++++++++++++++++++++
tools/perf/util/evsel.h | 3 +++
4 files changed, 36 insertions(+), 44 deletions(-)

diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index ef497a29e814..3b1676ff03f9 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -36,29 +36,6 @@ struct arm_spe_recording {
bool *wrapped;
};

-static void arm_spe_set_timestamp(struct auxtrace_record *itr,
- struct evsel *evsel)
-{
- struct arm_spe_recording *ptr;
- struct perf_pmu *arm_spe_pmu;
- struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
- u64 user_bits = 0, bit;
-
- ptr = container_of(itr, struct arm_spe_recording, itr);
- arm_spe_pmu = ptr->arm_spe_pmu;
-
- if (term)
- user_bits = term->val.cfg_chg;
-
- bit = perf_pmu__format_bits(&arm_spe_pmu->format, "ts_enable");
-
- /* Skip if user has set it */
- if (bit & user_bits)
- return;
-
- evsel->core.attr.config |= bit;
-}
-
static size_t
arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
struct evlist *evlist __maybe_unused)
@@ -238,7 +215,8 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
*/
if (!perf_cpu_map__empty(cpus)) {
evsel__set_sample_bit(arm_spe_evsel, CPU);
- arm_spe_set_timestamp(itr, arm_spe_evsel);
+ evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel,
+ "ts_enable", 1);
}

/*
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 2cff11de9d8a..17336da08b58 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -576,25 +576,6 @@ static int intel_pt_validate_config(struct perf_pmu *intel_pt_pmu,
return err;
}

-static void intel_pt_config_sample_mode(struct perf_pmu *intel_pt_pmu,
- struct evsel *evsel)
-{
- u64 user_bits = 0, bits;
- struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
-
- if (term)
- user_bits = term->val.cfg_chg;
-
- bits = perf_pmu__format_bits(&intel_pt_pmu->format, "psb_period");
-
- /* Did user change psb_period */
- if (bits & user_bits)
- return;
-
- /* Set psb_period to 0 */
- evsel->core.attr.config &= ~bits;
-}
-
static void intel_pt_min_max_sample_sz(struct evlist *evlist,
size_t *min_sz, size_t *max_sz)
{
@@ -686,7 +667,8 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
return 0;

if (opts->auxtrace_sample_mode)
- intel_pt_config_sample_mode(intel_pt_pmu, intel_pt_evsel);
+ evsel__set_config_if_unset(intel_pt_pmu, intel_pt_evsel,
+ "psb_period", 0);

err = intel_pt_validate_config(intel_pt_pmu, intel_pt_evsel);
if (err)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a85a987128aa..cdf1445136ad 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3216,3 +3216,32 @@ void evsel__remove_from_group(struct evsel *evsel, struct evsel *leader)
leader->core.nr_members--;
}
}
+
+/*
+ * Set @config_name to @val as long as the user hasn't already set or cleared it
+ * by passing a config term on the command line.
+ *
+ * @val is the value to put into the bits specified by @config_name rather than
+ * the bit pattern. It is shifted into position by this function, so to set
+ * something to true, pass 1 for val rather than a pre shifted value.
+ */
+#define field_prep(_mask, _val) (((_val) << (ffsll(_mask) - 1)) & (_mask))
+void evsel__set_config_if_unset(struct perf_pmu *pmu, struct evsel *evsel,
+ const char *config_name, u64 val)
+{
+ u64 user_bits = 0, bits;
+ struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
+
+ if (term)
+ user_bits = term->val.cfg_chg;
+
+ bits = perf_pmu__format_bits(&pmu->format, config_name);
+
+ /* Do nothing if the user changed the value */
+ if (bits & user_bits)
+ return;
+
+ /* Otherwise replace it */
+ evsel->core.attr.config &= ~bits;
+ evsel->core.attr.config |= field_prep(bits, val);
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 68072ec655ce..4120f5ff673d 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -529,4 +529,7 @@ bool arch_evsel__must_be_in_group(const struct evsel *evsel);
((((src) >> (pos)) & ((1ull << (size)) - 1)) << (63 - ((pos) + (size) - 1)))

u64 evsel__bitfield_swap_branch_flags(u64 value);
+void evsel__set_config_if_unset(struct perf_pmu *pmu, struct evsel *evsel,
+ const char *config_name, u64 val);
+
#endif /* __PERF_EVSEL_H */
--
2.34.1

2023-04-24 13:55:46

by James Clark

[permalink] [raw]
Subject: [PATCH 4/7] perf: cs-etm: Validate options after applying them

Currently the cs_etm_set_option() function both validates and applies
the config options. Because it's only called when they are added
automatically, there are some paths where the user can apply the option
on the command line and skip the validation. By moving it to the end it
covers both cases.

Also, options don't need to be re-applied anyway, Perf handles parsing
and applying the config terms automatically.

Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm/util/cs-etm.c | 152 +++++++++++++-----------------
1 file changed, 68 insertions(+), 84 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index f9b9ebf7fffc..af0a2400c655 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -69,21 +69,29 @@ static const char * const metadata_ete_ro[] = {
static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu);
static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu);

-static int cs_etm_set_context_id(struct auxtrace_record *itr,
- struct evsel *evsel, int cpu)
+static int cs_etm_validate_context_id(struct auxtrace_record *itr,
+ struct evsel *evsel, int cpu)
{
- struct cs_etm_recording *ptr;
- struct perf_pmu *cs_etm_pmu;
+ struct cs_etm_recording *ptr =
+ container_of(itr, struct cs_etm_recording, itr);
+ struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
char path[PATH_MAX];
- int err = -EINVAL;
+ int err;
u32 val;
- u64 contextid;
+ u64 contextid =
+ evsel->core.attr.config &
+ (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
+ perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));

- ptr = container_of(itr, struct cs_etm_recording, itr);
- cs_etm_pmu = ptr->cs_etm_pmu;
+ if (!contextid)
+ return 0;

- if (!cs_etm_is_etmv4(itr, cpu))
- goto out;
+ /* Not supported in etmv3 */
+ if (!cs_etm_is_etmv4(itr, cpu)) {
+ pr_err("%s: contextid not supported in ETMv3, disable with %s/contextid=0/\n",
+ CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+ return -EINVAL;
+ }

/* Get a handle on TRCIDR2 */
snprintf(path, PATH_MAX, "cpu%d/%s",
@@ -92,27 +100,13 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,

/* There was a problem reading the file, bailing out */
if (err != 1) {
- pr_err("%s: can't read file %s\n",
- CORESIGHT_ETM_PMU_NAME, path);
- goto out;
+ pr_err("%s: can't read file %s\n", CORESIGHT_ETM_PMU_NAME,
+ path);
+ return err;
}

- /* User has configured for PID tracing, respects it. */
- contextid = evsel->core.attr.config &
- (BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_CTXTID2));
-
- /*
- * If user doesn't configure the contextid format, parse PMU format and
- * enable PID tracing according to the "contextid" format bits:
- *
- * If bit ETM_OPT_CTXTID is set, trace CONTEXTIDR_EL1;
- * If bit ETM_OPT_CTXTID2 is set, trace CONTEXTIDR_EL2.
- */
- if (!contextid)
- contextid = perf_pmu__format_bits(&cs_etm_pmu->format,
- "contextid");
-
- if (contextid & BIT(ETM_OPT_CTXTID)) {
+ if (contextid &
+ perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1")) {
/*
* TRCIDR2.CIDSIZE, bit [9-5], indicates whether contextID
* tracing is supported:
@@ -122,14 +116,14 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
*/
val = BMVAL(val, 5, 9);
if (!val || val != 0x4) {
- pr_err("%s: CONTEXTIDR_EL1 isn't supported\n",
- CORESIGHT_ETM_PMU_NAME);
- err = -EINVAL;
- goto out;
+ pr_err("%s: CONTEXTIDR_EL1 isn't supported, disable with %s/contextid1=0/\n",
+ CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+ return -EINVAL;
}
}

- if (contextid & BIT(ETM_OPT_CTXTID2)) {
+ if (contextid &
+ perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2")) {
/*
* TRCIDR2.VMIDOPT[30:29] != 0 and
* TRCIDR2.VMIDSIZE[14:10] == 0b00100 (32bit virtual contextid)
@@ -138,35 +132,34 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
* Any value of VMIDSIZE >= 4 (i.e, > 32bit) is fine for us.
*/
if (!BMVAL(val, 29, 30) || BMVAL(val, 10, 14) < 4) {
- pr_err("%s: CONTEXTIDR_EL2 isn't supported\n",
- CORESIGHT_ETM_PMU_NAME);
- err = -EINVAL;
- goto out;
+ pr_err("%s: CONTEXTIDR_EL2 isn't supported, disable with %s/contextid2=0/\n",
+ CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+ return -EINVAL;
}
}

- /* All good, let the kernel know */
- evsel->core.attr.config |= contextid;
- err = 0;
-
-out:
- return err;
+ return 0;
}

-static int cs_etm_set_timestamp(struct auxtrace_record *itr,
- struct evsel *evsel, int cpu)
+static int cs_etm_validate_timestamp(struct auxtrace_record *itr,
+ struct evsel *evsel, int cpu)
{
- struct cs_etm_recording *ptr;
- struct perf_pmu *cs_etm_pmu;
+ struct cs_etm_recording *ptr =
+ container_of(itr, struct cs_etm_recording, itr);
+ struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
char path[PATH_MAX];
- int err = -EINVAL;
+ int err;
u32 val;

- ptr = container_of(itr, struct cs_etm_recording, itr);
- cs_etm_pmu = ptr->cs_etm_pmu;
+ if (!(evsel->core.attr.config &
+ perf_pmu__format_bits(&cs_etm_pmu->format, "timestamp")))
+ return 0;

- if (!cs_etm_is_etmv4(itr, cpu))
- goto out;
+ if (!cs_etm_is_etmv4(itr, cpu)) {
+ pr_err("%s: timestamp not supported in ETMv3, disable with %s/timestamp=0/\n",
+ CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+ return -EINVAL;
+ }

/* Get a handle on TRCIRD0 */
snprintf(path, PATH_MAX, "cpu%d/%s",
@@ -177,7 +170,7 @@ static int cs_etm_set_timestamp(struct auxtrace_record *itr,
if (err != 1) {
pr_err("%s: can't read file %s\n",
CORESIGHT_ETM_PMU_NAME, path);
- goto out;
+ return err;
}

/*
@@ -189,24 +182,21 @@ static int cs_etm_set_timestamp(struct auxtrace_record *itr,
*/
val &= GENMASK(28, 24);
if (!val) {
- err = -EINVAL;
- goto out;
+ return -EINVAL;
}

- /* All good, let the kernel know */
- evsel->core.attr.config |= (1 << ETM_OPT_TS);
- err = 0;
-
-out:
- return err;
+ return 0;
}

-#define ETM_SET_OPT_CTXTID (1 << 0)
-#define ETM_SET_OPT_TS (1 << 1)
-#define ETM_SET_OPT_MASK (ETM_SET_OPT_CTXTID | ETM_SET_OPT_TS)
-
-static int cs_etm_set_option(struct auxtrace_record *itr,
- struct evsel *evsel, u32 option)
+/*
+ * Check whether the requested timestamp and contextid options should be
+ * available on all requested CPUs and if not, tell the user how to override.
+ * The kernel will silently disable any unavailable options so a warning here
+ * first is better. In theory the kernel could still disable the option for
+ * some other reason so this is best effort only.
+ */
+static int cs_etm_validate_config(struct auxtrace_record *itr,
+ struct evsel *evsel)
{
int i, err = -EINVAL;
struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
@@ -220,18 +210,11 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
!perf_cpu_map__has(online_cpus, cpu))
continue;

- if (option & BIT(ETM_OPT_CTXTID)) {
- err = cs_etm_set_context_id(itr, evsel, i);
- if (err)
- goto out;
- }
- if (option & BIT(ETM_OPT_TS)) {
- err = cs_etm_set_timestamp(itr, evsel, i);
- if (err)
- goto out;
- }
- if (option & ~(BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)))
- /* Nothing else is currently supported */
+ err = cs_etm_validate_context_id(itr, evsel, i);
+ if (err)
+ goto out;
+ err = cs_etm_validate_timestamp(itr, evsel, i);
+ if (err)
goto out;
}

@@ -447,10 +430,10 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
* when a context switch happened.
*/
if (!perf_cpu_map__empty(cpus)) {
- err = cs_etm_set_option(itr, cs_etm_evsel,
- BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS));
- if (err)
- goto out;
+ cs_etm_evsel->core.attr.config |=
+ perf_pmu__format_bits(&cs_etm_pmu->format, "timestamp");
+ cs_etm_evsel->core.attr.config |=
+ perf_pmu__format_bits(&cs_etm_pmu->format, "contextid");
}

/* Add dummy event to keep tracking */
@@ -466,6 +449,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
if (!perf_cpu_map__empty(cpus))
evsel__set_sample_bit(evsel, TIME);

+ err = cs_etm_validate_config(itr, cs_etm_evsel);
out:
return err;
}
--
2.34.1

2023-04-24 13:56:02

by James Clark

[permalink] [raw]
Subject: [PATCH 7/7] perf: cs-etm: Add separate decode paths for timeless and per-thread modes

Timeless and per-thread are orthogonal concepts that are currently
treated as if they are the same (per-thread == timeless). This breaks
when you modify the command line or itrace options to something that the
current logic doesn't expect.

For example:

# Force timeless with Z
--itrace=Zi10i

# Or inconsistent record options
-e cs_etm/timestamp=1/ --per-thread

Adding Z for decoding in per-cpu mode is particularly bad because in
per-thread mode trace channel IDs are discarded and all assumed to be 0,
which would mix trace from different CPUs in per-cpu mode.

Although the results might not be perfect in all scenarios, if the user
requests no timestamps, it should still be possible to decode in either
mode. Especially if the relative times of samples in different processes
aren't interesting, quite a bit of space can be saved by turning off
timestamps in per-cpu mode.

Signed-off-by: James Clark <[email protected]>
---
tools/perf/tests/shell/test_arm_coresight.sh | 24 +++
tools/perf/util/cs-etm.c | 162 ++++++++++++++-----
2 files changed, 148 insertions(+), 38 deletions(-)

diff --git a/tools/perf/tests/shell/test_arm_coresight.sh b/tools/perf/tests/shell/test_arm_coresight.sh
index 565ce525c40b..482009e17bda 100755
--- a/tools/perf/tests/shell/test_arm_coresight.sh
+++ b/tools/perf/tests/shell/test_arm_coresight.sh
@@ -150,6 +150,8 @@ arm_cs_etm_system_wide_test() {
echo "Recording trace with system wide mode"
perf record -o ${perfdata} -e cs_etm// -a -- ls > /dev/null 2>&1

+ # System-wide mode should include perf samples so test for that
+ # instead of ls
perf_script_branch_samples perf &&
perf_report_branch_samples perf &&
perf_report_instruction_samples perf
@@ -182,7 +184,29 @@ arm_cs_etm_snapshot_test() {
arm_cs_report "CoreSight snapshot testing" $err
}

+arm_cs_etm_basic_test() {
+ echo "Recording trace with '$*'"
+ perf record -o ${perfdata} "$@" -- ls > /dev/null 2>&1
+
+ perf_script_branch_samples ls &&
+ perf_report_branch_samples ls &&
+ perf_report_instruction_samples ls
+
+ err=$?
+ arm_cs_report "CoreSight basic testing with '$*'" $err
+}
+
arm_cs_etm_traverse_path_test
arm_cs_etm_system_wide_test
arm_cs_etm_snapshot_test
+
+# Test all combinations of per-thread, system-wide and normal mode with
+# and without timestamps
+arm_cs_etm_basic_test -e cs_etm/timestamp=0/ --per-thread
+arm_cs_etm_basic_test -e cs_etm/timestamp=1/ --per-thread
+arm_cs_etm_basic_test -e cs_etm/timestamp=0/ -a
+arm_cs_etm_basic_test -e cs_etm/timestamp=1/ -a
+arm_cs_etm_basic_test -e cs_etm/timestamp=0/
+arm_cs_etm_basic_test -e cs_etm/timestamp=1/
+
exit $glb_err
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index e048949bf655..456994564d6e 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -50,7 +50,22 @@ struct cs_etm_auxtrace {
struct thread *unknown_thread;
struct perf_tsc_conversion tc;

+ /*
+ * Timeless has no timestamps in the trace so overlapping mmap lookups
+ * are less accurate but produces smaller trace data. We use context IDs
+ * in the trace instead of matching timestamps with fork records so
+ * they're not really needed in the general case. Overlapping mmaps
+ * happen in cases like between a fork and an exec.
+ */
bool timeless_decoding;
+
+ /*
+ * Per-thread ignores the trace channel ID and instead assumes that
+ * everything in a buffer comes from the same process regardless of
+ * which CPU it ran on. It also implies no context IDs so the TID is
+ * taken from the auxtrace buffer.
+ */
+ bool per_thread_decoding;
bool snapshot_mode;
bool data_queued;
bool has_virtual_ts; /* Virtual/Kernel timestamps in the trace. */
@@ -98,7 +113,7 @@ struct cs_etm_queue {
/* RB tree for quick conversion between traceID and metadata pointers */
static struct intlist *traceid_list;

-static int cs_etm__process_queues(struct cs_etm_auxtrace *etm);
+static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm);
static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
pid_t tid);
static int cs_etm__get_data_block(struct cs_etm_queue *etmq);
@@ -492,7 +507,7 @@ static struct cs_etm_traceid_queue
struct cs_etm_traceid_queue *tidq, **traceid_queues;
struct cs_etm_auxtrace *etm = etmq->etm;

- if (etm->timeless_decoding)
+ if (etm->per_thread_decoding)
trace_chan_id = CS_ETM_PER_THREAD_TRACEID;

traceid_queues_list = etmq->traceid_queues_list;
@@ -731,10 +746,15 @@ static int cs_etm__flush_events(struct perf_session *session,
if (!tool->ordered_events)
return -EINVAL;

- if (etm->timeless_decoding)
+ if (etm->timeless_decoding) {
+ /*
+ * Pass tid = -1 to process all queues. But likely they will have
+ * already been processed on PERF_RECORD_EXIT anyway.
+ */
return cs_etm__process_timeless_queues(etm, -1);
+ }

- return cs_etm__process_queues(etm);
+ return cs_etm__process_timestamped_queues(etm);
}

static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
@@ -1066,7 +1086,7 @@ static int cs_etm__queue_first_cs_timestamp(struct cs_etm_auxtrace *etm,
* chronological order.
*
* Note that packets decoded above are still in the traceID's packet
- * queue and will be processed in cs_etm__process_queues().
+ * queue and will be processed in cs_etm__process_timestamped_queues().
*/
cs_queue_nr = TO_CS_QUEUE_NR(queue_nr, trace_chan_id);
ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, cs_timestamp);
@@ -1347,9 +1367,7 @@ static inline u64 cs_etm__resolve_sample_time(struct cs_etm_queue *etmq,
struct cs_etm_auxtrace *etm = etmq->etm;
struct cs_etm_packet_queue *packet_queue = &tidq->packet_queue;

- if (etm->timeless_decoding)
- return 0;
- else if (etm->has_virtual_ts)
+ if (!etm->timeless_decoding && etm->has_virtual_ts)
return packet_queue->cs_timestamp;
else
return etm->latest_kernel_timestamp;
@@ -2329,7 +2347,7 @@ static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
}
}

-static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
+static int cs_etm__run_per_thread_timeless_decoder(struct cs_etm_queue *etmq)
{
int err = 0;
struct cs_etm_traceid_queue *tidq;
@@ -2367,6 +2385,51 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
return err;
}

+static int cs_etm__run_per_cpu_timeless_decoder(struct cs_etm_queue *etmq)
+{
+ int idx, err = 0;
+ struct cs_etm_traceid_queue *tidq;
+ struct int_node *inode;
+
+ /* Go through each buffer in the queue and decode them one by one */
+ while (1) {
+ err = cs_etm__get_data_block(etmq);
+ if (err <= 0)
+ return err;
+
+ /* Run trace decoder until buffer consumed or end of trace */
+ do {
+ err = cs_etm__decode_data_block(etmq);
+ if (err)
+ return err;
+
+ /*
+ * cs_etm__run_per_thread_timeless_decoder() runs on a
+ * single traceID queue because each TID has a separate
+ * buffer. But here in per-cpu mode we need to iterate
+ * over each channel instead.
+ */
+ intlist__for_each_entry(inode,
+ etmq->traceid_queues_list) {
+ idx = (int)(intptr_t)inode->priv;
+ tidq = etmq->traceid_queues[idx];
+ cs_etm__process_traceid_queue(etmq, tidq);
+ }
+ } while (etmq->buf_len);
+
+ intlist__for_each_entry(inode, etmq->traceid_queues_list) {
+ idx = (int)(intptr_t)inode->priv;
+ tidq = etmq->traceid_queues[idx];
+ /* Flush any remaining branch stack entries */
+ err = cs_etm__end_block(etmq, tidq);
+ if (err)
+ return err;
+ }
+ }
+
+ return err;
+}
+
static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
pid_t tid)
{
@@ -2381,22 +2444,30 @@ static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
if (!etmq)
continue;

- tidq = cs_etm__etmq_get_traceid_queue(etmq,
- CS_ETM_PER_THREAD_TRACEID);
+ /*
+ * Per-cpu mode has contextIDs in the trace and the decoder
+ * calls cs_etm__set_pid_tid_cpu() automatically so no need
+ * to do this here
+ */
+ if (etm->per_thread_decoding) {
+ tidq = cs_etm__etmq_get_traceid_queue(
+ etmq, CS_ETM_PER_THREAD_TRACEID);

- if (!tidq)
- continue;
+ if (!tidq)
+ continue;

- if ((tid == -1) || (tidq->tid == tid)) {
- cs_etm__set_pid_tid_cpu(etm, tidq);
- cs_etm__run_decoder(etmq);
- }
+ if ((tid == -1) || (tidq->tid == tid)) {
+ cs_etm__set_pid_tid_cpu(etm, tidq);
+ cs_etm__run_per_thread_timeless_decoder(etmq);
+ }
+ } else
+ cs_etm__run_per_cpu_timeless_decoder(etmq);
}

return 0;
}

-static int cs_etm__process_queues(struct cs_etm_auxtrace *etm)
+static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
{
int ret = 0;
unsigned int cs_queue_nr, queue_nr, i;
@@ -2573,7 +2644,6 @@ static int cs_etm__process_event(struct perf_session *session,
struct perf_sample *sample,
struct perf_tool *tool)
{
- u64 sample_kernel_timestamp;
struct cs_etm_auxtrace *etm = container_of(session->auxtrace,
struct cs_etm_auxtrace,
auxtrace);
@@ -2586,33 +2656,39 @@ static int cs_etm__process_event(struct perf_session *session,
return -EINVAL;
}

- if (sample->time && (sample->time != (u64) -1))
- sample_kernel_timestamp = sample->time;
- else
- sample_kernel_timestamp = 0;
-
- /*
- * Don't wait for cs_etm__flush_events() in per-thread/timeless mode to start the decode. We
- * need the tid of the PERF_RECORD_EXIT event to assign to the synthesised samples because
- * ETM_OPT_CTXTID is not enabled.
- */
- if (etm->timeless_decoding &&
- event->header.type == PERF_RECORD_EXIT)
- return cs_etm__process_timeless_queues(etm,
- event->fork.tid);
+ switch (event->header.type) {
+ case PERF_RECORD_EXIT:
+ /*
+ * Don't need to wait for cs_etm__flush_events() in per-thread mode to
+ * start the decode because we know there will be no more trace from
+ * this thread. All this does is emit samples earlier than waiting for
+ * the flush in other modes, but with timestamps it makes sense to wait
+ * for flush so that events from different threads are interleaved
+ * properly.
+ */
+ if (etm->per_thread_decoding && etm->timeless_decoding)
+ return cs_etm__process_timeless_queues(etm,
+ event->fork.tid);
+ break;

- if (event->header.type == PERF_RECORD_ITRACE_START)
+ case PERF_RECORD_ITRACE_START:
return cs_etm__process_itrace_start(etm, event);
- else if (event->header.type == PERF_RECORD_SWITCH_CPU_WIDE)
+
+ case PERF_RECORD_SWITCH_CPU_WIDE:
return cs_etm__process_switch_cpu_wide(etm, event);

- if (!etm->timeless_decoding && event->header.type == PERF_RECORD_AUX) {
+ case PERF_RECORD_AUX:
/*
* Record the latest kernel timestamp available in the header
* for samples so that synthesised samples occur from this point
* onwards.
*/
- etm->latest_kernel_timestamp = sample_kernel_timestamp;
+ if (sample->time && (sample->time != (u64)-1))
+ etm->latest_kernel_timestamp = sample->time;
+ break;
+
+ default:
+ break;
}

return 0;
@@ -2821,10 +2897,20 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
* Return 'not found' if mismatch.
*/
if (auxtrace_event->cpu == (__u32) -1) {
+ etm->per_thread_decoding = true;
if (auxtrace_event->tid != sample->tid)
return 1;
- } else if (auxtrace_event->cpu != sample->cpu)
+ } else if (auxtrace_event->cpu != sample->cpu) {
+ if (etm->per_thread_decoding) {
+ /*
+ * Found a per-cpu buffer after a per-thread one was
+ * already found
+ */
+ pr_err("CS ETM: Inconsistent per-thread/per-cpu mode.\n");
+ return -EINVAL;
+ }
return 1;
+ }

if (aux_event->flags & PERF_AUX_FLAG_OVERWRITE) {
/*
--
2.34.1

2023-04-24 13:56:20

by James Clark

[permalink] [raw]
Subject: [PATCH 3/7] perf: cs-etm: Don't test full_auxtrace because it's always set

There is no path in cs-etm where this isn't true so it doesn't need to
be tested. Also re-order the beginning of cs_etm_recording_options() so
that nothing is done until the early exit is passed.

Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm/util/cs-etm.c | 56 ++++++++++++++-----------------
1 file changed, 25 insertions(+), 31 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index e02a9bfc3d42..f9b9ebf7fffc 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -319,13 +319,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
bool privileged = perf_event_paranoid_check(-1);
int err = 0;

- ptr->evlist = evlist;
- ptr->snapshot_mode = opts->auxtrace_snapshot_mode;
-
- if (!record_opts__no_switch_events(opts) &&
- perf_can_record_switch_events())
- opts->record_switch_events = true;
-
evlist__for_each_entry(evlist, evsel) {
if (evsel->core.attr.type == cs_etm_pmu->type) {
if (cs_etm_evsel) {
@@ -333,11 +326,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
CORESIGHT_ETM_PMU_NAME);
return -EINVAL;
}
- evsel->core.attr.freq = 0;
- evsel->core.attr.sample_period = 1;
- evsel->needs_auxtrace_mmap = true;
cs_etm_evsel = evsel;
- opts->full_auxtrace = true;
}
}

@@ -345,6 +334,18 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
if (!cs_etm_evsel)
return 0;

+ ptr->evlist = evlist;
+ ptr->snapshot_mode = opts->auxtrace_snapshot_mode;
+
+ if (!record_opts__no_switch_events(opts) &&
+ perf_can_record_switch_events())
+ opts->record_switch_events = true;
+
+ cs_etm_evsel->core.attr.freq = 0;
+ cs_etm_evsel->core.attr.sample_period = 1;
+ cs_etm_evsel->needs_auxtrace_mmap = true;
+ opts->full_auxtrace = true;
+
ret = cs_etm_set_sink_attr(cs_etm_pmu, cs_etm_evsel);
if (ret)
return ret;
@@ -414,8 +415,8 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
}
}

- /* We are in full trace mode but '-m,xyz' wasn't specified */
- if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
+ /* Buffer sizes weren't specified with '-m,xyz' so give some defaults */
+ if (!opts->auxtrace_mmap_pages) {
if (privileged) {
opts->auxtrace_mmap_pages = MiB(4) / page_size;
} else {
@@ -423,7 +424,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
if (opts->mmap_pages == UINT_MAX)
opts->mmap_pages = KiB(256) / page_size;
}
-
}

if (opts->auxtrace_snapshot_mode)
@@ -454,23 +454,17 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
}

/* Add dummy event to keep tracking */
- if (opts->full_auxtrace) {
- struct evsel *tracking_evsel;
-
- err = parse_event(evlist, "dummy:u");
- if (err)
- goto out;
-
- tracking_evsel = evlist__last(evlist);
- evlist__set_tracking_event(evlist, tracking_evsel);
-
- tracking_evsel->core.attr.freq = 0;
- tracking_evsel->core.attr.sample_period = 1;
-
- /* In per-cpu case, always need the time of mmap events etc */
- if (!perf_cpu_map__empty(cpus))
- evsel__set_sample_bit(tracking_evsel, TIME);
- }
+ err = parse_event(evlist, "dummy:u");
+ if (err)
+ goto out;
+ evsel = evlist__last(evlist);
+ evlist__set_tracking_event(evlist, evsel);
+ evsel->core.attr.freq = 0;
+ evsel->core.attr.sample_period = 1;
+
+ /* In per-cpu case, always need the time of mmap events etc */
+ if (!perf_cpu_map__empty(cpus))
+ evsel__set_sample_bit(evsel, TIME);

out:
return err;
--
2.34.1

2023-04-24 13:56:42

by James Clark

[permalink] [raw]
Subject: [PATCH 6/7] perf: cs-etm: Use bool type for boolean values

Using u8 for boolean values makes the code a bit more difficult to read
so be more explicit by using bool.

Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 8 ++++----
tools/perf/util/cs-etm.c | 8 ++++----
tools/perf/util/cs-etm.h | 4 ++--
3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
index 92a855fbe5b8..21d403f55d96 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
@@ -60,10 +60,10 @@ struct cs_etm_decoder_params {
int operation;
void (*packet_printer)(const char *msg);
cs_etm_mem_cb_type mem_acc_cb;
- u8 formatted;
- u8 fsyncs;
- u8 hsyncs;
- u8 frame_aligned;
+ bool formatted;
+ bool fsyncs;
+ bool hsyncs;
+ bool frame_aligned;
void *data;
};

diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 50593289d53c..e048949bf655 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -50,10 +50,10 @@ struct cs_etm_auxtrace {
struct thread *unknown_thread;
struct perf_tsc_conversion tc;

- u8 timeless_decoding;
- u8 snapshot_mode;
- u8 data_queued;
- u8 has_virtual_ts; /* Virtual/Kernel timestamps in the trace. */
+ bool timeless_decoding;
+ bool snapshot_mode;
+ bool data_queued;
+ bool has_virtual_ts; /* Virtual/Kernel timestamps in the trace. */

int num_cpu;
u64 latest_kernel_timestamp;
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 16333d35bed4..70cac0375b34 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -179,8 +179,8 @@ struct cs_etm_packet {
u32 last_instr_subtype;
u32 flags;
u32 exception_number;
- u8 last_instr_cond;
- u8 last_instr_taken_branch;
+ bool last_instr_cond;
+ bool last_instr_taken_branch;
u8 last_instr_size;
u8 trace_chan_id;
int cpu;
--
2.34.1

2023-04-24 13:56:48

by James Clark

[permalink] [raw]
Subject: [PATCH 5/7] perf: cs-etm: Allow user to override timestamp and contextid settings

Timestamps and context tracking are automatically enabled in per-core
mode and it's impossible to override this. Use the new utility function
to set them conditionally.

Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm/util/cs-etm.c | 29 +++++++++++++++++++++++------
tools/perf/arch/arm/util/pmu.c | 2 ++
tools/perf/util/cs-etm.h | 2 ++
3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index af0a2400c655..77cb03e6ff87 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -324,8 +324,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
perf_can_record_switch_events())
opts->record_switch_events = true;

- cs_etm_evsel->core.attr.freq = 0;
- cs_etm_evsel->core.attr.sample_period = 1;
cs_etm_evsel->needs_auxtrace_mmap = true;
opts->full_auxtrace = true;

@@ -430,10 +428,10 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
* when a context switch happened.
*/
if (!perf_cpu_map__empty(cpus)) {
- cs_etm_evsel->core.attr.config |=
- perf_pmu__format_bits(&cs_etm_pmu->format, "timestamp");
- cs_etm_evsel->core.attr.config |=
- perf_pmu__format_bits(&cs_etm_pmu->format, "contextid");
+ evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
+ "timestamp", 1);
+ evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
+ "contextid", 1);
}

/* Add dummy event to keep tracking */
@@ -914,3 +912,22 @@ struct auxtrace_record *cs_etm_record_init(int *err)
out:
return NULL;
}
+
+/*
+ * Set a default config to enable the user changed config tracking mechanism
+ * (CFG_CHG and evsel__set_config_if_unset()). If no default is set then user
+ * changes aren't tracked.
+ */
+struct perf_event_attr *
+cs_etm_get_default_config(struct perf_pmu *pmu __maybe_unused)
+{
+ struct perf_event_attr *attr;
+
+ attr = zalloc(sizeof(struct perf_event_attr));
+ if (!attr)
+ return NULL;
+
+ attr->sample_period = 1;
+
+ return attr;
+}
diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
index 887c8addc491..860a8b42b4b5 100644
--- a/tools/perf/arch/arm/util/pmu.c
+++ b/tools/perf/arch/arm/util/pmu.c
@@ -12,6 +12,7 @@
#include "arm-spe.h"
#include "hisi-ptt.h"
#include "../../../util/pmu.h"
+#include "../cs-etm.h"

struct perf_event_attr
*perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
@@ -20,6 +21,7 @@ struct perf_event_attr
if (!strcmp(pmu->name, CORESIGHT_ETM_PMU_NAME)) {
/* add ETM default config here */
pmu->selectable = true;
+ return cs_etm_get_default_config(pmu);
#if defined(__aarch64__)
} else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
return arm_spe_pmu_default_config(pmu);
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 661f029322e4..16333d35bed4 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -12,6 +12,7 @@
#include <linux/bits.h>

struct perf_session;
+struct perf_pmu;

/*
* Versioning header in case things need to change in the future. That way
@@ -228,6 +229,7 @@ struct cs_etm_packet_queue {

int cs_etm__process_auxtrace_info(union perf_event *event,
struct perf_session *session);
+struct perf_event_attr *cs_etm_get_default_config(struct perf_pmu *pmu);

#ifdef HAVE_CSTRACE_SUPPORT
int cs_etm__get_cpu(u8 trace_chan_id, int *cpu);
--
2.34.1

2023-04-24 15:19:56

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH 1/7] perf: cs-etm: Fix timeless decode mode detection

On 24/04/2023 14:47, James Clark wrote:
> In this context, timeless refers to the trace data rather than the perf
> event data. But when detecting whether there are timestamps in the trace
> data or not, the presence of a timestamp flag on any perf event is used.
>
> Since commit f42c0ce573df ("perf record: Always get text_poke events
> with --kcore option") timestamps were added to a tracking event when
> --kcore is used which breaks this detection mechanism. Fix it by
> detecting if trace timestamps exist by looking at the ETM config flags.
> This would have always been a more accurate way of doing it anyway.
>
> This fixes the following error message when using --kcore with
> Coresight:
>
> $ perf record --kcore -e cs_etm// --per-thread
> $ perf report
> The perf.data/data data has no samples!
>
> Fixes: f42c0ce573df ("perf record: Always get text_poke events with --kcore option")
> Reported-by: Yang Shi <[email protected]>
> Link: https://lore.kernel.org/lkml/CAHbLzkrJQTrYBtPkf=jf3OpQ-yBcJe7XkvQstX9j2frz4WF-SQ@mail.gmail.com/
> Signed-off-by: James Clark <[email protected]>
> ---
> tools/perf/util/cs-etm.c | 30 ++++++++++++++++++------------
> 1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 8dd81ddd9e4e..50593289d53c 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -2684,26 +2684,29 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
> return 0;
> }
>
> -static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
> +static int cs_etm__setup_timeless_decoding(struct cs_etm_auxtrace *etm)

minor nit: "setup" sound more like prepare to do what is required to
do a timeless decoding, while we are doing more like, check if we
have to do a timeless decoding. So may be:

cs_etm_check_timeless_decoding() ?

> {
> struct evsel *evsel;
> struct evlist *evlist = etm->session->evlist;
> - bool timeless_decoding = true;
>
> /* Override timeless mode with user input from --itrace=Z */
> - if (etm->synth_opts.timeless_decoding)
> - return true;
> + if (etm->synth_opts.timeless_decoding) {
> + etm->timeless_decoding = true;
> + return 0;
> + }
>
> /*
> - * Circle through the list of event and complain if we find one
> - * with the time bit set.
> + * Find the cs_etm evsel and look at what its timestamp setting was
> */
> - evlist__for_each_entry(evlist, evsel) {
> - if ((evsel->core.attr.sample_type & PERF_SAMPLE_TIME))
> - timeless_decoding = false;
> - }
> + evlist__for_each_entry(evlist, evsel)

minor nit: please retain the braces

> + if (cs_etm__evsel_is_auxtrace(etm->session, evsel)) {
> + etm->timeless_decoding =
> + !(evsel->core.attr.config & BIT(ETM_OPT_TS));


> + return 0;
> + }

Otherwise, looks good to me

Suzuki


2023-04-24 15:45:16

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 2/7] perf tools: Add util function for overriding user set config values

On 24/04/23 16:47, James Clark wrote:
> There is some duplicated code to only override config values if they
> haven't already been set by the user so make a util function for this.
>
> Signed-off-by: James Clark <[email protected]>

One minor comment, nevertheless:

Acked-by: Adrian Hunter <[email protected]>

> ---
> tools/perf/arch/arm64/util/arm-spe.c | 26 ++-----------------------
> tools/perf/arch/x86/util/intel-pt.c | 22 ++-------------------
> tools/perf/util/evsel.c | 29 ++++++++++++++++++++++++++++
> tools/perf/util/evsel.h | 3 +++
> 4 files changed, 36 insertions(+), 44 deletions(-)
>
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index ef497a29e814..3b1676ff03f9 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -36,29 +36,6 @@ struct arm_spe_recording {
> bool *wrapped;
> };
>
> -static void arm_spe_set_timestamp(struct auxtrace_record *itr,
> - struct evsel *evsel)
> -{
> - struct arm_spe_recording *ptr;
> - struct perf_pmu *arm_spe_pmu;
> - struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
> - u64 user_bits = 0, bit;
> -
> - ptr = container_of(itr, struct arm_spe_recording, itr);
> - arm_spe_pmu = ptr->arm_spe_pmu;
> -
> - if (term)
> - user_bits = term->val.cfg_chg;
> -
> - bit = perf_pmu__format_bits(&arm_spe_pmu->format, "ts_enable");
> -
> - /* Skip if user has set it */
> - if (bit & user_bits)
> - return;
> -
> - evsel->core.attr.config |= bit;
> -}
> -
> static size_t
> arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
> struct evlist *evlist __maybe_unused)
> @@ -238,7 +215,8 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
> */
> if (!perf_cpu_map__empty(cpus)) {
> evsel__set_sample_bit(arm_spe_evsel, CPU);
> - arm_spe_set_timestamp(itr, arm_spe_evsel);
> + evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel,
> + "ts_enable", 1);
> }
>
> /*
> diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
> index 2cff11de9d8a..17336da08b58 100644
> --- a/tools/perf/arch/x86/util/intel-pt.c
> +++ b/tools/perf/arch/x86/util/intel-pt.c
> @@ -576,25 +576,6 @@ static int intel_pt_validate_config(struct perf_pmu *intel_pt_pmu,
> return err;
> }
>
> -static void intel_pt_config_sample_mode(struct perf_pmu *intel_pt_pmu,
> - struct evsel *evsel)
> -{
> - u64 user_bits = 0, bits;
> - struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
> -
> - if (term)
> - user_bits = term->val.cfg_chg;
> -
> - bits = perf_pmu__format_bits(&intel_pt_pmu->format, "psb_period");
> -
> - /* Did user change psb_period */
> - if (bits & user_bits)
> - return;
> -
> - /* Set psb_period to 0 */
> - evsel->core.attr.config &= ~bits;
> -}
> -
> static void intel_pt_min_max_sample_sz(struct evlist *evlist,
> size_t *min_sz, size_t *max_sz)
> {
> @@ -686,7 +667,8 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
> return 0;
>
> if (opts->auxtrace_sample_mode)
> - intel_pt_config_sample_mode(intel_pt_pmu, intel_pt_evsel);
> + evsel__set_config_if_unset(intel_pt_pmu, intel_pt_evsel,
> + "psb_period", 0);
>
> err = intel_pt_validate_config(intel_pt_pmu, intel_pt_evsel);
> if (err)
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index a85a987128aa..cdf1445136ad 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -3216,3 +3216,32 @@ void evsel__remove_from_group(struct evsel *evsel, struct evsel *leader)
> leader->core.nr_members--;
> }
> }
> +
> +/*
> + * Set @config_name to @val as long as the user hasn't already set or cleared it
> + * by passing a config term on the command line.
> + *
> + * @val is the value to put into the bits specified by @config_name rather than
> + * the bit pattern. It is shifted into position by this function, so to set
> + * something to true, pass 1 for val rather than a pre shifted value.
> + */
> +#define field_prep(_mask, _val) (((_val) << (ffsll(_mask) - 1)) & (_mask))

I notice there is already tools/include/linux/bitfield.h
so may FIELD_PREP from there could be used?

> +void evsel__set_config_if_unset(struct perf_pmu *pmu, struct evsel *evsel,
> + const char *config_name, u64 val)
> +{
> + u64 user_bits = 0, bits;
> + struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
> +
> + if (term)
> + user_bits = term->val.cfg_chg;
> +
> + bits = perf_pmu__format_bits(&pmu->format, config_name);
> +
> + /* Do nothing if the user changed the value */
> + if (bits & user_bits)
> + return;
> +
> + /* Otherwise replace it */
> + evsel->core.attr.config &= ~bits;
> + evsel->core.attr.config |= field_prep(bits, val);
> +}
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 68072ec655ce..4120f5ff673d 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -529,4 +529,7 @@ bool arch_evsel__must_be_in_group(const struct evsel *evsel);
> ((((src) >> (pos)) & ((1ull << (size)) - 1)) << (63 - ((pos) + (size) - 1)))
>
> u64 evsel__bitfield_swap_branch_flags(u64 value);
> +void evsel__set_config_if_unset(struct perf_pmu *pmu, struct evsel *evsel,
> + const char *config_name, u64 val);
> +
> #endif /* __PERF_EVSEL_H */

2023-04-24 16:46:16

by James Clark

[permalink] [raw]
Subject: Re: [PATCH 2/7] perf tools: Add util function for overriding user set config values



On 24/04/2023 16:36, Adrian Hunter wrote:
> On 24/04/23 16:47, James Clark wrote:
>> There is some duplicated code to only override config values if they
>> haven't already been set by the user so make a util function for this.
>>
>> Signed-off-by: James Clark <[email protected]>
>
> One minor comment, nevertheless:
>
> Acked-by: Adrian Hunter <[email protected]>
>

Thanks for the review

>> ---
>> tools/perf/arch/arm64/util/arm-spe.c | 26 ++-----------------------
>> tools/perf/arch/x86/util/intel-pt.c | 22 ++-------------------
>> tools/perf/util/evsel.c | 29 ++++++++++++++++++++++++++++
>> tools/perf/util/evsel.h | 3 +++
>> 4 files changed, 36 insertions(+), 44 deletions(-)
>>
[...]
>> }
>> }
>> +
>> +/*
>> + * Set @config_name to @val as long as the user hasn't already set or cleared it
>> + * by passing a config term on the command line.
>> + *
>> + * @val is the value to put into the bits specified by @config_name rather than
>> + * the bit pattern. It is shifted into position by this function, so to set
>> + * something to true, pass 1 for val rather than a pre shifted value.
>> + */
>> +#define field_prep(_mask, _val) (((_val) << (ffsll(_mask) - 1)) & (_mask))
>
> I notice there is already tools/include/linux/bitfield.h
> so may FIELD_PREP from there could be used?

I tried that first, but it seems like quite a lot of effort went into it
to make it work on const only values so it doesn't work here. There is
this [1] change to make a non const one but it doesn't look like it went
anywhere:

[1]:
https://patchwork.kernel.org/project/linux-omap/patch/3a54a6703879d10f08cf0275a2a69297ebd2b1d4.1637592133.git.geert+renesas@glider.be/#24610749

2023-04-24 17:54:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 2/7] perf tools: Add util function for overriding user set config values

Em Mon, Apr 24, 2023 at 06:36:14PM +0300, Adrian Hunter escreveu:
> On 24/04/23 16:47, James Clark wrote:
> > There is some duplicated code to only override config values if they
> > haven't already been set by the user so make a util function for this.
> >
> > Signed-off-by: James Clark <[email protected]>
>
> One minor comment, nevertheless:
>
> Acked-by: Adrian Hunter <[email protected]>

I just moved to evsel__set_config_if_unset() to util/pmu.c, next to
some other evsel__ functions to not break the python.so binding, before
I was getting:

[acme@quaco perf-tools-next]$ perf test -v python
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
19: 'import perf' in python :
--- start ---
test child forked, pid 500086
python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf-tools-next/python'); import perf" | '/usr/bin/python3' "
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: perf_pmu__format_bits
test child finished with -1
---- end ----
'import perf' in python: FAILED!
[acme@quaco perf-tools-next]$

Please run 'perf test' and 'make -C tools/perf build-test' prior to
sending pull requests,

Thanks, applied.

- Arnaldo

2023-04-25 11:07:22

by James Clark

[permalink] [raw]
Subject: Re: [PATCH 2/7] perf tools: Add util function for overriding user set config values



On 24/04/2023 18:45, Arnaldo Carvalho de Melo wrote:
> Em Mon, Apr 24, 2023 at 06:36:14PM +0300, Adrian Hunter escreveu:
>> On 24/04/23 16:47, James Clark wrote:
>>> There is some duplicated code to only override config values if they
>>> haven't already been set by the user so make a util function for this.
>>>
>>> Signed-off-by: James Clark <[email protected]>
>>
>> One minor comment, nevertheless:
>>
>> Acked-by: Adrian Hunter <[email protected]>
>
> I just moved to evsel__set_config_if_unset() to util/pmu.c, next to
> some other evsel__ functions to not break the python.so binding, before
> I was getting:
>
> [acme@quaco perf-tools-next]$ perf test -v python
> Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
> 19: 'import perf' in python :
> --- start ---
> test child forked, pid 500086
> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf-tools-next/python'); import perf" | '/usr/bin/python3' "
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: perf_pmu__format_bits
> test child finished with -1
> ---- end ----
> 'import perf' in python: FAILED!
> [acme@quaco perf-tools-next]$
>
> Please run 'perf test' and 'make -C tools/perf build-test' prior to
> sending pull requests,
>
> Thanks, applied.
>

Ah sorry! I ran it from an in source build and got the python import
error so I ignored that test. I see the new error if I run it from
tools/perf instead.

Interestingly with an out of source build it doesn't matter which cwd
you run the Python test from because $OUTPUT is an absolute path.
Normally I do an out of source build, but the Coresight tests don't
currently work with that. Which I will submit another fix for...

I don't know if it's worth getting rid of that edge by making sure
PYTHONPATH is always absolute even for in source builds or if it will
break something else like a make install? It's because of this line:

-DPYTHONPATH="BUILD_STR($(OUTPUT)python)"

Will make sure that they all pass next time. I also sent a fix for the
build-test target on my platform.

> - Arnaldo
>

2023-04-26 03:18:28

by Denis Nikitin

[permalink] [raw]
Subject: Re: [PATCH 0/7] perf: cs-etm: Fixes around timestamped and timeless decoding

Thanks James!

Some of the patches failed to apply on ToT.

Acked-by: Denis Nikitin <[email protected]>

- Denis

On Mon, Apr 24, 2023 at 6:48 AM James Clark <[email protected]> wrote:
>
> The first commit contains a fix for a recently introduced regression,
> but was always a shortcoming in the Coresight code anyway.
>
> The following commits are a tidyup in preparation for the last commit,
> which is a fairly major change to the decode logic that's also
> indirectly related to the regression so I thought it would be good time
> to fix that now.
>
> Applies to perf/core (9be6ab181b7b)
>
> James Clark (7):
> perf: cs-etm: Fix timeless decode mode detection
> perf tools: Add util function for overriding user set config values
> perf: cs-etm: Don't test full_auxtrace because it's always set
> perf: cs-etm: Validate options after applying them
> perf: cs-etm: Allow user to override timestamp and contextid settings
> perf: cs-etm: Use bool type for boolean values
> perf: cs-etm: Add separate decode paths for timeless and per-thread
> modes
>
> tools/perf/arch/arm/util/cs-etm.c | 223 +++++++++---------
> tools/perf/arch/arm/util/pmu.c | 2 +
> tools/perf/arch/arm64/util/arm-spe.c | 26 +-
> tools/perf/arch/x86/util/intel-pt.c | 22 +-
> tools/perf/tests/shell/test_arm_coresight.sh | 24 ++
> .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 8 +-
> tools/perf/util/cs-etm.c | 200 +++++++++++-----
> tools/perf/util/cs-etm.h | 6 +-
> tools/perf/util/evsel.c | 29 +++
> tools/perf/util/evsel.h | 3 +
> 10 files changed, 325 insertions(+), 218 deletions(-)
>
> --
> 2.34.1
>
> _______________________________________________
> CoreSight mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

2023-04-26 05:48:42

by Denis Nikitin

[permalink] [raw]
Subject: Re: [PATCH 1/7] perf: cs-etm: Fix timeless decode mode detection

On Mon, Apr 24, 2023 at 8:14 AM Suzuki K Poulose <[email protected]> wrote:
>
> On 24/04/2023 14:47, James Clark wrote:
> > In this context, timeless refers to the trace data rather than the perf
> > event data. But when detecting whether there are timestamps in the trace
> > data or not, the presence of a timestamp flag on any perf event is used.
> >
> > Since commit f42c0ce573df ("perf record: Always get text_poke events
> > with --kcore option") timestamps were added to a tracking event when
> > --kcore is used which breaks this detection mechanism. Fix it by
> > detecting if trace timestamps exist by looking at the ETM config flags.
> > This would have always been a more accurate way of doing it anyway.
> >
> > This fixes the following error message when using --kcore with
> > Coresight:
> >
> > $ perf record --kcore -e cs_etm// --per-thread
> > $ perf report
> > The perf.data/data data has no samples!
> >
> > Fixes: f42c0ce573df ("perf record: Always get text_poke events with --kcore option")
> > Reported-by: Yang Shi <[email protected]>
> > Link: https://lore.kernel.org/lkml/CAHbLzkrJQTrYBtPkf=jf3OpQ-yBcJe7XkvQstX9j2frz4WF-SQ@mail.gmail.com/
> > Signed-off-by: James Clark <[email protected]>
> > ---
> > tools/perf/util/cs-etm.c | 30 ++++++++++++++++++------------
> > 1 file changed, 18 insertions(+), 12 deletions(-)
> >
> > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> > index 8dd81ddd9e4e..50593289d53c 100644
> > --- a/tools/perf/util/cs-etm.c
> > +++ b/tools/perf/util/cs-etm.c
> > @@ -2684,26 +2684,29 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session,
> > return 0;
> > }
> >
> > -static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
> > +static int cs_etm__setup_timeless_decoding(struct cs_etm_auxtrace *etm)
>
> minor nit: "setup" sound more like prepare to do what is required to
> do a timeless decoding, while we are doing more like, check if we
> have to do a timeless decoding. So may be:
>
> cs_etm_check_timeless_decoding() ?
>

I didn't catch that "setup_timeless_decoding" can be treated as the
initialization
of the timeless decoding. On the other hand _check_ doesn't imply that
it changes
the flag.
Maybe "_setup_timeless_decoding_flag" will be more specific about its intention?

- Denis

>
> Otherwise, looks good to me
>
> Suzuki
>
>
> _______________________________________________
> CoreSight mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

2023-04-26 16:13:23

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH 0/7] perf: cs-etm: Fixes around timestamped and timeless decoding

On Mon, Apr 24, 2023 at 6:48 AM James Clark <[email protected]> wrote:
>
> The first commit contains a fix for a recently introduced regression,
> but was always a shortcoming in the Coresight code anyway.
>
> The following commits are a tidyup in preparation for the last commit,
> which is a fairly major change to the decode logic that's also
> indirectly related to the regression so I thought it would be good time
> to fix that now.
>
> Applies to perf/core (9be6ab181b7b)

Thanks for working on this. I tested with perf/core branch on acme
tree, it does solve the "no samples" problem.

Please feel free to add: Tested-by: Yang Shi <[email protected]>

>
> James Clark (7):
> perf: cs-etm: Fix timeless decode mode detection
> perf tools: Add util function for overriding user set config values
> perf: cs-etm: Don't test full_auxtrace because it's always set
> perf: cs-etm: Validate options after applying them
> perf: cs-etm: Allow user to override timestamp and contextid settings
> perf: cs-etm: Use bool type for boolean values
> perf: cs-etm: Add separate decode paths for timeless and per-thread
> modes
>
> tools/perf/arch/arm/util/cs-etm.c | 223 +++++++++---------
> tools/perf/arch/arm/util/pmu.c | 2 +
> tools/perf/arch/arm64/util/arm-spe.c | 26 +-
> tools/perf/arch/x86/util/intel-pt.c | 22 +-
> tools/perf/tests/shell/test_arm_coresight.sh | 24 ++
> .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 8 +-
> tools/perf/util/cs-etm.c | 200 +++++++++++-----
> tools/perf/util/cs-etm.h | 6 +-
> tools/perf/util/evsel.c | 29 +++
> tools/perf/util/evsel.h | 3 +
> 10 files changed, 325 insertions(+), 218 deletions(-)
>
> --
> 2.34.1
>

2023-04-27 15:15:43

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH 4/7] perf: cs-etm: Validate options after applying themperf_pmu__format_bits

Hi James,

On Mon, Apr 24, 2023 at 02:47:44PM +0100, James Clark wrote:
> Currently the cs_etm_set_option() function both validates and applies
> the config options. Because it's only called when they are added
> automatically, there are some paths where the user can apply the option
> on the command line and skip the validation. By moving it to the end it
> covers both cases.
>
> Also, options don't need to be re-applied anyway, Perf handles parsing
> and applying the config terms automatically.
>
> Signed-off-by: James Clark <[email protected]>
> ---
> tools/perf/arch/arm/util/cs-etm.c | 152 +++++++++++++-----------------
> 1 file changed, 68 insertions(+), 84 deletions(-)
>
> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
> index f9b9ebf7fffc..af0a2400c655 100644
> --- a/tools/perf/arch/arm/util/cs-etm.c
> +++ b/tools/perf/arch/arm/util/cs-etm.c
> @@ -69,21 +69,29 @@ static const char * const metadata_ete_ro[] = {
> static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu);
> static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu);
>
> -static int cs_etm_set_context_id(struct auxtrace_record *itr,
> - struct evsel *evsel, int cpu)
> +static int cs_etm_validate_context_id(struct auxtrace_record *itr,
> + struct evsel *evsel, int cpu)
> {
> - struct cs_etm_recording *ptr;
> - struct perf_pmu *cs_etm_pmu;
> + struct cs_etm_recording *ptr =
> + container_of(itr, struct cs_etm_recording, itr);
> + struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
> char path[PATH_MAX];
> - int err = -EINVAL;
> + int err;
> u32 val;
> - u64 contextid;
> + u64 contextid =
> + evsel->core.attr.config &
> + (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));

Seems to me, this would break backward compability.

The old kernel (before 5.11) doesn't provide 'contextid1' and
'contextid2', so we always check the entry 'contextid' rather than
'contextid1' and 'contextid2'.

With this change, if a kernel doesn't contain 'contextid1' and
'contextid2' formats, will perf tool never trace for contexid?

Thanks,
Leo

>
> - ptr = container_of(itr, struct cs_etm_recording, itr);
> - cs_etm_pmu = ptr->cs_etm_pmu;
> + if (!contextid)
> + return 0;
>
> - if (!cs_etm_is_etmv4(itr, cpu))
> - goto out;
> + /* Not supported in etmv3 */
> + if (!cs_etm_is_etmv4(itr, cpu)) {
> + pr_err("%s: contextid not supported in ETMv3, disable with %s/contextid=0/\n",
> + CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
> + return -EINVAL;
> + }
>
> /* Get a handle on TRCIDR2 */
> snprintf(path, PATH_MAX, "cpu%d/%s",
> @@ -92,27 +100,13 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
>
> /* There was a problem reading the file, bailing out */
> if (err != 1) {
> - pr_err("%s: can't read file %s\n",
> - CORESIGHT_ETM_PMU_NAME, path);
> - goto out;
> + pr_err("%s: can't read file %s\n", CORESIGHT_ETM_PMU_NAME,
> + path);
> + return err;
> }
>
> - /* User has configured for PID tracing, respects it. */
> - contextid = evsel->core.attr.config &
> - (BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_CTXTID2));
> -
> - /*
> - * If user doesn't configure the contextid format, parse PMU format and
> - * enable PID tracing according to the "contextid" format bits:
> - *
> - * If bit ETM_OPT_CTXTID is set, trace CONTEXTIDR_EL1;
> - * If bit ETM_OPT_CTXTID2 is set, trace CONTEXTIDR_EL2.
> - */
> - if (!contextid)
> - contextid = perf_pmu__format_bits(&cs_etm_pmu->format,
> - "contextid");
> -
> - if (contextid & BIT(ETM_OPT_CTXTID)) {
> + if (contextid &
> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1")) {
> /*
> * TRCIDR2.CIDSIZE, bit [9-5], indicates whether contextID
> * tracing is supported:
> @@ -122,14 +116,14 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
> */
> val = BMVAL(val, 5, 9);
> if (!val || val != 0x4) {
> - pr_err("%s: CONTEXTIDR_EL1 isn't supported\n",
> - CORESIGHT_ETM_PMU_NAME);
> - err = -EINVAL;
> - goto out;
> + pr_err("%s: CONTEXTIDR_EL1 isn't supported, disable with %s/contextid1=0/\n",
> + CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
> + return -EINVAL;
> }
> }
>
> - if (contextid & BIT(ETM_OPT_CTXTID2)) {
> + if (contextid &
> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2")) {
> /*
> * TRCIDR2.VMIDOPT[30:29] != 0 and
> * TRCIDR2.VMIDSIZE[14:10] == 0b00100 (32bit virtual contextid)
> @@ -138,35 +132,34 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
> * Any value of VMIDSIZE >= 4 (i.e, > 32bit) is fine for us.
> */
> if (!BMVAL(val, 29, 30) || BMVAL(val, 10, 14) < 4) {
> - pr_err("%s: CONTEXTIDR_EL2 isn't supported\n",
> - CORESIGHT_ETM_PMU_NAME);
> - err = -EINVAL;
> - goto out;
> + pr_err("%s: CONTEXTIDR_EL2 isn't supported, disable with %s/contextid2=0/\n",
> + CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
> + return -EINVAL;
> }
> }
>
> - /* All good, let the kernel know */
> - evsel->core.attr.config |= contextid;
> - err = 0;
> -
> -out:
> - return err;
> + return 0;
> }
>
> -static int cs_etm_set_timestamp(struct auxtrace_record *itr,
> - struct evsel *evsel, int cpu)
> +static int cs_etm_validate_timestamp(struct auxtrace_record *itr,
> + struct evsel *evsel, int cpu)
> {
> - struct cs_etm_recording *ptr;
> - struct perf_pmu *cs_etm_pmu;
> + struct cs_etm_recording *ptr =
> + container_of(itr, struct cs_etm_recording, itr);
> + struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
> char path[PATH_MAX];
> - int err = -EINVAL;
> + int err;
> u32 val;
>
> - ptr = container_of(itr, struct cs_etm_recording, itr);
> - cs_etm_pmu = ptr->cs_etm_pmu;
> + if (!(evsel->core.attr.config &
> + perf_pmu__format_bits(&cs_etm_pmu->format, "timestamp")))
> + return 0;
>
> - if (!cs_etm_is_etmv4(itr, cpu))
> - goto out;
> + if (!cs_etm_is_etmv4(itr, cpu)) {
> + pr_err("%s: timestamp not supported in ETMv3, disable with %s/timestamp=0/\n",
> + CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
> + return -EINVAL;
> + }
>
> /* Get a handle on TRCIRD0 */
> snprintf(path, PATH_MAX, "cpu%d/%s",
> @@ -177,7 +170,7 @@ static int cs_etm_set_timestamp(struct auxtrace_record *itr,
> if (err != 1) {
> pr_err("%s: can't read file %s\n",
> CORESIGHT_ETM_PMU_NAME, path);
> - goto out;
> + return err;
> }
>
> /*
> @@ -189,24 +182,21 @@ static int cs_etm_set_timestamp(struct auxtrace_record *itr,
> */
> val &= GENMASK(28, 24);
> if (!val) {
> - err = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> - /* All good, let the kernel know */
> - evsel->core.attr.config |= (1 << ETM_OPT_TS);
> - err = 0;
> -
> -out:
> - return err;
> + return 0;
> }
>
> -#define ETM_SET_OPT_CTXTID (1 << 0)
> -#define ETM_SET_OPT_TS (1 << 1)
> -#define ETM_SET_OPT_MASK (ETM_SET_OPT_CTXTID | ETM_SET_OPT_TS)
> -
> -static int cs_etm_set_option(struct auxtrace_record *itr,
> - struct evsel *evsel, u32 option)
> +/*
> + * Check whether the requested timestamp and contextid options should be
> + * available on all requested CPUs and if not, tell the user how to override.
> + * The kernel will silently disable any unavailable options so a warning here
> + * first is better. In theory the kernel could still disable the option for
> + * some other reason so this is best effort only.
> + */
> +static int cs_etm_validate_config(struct auxtrace_record *itr,
> + struct evsel *evsel)
> {
> int i, err = -EINVAL;
> struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
> @@ -220,18 +210,11 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
> !perf_cpu_map__has(online_cpus, cpu))
> continue;
>
> - if (option & BIT(ETM_OPT_CTXTID)) {
> - err = cs_etm_set_context_id(itr, evsel, i);
> - if (err)
> - goto out;
> - }
> - if (option & BIT(ETM_OPT_TS)) {
> - err = cs_etm_set_timestamp(itr, evsel, i);
> - if (err)
> - goto out;
> - }
> - if (option & ~(BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)))
> - /* Nothing else is currently supported */
> + err = cs_etm_validate_context_id(itr, evsel, i);
> + if (err)
> + goto out;
> + err = cs_etm_validate_timestamp(itr, evsel, i);
> + if (err)
> goto out;
> }
>
> @@ -447,10 +430,10 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
> * when a context switch happened.
> */
> if (!perf_cpu_map__empty(cpus)) {
> - err = cs_etm_set_option(itr, cs_etm_evsel,
> - BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS));
> - if (err)
> - goto out;
> + cs_etm_evsel->core.attr.config |=
> + perf_pmu__format_bits(&cs_etm_pmu->format, "timestamp");
> + cs_etm_evsel->core.attr.config |=
> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid");
> }
>
> /* Add dummy event to keep tracking */
> @@ -466,6 +449,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
> if (!perf_cpu_map__empty(cpus))
> evsel__set_sample_bit(evsel, TIME);
>
> + err = cs_etm_validate_config(itr, cs_etm_evsel);
> out:
> return err;
> }
> --
> 2.34.1
>

2023-04-27 15:59:58

by James Clark

[permalink] [raw]
Subject: Re: [PATCH 4/7] perf: cs-etm: Validate options after applying themperf_pmu__format_bits



On 27/04/2023 16:12, Leo Yan wrote:
> Hi James,
>
> On Mon, Apr 24, 2023 at 02:47:44PM +0100, James Clark wrote:
>> Currently the cs_etm_set_option() function both validates and applies
>> the config options. Because it's only called when they are added
>> automatically, there are some paths where the user can apply the option
>> on the command line and skip the validation. By moving it to the end it
>> covers both cases.
>>
>> Also, options don't need to be re-applied anyway, Perf handles parsing
>> and applying the config terms automatically.
>>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> tools/perf/arch/arm/util/cs-etm.c | 152 +++++++++++++-----------------
>> 1 file changed, 68 insertions(+), 84 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
>> index f9b9ebf7fffc..af0a2400c655 100644
>> --- a/tools/perf/arch/arm/util/cs-etm.c
>> +++ b/tools/perf/arch/arm/util/cs-etm.c
>> @@ -69,21 +69,29 @@ static const char * const metadata_ete_ro[] = {
>> static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu);
>> static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu);
>>
>> -static int cs_etm_set_context_id(struct auxtrace_record *itr,
>> - struct evsel *evsel, int cpu)
>> +static int cs_etm_validate_context_id(struct auxtrace_record *itr,
>> + struct evsel *evsel, int cpu)
>> {
>> - struct cs_etm_recording *ptr;
>> - struct perf_pmu *cs_etm_pmu;
>> + struct cs_etm_recording *ptr =
>> + container_of(itr, struct cs_etm_recording, itr);
>> + struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
>> char path[PATH_MAX];
>> - int err = -EINVAL;
>> + int err;
>> u32 val;
>> - u64 contextid;
>> + u64 contextid =
>> + evsel->core.attr.config &
>> + (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
>> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));
>
> Seems to me, this would break backward compability.
>
> The old kernel (before 5.11) doesn't provide 'contextid1' and
> 'contextid2', so we always check the entry 'contextid' rather than
> 'contextid1' and 'contextid2'.
>
> With this change, if a kernel doesn't contain 'contextid1' and
> 'contextid2' formats, will perf tool never trace for contexid?
>

No because I changed to to be purely validation, so the format flags
would still be applied. But yes I think you are right there is a small
issue.

Now validation of 'contextid' isn't done on pre 5.11 kernels. But that
only checks for ETMv3 anyway. Validation of 'contextid1' and
'contextid2' isn't a problem, because if the kernel doesn't support them
they can't be applied on the command line anyway.

I can fix it by checking for 'contextid' and ETMv3 first and then doing
'contextid1' and 'contextid2' after.

> Thanks,
> Leo
>

2023-04-27 22:18:25

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH 4/7] perf: cs-etm: Validate options after applying themperf_pmu__format_bits

On Thu, Apr 27, 2023 at 04:52:06PM +0100, James Clark wrote:

[...]

> >> -static int cs_etm_set_context_id(struct auxtrace_record *itr,
> >> - struct evsel *evsel, int cpu)
> >> +static int cs_etm_validate_context_id(struct auxtrace_record *itr,
> >> + struct evsel *evsel, int cpu)
> >> {
> >> - struct cs_etm_recording *ptr;
> >> - struct perf_pmu *cs_etm_pmu;
> >> + struct cs_etm_recording *ptr =
> >> + container_of(itr, struct cs_etm_recording, itr);
> >> + struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
> >> char path[PATH_MAX];
> >> - int err = -EINVAL;
> >> + int err;
> >> u32 val;
> >> - u64 contextid;
> >> + u64 contextid =
> >> + evsel->core.attr.config &
> >> + (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
> >> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));
> >
> > Seems to me, this would break backward compability.
> >
> > The old kernel (before 5.11) doesn't provide 'contextid1' and
> > 'contextid2', so we always check the entry 'contextid' rather than
> > 'contextid1' and 'contextid2'.
> >
> > With this change, if a kernel doesn't contain 'contextid1' and
> > 'contextid2' formats, will perf tool never trace for contexid?
> >
>
> No because I changed to to be purely validation, so the format flags
> would still be applied. But yes I think you are right there is a small
> issue.
>
> Now validation of 'contextid' isn't done on pre 5.11 kernels. But that
> only checks for ETMv3 anyway.

IIUC, 'contextid' is not only used for ETMv3. Just quotes the comments
from drivers/hwtracing/coresight/coresight-etm-perf.c:

73 /*
74 * contextid always traces the "PID". The PID is in CONTEXTIDR_EL1
75 * when the kernel is running at EL1; when the kernel is at EL2,
76 * the PID is in CONTEXTIDR_EL2.
77 */

ETMv4 uses 'contextid' as well, since the user space needs to know which
exception level's PID should be traced, e.g. when CPU runs in EL2
'contextid' is set as ETM_OPT_CTXTID2, the perf tool will set 'contextid2'
to tell driver to trace CONTEXTIDR_EL2.

We can only verify 'contextid', and set 'contextid1' or 'contextid2' based
on CPU running exception level, finally driver knows how to trace PID.

Thanks,
Leo

> Validation of 'contextid1' and
> 'contextid2' isn't a problem, because if the kernel doesn't support them
> they can't be applied on the command line anyway.
>
> I can fix it by checking for 'contextid' and ETMv3 first and then doing
> 'contextid1' and 'contextid2' after.

2023-04-28 12:37:28

by James Clark

[permalink] [raw]
Subject: Re: [PATCH 4/7] perf: cs-etm: Validate options after applying themperf_pmu__format_bits



On 27/04/2023 23:10, Leo Yan wrote:
> On Thu, Apr 27, 2023 at 04:52:06PM +0100, James Clark wrote:
>
> [...]
>
>>>> -static int cs_etm_set_context_id(struct auxtrace_record *itr,
>>>> - struct evsel *evsel, int cpu)
>>>> +static int cs_etm_validate_context_id(struct auxtrace_record *itr,
>>>> + struct evsel *evsel, int cpu)
>>>> {
>>>> - struct cs_etm_recording *ptr;
>>>> - struct perf_pmu *cs_etm_pmu;
>>>> + struct cs_etm_recording *ptr =
>>>> + container_of(itr, struct cs_etm_recording, itr);
>>>> + struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
>>>> char path[PATH_MAX];
>>>> - int err = -EINVAL;
>>>> + int err;
>>>> u32 val;
>>>> - u64 contextid;
>>>> + u64 contextid =
>>>> + evsel->core.attr.config &
>>>> + (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
>>>> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));
>>>
>>> Seems to me, this would break backward compability.
>>>
>>> The old kernel (before 5.11) doesn't provide 'contextid1' and
>>> 'contextid2', so we always check the entry 'contextid' rather than
>>> 'contextid1' and 'contextid2'.
>>>
>>> With this change, if a kernel doesn't contain 'contextid1' and
>>> 'contextid2' formats, will perf tool never trace for contexid?
>>>
>>
>> No because I changed to to be purely validation, so the format flags
>> would still be applied. But yes I think you are right there is a small
>> issue.
>>
>> Now validation of 'contextid' isn't done on pre 5.11 kernels. But that
>> only checks for ETMv3 anyway.
>
> IIUC, 'contextid' is not only used for ETMv3. Just quotes the comments
> from drivers/hwtracing/coresight/coresight-etm-perf.c:
>

I meant that the validation only looks for the presence of ETMv3 and
shows a warning in that scenario, so that was the only thing not working
any more. Not that it's only used for ETMv3.

> 73 /*
> 74 * contextid always traces the "PID". The PID is in CONTEXTIDR_EL1
> 75 * when the kernel is running at EL1; when the kernel is at EL2,
> 76 * the PID is in CONTEXTIDR_EL2.
> 77 */
>
> ETMv4 uses 'contextid' as well, since the user space needs to know which
> exception level's PID should be traced, e.g. when CPU runs in EL2
> 'contextid' is set as ETM_OPT_CTXTID2, the perf tool will set 'contextid2'
> to tell driver to trace CONTEXTIDR_EL2.
>

That's still working because it reads the config term in the setup
function rather than setting any one bit manually:

if (!perf_cpu_map__empty(cpus)) {
evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
"timestamp", 1);
evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
"contextid", 1);
}

> We can only verify 'contextid', and set 'contextid1' or 'contextid2' based
> on CPU running exception level, finally driver knows how to trace PID.
>
> Thanks,
> Leo
>

I'm not 100% sure what you mean by this. But previously the validation
was looking at both contextid1 and contextid2 options and checking if
either were supported if either were set.

I have the following change in mind, it fixes the backwards
compatibility issue. And the validation should be exactly the same as it
was before. Except for one bug that I found when both bits are requested
which I've also fixed here:

From f1b9f56df29dfb4f2a7be25f009c79c86335587a Mon Sep 17 00:00:00 2001
From: James Clark <[email protected]>
Date: Fri, 28 Apr 2023 10:29:52 +0100
Subject: [PATCH] perf cs-etm: Fix contextid validation

Pre 5.11 kernels don't support 'contextid1' and 'contextid2' so
validation would be skipped. By adding an additional check for
'contextid', old kernels will still have validation done even though
contextid would either be contextid1 or contextid2.

Additionally now that it's possible to override options, an existing bug
in the validation is revealed. 'val' is overwritten by the contextid1
validation, and re-used for contextid2 validation causing it to always
fail. '!val || val != 0x4' is the same as 'val != 0x4' because 0 is also
!= 4, so that expression can be simplified and the temp variable not
overwritten.

Fixes: 35c51f83dd1e ("perf cs-etm: Validate options after applying them")
Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm/util/cs-etm.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 77cb03e6ff87..9ca040bfb1aa 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -78,9 +78,9 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr,
char path[PATH_MAX];
int err;
u32 val;
- u64 contextid =
- evsel->core.attr.config &
- (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
+ u64 contextid = evsel->core.attr.config &
+ (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid") |
+ perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));

if (!contextid)
@@ -114,8 +114,7 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr,
* 0b00100 Maximum of 32-bit Context ID size.
* All other values are reserved.
*/
- val = BMVAL(val, 5, 9);
- if (!val || val != 0x4) {
+ if (BMVAL(val, 5, 9) != 0x4) {
pr_err("%s: CONTEXTIDR_EL1 isn't supported, disable with %s/contextid1=0/\n",
CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
return -EINVAL;
--
2.34.1

2023-05-01 08:09:00

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH 4/7] perf: cs-etm: Validate options after applying themperf_pmu__format_bits

On Fri, Apr 28, 2023 at 01:33:16PM +0100, James Clark wrote:

[...]

> > ETMv4 uses 'contextid' as well, since the user space needs to know which
> > exception level's PID should be traced, e.g. when CPU runs in EL2
> > 'contextid' is set as ETM_OPT_CTXTID2, the perf tool will set 'contextid2'
> > to tell driver to trace CONTEXTIDR_EL2.
> >
>
> That's still working because it reads the config term in the setup
> function rather than setting any one bit manually:
>
> if (!perf_cpu_map__empty(cpus)) {
> evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
> "timestamp", 1);
> evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
> "contextid", 1);
> }

Okay, yeah, this change can set CTXTID or CTXTID2 bit if user doesn't
set anything.

> > We can only verify 'contextid', and set 'contextid1' or 'contextid2' based
> > on CPU running exception level, finally driver knows how to trace PID.
> >
> > Thanks,
> > Leo
> >
>
> I'm not 100% sure what you mean by this. But previously the validation
> was looking at both contextid1 and contextid2 options and checking if
> either were supported if either were set.
>
> I have the following change in mind, it fixes the backwards
> compatibility issue. And the validation should be exactly the same as it
> was before. Except for one bug that I found when both bits are requested
> which I've also fixed here:
>
> From f1b9f56df29dfb4f2a7be25f009c79c86335587a Mon Sep 17 00:00:00 2001
> From: James Clark <[email protected]>
> Date: Fri, 28 Apr 2023 10:29:52 +0100
> Subject: [PATCH] perf cs-etm: Fix contextid validation
>
> Pre 5.11 kernels don't support 'contextid1' and 'contextid2' so
> validation would be skipped. By adding an additional check for
> 'contextid', old kernels will still have validation done even though
> contextid would either be contextid1 or contextid2.
>
> Additionally now that it's possible to override options, an existing bug
> in the validation is revealed. 'val' is overwritten by the contextid1
> validation, and re-used for contextid2 validation causing it to always
> fail. '!val || val != 0x4' is the same as 'val != 0x4' because 0 is also
> != 4, so that expression can be simplified and the temp variable not
> overwritten.
>
> Fixes: 35c51f83dd1e ("perf cs-etm: Validate options after applying them")
> Signed-off-by: James Clark <[email protected]>

LGTM:

Reviewed-by: Leo Yan <[email protected]

Thanks for the fixing!

> ---
> tools/perf/arch/arm/util/cs-etm.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
> index 77cb03e6ff87..9ca040bfb1aa 100644
> --- a/tools/perf/arch/arm/util/cs-etm.c
> +++ b/tools/perf/arch/arm/util/cs-etm.c
> @@ -78,9 +78,9 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr,
> char path[PATH_MAX];
> int err;
> u32 val;
> - u64 contextid =
> - evsel->core.attr.config &
> - (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
> + u64 contextid = evsel->core.attr.config &
> + (perf_pmu__format_bits(&cs_etm_pmu->format, "contextid") |
> + perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
> perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));
>
> if (!contextid)
> @@ -114,8 +114,7 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr,
> * 0b00100 Maximum of 32-bit Context ID size.
> * All other values are reserved.
> */
> - val = BMVAL(val, 5, 9);
> - if (!val || val != 0x4) {
> + if (BMVAL(val, 5, 9) != 0x4) {
> pr_err("%s: CONTEXTIDR_EL1 isn't supported, disable with %s/contextid1=0/\n",
> CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
> return -EINVAL;
> --
> 2.34.1