2019-12-20 11:07:21

by James Clark

[permalink] [raw]
Subject: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

This patch fixes an issue when non Arm SPE events are specified after an
Arm SPE event. In that case, perf will exit with an error code and not
produce a record file. This is because a loop index is used to store the
location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
index will be overwritten. Fix this issue by saving the PMU into a
variable instead of using the index, and also add an error message.

Before the fix:
./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
237

After the fix:
./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
...
0

Signed-off-by: James Clark <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Igor Lubashev <[email protected]>
---
tools/perf/arch/arm/util/auxtrace.c | 10 +++++-----
tools/perf/arch/arm64/util/arm-spe.c | 1 +
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
index 0a6e75b8777a..230f03b622e1 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -54,9 +54,9 @@ struct auxtrace_record
*auxtrace_record__init(struct evlist *evlist, int *err)
{
struct perf_pmu *cs_etm_pmu;
+ struct perf_pmu *arm_spe_pmu = NULL;
struct evsel *evsel;
bool found_etm = false;
- bool found_spe = false;
static struct perf_pmu **arm_spe_pmus = NULL;
static int nr_spes = 0;
int i = 0;
@@ -79,13 +79,13 @@ struct auxtrace_record

for (i = 0; i < nr_spes; i++) {
if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
- found_spe = true;
+ arm_spe_pmu = arm_spe_pmus[i];
break;
}
}
}

- if (found_etm && found_spe) {
+ if (found_etm && arm_spe_pmu) {
pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
*err = -EOPNOTSUPP;
return NULL;
@@ -95,8 +95,8 @@ struct auxtrace_record
return cs_etm_record_init(err);

#if defined(__aarch64__)
- if (found_spe)
- return arm_spe_recording_init(err, arm_spe_pmus[i]);
+ if (arm_spe_pmu)
+ return arm_spe_recording_init(err, arm_spe_pmu);
#endif

/*
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index eba6541ec0f1..b7d17d8724df 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -178,6 +178,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
struct arm_spe_recording *sper;

if (!arm_spe_pmu) {
+ pr_err("Attempted to initialise null SPE PMU\n");
*err = -ENODEV;
return NULL;
}
--
2.24.0


2019-12-23 03:50:03

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

Hi James,

On Fri, Dec 20, 2019 at 11:05:25AM +0000, James Clark wrote:
> This patch fixes an issue when non Arm SPE events are specified after an
> Arm SPE event. In that case, perf will exit with an error code and not
> produce a record file. This is because a loop index is used to store the
> location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
> index will be overwritten. Fix this issue by saving the PMU into a
> variable instead of using the index, and also add an error message.
>
> Before the fix:
> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> 237
>
> After the fix:
> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> ...
> 0

Just bring up a question related with PMU event registration. Let's
see the DT binding in arch/arm64/boot/dts/arm/fvp-base-revc.dts:

spe-pmu {
compatible = "arm,statistical-profiling-extension-v1";
interrupts = <GIC_PPI 5 IRQ_TYPE_LEVEL_HIGH>;
};


Now SPE registers PMU event for every CPU; seem to me, though SPE is an
IP binding per CPU, it should register into perf framework with single
pmu event, ARM's PMU/ETM/IntelPT all use this way to regsiter PMU event;
this can allow perf tool logic to be more neat.

After the driver changes to use single PMU registration, the perf tool
code can be changed to use simple way to find perf_pmu and this data
structure can be not bound to a specific CPU. Finally, this bug can
be smoothly dismissed.

Thanks,
Leo

> Signed-off-by: James Clark <[email protected]>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Igor Lubashev <[email protected]>
> ---
> tools/perf/arch/arm/util/auxtrace.c | 10 +++++-----
> tools/perf/arch/arm64/util/arm-spe.c | 1 +
> 2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
> index 0a6e75b8777a..230f03b622e1 100644
> --- a/tools/perf/arch/arm/util/auxtrace.c
> +++ b/tools/perf/arch/arm/util/auxtrace.c
> @@ -54,9 +54,9 @@ struct auxtrace_record
> *auxtrace_record__init(struct evlist *evlist, int *err)
> {
> struct perf_pmu *cs_etm_pmu;
> + struct perf_pmu *arm_spe_pmu = NULL;
> struct evsel *evsel;
> bool found_etm = false;
> - bool found_spe = false;
> static struct perf_pmu **arm_spe_pmus = NULL;
> static int nr_spes = 0;
> int i = 0;
> @@ -79,13 +79,13 @@ struct auxtrace_record
>
> for (i = 0; i < nr_spes; i++) {
> if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
> - found_spe = true;
> + arm_spe_pmu = arm_spe_pmus[i];
> break;
> }
> }
> }
>
> - if (found_etm && found_spe) {
> + if (found_etm && arm_spe_pmu) {
> pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
> *err = -EOPNOTSUPP;
> return NULL;
> @@ -95,8 +95,8 @@ struct auxtrace_record
> return cs_etm_record_init(err);
>
> #if defined(__aarch64__)
> - if (found_spe)
> - return arm_spe_recording_init(err, arm_spe_pmus[i]);
> + if (arm_spe_pmu)
> + return arm_spe_recording_init(err, arm_spe_pmu);
> #endif
>
> /*
> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> index eba6541ec0f1..b7d17d8724df 100644
> --- a/tools/perf/arch/arm64/util/arm-spe.c
> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> @@ -178,6 +178,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
> struct arm_spe_recording *sper;
>
> if (!arm_spe_pmu) {
> + pr_err("Attempted to initialise null SPE PMU\n");
> *err = -ENODEV;
> return NULL;
> }
> --
> 2.24.0
>

2020-01-02 11:09:09

by James Clark

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

Hi Leo,

Do you mean that you would never expect there to be more than one SPE file like /sys/bus/event_source/devices/arm_spe_0?

If that is the case then do you know why there is still a number appended to the file?


Thanks
James

On 23/12/2019 03:48, Leo Yan wrote:
> Hi James,
>
> On Fri, Dec 20, 2019 at 11:05:25AM +0000, James Clark wrote:
>> This patch fixes an issue when non Arm SPE events are specified after an
>> Arm SPE event. In that case, perf will exit with an error code and not
>> produce a record file. This is because a loop index is used to store the
>> location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
>> index will be overwritten. Fix this issue by saving the PMU into a
>> variable instead of using the index, and also add an error message.
>>
>> Before the fix:
>> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
>> 237
>>
>> After the fix:
>> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
>> ...
>> 0
>
> Just bring up a question related with PMU event registration. Let's
> see the DT binding in arch/arm64/boot/dts/arm/fvp-base-revc.dts:
>
> spe-pmu {
> compatible = "arm,statistical-profiling-extension-v1";
> interrupts = <GIC_PPI 5 IRQ_TYPE_LEVEL_HIGH>;
> };
>
>
> Now SPE registers PMU event for every CPU; seem to me, though SPE is an
> IP binding per CPU, it should register into perf framework with single
> pmu event, ARM's PMU/ETM/IntelPT all use this way to regsiter PMU event;
> this can allow perf tool logic to be more neat.
>
> After the driver changes to use single PMU registration, the perf tool
> code can be changed to use simple way to find perf_pmu and this data
> structure can be not bound to a specific CPU. Finally, this bug can
> be smoothly dismissed.
>
> Thanks,
> Leo
>
>> Signed-off-by: James Clark <[email protected]>
>> Cc: Mathieu Poirier <[email protected]>
>> Cc: Suzuki K Poulose <[email protected]>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Ingo Molnar <[email protected]>
>> Cc: Arnaldo Carvalho de Melo <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> Cc: Alexander Shishkin <[email protected]>
>> Cc: Jiri Olsa <[email protected]>
>> Cc: Namhyung Kim <[email protected]>
>> Cc: Igor Lubashev <[email protected]>
>> ---
>> tools/perf/arch/arm/util/auxtrace.c | 10 +++++-----
>> tools/perf/arch/arm64/util/arm-spe.c | 1 +
>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
>> index 0a6e75b8777a..230f03b622e1 100644
>> --- a/tools/perf/arch/arm/util/auxtrace.c
>> +++ b/tools/perf/arch/arm/util/auxtrace.c
>> @@ -54,9 +54,9 @@ struct auxtrace_record
>> *auxtrace_record__init(struct evlist *evlist, int *err)
>> {
>> struct perf_pmu *cs_etm_pmu;
>> + struct perf_pmu *arm_spe_pmu = NULL;
>> struct evsel *evsel;
>> bool found_etm = false;
>> - bool found_spe = false;
>> static struct perf_pmu **arm_spe_pmus = NULL;
>> static int nr_spes = 0;
>> int i = 0;
>> @@ -79,13 +79,13 @@ struct auxtrace_record
>>
>> for (i = 0; i < nr_spes; i++) {
>> if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
>> - found_spe = true;
>> + arm_spe_pmu = arm_spe_pmus[i];
>> break;
>> }
>> }
>> }
>>
>> - if (found_etm && found_spe) {
>> + if (found_etm && arm_spe_pmu) {
>> pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
>> *err = -EOPNOTSUPP;
>> return NULL;
>> @@ -95,8 +95,8 @@ struct auxtrace_record
>> return cs_etm_record_init(err);
>>
>> #if defined(__aarch64__)
>> - if (found_spe)
>> - return arm_spe_recording_init(err, arm_spe_pmus[i]);
>> + if (arm_spe_pmu)
>> + return arm_spe_recording_init(err, arm_spe_pmu);
>> #endif
>>
>> /*
>> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
>> index eba6541ec0f1..b7d17d8724df 100644
>> --- a/tools/perf/arch/arm64/util/arm-spe.c
>> +++ b/tools/perf/arch/arm64/util/arm-spe.c
>> @@ -178,6 +178,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
>> struct arm_spe_recording *sper;
>>
>> if (!arm_spe_pmu) {
>> + pr_err("Attempted to initialise null SPE PMU\n");
>> *err = -ENODEV;
>> return NULL;
>> }
>> --
>> 2.24.0
>>

2020-01-02 11:45:37

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

Hi James,

On Thu, Jan 02, 2020 at 11:05:53AM +0000, James Clark wrote:
> Hi Leo,
>
> Do you mean that you would never expect there to be more than one SPE file like /sys/bus/event_source/devices/arm_spe_0?

Yeah.

To be more accurate, I'd suggest to be only one SPE file if CPUs have
the same SPE version. If there have multiple SPE files under
'/sys/bus/event_source/devices/', this means the CPUs have different
SPE versions

Let's see an example for SMP platform with 4xCA53 CPUs:
/sys/bus/event_source/devices/armv8_cortex_a53

For big.LITTLE system, we can see below nodes:
/sys/bus/event_source/devices/armv8_cortex_a53
/sys/bus/event_source/devices/armv8_cortex_a72

If SPE has the same IP for all CPUs, this would be simple to only
create on file under /sys/bus/event_source/devices/.

> If that is the case then do you know why there is still a number appended to the file?

This is caused by the code [1]. But I don't know what's the reason
for adding index to PMU event name.

Thanks,
Leo Yan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/arm_spe_pmu.c?h=v5.5-rc4#n911

> On 23/12/2019 03:48, Leo Yan wrote:
> > Hi James,
> >
> > On Fri, Dec 20, 2019 at 11:05:25AM +0000, James Clark wrote:
> >> This patch fixes an issue when non Arm SPE events are specified after an
> >> Arm SPE event. In that case, perf will exit with an error code and not
> >> produce a record file. This is because a loop index is used to store the
> >> location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
> >> index will be overwritten. Fix this issue by saving the PMU into a
> >> variable instead of using the index, and also add an error message.
> >>
> >> Before the fix:
> >> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> >> 237
> >>
> >> After the fix:
> >> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> >> ...
> >> 0
> >
> > Just bring up a question related with PMU event registration. Let's
> > see the DT binding in arch/arm64/boot/dts/arm/fvp-base-revc.dts:
> >
> > spe-pmu {
> > compatible = "arm,statistical-profiling-extension-v1";
> > interrupts = <GIC_PPI 5 IRQ_TYPE_LEVEL_HIGH>;
> > };
> >
> >
> > Now SPE registers PMU event for every CPU; seem to me, though SPE is an
> > IP binding per CPU, it should register into perf framework with single
> > pmu event, ARM's PMU/ETM/IntelPT all use this way to regsiter PMU event;
> > this can allow perf tool logic to be more neat.
> >
> > After the driver changes to use single PMU registration, the perf tool
> > code can be changed to use simple way to find perf_pmu and this data
> > structure can be not bound to a specific CPU. Finally, this bug can
> > be smoothly dismissed.
> >
> > Thanks,
> > Leo
> >
> >> Signed-off-by: James Clark <[email protected]>
> >> Cc: Mathieu Poirier <[email protected]>
> >> Cc: Suzuki K Poulose <[email protected]>
> >> Cc: Peter Zijlstra <[email protected]>
> >> Cc: Ingo Molnar <[email protected]>
> >> Cc: Arnaldo Carvalho de Melo <[email protected]>
> >> Cc: Mark Rutland <[email protected]>
> >> Cc: Alexander Shishkin <[email protected]>
> >> Cc: Jiri Olsa <[email protected]>
> >> Cc: Namhyung Kim <[email protected]>
> >> Cc: Igor Lubashev <[email protected]>
> >> ---
> >> tools/perf/arch/arm/util/auxtrace.c | 10 +++++-----
> >> tools/perf/arch/arm64/util/arm-spe.c | 1 +
> >> 2 files changed, 6 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
> >> index 0a6e75b8777a..230f03b622e1 100644
> >> --- a/tools/perf/arch/arm/util/auxtrace.c
> >> +++ b/tools/perf/arch/arm/util/auxtrace.c
> >> @@ -54,9 +54,9 @@ struct auxtrace_record
> >> *auxtrace_record__init(struct evlist *evlist, int *err)
> >> {
> >> struct perf_pmu *cs_etm_pmu;
> >> + struct perf_pmu *arm_spe_pmu = NULL;
> >> struct evsel *evsel;
> >> bool found_etm = false;
> >> - bool found_spe = false;
> >> static struct perf_pmu **arm_spe_pmus = NULL;
> >> static int nr_spes = 0;
> >> int i = 0;
> >> @@ -79,13 +79,13 @@ struct auxtrace_record
> >>
> >> for (i = 0; i < nr_spes; i++) {
> >> if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
> >> - found_spe = true;
> >> + arm_spe_pmu = arm_spe_pmus[i];
> >> break;
> >> }
> >> }
> >> }
> >>
> >> - if (found_etm && found_spe) {
> >> + if (found_etm && arm_spe_pmu) {
> >> pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
> >> *err = -EOPNOTSUPP;
> >> return NULL;
> >> @@ -95,8 +95,8 @@ struct auxtrace_record
> >> return cs_etm_record_init(err);
> >>
> >> #if defined(__aarch64__)
> >> - if (found_spe)
> >> - return arm_spe_recording_init(err, arm_spe_pmus[i]);
> >> + if (arm_spe_pmu)
> >> + return arm_spe_recording_init(err, arm_spe_pmu);
> >> #endif
> >>
> >> /*
> >> diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
> >> index eba6541ec0f1..b7d17d8724df 100644
> >> --- a/tools/perf/arch/arm64/util/arm-spe.c
> >> +++ b/tools/perf/arch/arm64/util/arm-spe.c
> >> @@ -178,6 +178,7 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
> >> struct arm_spe_recording *sper;
> >>
> >> if (!arm_spe_pmu) {
> >> + pr_err("Attempted to initialise null SPE PMU\n");
> >> *err = -ENODEV;
> >> return NULL;
> >> }
> >> --
> >> 2.24.0
> >>

2020-01-13 12:28:46

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

Hi Leo,

On 23/12/2019 03:48, Leo Yan wrote:
> Hi James,
>
> On Fri, Dec 20, 2019 at 11:05:25AM +0000, James Clark wrote:
>> This patch fixes an issue when non Arm SPE events are specified after an
>> Arm SPE event. In that case, perf will exit with an error code and not
>> produce a record file. This is because a loop index is used to store the
>> location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
>> index will be overwritten. Fix this issue by saving the PMU into a
>> variable instead of using the index, and also add an error message.
>>
>> Before the fix:
>> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
>> 237
>>
>> After the fix:
>> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
>> ...
>> 0
>
> Just bring up a question related with PMU event registration. Let's
> see the DT binding in arch/arm64/boot/dts/arm/fvp-base-revc.dts:
>
> spe-pmu {
> compatible = "arm,statistical-profiling-extension-v1";
> interrupts = <GIC_PPI 5 IRQ_TYPE_LEVEL_HIGH>;
> };
>
>
> Now SPE registers PMU event for every CPU; seem to me, though SPE is an

Do you mean "SPE PMU" here ? SPE is different from ETM, where the trace
data is micro-architecture dependent. And thus you cannot mix the trace
on different CPUs with different micro-archs.

As such I don't see any issue with this patch.

Suzuki

2020-01-13 12:29:45

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

On 20/12/2019 11:05, James Clark wrote:
> This patch fixes an issue when non Arm SPE events are specified after an
> Arm SPE event. In that case, perf will exit with an error code and not
> produce a record file. This is because a loop index is used to store the
> location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
> index will be overwritten. Fix this issue by saving the PMU into a
> variable instead of using the index, and also add an error message.
>
> Before the fix:
> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> 237
>
> After the fix:
> ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> ...
> 0
>
> Signed-off-by: James Clark <[email protected]>
> Cc: Mathieu Poirier <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Igor Lubashev <[email protected]>

Reviewed-by: Suzuki K Poulose <[email protected]>

2020-01-13 14:19:13

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

Hi Suzuki,

On Mon, Jan 13, 2020 at 12:27:38PM +0000, Suzuki Kuruppassery Poulose wrote:
> Hi Leo,
>
> On 23/12/2019 03:48, Leo Yan wrote:
> > Hi James,
> >
> > On Fri, Dec 20, 2019 at 11:05:25AM +0000, James Clark wrote:
> > > This patch fixes an issue when non Arm SPE events are specified after an
> > > Arm SPE event. In that case, perf will exit with an error code and not
> > > produce a record file. This is because a loop index is used to store the
> > > location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
> > > index will be overwritten. Fix this issue by saving the PMU into a
> > > variable instead of using the index, and also add an error message.
> > >
> > > Before the fix:
> > > ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> > > 237
> > >
> > > After the fix:
> > > ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> > > ...
> > > 0
> >
> > Just bring up a question related with PMU event registration. Let's
> > see the DT binding in arch/arm64/boot/dts/arm/fvp-base-revc.dts:
> >
> > spe-pmu {
> > compatible = "arm,statistical-profiling-extension-v1";
> > interrupts = <GIC_PPI 5 IRQ_TYPE_LEVEL_HIGH>;
> > };
> >
> >
> > Now SPE registers PMU event for every CPU; seem to me, though SPE is an
>
> Do you mean "SPE PMU" here ? SPE is different from ETM, where the trace
> data is micro-architecture dependent. And thus you cannot mix the trace
> on different CPUs with different micro-archs.

Understood that SPE is micro-architecture dependent.

Usually, we should register PMU event once for the same SPE version and
CPUs can create multiple instances. My concern at here is the PMU event
is regsitered for multiple times for the same SPE version. Please
correct me if I misunderstand.

> As such I don't see any issue with this patch.

Regard this patch, it does fix the issue if based on current PMU event
registration; so it's okay for me to merge it if now it's not necessary
to change PMU event registration.

Thanks,
Leo Yan

2020-01-13 14:59:26

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

On Mon, Jan 13, 2020 at 10:17:51PM +0800, Leo Yan wrote:

[...]

> > > On Fri, Dec 20, 2019 at 11:05:25AM +0000, James Clark wrote:
> > > > This patch fixes an issue when non Arm SPE events are specified after an
> > > > Arm SPE event. In that case, perf will exit with an error code and not
> > > > produce a record file. This is because a loop index is used to store the
> > > > location of the relevant Arm SPE PMU, but if non SPE PMUs follow, that
> > > > index will be overwritten. Fix this issue by saving the PMU into a
> > > > variable instead of using the index, and also add an error message.
> > > >
> > > > Before the fix:
> > > > ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> > > > 237
> > > >
> > > > After the fix:
> > > > ./perf record -e arm_spe/ts_enable=1/ -e branch-misses ls; echo $?
> > > > ...
> > > > 0
> > >
> > > Just bring up a question related with PMU event registration. Let's
> > > see the DT binding in arch/arm64/boot/dts/arm/fvp-base-revc.dts:
> > >
> > > spe-pmu {
> > > compatible = "arm,statistical-profiling-extension-v1";
> > > interrupts = <GIC_PPI 5 IRQ_TYPE_LEVEL_HIGH>;
> > > };
> > >
> > >
> > > Now SPE registers PMU event for every CPU; seem to me, though SPE is an
> >
> > Do you mean "SPE PMU" here ? SPE is different from ETM, where the trace
> > data is micro-architecture dependent. And thus you cannot mix the trace
> > on different CPUs with different micro-archs.
>
> Understood that SPE is micro-architecture dependent.

Maybe SPE is more general than we think :)

Since SPE is defined in ARMv8 architecture reference manual (ARM DDI
0487D.a); should SPE trace data format is unified and defined in Chapter
D9 "Statistical Profiling Extension Sample Record Specification"?

Thanks,
Leo

2020-01-15 13:17:09

by James Clark

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Fix bug when recording SPE and non SPE events

Hi Leo,

> Since SPE is defined in ARMv8 architecture reference manual (ARM DDI
> 0487D.a); should SPE trace data format is unified and defined in Chapter
> D9 "Statistical Profiling Extension Sample Record Specification"?
>
I'm not sure what you mean exactly, but the trace data format is described in
section D10.

Thanks
James