2022-12-19 16:23:47

by James Clark

[permalink] [raw]
Subject: [PATCH 0/4] Enable display of partial and empty SVE predicates from Arm SPE data

Hi,

I'm submitting this on behalf of German who moved on to work on other
things in Arm before he could finish it off.

The predicate information is available on SPE samples from
Armv8.3 (FEAT_SPEv1p1), this could be useful info for profiling SVE
code as partial and empty predicates indicate that the full vector
width isn't being used. There is a good example in the last commit
message.

Though currently, there isn't a suitable field to store the info
on Perf samples, so this change also adds a new SIMD field.
This field could be used by other architectures, but currently there
is only one bit reserved to identify SVE. It's only added to
struct perf_sample on the userspace side, and isn't part of the kernel
ABI, so it doesn't survive a perf inject. Although this is the
same behavior for some other fields like branch flags, so I don't
think it should be an issue to do something similar here. Perhaps in
the future we could make sure everything that is synthesised from
auxtrace data also makes it back into the new Perf inject file without
being lost.

German Gomez (4):
perf event: Add simd_flags field to perf_sample
perf arm-spe: Refactor arm-spe to support operation packet type
perf arm-spe: Add SVE flags to the SPE samples
perf report: Add 'simd' sort field

tools/perf/Documentation/perf-report.txt | 1 +
.../util/arm-spe-decoder/arm-spe-decoder.c | 30 ++++++++++--
.../util/arm-spe-decoder/arm-spe-decoder.h | 47 +++++++++++++++----
tools/perf/util/arm-spe.c | 28 +++++++++--
tools/perf/util/hist.c | 1 +
tools/perf/util/hist.h | 1 +
tools/perf/util/sample.h | 13 +++++
tools/perf/util/sort.c | 47 +++++++++++++++++++
tools/perf/util/sort.h | 2 +
9 files changed, 152 insertions(+), 18 deletions(-)


base-commit: 573de010917836f198a4e579d40674991659668b
--
2.25.1


2022-12-19 16:45:03

by James Clark

[permalink] [raw]
Subject: [PATCH 1/4] perf event: Add simd_flags field to perf_sample

From: German Gomez <[email protected]>

Add new field to the struct perf_sample to store flags related to SIMD
ops.

It will be used to store SIMD information from SVE and NEON when
profiling using ARM SPE.

Signed-off-by: German Gomez <[email protected]>
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/sample.h | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
index 60ec79d4eea4..bdf52faf165f 100644
--- a/tools/perf/util/sample.h
+++ b/tools/perf/util/sample.h
@@ -66,6 +66,18 @@ struct aux_sample {
void *data;
};

+struct simd_flags {
+ u64 arch:1, /* architecture (isa) */
+ pred:2; /* predication */
+};
+
+/* simd architecture flags */
+#define SIMD_OP_FLAGS_ARCH_SVE 0x01 /* ARM SVE */
+
+/* simd predicate flags */
+#define SIMD_OP_FLAGS_PRED_PARTIAL 0x01 /* partial predicate */
+#define SIMD_OP_FLAGS_PRED_EMPTY 0x02 /* empty predicate */
+
struct perf_sample {
u64 ip;
u32 pid, tid;
@@ -103,6 +115,7 @@ struct perf_sample {
struct stack_dump user_stack;
struct sample_read read;
struct aux_sample aux_sample;
+ struct simd_flags simd_flags;
};

/*
--
2.25.1

2022-12-19 18:32:23

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/4] perf event: Add simd_flags field to perf_sample

Hi James,

On Mon, Dec 19, 2022 at 8:13 AM James Clark <[email protected]> wrote:
>
> From: German Gomez <[email protected]>
>
> Add new field to the struct perf_sample to store flags related to SIMD
> ops.
>
> It will be used to store SIMD information from SVE and NEON when
> profiling using ARM SPE.
>
> Signed-off-by: German Gomez <[email protected]>
> Signed-off-by: James Clark <[email protected]>
> ---
> tools/perf/util/sample.h | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
> index 60ec79d4eea4..bdf52faf165f 100644
> --- a/tools/perf/util/sample.h
> +++ b/tools/perf/util/sample.h
> @@ -66,6 +66,18 @@ struct aux_sample {
> void *data;
> };
>
> +struct simd_flags {
> + u64 arch:1, /* architecture (isa) */
> + pred:2; /* predication */

Can we reserve more bits for possible future extension or
other arch support? It seems to be too tight for each field.
Do you plan to add more info to the struct in the future?

Thanks,
Namhyung


> +};
> +
> +/* simd architecture flags */
> +#define SIMD_OP_FLAGS_ARCH_SVE 0x01 /* ARM SVE */
> +
> +/* simd predicate flags */
> +#define SIMD_OP_FLAGS_PRED_PARTIAL 0x01 /* partial predicate */
> +#define SIMD_OP_FLAGS_PRED_EMPTY 0x02 /* empty predicate */
> +
> struct perf_sample {
> u64 ip;
> u32 pid, tid;
> @@ -103,6 +115,7 @@ struct perf_sample {
> struct stack_dump user_stack;
> struct sample_read read;
> struct aux_sample aux_sample;
> + struct simd_flags simd_flags;
> };
>
> /*
> --
> 2.25.1
>

2022-12-20 11:53:37

by James Clark

[permalink] [raw]
Subject: Re: [PATCH 1/4] perf event: Add simd_flags field to perf_sample



On 19/12/2022 18:21, Namhyung Kim wrote:
> Hi James,
>
> On Mon, Dec 19, 2022 at 8:13 AM James Clark <[email protected]> wrote:
>>
>> From: German Gomez <[email protected]>
>>
>> Add new field to the struct perf_sample to store flags related to SIMD
>> ops.
>>
>> It will be used to store SIMD information from SVE and NEON when
>> profiling using ARM SPE.
>>
>> Signed-off-by: German Gomez <[email protected]>
>> Signed-off-by: James Clark <[email protected]>
>> ---
>> tools/perf/util/sample.h | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h
>> index 60ec79d4eea4..bdf52faf165f 100644
>> --- a/tools/perf/util/sample.h
>> +++ b/tools/perf/util/sample.h
>> @@ -66,6 +66,18 @@ struct aux_sample {
>> void *data;
>> };
>>
>> +struct simd_flags {
>> + u64 arch:1, /* architecture (isa) */
>> + pred:2; /* predication */
>
> Can we reserve more bits for possible future extension or
> other arch support? It seems to be too tight for each field.
> Do you plan to add more info to the struct in the future?

As far as I can see because this is userspace only, reserving bits
doesn't be done ahead of time. When we need more bits we can just add
it. It never gets written to a file either so there is no need for
backwards compatibility.

>
> Thanks,
> Namhyung
>
>
>> +};
>> +
>> +/* simd architecture flags */
>> +#define SIMD_OP_FLAGS_ARCH_SVE 0x01 /* ARM SVE */
>> +
>> +/* simd predicate flags */
>> +#define SIMD_OP_FLAGS_PRED_PARTIAL 0x01 /* partial predicate */
>> +#define SIMD_OP_FLAGS_PRED_EMPTY 0x02 /* empty predicate */
>> +
>> struct perf_sample {
>> u64 ip;
>> u32 pid, tid;
>> @@ -103,6 +115,7 @@ struct perf_sample {
>> struct stack_dump user_stack;
>> struct sample_read read;
>> struct aux_sample aux_sample;
>> + struct simd_flags simd_flags;
>> };
>>
>> /*
>> --
>> 2.25.1
>>