2020-10-22 20:26:23

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 00/20] perf arm-spe: Refactor decoding & dumping flow

This is patch set v3 for refactoring Arm SPE trace decoding and dumping.
In this version, it mainly addressed the comments and suggestions from
mailing list (mainly from Andre Przywara, thanks!).

This patch set is to refactor the Arm SPE decoding with:

- Patches 01, 02 are minor cleans up for header, typos;
- Patches 03, 04 and 05 are used to fix and polish the packet and
payload length calculation;
- Patch 06 is to add a helper to wrap up printing strings, this can
avoid bunch of duplicate code lines;
- Patches 07 ~ 18 are used to refactor decoding for different types
packet one by one (address packet, context packet, counter packet,
event packet, operation packet); it also introduces separate functions
for parsing specific packet, this can allow the code more readable and
easier to manage and extend code;
- Patch 19 comes from Andre to dump memory tagging;
- Patch 20 comes from Wei Li to add decoding for ARMv8.3 SVE extension.

This patch set is cleanly applied on the top of perf/core branch
with commit 7cf726a59435 ("Merge tag 'linux-kselftest-kunit-5.10-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest"),
And I tested this patch set on Hisilicon D06 platform with commands
"perf script" and "perf script -D".

Changes from v2:
- Tried best to address Andre's comments and refined patches;
- Added new patches 08, 11, 13, 16 for introducing new functions for
packets parsing (Andre);
- Removed size condition checking for event packet (Andre);
- Used PKT_XXX_GET() form to replace PKT_XXX_MASK()/PKT_XXX_SHIFT()
(Andre).

Changes from v1:
- Heavily rewrote the patch 05 for refactoring printing strings; this
is fundamental change, so adjusted the sequence for patches and moved
the printing string patch ahead from patch 10 (v1) to patch 05;
- Changed to use GENMASK_ULL() for bits mask;
- Added Andre's patch 13 for dumping memory tagging;
- Refined patch 12 for adding sub classes for Operation packet, merged
some commit log from Andre's patch, which allows commit log and code
to be more clear; Added "Co-developed-by: Andre Przywara" tag to
reflect this.


Andre Przywara (1):
perf arm_spe: Decode memory tagging properties

Leo Yan (18):
perf arm-spe: Include bitops.h for BIT() macro
perf arm-spe: Fix a typo in comment
perf arm-spe: Refactor payload size calculation
perf arm-spe: Refactor arm_spe_get_events()
perf arm-spe: Fix packet length handling
perf arm-spe: Refactor printing string to buffer
perf arm-spe: Refactor packet header parsing
perf arm-spe: Add new function arm_spe_pkt_desc_addr()
perf arm-spe: Refactor address packet handling
perf arm-spe: Refactor context packet handling
perf arm-spe: Add new function arm_spe_pkt_desc_counter()
perf arm-spe: Refactor counter packet handling
perf arm-spe: Add new function arm_spe_pkt_desc_event()
perf arm-spe: Refactor event type handling
perf arm-spe: Remove size condition checking for events
perf arm-spe: Add new function arm_spe_pkt_desc_op_type()
perf arm-spe: Refactor operation packet handling
perf arm-spe: Add more sub classes for operation packet

Wei Li (1):
perf arm-spe: Add support for ARMv8.3-SPE

.../util/arm-spe-decoder/arm-spe-decoder.c | 43 +-
.../util/arm-spe-decoder/arm-spe-decoder.h | 17 -
.../arm-spe-decoder/arm-spe-pkt-decoder.c | 658 +++++++++++-------
.../arm-spe-decoder/arm-spe-pkt-decoder.h | 132 +++-
4 files changed, 536 insertions(+), 314 deletions(-)

--
2.17.1


2020-10-22 20:26:24

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 04/20] perf arm-spe: Refactor arm_spe_get_events()

In function arm_spe_get_events(), the event packet's 'index' is assigned
as payload length, but the flow is not directive: it firstly gets the
packet length from the return value of arm_spe_get_payload(), the value
includes header length (1) and payload length:

int ret = arm_spe_get_payload(buf, len, packet);

and then reduces header length from packet length, so finally get the
payload length:

packet->index = ret - 1;

To simplify the code, this patch directly assigns payload length to
event packet's index; and at the end it calls arm_spe_get_payload() to
return the payload value.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 4294c133a465..f3bb8bf102aa 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -136,8 +136,6 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
static int arm_spe_get_events(const unsigned char *buf, size_t len,
struct arm_spe_pkt *packet)
{
- int ret = arm_spe_get_payload(buf, len, packet);
-
packet->type = ARM_SPE_EVENTS;

/* we use index to identify Events with a less number of
@@ -145,9 +143,9 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
* LLC-REFILL, and REMOTE-ACCESS events are identified if
* index > 1.
*/
- packet->index = ret - 1;
+ packet->index = arm_spe_payload_len(buf[0]);

- return ret;
+ return arm_spe_get_payload(buf, len, packet);
}

static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
--
2.17.1

2020-10-22 20:26:26

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 13/20] perf arm-spe: Add new function arm_spe_pkt_desc_event()

This patch moves out the event packet parsing from arm_spe_pkt_desc()
to the new function arm_spe_pkt_desc_event().

Signed-off-by: Leo Yan <[email protected]>
---
.../arm-spe-decoder/arm-spe-pkt-decoder.c | 136 ++++++++++--------
1 file changed, 73 insertions(+), 63 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 6eebd30f3d78..8a6b50f32a52 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -266,6 +266,78 @@ static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
return ret;
}

+static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
+ char *buf, size_t buf_len)
+{
+ u64 payload = packet->payload;
+ size_t blen = buf_len;
+ int ret;
+
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
+ if (ret < 0)
+ return ret;
+
+ if (payload & 0x1) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x2) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x4) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x8) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x10) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x20) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x40) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x80) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
+ if (ret < 0)
+ return ret;
+ }
+ if (packet->index > 1) {
+ if (payload & 0x100) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x200) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & 0x400) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
+ if (ret < 0)
+ return ret;
+ }
+ }
+
+ return buf_len - blen;
+}
+
static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
char *buf, size_t buf_len)
{
@@ -344,69 +416,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
case ARM_SPE_END:
return arm_spe_pkt_snprintf(&buf, &blen, "%s", name);
case ARM_SPE_EVENTS:
- ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
- if (ret < 0)
- return ret;
-
- if (payload & 0x1) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x2) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x4) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x8) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x10) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x20) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x40) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x80) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
- if (ret < 0)
- return ret;
- }
- if (idx > 1) {
- if (payload & 0x100) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x200) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
- if (ret < 0)
- return ret;
- }
- if (payload & 0x400) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
- if (ret < 0)
- return ret;
- }
- }
- return buf_len - blen;
-
+ return arm_spe_pkt_desc_event(packet, buf, buf_len);
case ARM_SPE_OP_TYPE:
switch (idx) {
case 0:
--
2.17.1

2020-10-22 20:26:37

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 09/20] perf arm-spe: Refactor address packet handling

This patch is to refactor address packet handling, it defines macros for
address packet's header and payload, these macros are used by decoder
and the dump flow.

Signed-off-by: Leo Yan <[email protected]>
---
.../util/arm-spe-decoder/arm-spe-decoder.c | 29 ++++++++--------
.../arm-spe-decoder/arm-spe-pkt-decoder.c | 26 +++++++-------
.../arm-spe-decoder/arm-spe-pkt-decoder.h | 34 ++++++++++++-------
3 files changed, 47 insertions(+), 42 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index cc18a1e8c212..776b3e6628bb 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -24,36 +24,35 @@

static u64 arm_spe_calc_ip(int index, u64 payload)
{
- u8 *addr = (u8 *)&payload;
- int ns, el;
+ u64 ns, el;

/* Instruction virtual address or Branch target address */
if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
- ns = addr[7] & SPE_ADDR_PKT_NS;
- el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
+ ns = SPE_ADDR_PKT_GET_NS(payload);
+ el = SPE_ADDR_PKT_GET_EL(payload);
+
+ /* Clean highest byte */
+ payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);

/* Fill highest byte for EL1 or EL2 (VHE) mode */
if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
- addr[7] = 0xff;
- /* Clean highest byte for other cases */
- else
- addr[7] = 0x0;
+ payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;

/* Data access virtual address */
} else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {

+ /* Clean tags */
+ payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
+
/* Fill highest byte if bits [48..55] is 0xff */
- if (addr[6] == 0xff)
- addr[7] = 0xff;
- /* Otherwise, cleanup tags */
- else
- addr[7] = 0x0;
+ if (SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload) == 0xffULL)
+ payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;

/* Data access physical address */
} else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
- /* Cleanup byte 7 */
- addr[7] = 0x0;
+ /* Clean highest byte */
+ payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
} else {
pr_err("unsupported address packet index: 0x%x\n", index);
}
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 550cd7648c73..156f98d6b8b2 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -13,9 +13,6 @@

#include "arm-spe-pkt-decoder.h"

-#define NS_FLAG BIT(63)
-#define EL_FLAG (BIT(62) | BIT(61))
-
#if __BYTE_ORDER == __BIG_ENDIAN
#define le16_to_cpu bswap_16
#define le32_to_cpu bswap_32
@@ -167,10 +164,11 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
const unsigned char ext_hdr, struct arm_spe_pkt *packet)
{
packet->type = ARM_SPE_ADDRESS;
+
if (ext_hdr)
- packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
+ packet->index = SPE_ADDR_PKT_HDR_EXTENDED_INDEX(buf[0], buf[1]);
else
- packet->index = buf[0] & 0x7;
+ packet->index = SPE_ADDR_PKT_HDR_SHORT_INDEX(buf[0]);

return arm_spe_get_payload(buf, len, ext_hdr, packet);
}
@@ -274,20 +272,20 @@ static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
u64 payload = packet->payload;

switch (idx) {
- case 0:
- case 1:
- ns = !!(packet->payload & NS_FLAG);
- el = (packet->payload & EL_FLAG) >> 61;
- payload &= ~(0xffULL << 56);
+ case SPE_ADDR_PKT_HDR_INDEX_INS:
+ case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
+ ns = !!SPE_ADDR_PKT_GET_NS(payload);
+ el = SPE_ADDR_PKT_GET_EL(payload);
+ payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
return arm_spe_pkt_snprintf(&buf, &buf_len,
"%s 0x%llx el%d ns=%d",
(idx == 1) ? "TGT" : "PC", payload, el, ns);
- case 2:
+ case SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT:
return arm_spe_pkt_snprintf(&buf, &buf_len,
"VA 0x%llx", payload);
- case 3:
- ns = !!(packet->payload & NS_FLAG);
- payload &= ~(0xffULL << 56);
+ case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
+ ns = !!SPE_ADDR_PKT_GET_NS(payload);
+ payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
return arm_spe_pkt_snprintf(&buf, &buf_len,
"PA 0x%llx ns=%d", payload, ns);
default:
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 68552ff8a8f7..4111550d2bde 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -59,19 +59,27 @@ struct arm_spe_pkt {

#define SPE_HEADER_SZ(val) ((val & GENMASK_ULL(5, 4)) >> 4)

-#define SPE_ADDR_PKT_HDR_INDEX_INS (0x0)
-#define SPE_ADDR_PKT_HDR_INDEX_BRANCH (0x1)
-#define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT (0x2)
-#define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS (0x3)
-
-#define SPE_ADDR_PKT_NS BIT(7)
-#define SPE_ADDR_PKT_CH BIT(6)
-#define SPE_ADDR_PKT_EL_OFFSET (5)
-#define SPE_ADDR_PKT_EL_MASK (0x3 << SPE_ADDR_PKT_EL_OFFSET)
-#define SPE_ADDR_PKT_EL0 (0)
-#define SPE_ADDR_PKT_EL1 (1)
-#define SPE_ADDR_PKT_EL2 (2)
-#define SPE_ADDR_PKT_EL3 (3)
+/* Address packet header */
+#define SPE_ADDR_PKT_HDR_SHORT_INDEX(h) ((h) & GENMASK_ULL(2, 0))
+#define SPE_ADDR_PKT_HDR_EXTENDED_INDEX(h0, h1) (((h0) & GENMASK_ULL(1, 0)) << 3 | \
+ SPE_ADDR_PKT_HDR_SHORT_INDEX(h1))
+#define SPE_ADDR_PKT_HDR_INDEX_INS 0x0
+#define SPE_ADDR_PKT_HDR_INDEX_BRANCH 0x1
+#define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT 0x2
+#define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS 0x3
+
+/* Address packet payload */
+#define SPE_ADDR_PKT_ADDR_BYTE7_SHIFT 56
+#define SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(v) ((v) & GENMASK_ULL(55, 0))
+#define SPE_ADDR_PKT_ADDR_GET_BYTE_6(v) (((v) & GENMASK_ULL(55, 48)) >> 48)
+
+#define SPE_ADDR_PKT_GET_NS(v) (((v) & BIT(63)) >> 63)
+#define SPE_ADDR_PKT_GET_EL(v) (((v) & GENMASK_ULL(62, 61)) >> 61)
+
+#define SPE_ADDR_PKT_EL0 0
+#define SPE_ADDR_PKT_EL1 1
+#define SPE_ADDR_PKT_EL2 2
+#define SPE_ADDR_PKT_EL3 3

const char *arm_spe_pkt_name(enum arm_spe_pkt_type);

--
2.17.1

2020-10-22 20:26:49

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 07/20] perf arm-spe: Refactor packet header parsing

The packet header parsing uses the hard coded values and it uses nested
if-else statements.

To improve the readability, this patch refactors the macros for packet
header format so it removes the hard coded values. Furthermore, based
on the new mask macros it reduces the nested if-else statements and
changes to use the flat conditions checking, this is directive and can
easily map to the descriptions in ARMv8-a architecture reference manual
(ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
protocol packet headers'.

Signed-off-by: Leo Yan <[email protected]>
---
.../arm-spe-decoder/arm-spe-pkt-decoder.c | 92 +++++++++----------
.../arm-spe-decoder/arm-spe-pkt-decoder.h | 20 ++++
2 files changed, 61 insertions(+), 51 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index c7b6dc016f11..6f2329990729 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -16,28 +16,6 @@
#define NS_FLAG BIT(63)
#define EL_FLAG (BIT(62) | BIT(61))

-#define SPE_HEADER0_PAD 0x0
-#define SPE_HEADER0_END 0x1
-#define SPE_HEADER0_ADDRESS 0x30 /* address packet (short) */
-#define SPE_HEADER0_ADDRESS_MASK 0x38
-#define SPE_HEADER0_COUNTER 0x18 /* counter packet (short) */
-#define SPE_HEADER0_COUNTER_MASK 0x38
-#define SPE_HEADER0_TIMESTAMP 0x71
-#define SPE_HEADER0_TIMESTAMP 0x71
-#define SPE_HEADER0_EVENTS 0x2
-#define SPE_HEADER0_EVENTS_MASK 0xf
-#define SPE_HEADER0_SOURCE 0x3
-#define SPE_HEADER0_SOURCE_MASK 0xf
-#define SPE_HEADER0_CONTEXT 0x24
-#define SPE_HEADER0_CONTEXT_MASK 0x3c
-#define SPE_HEADER0_OP_TYPE 0x8
-#define SPE_HEADER0_OP_TYPE_MASK 0x3c
-#define SPE_HEADER1_ALIGNMENT 0x0
-#define SPE_HEADER1_ADDRESS 0xb0 /* address packet (extended) */
-#define SPE_HEADER1_ADDRESS_MASK 0xf8
-#define SPE_HEADER1_COUNTER 0x98 /* counter packet (extended) */
-#define SPE_HEADER1_COUNTER_MASK 0xf8
-
#if __BYTE_ORDER == __BIG_ENDIAN
#define le16_to_cpu bswap_16
#define le32_to_cpu bswap_32
@@ -200,46 +178,58 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
struct arm_spe_pkt *packet)
{
- unsigned int byte;
+ unsigned int hdr;
+ unsigned char ext_hdr = 0;

memset(packet, 0, sizeof(struct arm_spe_pkt));

if (!len)
return ARM_SPE_NEED_MORE_BYTES;

- byte = buf[0];
- if (byte == SPE_HEADER0_PAD)
+ hdr = buf[0];
+
+ if (hdr == SPE_HEADER0_PAD)
return arm_spe_get_pad(packet);
- else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
+
+ if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
return arm_spe_get_end(packet);
- else if (byte & 0xc0 /* 0y11xxxxxx */) {
- if (byte & 0x80) {
- if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
- return arm_spe_get_addr(buf, len, 0, packet);
- if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
- return arm_spe_get_counter(buf, len, 0, packet);
- } else
- if (byte == SPE_HEADER0_TIMESTAMP)
- return arm_spe_get_timestamp(buf, len, packet);
- else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
- return arm_spe_get_events(buf, len, packet);
- else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
- return arm_spe_get_data_source(buf, len, packet);
- else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
- return arm_spe_get_context(buf, len, packet);
- else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
- return arm_spe_get_op_type(buf, len, packet);
- } else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
- /* 16-bit header */
- byte = buf[1];
- if (byte == SPE_HEADER1_ALIGNMENT)
+
+ if (hdr == SPE_HEADER0_TIMESTAMP)
+ return arm_spe_get_timestamp(buf, len, packet);
+
+ if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_EVENTS)
+ return arm_spe_get_events(buf, len, packet);
+
+ if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_SOURCE)
+ return arm_spe_get_data_source(buf, len, packet);
+
+ if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_CONTEXT)
+ return arm_spe_get_context(buf, len, packet);
+
+ if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_OP_TYPE)
+ return arm_spe_get_op_type(buf, len, packet);
+
+ if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_EXTENDED) {
+ /* 16-bit extended format header */
+ ext_hdr = 1;
+
+ hdr = buf[1];
+ if (hdr == SPE_HEADER1_ALIGNMENT)
return arm_spe_get_alignment(buf, len, packet);
- else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
- return arm_spe_get_addr(buf, len, 1, packet);
- else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
- return arm_spe_get_counter(buf, len, 1, packet);
}

+ /*
+ * The short format header's byte 0 or the extended format header's
+ * byte 1 has been assigned to 'hdr', which uses the same encoding for
+ * address packet and counter packet, so don't need to distinguish if
+ * it's short format or extended format and handle in once.
+ */
+ if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_ADDRESS)
+ return arm_spe_get_addr(buf, len, ext_hdr, packet);
+
+ if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_COUNTER)
+ return arm_spe_get_counter(buf, len, ext_hdr, packet);
+
return ARM_SPE_BAD_PACKET;
}

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index e9ea8e3ead5d..68552ff8a8f7 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -37,6 +37,26 @@ struct arm_spe_pkt {
uint64_t payload;
};

+/* Short header (HEADER0) and extended header (HEADER1) */
+#define SPE_HEADER0_PAD 0x0
+#define SPE_HEADER0_END 0x1
+#define SPE_HEADER0_TIMESTAMP 0x71
+/* Mask for event & data source */
+#define SPE_HEADER0_MASK1 (GENMASK_ULL(7, 6) | GENMASK_ULL(3, 0))
+#define SPE_HEADER0_EVENTS 0x42
+#define SPE_HEADER0_SOURCE 0x43
+/* Mask for context & operation */
+#define SPE_HEADER0_MASK2 GENMASK_ULL(7, 2)
+#define SPE_HEADER0_CONTEXT 0x64
+#define SPE_HEADER0_OP_TYPE 0x48
+/* Mask for extended format */
+#define SPE_HEADER0_EXTENDED 0x20
+/* Mask for address & counter */
+#define SPE_HEADER0_MASK3 GENMASK_ULL(7, 3)
+#define SPE_HEADER0_ADDRESS 0xb0
+#define SPE_HEADER0_COUNTER 0x98
+#define SPE_HEADER1_ALIGNMENT 0x0
+
#define SPE_HEADER_SZ(val) ((val & GENMASK_ULL(5, 4)) >> 4)

#define SPE_ADDR_PKT_HDR_INDEX_INS (0x0)
--
2.17.1

2020-10-22 20:27:03

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 06/20] perf arm-spe: Refactor printing string to buffer

When outputs strings to the decoding buffer with function snprintf(),
SPE decoder needs to detects if any error returns from snprintf() and if
so needs to directly bail out. If snprintf() returns success, it needs
to update buffer pointer and reduce the buffer length so can continue to
output the next string into the consequent memory space.

This complex logics are spreading in the function arm_spe_pkt_desc() so
there has many duplicate codes for handling error detecting, increment
buffer pointer and decrement buffer size.

To avoid the duplicate code, this patch introduces a new helper function
arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
the caller arm_spe_pkt_desc() will call it and simply check the returns
value.

This patch also moves the variable 'blen' as the function's local
variable, this allows to remove the unnecessary braces and improve the
readability.

Suggested-by: Dave Martin <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
---
.../arm-spe-decoder/arm-spe-pkt-decoder.c | 247 ++++++++++--------
1 file changed, 135 insertions(+), 112 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index e9ec7edb51a0..c7b6dc016f11 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -9,6 +9,7 @@
#include <endian.h>
#include <byteswap.h>
#include <linux/bitops.h>
+#include <stdarg.h>

#include "arm-spe-pkt-decoder.h"

@@ -258,192 +259,214 @@ int arm_spe_get_packet(const unsigned char *buf, size_t len,
return ret;
}

+static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
+ const char *fmt, ...)
+{
+ va_list ap;
+ int ret;
+
+ va_start(ap, fmt);
+ ret = vsnprintf(*buf_p, *blen, fmt, ap);
+ va_end(ap);
+
+ if (ret < 0)
+ return ret;
+
+ *buf_p += ret;
+ *blen -= ret;
+ return ret;
+}
+
int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
size_t buf_len)
{
int ret, ns, el, idx = packet->index;
unsigned long long payload = packet->payload;
const char *name = arm_spe_pkt_name(packet->type);
+ size_t blen = buf_len;

switch (packet->type) {
case ARM_SPE_BAD:
case ARM_SPE_PAD:
case ARM_SPE_END:
- return snprintf(buf, buf_len, "%s", name);
- case ARM_SPE_EVENTS: {
- size_t blen = buf_len;
-
- ret = 0;
- ret = snprintf(buf, buf_len, "EV");
- buf += ret;
- blen -= ret;
+ return arm_spe_pkt_snprintf(&buf, &blen, "%s", name);
+ case ARM_SPE_EVENTS:
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
+ if (ret < 0)
+ return ret;
+
if (payload & 0x1) {
- ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x2) {
- ret = snprintf(buf, buf_len, " RETIRED");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x4) {
- ret = snprintf(buf, buf_len, " L1D-ACCESS");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x8) {
- ret = snprintf(buf, buf_len, " L1D-REFILL");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x10) {
- ret = snprintf(buf, buf_len, " TLB-ACCESS");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x20) {
- ret = snprintf(buf, buf_len, " TLB-REFILL");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x40) {
- ret = snprintf(buf, buf_len, " NOT-TAKEN");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x80) {
- ret = snprintf(buf, buf_len, " MISPRED");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
+ if (ret < 0)
+ return ret;
}
if (idx > 1) {
if (payload & 0x100) {
- ret = snprintf(buf, buf_len, " LLC-ACCESS");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x200) {
- ret = snprintf(buf, buf_len, " LLC-REFILL");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x400) {
- ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
+ if (ret < 0)
+ return ret;
}
}
- if (ret < 0)
- return ret;
- blen -= ret;
return buf_len - blen;
- }
+
case ARM_SPE_OP_TYPE:
switch (idx) {
- case 0: return snprintf(buf, buf_len, "%s", payload & 0x1 ?
- "COND-SELECT" : "INSN-OTHER");
- case 1: {
- size_t blen = buf_len;
-
- if (payload & 0x1)
- ret = snprintf(buf, buf_len, "ST");
- else
- ret = snprintf(buf, buf_len, "LD");
- buf += ret;
- blen -= ret;
+ case 0:
+ return arm_spe_pkt_snprintf(&buf, &blen,
+ payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
+ case 1:
+ ret = arm_spe_pkt_snprintf(&buf, &blen,
+ payload & 0x1 ? "ST" : "LD");
+ if (ret < 0)
+ return ret;
+
if (payload & 0x2) {
if (payload & 0x4) {
- ret = snprintf(buf, buf_len, " AT");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x8) {
- ret = snprintf(buf, buf_len, " EXCL");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCL");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x10) {
- ret = snprintf(buf, buf_len, " AR");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
+ if (ret < 0)
+ return ret;
}
} else if (payload & 0x4) {
- ret = snprintf(buf, buf_len, " SIMD-FP");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
+ if (ret < 0)
+ return ret;
}
+
+ return buf_len - blen;
+
+ case 2:
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
if (ret < 0)
return ret;
- blen -= ret;
- return buf_len - blen;
- }
- case 2: {
- size_t blen = buf_len;

- ret = snprintf(buf, buf_len, "B");
- buf += ret;
- blen -= ret;
if (payload & 0x1) {
- ret = snprintf(buf, buf_len, " COND");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
+ if (ret < 0)
+ return ret;
}
if (payload & 0x2) {
- ret = snprintf(buf, buf_len, " IND");
- buf += ret;
- blen -= ret;
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
+ if (ret < 0)
+ return ret;
}
- if (ret < 0)
- return ret;
- blen -= ret;
+
return buf_len - blen;
- }
- default: return 0;
+
+ default:
+ return 0;
}
case ARM_SPE_DATA_SOURCE:
case ARM_SPE_TIMESTAMP:
- return snprintf(buf, buf_len, "%s %lld", name, payload);
+ return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, payload);
case ARM_SPE_ADDRESS:
switch (idx) {
case 0:
case 1: ns = !!(packet->payload & NS_FLAG);
el = (packet->payload & EL_FLAG) >> 61;
payload &= ~(0xffULL << 56);
- return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
+ return arm_spe_pkt_snprintf(&buf, &blen,
+ "%s 0x%llx el%d ns=%d",
(idx == 1) ? "TGT" : "PC", payload, el, ns);
- case 2: return snprintf(buf, buf_len, "VA 0x%llx", payload);
+ case 2:
+ return arm_spe_pkt_snprintf(&buf, &blen,
+ "VA 0x%llx", payload);
case 3: ns = !!(packet->payload & NS_FLAG);
payload &= ~(0xffULL << 56);
- return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
- payload, ns);
- default: return 0;
+ return arm_spe_pkt_snprintf(&buf, &blen,
+ "PA 0x%llx ns=%d", payload, ns);
+ default:
+ return 0;
}
case ARM_SPE_CONTEXT:
- return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
- (unsigned long)payload, idx + 1);
- case ARM_SPE_COUNTER: {
- size_t blen = buf_len;
-
- ret = snprintf(buf, buf_len, "%s %d ", name,
- (unsigned short)payload);
- buf += ret;
- blen -= ret;
- switch (idx) {
- case 0: ret = snprintf(buf, buf_len, "TOT"); break;
- case 1: ret = snprintf(buf, buf_len, "ISSUE"); break;
- case 2: ret = snprintf(buf, buf_len, "XLAT"); break;
- default: ret = 0;
- }
+ return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%lx el%d",
+ name, (unsigned long)payload, idx + 1);
+ case ARM_SPE_COUNTER:
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "%s %d ", name,
+ (unsigned short)payload);
if (ret < 0)
return ret;
- blen -= ret;
+
+ switch (idx) {
+ case 0:
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
+ if (ret < 0)
+ return ret;
+ break;
+ case 1:
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
+ if (ret < 0)
+ return ret;
+ break;
+ case 2:
+ ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
+ if (ret < 0)
+ return ret;
+ break;
+ default:
+ break;
+ }
+
return buf_len - blen;
- }
+
default:
break;
}

- return snprintf(buf, buf_len, "%s 0x%llx (%d)",
- name, payload, packet->index);
+ return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%llx (%d)",
+ name, payload, packet->index);
}
--
2.17.1

2020-10-22 23:11:59

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 15/20] perf arm-spe: Remove size condition checking for events

In the Armv8 ARM (ARM DDI 0487F.c), chapter "D10.2.6 Events packet", it
describes the event bit is valid with specific payload requirement. For
example, the Last Level cache access event, the bit is defined as:

E[8], byte 1 bit [0], when SZ == 0b01 , when SZ == 0b10 ,
or when SZ == 0b11

It requires the payload size is at least 2 bytes, when byte 1 (start
counting from 0) is valid, E[8] (bit 0 in byte 1) can be used for LLC
access event type. For safety, the code checks the condition for
payload size firstly, if meet the requirement for payload size, then
continue to parse event type.

If review function arm_spe_get_payload(), it has used cast, so any bytes
beyond the valid size have been set to zeros.

For this reason, we don't need to check payload size anymore afterwards
when parse events, thus this patch removes payload size conditions.

Suggested-by: Andre Przywara <[email protected]>
Signed-off-by: Leo Yan <[email protected]>
---
.../util/arm-spe-decoder/arm-spe-decoder.c | 9 ++----
.../arm-spe-decoder/arm-spe-pkt-decoder.c | 30 +++++++++----------
2 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 776b3e6628bb..a5d7509d5daa 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -178,16 +178,13 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
if (payload & BIT(EV_TLB_ACCESS))
decoder->record.type |= ARM_SPE_TLB_ACCESS;

- if ((idx == 2 || idx == 4 || idx == 8) &&
- (payload & BIT(EV_LLC_MISS)))
+ if (payload & BIT(EV_LLC_MISS))
decoder->record.type |= ARM_SPE_LLC_MISS;

- if ((idx == 2 || idx == 4 || idx == 8) &&
- (payload & BIT(EV_LLC_ACCESS)))
+ if (payload & BIT(EV_LLC_ACCESS))
decoder->record.type |= ARM_SPE_LLC_ACCESS;

- if ((idx == 2 || idx == 4 || idx == 8) &&
- (payload & BIT(EV_REMOTE_ACCESS)))
+ if (payload & BIT(EV_REMOTE_ACCESS))
decoder->record.type |= ARM_SPE_REMOTE_ACCESS;

if (payload & BIT(EV_MISPRED))
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 58a1390b7a43..2cb019999016 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -317,22 +317,20 @@ static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
if (ret < 0)
return ret;
}
- if (packet->index > 1) {
- if (payload & BIT(EV_LLC_ACCESS)) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
- if (ret < 0)
- return ret;
- }
- if (payload & BIT(EV_LLC_MISS)) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
- if (ret < 0)
- return ret;
- }
- if (payload & BIT(EV_REMOTE_ACCESS)) {
- ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
- if (ret < 0)
- return ret;
- }
+ if (payload & BIT(EV_LLC_ACCESS)) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & BIT(EV_LLC_MISS)) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
+ if (ret < 0)
+ return ret;
+ }
+ if (payload & BIT(EV_REMOTE_ACCESS)) {
+ ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
+ if (ret < 0)
+ return ret;
}

return buf_len - blen;
--
2.17.1

2020-10-23 18:59:33

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH v3 04/20] perf arm-spe: Refactor arm_spe_get_events()

On 22/10/2020 15:58, Leo Yan wrote:
> In function arm_spe_get_events(), the event packet's 'index' is assigned
> as payload length, but the flow is not directive: it firstly gets the
> packet length from the return value of arm_spe_get_payload(), the value
> includes header length (1) and payload length:
>
> int ret = arm_spe_get_payload(buf, len, packet);
>
> and then reduces header length from packet length, so finally get the
> payload length:
>
> packet->index = ret - 1;
>
> To simplify the code, this patch directly assigns payload length to
> event packet's index; and at the end it calls arm_spe_get_payload() to
> return the payload value.
>
> Signed-off-by: Leo Yan <[email protected]>

Reviewed-by: Andre Przywara <[email protected]>

Cheers,
Andre

> ---
> tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 4294c133a465..f3bb8bf102aa 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -136,8 +136,6 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
> static int arm_spe_get_events(const unsigned char *buf, size_t len,
> struct arm_spe_pkt *packet)
> {
> - int ret = arm_spe_get_payload(buf, len, packet);
> -
> packet->type = ARM_SPE_EVENTS;
>
> /* we use index to identify Events with a less number of
> @@ -145,9 +143,9 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
> * LLC-REFILL, and REMOTE-ACCESS events are identified if
> * index > 1.
> */
> - packet->index = ret - 1;
> + packet->index = arm_spe_payload_len(buf[0]);
>
> - return ret;
> + return arm_spe_get_payload(buf, len, packet);
> }
>
> static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
>

2020-10-23 19:04:13

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH v3 09/20] perf arm-spe: Refactor address packet handling

On 22/10/2020 15:58, Leo Yan wrote:

Hi Leo,

> This patch is to refactor address packet handling, it defines macros for
> address packet's header and payload, these macros are used by decoder
> and the dump flow.
>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> .../util/arm-spe-decoder/arm-spe-decoder.c | 29 ++++++++--------
> .../arm-spe-decoder/arm-spe-pkt-decoder.c | 26 +++++++-------
> .../arm-spe-decoder/arm-spe-pkt-decoder.h | 34 ++++++++++++-------
> 3 files changed, 47 insertions(+), 42 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index cc18a1e8c212..776b3e6628bb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -24,36 +24,35 @@
>
> static u64 arm_spe_calc_ip(int index, u64 payload)
> {
> - u8 *addr = (u8 *)&payload;
> - int ns, el;
> + u64 ns, el;
>
> /* Instruction virtual address or Branch target address */
> if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
> index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> - ns = addr[7] & SPE_ADDR_PKT_NS;
> - el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> + ns = SPE_ADDR_PKT_GET_NS(payload);
> + el = SPE_ADDR_PKT_GET_EL(payload);
> +
> + /* Clean highest byte */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
>
> /* Fill highest byte for EL1 or EL2 (VHE) mode */
> if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
> - addr[7] = 0xff;
> - /* Clean highest byte for other cases */
> - else
> - addr[7] = 0x0;
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>
> /* Data access virtual address */
> } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
>
> + /* Clean tags */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> +
> /* Fill highest byte if bits [48..55] is 0xff */

Do you know where this comes from? If yes, can you replace the comment
with the reason, so that it says *why* and not *what* the code does?

> - if (addr[6] == 0xff)
> - addr[7] = 0xff;
> - /* Otherwise, cleanup tags */
> - else
> - addr[7] = 0x0;
> + if (SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload) == 0xffULL)
> + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>
> /* Data access physical address */
> } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
> - /* Cleanup byte 7 */
> - addr[7] = 0x0;
> + /* Clean highest byte */
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> } else {
> pr_err("unsupported address packet index: 0x%x\n", index);
> }
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 550cd7648c73..156f98d6b8b2 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -13,9 +13,6 @@
>
> #include "arm-spe-pkt-decoder.h"
>
> -#define NS_FLAG BIT(63)
> -#define EL_FLAG (BIT(62) | BIT(61))
> -
> #if __BYTE_ORDER == __BIG_ENDIAN
> #define le16_to_cpu bswap_16
> #define le32_to_cpu bswap_32
> @@ -167,10 +164,11 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
> const unsigned char ext_hdr, struct arm_spe_pkt *packet)
> {
> packet->type = ARM_SPE_ADDRESS;
> +
> if (ext_hdr)
> - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> + packet->index = SPE_ADDR_PKT_HDR_EXTENDED_INDEX(buf[0], buf[1]);
> else
> - packet->index = buf[0] & 0x7;
> + packet->index = SPE_ADDR_PKT_HDR_SHORT_INDEX(buf[0]);
>
> return arm_spe_get_payload(buf, len, ext_hdr, packet);
> }
> @@ -274,20 +272,20 @@ static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
> u64 payload = packet->payload;
>
> switch (idx) {
> - case 0:
> - case 1:
> - ns = !!(packet->payload & NS_FLAG);
> - el = (packet->payload & EL_FLAG) >> 61;
> - payload &= ~(0xffULL << 56);
> + case SPE_ADDR_PKT_HDR_INDEX_INS:
> + case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
> + ns = !!SPE_ADDR_PKT_GET_NS(payload);
> + el = SPE_ADDR_PKT_GET_EL(payload);
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> return arm_spe_pkt_snprintf(&buf, &buf_len,
> "%s 0x%llx el%d ns=%d",
> (idx == 1) ? "TGT" : "PC", payload, el, ns);
> - case 2:
> + case SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT:
> return arm_spe_pkt_snprintf(&buf, &buf_len,
> "VA 0x%llx", payload);
> - case 3:
> - ns = !!(packet->payload & NS_FLAG);
> - payload &= ~(0xffULL << 56);
> + case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
> + ns = !!SPE_ADDR_PKT_GET_NS(payload);
> + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> return arm_spe_pkt_snprintf(&buf, &buf_len,
> "PA 0x%llx ns=%d", payload, ns);
> default:
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 68552ff8a8f7..4111550d2bde 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -59,19 +59,27 @@ struct arm_spe_pkt {
>
> #define SPE_HEADER_SZ(val) ((val & GENMASK_ULL(5, 4)) >> 4)
>
> -#define SPE_ADDR_PKT_HDR_INDEX_INS (0x0)
> -#define SPE_ADDR_PKT_HDR_INDEX_BRANCH (0x1)
> -#define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT (0x2)
> -#define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS (0x3)
> -
> -#define SPE_ADDR_PKT_NS BIT(7)
> -#define SPE_ADDR_PKT_CH BIT(6)
> -#define SPE_ADDR_PKT_EL_OFFSET (5)
> -#define SPE_ADDR_PKT_EL_MASK (0x3 << SPE_ADDR_PKT_EL_OFFSET)
> -#define SPE_ADDR_PKT_EL0 (0)
> -#define SPE_ADDR_PKT_EL1 (1)
> -#define SPE_ADDR_PKT_EL2 (2)
> -#define SPE_ADDR_PKT_EL3 (3)
> +/* Address packet header */
> +#define SPE_ADDR_PKT_HDR_SHORT_INDEX(h) ((h) & GENMASK_ULL(2, 0))
> +#define SPE_ADDR_PKT_HDR_EXTENDED_INDEX(h0, h1) (((h0) & GENMASK_ULL(1, 0)) << 3 | \
> + SPE_ADDR_PKT_HDR_SHORT_INDEX(h1))

Did you consider sharing those two with the identical definition for the
extended counter packet? This extended packet seems more like a generic
concept, regardless of the packet type.

> +#define SPE_ADDR_PKT_HDR_INDEX_INS 0x0
> +#define SPE_ADDR_PKT_HDR_INDEX_BRANCH 0x1
> +#define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT 0x2
> +#define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS 0x3
> +
> +/* Address packet payload */
> +#define SPE_ADDR_PKT_ADDR_BYTE7_SHIFT 56
> +#define SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(v) ((v) & GENMASK_ULL(55, 0))
> +#define SPE_ADDR_PKT_ADDR_GET_BYTE_6(v) (((v) & GENMASK_ULL(55, 48)) >> 48)
> +
> +#define SPE_ADDR_PKT_GET_NS(v) (((v) & BIT(63)) >> 63)

You need BIT_ULL(63) here to make this work on 32-bit systems.

Cheers,
Andre

> +#define SPE_ADDR_PKT_GET_EL(v) (((v) & GENMASK_ULL(62, 61)) >> 61)
> +
> +#define SPE_ADDR_PKT_EL0 0
> +#define SPE_ADDR_PKT_EL1 1
> +#define SPE_ADDR_PKT_EL2 2
> +#define SPE_ADDR_PKT_EL3 3
>
> const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>
>

2020-10-23 22:35:15

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH v3 07/20] perf arm-spe: Refactor packet header parsing

On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> The packet header parsing uses the hard coded values and it uses nested
> if-else statements.
>
> To improve the readability, this patch refactors the macros for packet
> header format so it removes the hard coded values. Furthermore, based
> on the new mask macros it reduces the nested if-else statements and
> changes to use the flat conditions checking, this is directive and can
> easily map to the descriptions in ARMv8-a architecture reference manual
> (ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
> protocol packet headers'.
>
> Signed-off-by: Leo Yan <[email protected]>

Compared against the previous version, which I had checked already
against the manual. Thanks for the fixes!

Reviewed-by: Andre Przywara <[email protected]>

Thanks,
Andre


> ---
> .../arm-spe-decoder/arm-spe-pkt-decoder.c | 92 +++++++++----------
> .../arm-spe-decoder/arm-spe-pkt-decoder.h | 20 ++++
> 2 files changed, 61 insertions(+), 51 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index c7b6dc016f11..6f2329990729 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -16,28 +16,6 @@
> #define NS_FLAG BIT(63)
> #define EL_FLAG (BIT(62) | BIT(61))
>
> -#define SPE_HEADER0_PAD 0x0
> -#define SPE_HEADER0_END 0x1
> -#define SPE_HEADER0_ADDRESS 0x30 /* address packet (short) */
> -#define SPE_HEADER0_ADDRESS_MASK 0x38
> -#define SPE_HEADER0_COUNTER 0x18 /* counter packet (short) */
> -#define SPE_HEADER0_COUNTER_MASK 0x38
> -#define SPE_HEADER0_TIMESTAMP 0x71
> -#define SPE_HEADER0_TIMESTAMP 0x71
> -#define SPE_HEADER0_EVENTS 0x2
> -#define SPE_HEADER0_EVENTS_MASK 0xf
> -#define SPE_HEADER0_SOURCE 0x3
> -#define SPE_HEADER0_SOURCE_MASK 0xf
> -#define SPE_HEADER0_CONTEXT 0x24
> -#define SPE_HEADER0_CONTEXT_MASK 0x3c
> -#define SPE_HEADER0_OP_TYPE 0x8
> -#define SPE_HEADER0_OP_TYPE_MASK 0x3c
> -#define SPE_HEADER1_ALIGNMENT 0x0
> -#define SPE_HEADER1_ADDRESS 0xb0 /* address packet (extended) */
> -#define SPE_HEADER1_ADDRESS_MASK 0xf8
> -#define SPE_HEADER1_COUNTER 0x98 /* counter packet (extended) */
> -#define SPE_HEADER1_COUNTER_MASK 0xf8
> -
> #if __BYTE_ORDER == __BIG_ENDIAN
> #define le16_to_cpu bswap_16
> #define le32_to_cpu bswap_32
> @@ -200,46 +178,58 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
> static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
> struct arm_spe_pkt *packet)
> {
> - unsigned int byte;
> + unsigned int hdr;
> + unsigned char ext_hdr = 0;
>
> memset(packet, 0, sizeof(struct arm_spe_pkt));
>
> if (!len)
> return ARM_SPE_NEED_MORE_BYTES;
>
> - byte = buf[0];
> - if (byte == SPE_HEADER0_PAD)
> + hdr = buf[0];
> +
> + if (hdr == SPE_HEADER0_PAD)
> return arm_spe_get_pad(packet);
> - else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
> +
> + if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
> return arm_spe_get_end(packet);
> - else if (byte & 0xc0 /* 0y11xxxxxx */) {
> - if (byte & 0x80) {
> - if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
> - return arm_spe_get_addr(buf, len, 0, packet);
> - if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
> - return arm_spe_get_counter(buf, len, 0, packet);
> - } else
> - if (byte == SPE_HEADER0_TIMESTAMP)
> - return arm_spe_get_timestamp(buf, len, packet);
> - else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
> - return arm_spe_get_events(buf, len, packet);
> - else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
> - return arm_spe_get_data_source(buf, len, packet);
> - else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
> - return arm_spe_get_context(buf, len, packet);
> - else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
> - return arm_spe_get_op_type(buf, len, packet);
> - } else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
> - /* 16-bit header */
> - byte = buf[1];
> - if (byte == SPE_HEADER1_ALIGNMENT)
> +
> + if (hdr == SPE_HEADER0_TIMESTAMP)
> + return arm_spe_get_timestamp(buf, len, packet);
> +
> + if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_EVENTS)
> + return arm_spe_get_events(buf, len, packet);
> +
> + if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_SOURCE)
> + return arm_spe_get_data_source(buf, len, packet);
> +
> + if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_CONTEXT)
> + return arm_spe_get_context(buf, len, packet);
> +
> + if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_OP_TYPE)
> + return arm_spe_get_op_type(buf, len, packet);
> +
> + if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_EXTENDED) {
> + /* 16-bit extended format header */
> + ext_hdr = 1;
> +
> + hdr = buf[1];
> + if (hdr == SPE_HEADER1_ALIGNMENT)
> return arm_spe_get_alignment(buf, len, packet);
> - else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
> - return arm_spe_get_addr(buf, len, 1, packet);
> - else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
> - return arm_spe_get_counter(buf, len, 1, packet);
> }
>
> + /*
> + * The short format header's byte 0 or the extended format header's
> + * byte 1 has been assigned to 'hdr', which uses the same encoding for
> + * address packet and counter packet, so don't need to distinguish if
> + * it's short format or extended format and handle in once.
> + */
> + if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_ADDRESS)
> + return arm_spe_get_addr(buf, len, ext_hdr, packet);
> +
> + if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_COUNTER)
> + return arm_spe_get_counter(buf, len, ext_hdr, packet);
> +
> return ARM_SPE_BAD_PACKET;
> }
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index e9ea8e3ead5d..68552ff8a8f7 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -37,6 +37,26 @@ struct arm_spe_pkt {
> uint64_t payload;
> };
>
> +/* Short header (HEADER0) and extended header (HEADER1) */
> +#define SPE_HEADER0_PAD 0x0
> +#define SPE_HEADER0_END 0x1
> +#define SPE_HEADER0_TIMESTAMP 0x71
> +/* Mask for event & data source */
> +#define SPE_HEADER0_MASK1 (GENMASK_ULL(7, 6) | GENMASK_ULL(3, 0))
> +#define SPE_HEADER0_EVENTS 0x42
> +#define SPE_HEADER0_SOURCE 0x43
> +/* Mask for context & operation */
> +#define SPE_HEADER0_MASK2 GENMASK_ULL(7, 2)
> +#define SPE_HEADER0_CONTEXT 0x64
> +#define SPE_HEADER0_OP_TYPE 0x48
> +/* Mask for extended format */
> +#define SPE_HEADER0_EXTENDED 0x20
> +/* Mask for address & counter */
> +#define SPE_HEADER0_MASK3 GENMASK_ULL(7, 3)
> +#define SPE_HEADER0_ADDRESS 0xb0
> +#define SPE_HEADER0_COUNTER 0x98
> +#define SPE_HEADER1_ALIGNMENT 0x0
> +
> #define SPE_HEADER_SZ(val) ((val & GENMASK_ULL(5, 4)) >> 4)
>
> #define SPE_ADDR_PKT_HDR_INDEX_INS (0x0)
>

2020-10-23 22:36:31

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH v3 15/20] perf arm-spe: Remove size condition checking for events

On 22/10/2020 15:58, Leo Yan wrote:
> In the Armv8 ARM (ARM DDI 0487F.c), chapter "D10.2.6 Events packet", it
> describes the event bit is valid with specific payload requirement. For
> example, the Last Level cache access event, the bit is defined as:
>
> E[8], byte 1 bit [0], when SZ == 0b01 , when SZ == 0b10 ,
> or when SZ == 0b11
>
> It requires the payload size is at least 2 bytes, when byte 1 (start
> counting from 0) is valid, E[8] (bit 0 in byte 1) can be used for LLC
> access event type. For safety, the code checks the condition for
> payload size firstly, if meet the requirement for payload size, then
> continue to parse event type.
>
> If review function arm_spe_get_payload(), it has used cast, so any bytes
> beyond the valid size have been set to zeros.
>
> For this reason, we don't need to check payload size anymore afterwards
> when parse events, thus this patch removes payload size conditions.
>
> Suggested-by: Andre Przywara <[email protected]>
> Signed-off-by: Leo Yan <[email protected]>

Thanks, that looks better now!

Reviewed-by: Andre Przywara <[email protected]>

Cheers,
Andre

> ---
> .../util/arm-spe-decoder/arm-spe-decoder.c | 9 ++----
> .../arm-spe-decoder/arm-spe-pkt-decoder.c | 30 +++++++++----------
> 2 files changed, 17 insertions(+), 22 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 776b3e6628bb..a5d7509d5daa 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -178,16 +178,13 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
> if (payload & BIT(EV_TLB_ACCESS))
> decoder->record.type |= ARM_SPE_TLB_ACCESS;
>
> - if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_LLC_MISS)))
> + if (payload & BIT(EV_LLC_MISS))
> decoder->record.type |= ARM_SPE_LLC_MISS;
>
> - if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_LLC_ACCESS)))
> + if (payload & BIT(EV_LLC_ACCESS))
> decoder->record.type |= ARM_SPE_LLC_ACCESS;
>
> - if ((idx == 2 || idx == 4 || idx == 8) &&
> - (payload & BIT(EV_REMOTE_ACCESS)))
> + if (payload & BIT(EV_REMOTE_ACCESS))
> decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
>
> if (payload & BIT(EV_MISPRED))
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 58a1390b7a43..2cb019999016 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -317,22 +317,20 @@ static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
> if (ret < 0)
> return ret;
> }
> - if (packet->index > 1) {
> - if (payload & BIT(EV_LLC_ACCESS)) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & BIT(EV_LLC_MISS)) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & BIT(EV_REMOTE_ACCESS)) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> + if (payload & BIT(EV_LLC_ACCESS)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_LLC_MISS)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & BIT(EV_REMOTE_ACCESS)) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> + if (ret < 0)
> + return ret;
> }
>
> return buf_len - blen;
>

2020-10-24 02:10:05

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH v3 13/20] perf arm-spe: Add new function arm_spe_pkt_desc_event()

On 22/10/2020 15:58, Leo Yan wrote:

Hi,

> This patch moves out the event packet parsing from arm_spe_pkt_desc()
> to the new function arm_spe_pkt_desc_event().
>
> Signed-off-by: Leo Yan <[email protected]>

diff -w says this is correct, so:

Reviewed-by: Andre Przywara <[email protected]>

Thanks!
Andre

> ---
> .../arm-spe-decoder/arm-spe-pkt-decoder.c | 136 ++++++++++--------
> 1 file changed, 73 insertions(+), 63 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 6eebd30f3d78..8a6b50f32a52 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -266,6 +266,78 @@ static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
> return ret;
> }
>
> +static int arm_spe_pkt_desc_event(const struct arm_spe_pkt *packet,
> + char *buf, size_t buf_len)
> +{
> + u64 payload = packet->payload;
> + size_t blen = buf_len;
> + int ret;
> +
> + ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> + if (ret < 0)
> + return ret;
> +
> + if (payload & 0x1) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x2) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x4) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x8) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x10) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x20) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x40) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x80) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> + if (ret < 0)
> + return ret;
> + }
> + if (packet->index > 1) {
> + if (payload & 0x100) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x200) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> + if (ret < 0)
> + return ret;
> + }
> + if (payload & 0x400) {
> + ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> + if (ret < 0)
> + return ret;
> + }
> + }
> +
> + return buf_len - blen;
> +}
> +
> static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
> char *buf, size_t buf_len)
> {
> @@ -344,69 +416,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> case ARM_SPE_END:
> return arm_spe_pkt_snprintf(&buf, &blen, "%s", name);
> case ARM_SPE_EVENTS:
> - ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> - if (ret < 0)
> - return ret;
> -
> - if (payload & 0x1) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x2) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x4) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x8) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x10) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x20) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x40) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x80) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> - if (ret < 0)
> - return ret;
> - }
> - if (idx > 1) {
> - if (payload & 0x100) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x200) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> - if (ret < 0)
> - return ret;
> - }
> - if (payload & 0x400) {
> - ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> - if (ret < 0)
> - return ret;
> - }
> - }
> - return buf_len - blen;
> -
> + return arm_spe_pkt_desc_event(packet, buf, buf_len);
> case ARM_SPE_OP_TYPE:
> switch (idx) {
> case 0:
>

2020-10-26 07:03:08

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v3 09/20] perf arm-spe: Refactor address packet handling

On Fri, Oct 23, 2020 at 06:53:35PM +0100, Andr? Przywara wrote:
> On 22/10/2020 15:58, Leo Yan wrote:
>
> Hi Leo,
>
> > This patch is to refactor address packet handling, it defines macros for
> > address packet's header and payload, these macros are used by decoder
> > and the dump flow.
> >
> > Signed-off-by: Leo Yan <[email protected]>
> > ---
> > .../util/arm-spe-decoder/arm-spe-decoder.c | 29 ++++++++--------
> > .../arm-spe-decoder/arm-spe-pkt-decoder.c | 26 +++++++-------
> > .../arm-spe-decoder/arm-spe-pkt-decoder.h | 34 ++++++++++++-------
> > 3 files changed, 47 insertions(+), 42 deletions(-)
> >
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > index cc18a1e8c212..776b3e6628bb 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > @@ -24,36 +24,35 @@
> >
> > static u64 arm_spe_calc_ip(int index, u64 payload)
> > {
> > - u8 *addr = (u8 *)&payload;
> > - int ns, el;
> > + u64 ns, el;
> >
> > /* Instruction virtual address or Branch target address */
> > if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
> > index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> > - ns = addr[7] & SPE_ADDR_PKT_NS;
> > - el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> > + ns = SPE_ADDR_PKT_GET_NS(payload);
> > + el = SPE_ADDR_PKT_GET_EL(payload);
> > +
> > + /* Clean highest byte */
> > + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> >
> > /* Fill highest byte for EL1 or EL2 (VHE) mode */
> > if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
> > - addr[7] = 0xff;
> > - /* Clean highest byte for other cases */
> > - else
> > - addr[7] = 0x0;
> > + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
> >
> > /* Data access virtual address */
> > } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
> >
> > + /* Clean tags */
> > + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> > +
> > /* Fill highest byte if bits [48..55] is 0xff */
>
> Do you know where this comes from? If yes, can you replace the comment
> with the reason, so that it says *why* and not *what* the code does?

I think I know where this comes from :)

Armv8 ARM defines the Address packet payload, the data virtual address
highest byte (byte7 for bits [63:56]) is assignment for TAG; if connect
with the kernel document Documentation/arm64/memory.rst, we can see the
prededined the "AArch64 Linux memory layout with 4KB pages + 4 levels",
though the payload's byte7 is for TAG, but we can check bits [48..55]
(byte6) and if it's 0xff, then we need fill the byte7 so can create a
valid virtual address for data accessing, this finally can allow perf
tool to find correct symbol for data structure.

Start End Size Use
-----------------------------------------------------------------------
0000000000000000 0000ffffffffffff 256TB user
ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map
ffff800000000000 ffff9fffffffffff 32TB kasan shadow region
ffffa00000000000 ffffa00007ffffff 128MB bpf jit region
ffffa00008000000 ffffa0000fffffff 128MB modules
ffffa00010000000 fffffdffbffeffff ~93TB vmalloc
fffffdffbfff0000 fffffdfffe5f8fff ~998MB [guard region]
fffffdfffe5f9000 fffffdfffe9fffff 4124KB fixed mappings
fffffdfffea00000 fffffdfffebfffff 2MB [guard region]
fffffdfffec00000 fffffdffffbfffff 16MB PCI I/O space
fffffdffffc00000 fffffdffffdfffff 2MB [guard region]
fffffdffffe00000 ffffffffffdfffff 2TB vmemmap
ffffffffffe00000 ffffffffffffffff 2MB [guard region]

But I find it misses to support the "AArch64 Linux memory layout with
64KB pages + 3 levels (52-bit with HW support)":

Start End Size Use
-----------------------------------------------------------------------
0000000000000000 000fffffffffffff 4PB user
fff0000000000000 fff7ffffffffffff 2PB kernel logical memory map
fff8000000000000 fffd9fffffffffff 1440TB [gap]
fffda00000000000 ffff9fffffffffff 512TB kasan shadow region
ffffa00000000000 ffffa00007ffffff 128MB bpf jit region
ffffa00008000000 ffffa0000fffffff 128MB modules
ffffa00010000000 fffff81ffffeffff ~88TB vmalloc
fffff81fffff0000 fffffc1ffe58ffff ~3TB [guard region]
fffffc1ffe590000 fffffc1ffe9fffff 4544KB fixed mappings
fffffc1ffea00000 fffffc1ffebfffff 2MB [guard region]
fffffc1ffec00000 fffffc1fffbfffff 16MB PCI I/O space
fffffc1fffc00000 fffffc1fffdfffff 2MB [guard region]
fffffc1fffe00000 ffffffffffdfffff 3968GB vmemmap
ffffffffffe00000 ffffffffffffffff 2MB [guard region]

So if detect byte6 is "0xf0", "0xf8, "0xfd", we also need to fixup the
byte7 and fill 0xff to it.

If you think this is right way to process address packet, I will use a
separate to fix comment and support 64KB page fixup.

> > - if (addr[6] == 0xff)
> > - addr[7] = 0xff;
> > - /* Otherwise, cleanup tags */
> > - else
> > - addr[7] = 0x0;
> > + if (SPE_ADDR_PKT_ADDR_GET_BYTE_6(payload) == 0xffULL)
> > + payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
> >
> > /* Data access physical address */
> > } else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
> > - /* Cleanup byte 7 */
> > - addr[7] = 0x0;
> > + /* Clean highest byte */
> > + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> > } else {
> > pr_err("unsupported address packet index: 0x%x\n", index);
> > }
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index 550cd7648c73..156f98d6b8b2 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -13,9 +13,6 @@
> >
> > #include "arm-spe-pkt-decoder.h"
> >
> > -#define NS_FLAG BIT(63)
> > -#define EL_FLAG (BIT(62) | BIT(61))
> > -
> > #if __BYTE_ORDER == __BIG_ENDIAN
> > #define le16_to_cpu bswap_16
> > #define le32_to_cpu bswap_32
> > @@ -167,10 +164,11 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
> > const unsigned char ext_hdr, struct arm_spe_pkt *packet)
> > {
> > packet->type = ARM_SPE_ADDRESS;
> > +
> > if (ext_hdr)
> > - packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> > + packet->index = SPE_ADDR_PKT_HDR_EXTENDED_INDEX(buf[0], buf[1]);
> > else
> > - packet->index = buf[0] & 0x7;
> > + packet->index = SPE_ADDR_PKT_HDR_SHORT_INDEX(buf[0]);
> >
> > return arm_spe_get_payload(buf, len, ext_hdr, packet);
> > }
> > @@ -274,20 +272,20 @@ static int arm_spe_pkt_desc_addr(const struct arm_spe_pkt *packet,
> > u64 payload = packet->payload;
> >
> > switch (idx) {
> > - case 0:
> > - case 1:
> > - ns = !!(packet->payload & NS_FLAG);
> > - el = (packet->payload & EL_FLAG) >> 61;
> > - payload &= ~(0xffULL << 56);
> > + case SPE_ADDR_PKT_HDR_INDEX_INS:
> > + case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
> > + ns = !!SPE_ADDR_PKT_GET_NS(payload);
> > + el = SPE_ADDR_PKT_GET_EL(payload);
> > + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> > return arm_spe_pkt_snprintf(&buf, &buf_len,
> > "%s 0x%llx el%d ns=%d",
> > (idx == 1) ? "TGT" : "PC", payload, el, ns);
> > - case 2:
> > + case SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT:
> > return arm_spe_pkt_snprintf(&buf, &buf_len,
> > "VA 0x%llx", payload);
> > - case 3:
> > - ns = !!(packet->payload & NS_FLAG);
> > - payload &= ~(0xffULL << 56);
> > + case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
> > + ns = !!SPE_ADDR_PKT_GET_NS(payload);
> > + payload = SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(payload);
> > return arm_spe_pkt_snprintf(&buf, &buf_len,
> > "PA 0x%llx ns=%d", payload, ns);
> > default:
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > index 68552ff8a8f7..4111550d2bde 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > @@ -59,19 +59,27 @@ struct arm_spe_pkt {
> >
> > #define SPE_HEADER_SZ(val) ((val & GENMASK_ULL(5, 4)) >> 4)
> >
> > -#define SPE_ADDR_PKT_HDR_INDEX_INS (0x0)
> > -#define SPE_ADDR_PKT_HDR_INDEX_BRANCH (0x1)
> > -#define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT (0x2)
> > -#define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS (0x3)
> > -
> > -#define SPE_ADDR_PKT_NS BIT(7)
> > -#define SPE_ADDR_PKT_CH BIT(6)
> > -#define SPE_ADDR_PKT_EL_OFFSET (5)
> > -#define SPE_ADDR_PKT_EL_MASK (0x3 << SPE_ADDR_PKT_EL_OFFSET)
> > -#define SPE_ADDR_PKT_EL0 (0)
> > -#define SPE_ADDR_PKT_EL1 (1)
> > -#define SPE_ADDR_PKT_EL2 (2)
> > -#define SPE_ADDR_PKT_EL3 (3)
> > +/* Address packet header */
> > +#define SPE_ADDR_PKT_HDR_SHORT_INDEX(h) ((h) & GENMASK_ULL(2, 0))
> > +#define SPE_ADDR_PKT_HDR_EXTENDED_INDEX(h0, h1) (((h0) & GENMASK_ULL(1, 0)) << 3 | \
> > + SPE_ADDR_PKT_HDR_SHORT_INDEX(h1))
>
> Did you consider sharing those two with the identical definition for the
> extended counter packet? This extended packet seems more like a generic
> concept, regardless of the packet type.

TBH, I have considered for this :) An exception is the Context packet,
it uses different format for index (bits [1:0]).

I think I can consolidate the index for Address packet and conter
packet for sharing definitions.

#define SPE_PKT_HDR_SHORT_INDEX(h) ((h) & GENMASK_ULL(2, 0))
#define SPE_PKT_HDR_EXTENDED_INDEX(h0, h1) (((h0) & GENMASK_ULL(1, 0)) << 3 | \
SPE_PKT_HDR_SHORT_INDEX(h1))

> > +#define SPE_ADDR_PKT_HDR_INDEX_INS 0x0
> > +#define SPE_ADDR_PKT_HDR_INDEX_BRANCH 0x1
> > +#define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT 0x2
> > +#define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS 0x3
> > +
> > +/* Address packet payload */
> > +#define SPE_ADDR_PKT_ADDR_BYTE7_SHIFT 56
> > +#define SPE_ADDR_PKT_ADDR_GET_BYTES_0_6(v) ((v) & GENMASK_ULL(55, 0))
> > +#define SPE_ADDR_PKT_ADDR_GET_BYTE_6(v) (((v) & GENMASK_ULL(55, 48)) >> 48)
> > +
> > +#define SPE_ADDR_PKT_GET_NS(v) (((v) & BIT(63)) >> 63)
>
> You need BIT_ULL(63) here to make this work on 32-bit systems.

Will do. Thanks!

Leo