2022-06-06 12:03:50

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
integrated Endpoint (RCiEP) device, providing the capability
to dynamically monitor and tune the PCIe traffic (tune),
and trace the TLP headers (trace).

PTT tune is designed for monitoring and adjusting PCIe link parameters.
We provide several parameters of the PCIe link. Through the driver,
user can adjust the value of certain parameter to affect the PCIe link
for the purpose of enhancing the performance in certian situation.

PTT trace is designed for dumping the TLP headers to the memory, which
can be used to analyze the transactions and usage condition of the PCIe
Link. Users can choose filters to trace headers, by either requester
ID, or those downstream of a set of Root Ports on the same core of the
PTT device. It's also supported to trace the headers of certain type and
of certain direction.

The driver registers a PMU device for each PTT device. The trace can
be used through `perf record` and the traced headers can be decoded
by `perf report`. The perf command support for the device is also
added in this patchset. The tune can be used through the sysfs
attributes of related PMU device. See the documentation for the
detailed usage.

Change since v8:
- Cleanups and one minor fix from Jonathan and John, thanks
Link: https://lore.kernel.org/lkml/[email protected]/

Change since v7:
- Configure the DMA in probe rather than in runtime. Also use devres to manage
PMU device as we have no order problem now
- Refactor the config validation function per John and Leo
- Use a spinlock hisi_ptt::pmu_lock instead of mutex to serialize the perf process
in pmu::start as it's in atomic context
- Only commit the traced data when stop, per Leo and James
- Drop the filter dynamically updating patch from this series to simply the review
of the driver. That patch will be send separately.
- add a cpumask sysfs attribute and handle the cpu hotplug events, follow the
uncore PMU convention
- Other cleanups and fixes, both in driver and perf tool
Link: https://lore.kernel.org/lkml/[email protected]/

Change since v6:
- Fix W=1 errors reported by lkp test, thanks

Change since v5:
- Squash the PMU patch into PATCH 2 suggested by John
- refine the commit message of PATCH 1 and some comments
Link: https://lore.kernel.org/lkml/[email protected]/

Change since v4:
Address the comments from Jonathan, John and Ma Ca, thanks.
- Use devm* also for allocating the DMA buffers
- Remove the IRQ handler stub in Patch 2
- Make functions waiting for hardware state return boolean
- Manual remove the PMU device as it should be removed first
- Modifier the orders in probe and removal to make them matched well
- Make available {directions,type,format} array const and non-global
- Using the right filter list in filters show and well protect the
list with mutex
- Record the trace status with a boolean @started rather than enum
- Optimize the process of finding the PTT devices of the perf-tool
Link: https://lore.kernel.org/linux-pci/[email protected]/

Change since v3:
Address the comments from Jonathan and John, thanks.
- drop members in the common struct which can be get on the fly
- reduce buffer struct and organize the buffers with array instead of list
- reduce the DMA reset wait time to avoid long time busy loop
- split the available_filters sysfs attribute into two files, for root port
and requester respectively. Update the documentation accordingly
- make IOMMU mapping check earlier in probe to avoid race condition. Also
make IOMMU quirk patch prior to driver in the series
- Cleanups and typos fixes from John and Jonathan
Link: https://lore.kernel.org/linux-pci/[email protected]/

Change since v2:
- address the comments from Mathieu, thanks.
- rename the directory to ptt to match the function of the device
- spinoff the declarations to a separate header
- split the trace function to several patches
- some other comments.
- make default smmu domain type of PTT device to identity
Drop the RMR as it's not recommended and use an iommu_def_domain_type
quirk to passthrough the device DMA as suggested by Robin.
Link: https://lore.kernel.org/linux-pci/[email protected]/

Change since v1:
- switch the user interface of trace to perf from debugfs
- switch the user interface of tune to sysfs from debugfs
- add perf tool support to start trace and decode the trace data
- address the comments of documentation from Bjorn
- add RMR[1] support of the device as trace works in RMR mode or
direct DMA mode. RMR support is achieved by common APIs rather
than the APIs implemented in [1].
Link: https://lore.kernel.org/lkml/[email protected]/
[1] https://lore.kernel.org/linux-acpi/[email protected]/

Qi Liu (3):
perf tool: arm: Refactor event list iteration in
auxtrace_record__init()
perf tool: Add support for HiSilicon PCIe Tune and Trace device driver
perf tool: Add support for parsing HiSilicon PCIe Trace packet

Yicong Yang (5):
iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to
identity
hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe
Tune and Trace device
hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune
and Trace device
docs: trace: Add HiSilicon PTT device driver documentation
MAINTAINERS: Add maintainer for HiSilicon PTT driver

Documentation/trace/hisi-ptt.rst | 307 +++++
Documentation/trace/index.rst | 1 +
MAINTAINERS | 7 +
drivers/Makefile | 1 +
drivers/hwtracing/Kconfig | 2 +
drivers/hwtracing/ptt/Kconfig | 12 +
drivers/hwtracing/ptt/Makefile | 2 +
drivers/hwtracing/ptt/hisi_ptt.c | 1092 +++++++++++++++++
drivers/hwtracing/ptt/hisi_ptt.h | 200 +++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +
tools/perf/arch/arm/util/auxtrace.c | 116 +-
tools/perf/arch/arm/util/pmu.c | 3 +
tools/perf/arch/arm64/util/Build | 2 +-
tools/perf/arch/arm64/util/hisi-ptt.c | 187 +++
tools/perf/util/Build | 2 +
tools/perf/util/auxtrace.c | 4 +
tools/perf/util/auxtrace.h | 1 +
tools/perf/util/hisi-ptt-decoder/Build | 1 +
.../hisi-ptt-decoder/hisi-ptt-pkt-decoder.c | 164 +++
.../hisi-ptt-decoder/hisi-ptt-pkt-decoder.h | 31 +
tools/perf/util/hisi-ptt.c | 192 +++
tools/perf/util/hisi-ptt.h | 19 +
22 files changed, 2347 insertions(+), 20 deletions(-)
create mode 100644 Documentation/trace/hisi-ptt.rst
create mode 100644 drivers/hwtracing/ptt/Kconfig
create mode 100644 drivers/hwtracing/ptt/Makefile
create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c
create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h
create mode 100644 tools/perf/arch/arm64/util/hisi-ptt.c
create mode 100644 tools/perf/util/hisi-ptt-decoder/Build
create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
create mode 100644 tools/perf/util/hisi-ptt.c
create mode 100644 tools/perf/util/hisi-ptt.h

--
2.24.0


2022-06-06 12:03:56

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v9 6/8] perf tool: Add support for parsing HiSilicon PCIe Trace packet

From: Qi Liu <[email protected]>

Add support for using 'perf report --dump-raw-trace' to parse PTT packet.

Example usage:

Output will contain raw PTT data and its textual representation, such
as:

0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000 offset: 0
ref: 0xa5d50c725 idx: 0 tid: -1 cpu: 0
.
. ... HISI PTT data: size 4194304 bytes
. 00000000: 00 00 00 00 Prefix
. 00000004: 08 20 00 60 Header DW0
. 00000008: ff 02 00 01 Header DW1
. 0000000c: 20 08 00 00 Header DW2
. 00000010: 10 e7 44 ab Header DW3
. 00000014: 2a a8 1e 01 Time
. 00000020: 00 00 00 00 Prefix
. 00000024: 01 00 00 60 Header DW0
. 00000028: 0f 1e 00 01 Header DW1
. 0000002c: 04 00 00 00 Header DW2
. 00000030: 40 00 81 02 Header DW3
. 00000034: ee 02 00 00 Time
....

Signed-off-by: Qi Liu <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
tools/perf/util/Build | 2 +
tools/perf/util/auxtrace.c | 3 +
tools/perf/util/hisi-ptt-decoder/Build | 1 +
.../hisi-ptt-decoder/hisi-ptt-pkt-decoder.c | 164 +++++++++++++++
.../hisi-ptt-decoder/hisi-ptt-pkt-decoder.h | 31 +++
tools/perf/util/hisi-ptt.c | 192 ++++++++++++++++++
6 files changed, 393 insertions(+)
create mode 100644 tools/perf/util/hisi-ptt-decoder/Build
create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
create mode 100644 tools/perf/util/hisi-ptt.c

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index a51267d88ca9..e22df9e7fd10 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -116,6 +116,8 @@ perf-$(CONFIG_AUXTRACE) += intel-pt.o
perf-$(CONFIG_AUXTRACE) += intel-bts.o
perf-$(CONFIG_AUXTRACE) += arm-spe.o
perf-$(CONFIG_AUXTRACE) += arm-spe-decoder/
+perf-$(CONFIG_AUXTRACE) += hisi-ptt.o
+perf-$(CONFIG_AUXTRACE) += hisi-ptt-decoder/
perf-$(CONFIG_AUXTRACE) += s390-cpumsf.o

ifdef CONFIG_LIBOPENCSD
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c5ef322a30b8..3371a0feec68 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -51,6 +51,7 @@
#include "intel-pt.h"
#include "intel-bts.h"
#include "arm-spe.h"
+#include "hisi-ptt.h"
#include "s390-cpumsf.h"
#include "util/mmap.h"

@@ -1305,6 +1306,8 @@ int perf_event__process_auxtrace_info(struct perf_session *session,
err = s390_cpumsf_process_auxtrace_info(event, session);
break;
case PERF_AUXTRACE_HISI_PTT:
+ err = hisi_ptt_process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_UNKNOWN:
default:
return -EINVAL;
diff --git a/tools/perf/util/hisi-ptt-decoder/Build b/tools/perf/util/hisi-ptt-decoder/Build
new file mode 100644
index 000000000000..db3db8b75033
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/Build
@@ -0,0 +1 @@
+perf-$(CONFIG_AUXTRACE) += hisi-ptt-pkt-decoder.o
diff --git a/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
new file mode 100644
index 000000000000..dc8f19914628
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * HiSilicon PCIe Trace and Tuning (PTT) support
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <endian.h>
+#include <byteswap.h>
+#include <linux/bitops.h>
+#include <stdarg.h>
+
+#include "../color.h"
+#include "hisi-ptt-pkt-decoder.h"
+
+/*
+ * For 8DW format, the bit[31:11] of DW0 is always 0x1fffff, which can be
+ * used to distinguish the data format.
+ * 8DW format is like:
+ * bits [ 31:11 ][ 10:0 ]
+ * |---------------------------------------|-------------------|
+ * DW0 [ 0x1fffff ][ Reserved (0x7ff) ]
+ * DW1 [ Prefix ]
+ * DW2 [ Header DW0 ]
+ * DW3 [ Header DW1 ]
+ * DW4 [ Header DW2 ]
+ * DW5 [ Header DW3 ]
+ * DW6 [ Reserved (0x0) ]
+ * DW7 [ Time ]
+ *
+ * 4DW format is like:
+ * bits [31:30] [ 29:25 ][24][23][22][21][ 20:11 ][ 10:0 ]
+ * |-----|---------|---|---|---|---|-------------|-------------|
+ * DW0 [ Fmt ][ Type ][T9][T8][TH][SO][ Length ][ Time ]
+ * DW1 [ Header DW1 ]
+ * DW2 [ Header DW2 ]
+ * DW3 [ Header DW3 ]
+ */
+
+enum hisi_ptt_8dw_pkt_field_type {
+ HISI_PTT_8DW_RSV0,
+ HISI_PTT_8DW_PREFIX,
+ HISI_PTT_8DW_HEAD0,
+ HISI_PTT_8DW_HEAD1,
+ HISI_PTT_8DW_HEAD2,
+ HISI_PTT_8DW_HEAD3,
+ HISI_PTT_8DW_RSV1,
+ HISI_PTT_8DW_TIME,
+ HISI_PTT_8DW_TYPE_MAX
+};
+
+enum hisi_ptt_4dw_pkt_field_type {
+ HISI_PTT_4DW_HEAD1,
+ HISI_PTT_4DW_HEAD2,
+ HISI_PTT_4DW_HEAD3,
+ HISI_PTT_4DW_TYPE_MAX
+};
+
+static const char * const hisi_ptt_8dw_pkt_field_name[] = {
+ [HISI_PTT_8DW_PREFIX] = "Prefix",
+ [HISI_PTT_8DW_HEAD0] = "Header DW0",
+ [HISI_PTT_8DW_HEAD1] = "Header DW1",
+ [HISI_PTT_8DW_HEAD2] = "Header DW2",
+ [HISI_PTT_8DW_HEAD3] = "Header DW3",
+ [HISI_PTT_8DW_TIME] = "Time"
+};
+
+static const char * const hisi_ptt_4dw_pkt_field_name[] = {
+ [HISI_PTT_4DW_HEAD1] = "Header DW1",
+ [HISI_PTT_4DW_HEAD2] = "Header DW2",
+ [HISI_PTT_4DW_HEAD3] = "Header DW3",
+};
+
+union hisi_ptt_4dw {
+ struct {
+ uint32_t format : 2;
+ uint32_t type : 5;
+ uint32_t t9 : 1;
+ uint32_t t8 : 1;
+ uint32_t th : 1;
+ uint32_t so : 1;
+ uint32_t len : 10;
+ uint32_t time : 11;
+ };
+ uint32_t value;
+};
+
+static void hisi_ptt_print_pkt(const unsigned char *buf, int pos, const char *desc)
+{
+ const char *color = PERF_COLOR_BLUE;
+ int i;
+
+ printf(".");
+ color_fprintf(stdout, color, " %08x: ", pos);
+ for (i = 0; i < HISI_PTT_FIELD_LENTH; i++)
+ color_fprintf(stdout, color, "%02x ", buf[pos + i]);
+ for (i = 0; i < HISI_PTT_MAX_SPACE_LEN; i++)
+ color_fprintf(stdout, color, " ");
+ color_fprintf(stdout, color, " %s\n", desc);
+}
+
+static int hisi_ptt_8dw_kpt_desc(const unsigned char *buf, int pos)
+{
+ int i;
+
+ for (i = 0; i < HISI_PTT_8DW_TYPE_MAX; i++) {
+ /* Do not show reserved filed */
+ if (i == HISI_PTT_8DW_RSV0 || i == HISI_PTT_8DW_RSV1) {
+ pos += HISI_PTT_FIELD_LENTH;
+ continue;
+ }
+
+ hisi_ptt_print_pkt(buf, pos, hisi_ptt_8dw_pkt_field_name[i]);
+ pos += HISI_PTT_FIELD_LENTH;
+ }
+
+ return hisi_ptt_pkt_size[HISI_PTT_8DW_PKT];
+}
+
+static void hisi_ptt_4dw_print_dw0(const unsigned char *buf, int pos)
+{
+ const char *color = PERF_COLOR_BLUE;
+ union hisi_ptt_4dw dw0;
+ int i;
+
+ dw0.value = *(uint32_t *)(buf + pos);
+ printf(".");
+ color_fprintf(stdout, color, " %08x: ", pos);
+ for (i = 0; i < HISI_PTT_FIELD_LENTH; i++)
+ color_fprintf(stdout, color, "%02x ", buf[pos + i]);
+ for (i = 0; i < HISI_PTT_MAX_SPACE_LEN; i++)
+ color_fprintf(stdout, color, " ");
+
+ color_fprintf(stdout, color,
+ " %s %x %s %x %s %x %s %x %s %x %s %x %s %x %s %x\n",
+ "Format", dw0.format, "Type", dw0.type, "T9", dw0.t9,
+ "T8", dw0.t8, "TH", dw0.th, "SO", dw0.so, "Length",
+ dw0.len, "Time", dw0.time);
+}
+
+static int hisi_ptt_4dw_kpt_desc(const unsigned char *buf, int pos)
+{
+ int i;
+
+ hisi_ptt_4dw_print_dw0(buf, pos);
+ pos += HISI_PTT_FIELD_LENTH;
+
+ for (i = 0; i < HISI_PTT_4DW_TYPE_MAX; i++) {
+ hisi_ptt_print_pkt(buf, pos, hisi_ptt_4dw_pkt_field_name[i]);
+ pos += HISI_PTT_FIELD_LENTH;
+ }
+
+ return hisi_ptt_pkt_size[HISI_PTT_4DW_PKT];
+}
+
+int hisi_ptt_pkt_desc(const unsigned char *buf, int pos, enum hisi_ptt_pkt_type type)
+{
+ if (type == HISI_PTT_8DW_PKT)
+ return hisi_ptt_8dw_kpt_desc(buf, pos);
+
+ return hisi_ptt_4dw_kpt_desc(buf, pos);
+}
diff --git a/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
new file mode 100644
index 000000000000..e78f1b5bc836
--- /dev/null
+++ b/tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * HiSilicon PCIe Trace and Tuning (PTT) support
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ */
+
+#ifndef INCLUDE__HISI_PTT_PKT_DECODER_H__
+#define INCLUDE__HISI_PTT_PKT_DECODER_H__
+
+#include <stddef.h>
+#include <stdint.h>
+
+#define HISI_PTT_8DW_CHECK_MASK GENMASK(31, 11)
+#define HISI_PTT_IS_8DW_PKT GENMASK(31, 11)
+#define HISI_PTT_MAX_SPACE_LEN 10
+#define HISI_PTT_FIELD_LENTH 4
+
+enum hisi_ptt_pkt_type {
+ HISI_PTT_4DW_PKT,
+ HISI_PTT_8DW_PKT,
+ HISI_PTT_PKT_MAX
+};
+
+static int hisi_ptt_pkt_size[] = {
+ [HISI_PTT_4DW_PKT] = 16,
+ [HISI_PTT_8DW_PKT] = 32,
+};
+
+int hisi_ptt_pkt_desc(const unsigned char *buf, int pos, enum hisi_ptt_pkt_type type);
+
+#endif
diff --git a/tools/perf/util/hisi-ptt.c b/tools/perf/util/hisi-ptt.c
new file mode 100644
index 000000000000..9798e297e7ab
--- /dev/null
+++ b/tools/perf/util/hisi-ptt.c
@@ -0,0 +1,192 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * HiSilicon PCIe Trace and Tuning (PTT) support
+ * Copyright (c) 2022 HiSilicon Technologies Co., Ltd.
+ */
+
+#include <byteswap.h>
+#include <endian.h>
+#include <errno.h>
+#include <inttypes.h>
+#include <linux/bitops.h>
+#include <linux/kernel.h>
+#include <linux/log2.h>
+#include <linux/types.h>
+#include <linux/zalloc.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include "auxtrace.h"
+#include "color.h"
+#include "debug.h"
+#include "evlist.h"
+#include "evsel.h"
+#include "hisi-ptt.h"
+#include "hisi-ptt-decoder/hisi-ptt-pkt-decoder.h"
+#include "machine.h"
+#include "session.h"
+#include "symbol.h"
+#include "tool.h"
+#include "util/synthetic-events.h"
+#include <internal/lib.h>
+
+struct hisi_ptt {
+ struct auxtrace auxtrace;
+ u32 auxtrace_type;
+ struct perf_session *session;
+ struct machine *machine;
+ u32 pmu_type;
+};
+
+struct hisi_ptt_queue {
+ struct hisi_ptt *ptt;
+ struct auxtrace_buffer *buffer;
+};
+
+static enum hisi_ptt_pkt_type hisi_ptt_check_packet_type(unsigned char *buf)
+{
+ uint32_t head = *(uint32_t *)buf;
+
+ if ((HISI_PTT_8DW_CHECK_MASK & head) == HISI_PTT_IS_8DW_PKT)
+ return HISI_PTT_8DW_PKT;
+
+ return HISI_PTT_4DW_PKT;
+}
+
+static void hisi_ptt_dump(struct hisi_ptt *ptt __maybe_unused,
+ unsigned char *buf, size_t len)
+{
+ const char *color = PERF_COLOR_BLUE;
+ enum hisi_ptt_pkt_type type;
+ size_t pos = 0;
+ int pkt_len;
+
+ type = hisi_ptt_check_packet_type(buf);
+ len = round_down(len, hisi_ptt_pkt_size[type]);
+ color_fprintf(stdout, color, ". ... HISI PTT data: size %zu bytes\n",
+ len);
+
+ while (len > 0) {
+ pkt_len = hisi_ptt_pkt_desc(buf, pos, type);
+ if (!pkt_len)
+ color_fprintf(stdout, color, " Bad packet!\n");
+
+ pos += pkt_len;
+ len -= pkt_len;
+ }
+}
+
+static void hisi_ptt_dump_event(struct hisi_ptt *ptt, unsigned char *buf,
+ size_t len)
+{
+ printf(".\n");
+
+ hisi_ptt_dump(ptt, buf, len);
+}
+
+static int hisi_ptt_process_event(struct perf_session *session __maybe_unused,
+ union perf_event *event __maybe_unused,
+ struct perf_sample *sample __maybe_unused,
+ struct perf_tool *tool __maybe_unused)
+{
+ return 0;
+}
+
+static int hisi_ptt_process_auxtrace_event(struct perf_session *session,
+ union perf_event *event,
+ struct perf_tool *tool __maybe_unused)
+{
+ struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt,
+ auxtrace);
+ int fd = perf_data__fd(session->data);
+ int size = event->auxtrace.size;
+ void *data = malloc(size);
+ off_t data_offset;
+ int err;
+
+ if (perf_data__is_pipe(session->data)) {
+ data_offset = 0;
+ } else {
+ data_offset = lseek(fd, 0, SEEK_CUR);
+ if (data_offset == -1)
+ return -errno;
+ }
+
+ err = readn(fd, data, size);
+ if (err != (ssize_t)size) {
+ free(data);
+ return -errno;
+ }
+
+ if (dump_trace)
+ hisi_ptt_dump_event(ptt, data, size);
+
+ return 0;
+}
+
+static int hisi_ptt_flush(struct perf_session *session __maybe_unused,
+ struct perf_tool *tool __maybe_unused)
+{
+ return 0;
+}
+
+static void hisi_ptt_free_events(struct perf_session *session __maybe_unused)
+{
+}
+
+static void hisi_ptt_free(struct perf_session *session)
+{
+ struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt,
+ auxtrace);
+
+ session->auxtrace = NULL;
+ free(ptt);
+}
+
+static bool hisi_ptt_evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel)
+{
+ struct hisi_ptt *ptt = container_of(session->auxtrace, struct hisi_ptt, auxtrace);
+
+ return evsel->core.attr.type == ptt->pmu_type;
+}
+
+static void hisi_ptt_print_info(__u64 type)
+{
+ if (!dump_trace)
+ return;
+
+ fprintf(stdout, " PMU Type %" PRId64 "\n", (s64) type);
+}
+
+int hisi_ptt_process_auxtrace_info(union perf_event *event,
+ struct perf_session *session)
+{
+ struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
+ struct hisi_ptt *ptt;
+
+ if (auxtrace_info->header.size < HISI_PTT_AUXTRACE_PRIV_SIZE +
+ sizeof(struct perf_record_auxtrace_info))
+ return -EINVAL;
+
+ ptt = zalloc(sizeof(*ptt));
+ if (!ptt)
+ return -ENOMEM;
+
+ ptt->session = session;
+ ptt->machine = &session->machines.host; /* No kvm support */
+ ptt->auxtrace_type = auxtrace_info->type;
+ ptt->pmu_type = auxtrace_info->priv[0];
+
+ ptt->auxtrace.process_event = hisi_ptt_process_event;
+ ptt->auxtrace.process_auxtrace_event = hisi_ptt_process_auxtrace_event;
+ ptt->auxtrace.flush_events = hisi_ptt_flush;
+ ptt->auxtrace.free_events = hisi_ptt_free_events;
+ ptt->auxtrace.free = hisi_ptt_free;
+ ptt->auxtrace.evsel_is_auxtrace = hisi_ptt_evsel_is_auxtrace;
+ session->auxtrace = &ptt->auxtrace;
+
+ hisi_ptt_print_info(auxtrace_info->priv[0]);
+
+ return 0;
+}
--
2.24.0

2022-06-27 12:30:01

by Yicong Yang

[permalink] [raw]
Subject: Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

Hi Greg,

Since the kernel side of this device has been reviewed for 8 versions with
all comments addressed and no more comment since v9 posted in 5.19-rc1,
is it ok to merge it first (for Patch 1-3 and 7-8)?

Thanks.

On 2022/6/6 19:55, Yicong Yang wrote:
> HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
> integrated Endpoint (RCiEP) device, providing the capability
> to dynamically monitor and tune the PCIe traffic (tune),
> and trace the TLP headers (trace).
>
> PTT tune is designed for monitoring and adjusting PCIe link parameters.
> We provide several parameters of the PCIe link. Through the driver,
> user can adjust the value of certain parameter to affect the PCIe link
> for the purpose of enhancing the performance in certian situation.
>
> PTT trace is designed for dumping the TLP headers to the memory, which
> can be used to analyze the transactions and usage condition of the PCIe
> Link. Users can choose filters to trace headers, by either requester
> ID, or those downstream of a set of Root Ports on the same core of the
> PTT device. It's also supported to trace the headers of certain type and
> of certain direction.
>
> The driver registers a PMU device for each PTT device. The trace can
> be used through `perf record` and the traced headers can be decoded
> by `perf report`. The perf command support for the device is also
> added in this patchset. The tune can be used through the sysfs
> attributes of related PMU device. See the documentation for the
> detailed usage.
>
> Change since v8:
> - Cleanups and one minor fix from Jonathan and John, thanks
> Link: https://lore.kernel.org/lkml/[email protected]/
>
> Change since v7:
> - Configure the DMA in probe rather than in runtime. Also use devres to manage
> PMU device as we have no order problem now
> - Refactor the config validation function per John and Leo
> - Use a spinlock hisi_ptt::pmu_lock instead of mutex to serialize the perf process
> in pmu::start as it's in atomic context
> - Only commit the traced data when stop, per Leo and James
> - Drop the filter dynamically updating patch from this series to simply the review
> of the driver. That patch will be send separately.
> - add a cpumask sysfs attribute and handle the cpu hotplug events, follow the
> uncore PMU convention
> - Other cleanups and fixes, both in driver and perf tool
> Link: https://lore.kernel.org/lkml/[email protected]/
>
> Change since v6:
> - Fix W=1 errors reported by lkp test, thanks
>
> Change since v5:
> - Squash the PMU patch into PATCH 2 suggested by John
> - refine the commit message of PATCH 1 and some comments
> Link: https://lore.kernel.org/lkml/[email protected]/
>
> Change since v4:
> Address the comments from Jonathan, John and Ma Ca, thanks.
> - Use devm* also for allocating the DMA buffers
> - Remove the IRQ handler stub in Patch 2
> - Make functions waiting for hardware state return boolean
> - Manual remove the PMU device as it should be removed first
> - Modifier the orders in probe and removal to make them matched well
> - Make available {directions,type,format} array const and non-global
> - Using the right filter list in filters show and well protect the
> list with mutex
> - Record the trace status with a boolean @started rather than enum
> - Optimize the process of finding the PTT devices of the perf-tool
> Link: https://lore.kernel.org/linux-pci/[email protected]/
>
> Change since v3:
> Address the comments from Jonathan and John, thanks.
> - drop members in the common struct which can be get on the fly
> - reduce buffer struct and organize the buffers with array instead of list
> - reduce the DMA reset wait time to avoid long time busy loop
> - split the available_filters sysfs attribute into two files, for root port
> and requester respectively. Update the documentation accordingly
> - make IOMMU mapping check earlier in probe to avoid race condition. Also
> make IOMMU quirk patch prior to driver in the series
> - Cleanups and typos fixes from John and Jonathan
> Link: https://lore.kernel.org/linux-pci/[email protected]/
>
> Change since v2:
> - address the comments from Mathieu, thanks.
> - rename the directory to ptt to match the function of the device
> - spinoff the declarations to a separate header
> - split the trace function to several patches
> - some other comments.
> - make default smmu domain type of PTT device to identity
> Drop the RMR as it's not recommended and use an iommu_def_domain_type
> quirk to passthrough the device DMA as suggested by Robin.
> Link: https://lore.kernel.org/linux-pci/[email protected]/
>
> Change since v1:
> - switch the user interface of trace to perf from debugfs
> - switch the user interface of tune to sysfs from debugfs
> - add perf tool support to start trace and decode the trace data
> - address the comments of documentation from Bjorn
> - add RMR[1] support of the device as trace works in RMR mode or
> direct DMA mode. RMR support is achieved by common APIs rather
> than the APIs implemented in [1].
> Link: https://lore.kernel.org/lkml/[email protected]/
> [1] https://lore.kernel.org/linux-acpi/[email protected]/
>
> Qi Liu (3):
> perf tool: arm: Refactor event list iteration in
> auxtrace_record__init()
> perf tool: Add support for HiSilicon PCIe Tune and Trace device driver
> perf tool: Add support for parsing HiSilicon PCIe Trace packet
>
> Yicong Yang (5):
> iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to
> identity
> hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe
> Tune and Trace device
> hwtracing: hisi_ptt: Add tune function support for HiSilicon PCIe Tune
> and Trace device
> docs: trace: Add HiSilicon PTT device driver documentation
> MAINTAINERS: Add maintainer for HiSilicon PTT driver
>
> Documentation/trace/hisi-ptt.rst | 307 +++++
> Documentation/trace/index.rst | 1 +
> MAINTAINERS | 7 +
> drivers/Makefile | 1 +
> drivers/hwtracing/Kconfig | 2 +
> drivers/hwtracing/ptt/Kconfig | 12 +
> drivers/hwtracing/ptt/Makefile | 2 +
> drivers/hwtracing/ptt/hisi_ptt.c | 1092 +++++++++++++++++
> drivers/hwtracing/ptt/hisi_ptt.h | 200 +++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +
> tools/perf/arch/arm/util/auxtrace.c | 116 +-
> tools/perf/arch/arm/util/pmu.c | 3 +
> tools/perf/arch/arm64/util/Build | 2 +-
> tools/perf/arch/arm64/util/hisi-ptt.c | 187 +++
> tools/perf/util/Build | 2 +
> tools/perf/util/auxtrace.c | 4 +
> tools/perf/util/auxtrace.h | 1 +
> tools/perf/util/hisi-ptt-decoder/Build | 1 +
> .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.c | 164 +++
> .../hisi-ptt-decoder/hisi-ptt-pkt-decoder.h | 31 +
> tools/perf/util/hisi-ptt.c | 192 +++
> tools/perf/util/hisi-ptt.h | 19 +
> 22 files changed, 2347 insertions(+), 20 deletions(-)
> create mode 100644 Documentation/trace/hisi-ptt.rst
> create mode 100644 drivers/hwtracing/ptt/Kconfig
> create mode 100644 drivers/hwtracing/ptt/Makefile
> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c
> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h
> create mode 100644 tools/perf/arch/arm64/util/hisi-ptt.c
> create mode 100644 tools/perf/util/hisi-ptt-decoder/Build
> create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.c
> create mode 100644 tools/perf/util/hisi-ptt-decoder/hisi-ptt-pkt-decoder.h
> create mode 100644 tools/perf/util/hisi-ptt.c
> create mode 100644 tools/perf/util/hisi-ptt.h
>

2022-06-27 13:34:00

by Yicong Yang

[permalink] [raw]
Subject: Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

On 2022/6/27 21:12, Greg KH wrote:
> On Mon, Jun 27, 2022 at 07:18:12PM +0800, Yicong Yang wrote:
>> Hi Greg,
>>
>> Since the kernel side of this device has been reviewed for 8 versions with
>> all comments addressed and no more comment since v9 posted in 5.19-rc1,
>> is it ok to merge it first (for Patch 1-3 and 7-8)?
>
> I am not the maintainer of this subsystem, so I do not understand why
> you are asking me :(
>

I checked the log of drivers/hwtracing and seems patches of coresight/intel_th/stm
applied by different maintainers and I see you applied some patches of intel_th/stm.
Should any of these three maintainers or you can help applied this?

Thanks.

2022-06-27 13:39:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

On Mon, Jun 27, 2022 at 07:18:12PM +0800, Yicong Yang wrote:
> Hi Greg,
>
> Since the kernel side of this device has been reviewed for 8 versions with
> all comments addressed and no more comment since v9 posted in 5.19-rc1,
> is it ok to merge it first (for Patch 1-3 and 7-8)?

I am not the maintainer of this subsystem, so I do not understand why
you are asking me :(

thanks,

greg k-h

2022-06-27 14:15:31

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

On Mon, Jun 27, 2022 at 09:25:42PM +0800, Yicong Yang wrote:
> On 2022/6/27 21:12, Greg KH wrote:
> > On Mon, Jun 27, 2022 at 07:18:12PM +0800, Yicong Yang wrote:
> >> Hi Greg,
> >>
> >> Since the kernel side of this device has been reviewed for 8 versions with
> >> all comments addressed and no more comment since v9 posted in 5.19-rc1,
> >> is it ok to merge it first (for Patch 1-3 and 7-8)?
> >
> > I am not the maintainer of this subsystem, so I do not understand why
> > you are asking me :(
> >
>
> I checked the log of drivers/hwtracing and seems patches of coresight/intel_th/stm
> applied by different maintainers and I see you applied some patches of intel_th/stm.
> Should any of these three maintainers or you can help applied this?

I was hoping Mark would have a look, since he knows this ARM stuff
better than me. But ISTR he's somewhat busy atm too. But an ACK from the
CoreSight people would also be appreciated.

And Arnaldo usually doesn't pick up the userspace perf bits until the
kernel side is sorted.

2022-06-28 07:29:34

by Yicong Yang

[permalink] [raw]
Subject: Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

On 2022/6/27 22:01, Peter Zijlstra wrote:
> On Mon, Jun 27, 2022 at 09:25:42PM +0800, Yicong Yang wrote:
>> On 2022/6/27 21:12, Greg KH wrote:
>>> On Mon, Jun 27, 2022 at 07:18:12PM +0800, Yicong Yang wrote:
>>>> Hi Greg,
>>>>
>>>> Since the kernel side of this device has been reviewed for 8 versions with
>>>> all comments addressed and no more comment since v9 posted in 5.19-rc1,
>>>> is it ok to merge it first (for Patch 1-3 and 7-8)?
>>>
>>> I am not the maintainer of this subsystem, so I do not understand why
>>> you are asking me :(
>>>
>>
>> I checked the log of drivers/hwtracing and seems patches of coresight/intel_th/stm
>> applied by different maintainers and I see you applied some patches of intel_th/stm.
>> Should any of these three maintainers or you can help applied this?
>
> I was hoping Mark would have a look, since he knows this ARM stuff
> better than me. But ISTR he's somewhat busy atm too. But an ACK from the
> CoreSight people would also be appreciated.
>

Thanks for the instruction.

Hi Mark, Mathieu and Suzuki,

May I have an ack from you to have the driver part of this device merged?

Thanks!

2022-06-29 16:54:42

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

On Mon, Jun 27, 2022 at 04:01:31PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 27, 2022 at 09:25:42PM +0800, Yicong Yang wrote:
> > On 2022/6/27 21:12, Greg KH wrote:
> > > On Mon, Jun 27, 2022 at 07:18:12PM +0800, Yicong Yang wrote:
> > >> Hi Greg,
> > >>
> > >> Since the kernel side of this device has been reviewed for 8 versions with
> > >> all comments addressed and no more comment since v9 posted in 5.19-rc1,
> > >> is it ok to merge it first (for Patch 1-3 and 7-8)?
> > >
> > > I am not the maintainer of this subsystem, so I do not understand why
> > > you are asking me :(
> > >
> >
> > I checked the log of drivers/hwtracing and seems patches of coresight/intel_th/stm
> > applied by different maintainers and I see you applied some patches of intel_th/stm.
> > Should any of these three maintainers or you can help applied this?
>
> I was hoping Mark would have a look, since he knows this ARM stuff
> better than me. But ISTR he's somewhat busy atm too. But an ACK from the
> CoreSight people would also be appreciated.
>

I'll spend some time on it next week.

Thanks,
Mathieu

> And Arnaldo usually doesn't pick up the userspace perf bits until the
> kernel side is sorted.