LinuxLists.cc - [RFC V2 PATCH 00/11] CXL: Process event logs

2022-10-10 23:05:49

Subject: [RFC V2 PATCH 00/11] CXL: Process event logs

From: Ira Weiny <[email protected]>

Changes from RFC v1
Add event irqs
General simplification of the code.
Resolve field alignment questions
Update to rev 3.0 for comments and structures
Add reserved fields and output them

Event records inform the OS of various device events. Events are not needed
for any kernel operation but various user level software will want to track
events.

Add event reporting through the trace event mechanism. On driver load read and
clear all device events.

Enable all event logs for interrupts and process each log on interrupt.

Testing of this was performed with additions to QEMU posted here:

https://lore.kernel.org/all/[email protected]/

Ira Weiny (11):
cxl/mbox: Add debug of hardware error code
cxl/mem: Implement Get Event Records command
cxl/mem: Implement Clear Event Records command
cxl/mem: Clear events on driver load
cxl/mem: Trace General Media Event Record
cxl/mem: Trace DRAM Event Record
cxl/mem: Trace Memory Module Event Record
cxl/test: Add generic mock events
cxl/test: Add specific events
cxl/test: Simulate event log overflow
cxl/mem: Wire up event interrupts

MAINTAINERS | 1 +
drivers/cxl/core/mbox.c | 186 ++++++++++++-
drivers/cxl/cxlmem.h | 193 +++++++++++++
drivers/cxl/pci.c | 154 ++++++++++
include/trace/events/cxl.h | 478 ++++++++++++++++++++++++++++++++
include/uapi/linux/cxl_mem.h | 4 +
tools/testing/cxl/test/Kbuild | 2 +-
tools/testing/cxl/test/events.c | 329 ++++++++++++++++++++++
tools/testing/cxl/test/events.h | 9 +
tools/testing/cxl/test/mem.c | 34 +++
10 files changed, 1388 insertions(+), 2 deletions(-)
create mode 100644 include/trace/events/cxl.h
create mode 100644 tools/testing/cxl/test/events.c
create mode 100644 tools/testing/cxl/test/events.h

base-commit: e2302539dd4f1c62d96651c07ddb05aa2461d29c
--
2.37.2

2022-10-10 23:19:31

by Ira Weiny

[permalink] [raw]

Subject: [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load

From: Ira Weiny <[email protected]>

The information contained in the events prior to the driver loading can
be queried at any time through other mailbox commands.

Ensure a clean slate of events by reading and clearing the events. The
events are sent to the trace buffer but it is not anticipated to have
anyone listening to it at driver load time.

Signed-off-by: Ira Weiny <[email protected]>
---
drivers/cxl/pci.c | 2 ++
tools/testing/cxl/test/mem.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index faeb5d9d7a7a..5f1b492bd388 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -498,6 +498,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (IS_ERR(cxlmd))
return PTR_ERR(cxlmd);

+ cxl_mem_get_event_records(cxlds);
+
if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);

diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index aa2df3a15051..e2f5445d24ff 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
if (IS_ERR(cxlmd))
return PTR_ERR(cxlmd);

+ cxl_mem_get_event_records(cxlds);
+
if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
rc = devm_cxl_add_nvdimm(dev, cxlmd);

--
2.37.2

2022-10-10 23:38:49

by Ira Weiny

[permalink] [raw]

Subject: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code

From: Ira Weiny <[email protected]>

If a mailbox command fails the driver always reports ENXIO. But this
may not be enough information to understand why the hardware, or in my
case Qemu, was failing.

Add a debug print of the error code returned from the hardware.

Signed-off-by: Ira Weiny <[email protected]>
---
drivers/cxl/core/mbox.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 16176b9278b4..6c4d024ad0e8 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -181,8 +181,11 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
if (rc)
return rc;

- if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
+ if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
+ dev_dbg(cxlds->dev, "MB error : %s\n",
+ cxl_mbox_cmd_rc2str(&mbox_cmd));
return cxl_mbox_cmd_rc2errno(&mbox_cmd);
+ }

/*
* Variable sized commands can't be validated and so it's up to the
--
2.37.2

2022-10-10 23:59:05

by Ira Weiny

[permalink] [raw]

Subject: [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module Event Record

From: Ira Weiny <[email protected]>

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Signed-off-by: Ira Weiny <[email protected]>

---
Changes from RFC:
Clean up spec reference
Add reserved data
Use new CXL header macros
Jonathan
Use else if
Use get_unaligned_le*() for unaligned fields
Dave Jiang
s/cxl_mem_mod_event/memory_module
s/cxl_evt_mem_mod_rec/cxl_event_mem_module
---
drivers/cxl/core/mbox.c | 14 ++++
drivers/cxl/cxlmem.h | 27 +++++++
include/trace/events/cxl.h | 146 +++++++++++++++++++++++++++++++++++++
3 files changed, 187 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 72b589edc074..6b3119bc83d2 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -728,6 +728,14 @@ static const uuid_t dram_event_uuid =
UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);

+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+ UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+ 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
static void cxl_trace_event_record(const char *dev_name,
enum cxl_event_log_type type,
struct cxl_get_event_payload *payload)
@@ -746,6 +754,12 @@ static void cxl_trace_event_record(const char *dev_name,

trace_dram(dev_name, type, rec);
return;
+ } else if (uuid_equal(id, &mem_mod_event_uuid)) {
+ struct cxl_event_mem_module *rec =
+ (struct cxl_event_mem_module *)&payload->record;
+
+ trace_memory_module(dev_name, type, rec);
+ return;
}

/* For unknown record types print just the header */
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index d0253e5f1187..79b3fac6d9ef 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -455,6 +455,33 @@ struct cxl_event_dram {
u8 reserved[CXL_EVENT_DER_RES_SIZE];
} __packed;

+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+ u8 health_status;
+ u8 media_status;
+ u8 add_status;
+ u8 life_used;
+ u8 device_temp[2];
+ u8 dirty_shutdown_cnt[4];
+ u8 cor_vol_err_cnt[4];
+ u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_EVENT_MEM_MOD_RES_SIZE 0x3d
+struct cxl_event_mem_module {
+ struct cxl_event_record_hdr hdr;
+ u8 event_type;
+ struct cxl_get_health_info info;
+ u8 reserved[CXL_EVENT_MEM_MOD_RES_SIZE];
+} __packed;
+
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index 7a90cfea348b..e2082862ed94 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -324,6 +324,152 @@ TRACE_EVENT(dram,
)
);

+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE 0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE 0x01
+#define CXL_MMER_LIFE_USED_CHANGE 0x02
+#define CXL_MMER_TEMP_CHANGE 0x03
+#define CXL_MMER_DATA_PATH_ERROR 0x04
+#define CXL_MMER_LAS_ERROR 0x05
+#define show_dev_evt_type(type) __print_symbolic(type, \
+ { CXL_MMER_HEALTH_STATUS_CHANGE, "Health Status Change" }, \
+ { CXL_MMER_MEDIA_STATUS_CHANGE, "Media Status Change" }, \
+ { CXL_MMER_LIFE_USED_CHANGE, "Life Used Change" }, \
+ { CXL_MMER_TEMP_CHANGE, "Temperature Change" }, \
+ { CXL_MMER_DATA_PATH_ERROR, "Data Path Error" }, \
+ { CXL_MMER_LAS_ERROR, "LSA Error" } \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED BIT(2)
+#define show_health_status_flags(flags) __print_flags(flags, "|", \
+ { CXL_DHI_HS_MAINTENANCE_NEEDED, "Maintenance Needed" }, \
+ { CXL_DHI_HS_PERFORMANCE_DEGRADED, "Performance Degraded" }, \
+ { CXL_DHI_HS_HW_REPLACEMENT_NEEDED, "Replacement Needed" } \
+)
+
+#define CXL_DHI_MS_NORMAL 0x00
+#define CXL_DHI_MS_NOT_READY 0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST 0x02
+#define CXL_DHI_MS_ALL_DATA_LOST 0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS 0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN 0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT 0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS 0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN 0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT 0x09
+#define show_media_status(ms) __print_symbolic(ms, \
+ { CXL_DHI_MS_NORMAL, \
+ "Normal" }, \
+ { CXL_DHI_MS_NOT_READY, \
+ "Not Ready" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOST, \
+ "Write Persistency Lost" }, \
+ { CXL_DHI_MS_ALL_DATA_LOST, \
+ "All Data Lost" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS, \
+ "Write Persistency Loss in the Event of Power Loss" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN, \
+ "Write Persistency Loss in Event of Shutdown" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT, \
+ "Write Persistency Loss Imminent" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS, \
+ "All Data Loss in Event of Power Loss" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN, \
+ "All Data loss in the Event of Shutdown" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT, \
+ "All Data Loss Imminent" } \
+)
+
+#define CXL_DHI_AS_NORMAL 0x0
+#define CXL_DHI_AS_WARNING 0x1
+#define CXL_DHI_AS_CRITICAL 0x2
+#define show_add_status(as) __print_symbolic(as, \
+ { CXL_DHI_AS_NORMAL, "Normal" }, \
+ { CXL_DHI_AS_WARNING, "Warning" }, \
+ { CXL_DHI_AS_CRITICAL, "Critical" } \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as) (as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as) ((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as) ((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as) ((as & 0x20) >> 5)
+
+TRACE_EVENT(memory_module,
+
+ TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+ struct cxl_event_mem_module *rec),
+
+ TP_ARGS(dev_name, log, rec),
+
+ TP_STRUCT__entry(
+ CXL_EVT_TP_entry
+
+ /* Memory Module Event */
+ __field(u8, event_type)
+
+ /* Device Health Info */
+ __field(u8, health_status)
+ __field(u8, media_status)
+ __field(u8, life_used)
+ __field(u32, dirty_shutdown_cnt)
+ __field(u32, cor_vol_err_cnt)
+ __field(u32, cor_per_err_cnt)
+ __field(s16, device_temp)
+ __field(u8, add_status)
+
+ __array(u8, reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
+ ),
+
+ TP_fast_assign(
+ CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+ /* Memory Module Event */
+ __entry->event_type = rec->event_type;
+
+ /* Device Health Info */
+ __entry->health_status = rec->info.health_status;
+ __entry->media_status = rec->info.media_status;
+ __entry->life_used = rec->info.life_used;
+ __entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+ __entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+ __entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+ __entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+ __entry->add_status = rec->info.add_status;
+ memcpy(__entry->reserved, &rec->reserved,
+ CXL_EVENT_MEM_MOD_RES_SIZE);
+ ),
+
+ CXL_EVT_TP_printk("evt_type='%s' health_status='%s' media_status='%s' " \
+ "as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+ "as_cor_per_err_cnt=%s life_used=%u dev_temp=%d " \
+ "dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u " \
+ "reserved=%s",
+ show_dev_evt_type(__entry->event_type),
+ show_health_status_flags(__entry->health_status),
+ show_media_status(__entry->media_status),
+ show_add_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+ show_add_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+ show_add_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+ show_add_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+ __entry->life_used, __entry->device_temp,
+ __entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+ __entry->cor_per_err_cnt,
+ __print_hex(__entry->reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
+ )
+);
+
+
#endif /* _CXL_TRACE_EVENTS_H */

/* This part must be outside protection */
--
2.37.2

2022-10-11 00:03:05

by Ira Weiny

[permalink] [raw]

Subject: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record

From: Ira Weiny <[email protected]>

CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Determine if the event read is a DRAM event record and if so trace the
record.

Signed-off-by: Ira Weiny <[email protected]>

---
Changes from RFC:
Add reserved byte data
Use new CXL header macros
Jonathan
Use get_unaligned_le{24,16}() for unaligned fields
Use 'else if'
Dave Jiang
s/cxl_dram_event/dram
s/cxl_evt_dram_rec/cxl_event_dram
Adjust for new phys addr mask
---
drivers/cxl/core/mbox.c | 14 ++++++
drivers/cxl/cxlmem.h | 24 ++++++++++
include/trace/events/cxl.h | 94 ++++++++++++++++++++++++++++++++++++++
3 files changed, 132 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 1097250c115a..72b589edc074 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -720,6 +720,14 @@ static const uuid_t gen_media_event_uuid =
UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);

+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+static const uuid_t dram_event_uuid =
+ UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+ 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
+
static void cxl_trace_event_record(const char *dev_name,
enum cxl_event_log_type type,
struct cxl_get_event_payload *payload)
@@ -732,6 +740,12 @@ static void cxl_trace_event_record(const char *dev_name,

trace_general_media(dev_name, type, rec);
return;
+ } else if (uuid_equal(id, &dram_event_uuid)) {
+ struct cxl_event_dram *rec =
+ (struct cxl_event_dram *)&payload->record;
+
+ trace_dram(dev_name, type, rec);
+ return;
}

/* For unknown record types print just the header */
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index b5c120bd4068..d0253e5f1187 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -431,6 +431,30 @@ struct cxl_event_gen_media {
u8 reserved[CXL_EVENT_GEN_MED_RES_SIZE];
} __packed;

+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE 0x20
+#define CXL_EVENT_DER_RES_SIZE 0x17
+struct cxl_event_dram {
+ struct cxl_event_record_hdr hdr;
+ __le64 phys_addr;
+ u8 descriptor;
+ u8 type;
+ u8 transaction_type;
+ u8 validity_flags[2];
+ u8 channel;
+ u8 rank;
+ u8 nibble_mask[3];
+ u8 bank_group;
+ u8 bank;
+ u8 row[3];
+ u8 column[2];
+ u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+ u8 reserved[CXL_EVENT_DER_RES_SIZE];
+} __packed;
+
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index 82a8d3b750a2..7a90cfea348b 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -230,6 +230,100 @@ TRACE_EVENT(general_media,
)
);

+/*
+ * DRAM Event Record - DER
+ *
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+/*
+ * DRAM Event Record defines many fields the same as the General Media Event
+ * Record. Reuse those definitions as appropriate.
+ */
+#define CXL_DER_VALID_CHANNEL BIT(0)
+#define CXL_DER_VALID_RANK BIT(1)
+#define CXL_DER_VALID_NIBBLE BIT(2)
+#define CXL_DER_VALID_BANK_GROUP BIT(3)
+#define CXL_DER_VALID_BANK BIT(4)
+#define CXL_DER_VALID_ROW BIT(5)
+#define CXL_DER_VALID_COLUMN BIT(6)
+#define CXL_DER_VALID_CORRECTION_MASK BIT(7)
+#define show_dram_valid_flags(flags) __print_flags(flags, "|", \
+ { CXL_DER_VALID_CHANNEL, "CHANNEL" }, \
+ { CXL_DER_VALID_RANK, "RANK" }, \
+ { CXL_DER_VALID_NIBBLE, "NIBBLE" }, \
+ { CXL_DER_VALID_BANK_GROUP, "BANK GROUP" }, \
+ { CXL_DER_VALID_BANK, "BANK" }, \
+ { CXL_DER_VALID_ROW, "ROW" }, \
+ { CXL_DER_VALID_COLUMN, "COLUMN" }, \
+ { CXL_DER_VALID_CORRECTION_MASK, "CORRECTION MASK" } \
+)
+
+TRACE_EVENT(dram,
+
+ TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+ struct cxl_event_dram *rec),
+
+ TP_ARGS(dev_name, log, rec),
+
+ TP_STRUCT__entry(
+ CXL_EVT_TP_entry
+ /* DRAM */
+ __field(u64, phys_addr)
+ __field(u8, descriptor)
+ __field(u8, type)
+ __field(u8, transaction_type)
+ __field(u8, channel)
+ __field(u16, validity_flags)
+ __field(u16, column) /* Out of order to pack trace record */
+ __field(u32, nibble_mask)
+ __field(u32, row)
+ __array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+ __array(u8, reserved, CXL_EVENT_DER_RES_SIZE)
+ __field(u8, rank) /* Out of order to pack trace record */
+ __field(u8, bank_group) /* Out of order to pack trace record */
+ __field(u8, bank) /* Out of order to pack trace record */
+ ),
+
+ TP_fast_assign(
+ CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+ /* DRAM */
+ __entry->phys_addr = le64_to_cpu(rec->phys_addr);
+ __entry->descriptor = rec->descriptor;
+ __entry->type = rec->type;
+ __entry->transaction_type = rec->transaction_type;
+ __entry->validity_flags = get_unaligned_le16(rec->validity_flags);
+ __entry->channel = rec->channel;
+ __entry->rank = rec->rank;
+ __entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
+ __entry->bank_group = rec->bank_group;
+ __entry->bank = rec->bank;
+ __entry->row = get_unaligned_le24(rec->row);
+ __entry->column = get_unaligned_le16(rec->column);
+ memcpy(__entry->cor_mask, &rec->correction_mask,
+ CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+ memcpy(__entry->reserved, &rec->reserved,
+ CXL_EVENT_DER_RES_SIZE);
+ ),
+
+ CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
+ "trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
+ "bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
+ "valid_flags='%s' reserved=%s",
+ __entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
+ (__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
+ show_event_desc_flags(__entry->descriptor),
+ show_mem_event_type(__entry->type),
+ show_trans_type(__entry->transaction_type),
+ __entry->channel, __entry->rank, __entry->nibble_mask,
+ __entry->bank_group, __entry->bank,
+ __entry->row, __entry->column,
+ __print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
+ show_dram_valid_flags(__entry->validity_flags),
+ __print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
+ )
+);
+
#endif /* _CXL_TRACE_EVENTS_H */

/* This part must be outside protection */
--
2.37.2

2022-10-11 10:48:44

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code

On Mon, 10 Oct 2022 15:41:21 -0700
[email protected] wrote:

> From: Ira Weiny <[email protected]>
>
> If a mailbox command fails the driver always reports ENXIO. But this
> may not be enough information to understand why the hardware, or in my
> case Qemu, was failing.
>
> Add a debug print of the error code returned from the hardware.
>
> Signed-off-by: Ira Weiny <[email protected]>
Seems very sensible to me.

Reviewed-by: Jonathan Cameron <[email protected]>

> ---
> drivers/cxl/core/mbox.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 16176b9278b4..6c4d024ad0e8 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -181,8 +181,11 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
> if (rc)
> return rc;
>
> - if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
> + if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> + dev_dbg(cxlds->dev, "MB error : %s\n",
> + cxl_mbox_cmd_rc2str(&mbox_cmd));
> return cxl_mbox_cmd_rc2errno(&mbox_cmd);
> + }
>
> /*
> * Variable sized commands can't be validated and so it's up to the

2022-10-11 13:04:08

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load

On Mon, 10 Oct 2022 15:41:24 -0700
[email protected] wrote:

> From: Ira Weiny <[email protected]>
>
> The information contained in the events prior to the driver loading can
> be queried at any time through other mailbox commands.
>
> Ensure a clean slate of events by reading and clearing the events. The
> events are sent to the trace buffer but it is not anticipated to have
> anyone listening to it at driver load time.
>
> Signed-off-by: Ira Weiny <[email protected]>
Makes sense I think.

Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/cxl/pci.c | 2 ++
> tools/testing/cxl/test/mem.c | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..5f1b492bd388 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -498,6 +498,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
>
> + cxl_mem_get_event_records(cxlds);
> +
> if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
>
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index aa2df3a15051..e2f5445d24ff 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
>
> + cxl_mem_get_event_records(cxlds);
> +
> if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> rc = devm_cxl_add_nvdimm(dev, cxlmd);
>

2022-10-11 14:08:21

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module Event Record

On Mon, 10 Oct 2022 15:41:27 -0700
[email protected] wrote:

> From: Ira Weiny <[email protected]>
>
> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
>
> Determine if the event read is memory module record and if so trace the
> record.
>
> Signed-off-by: Ira Weiny <[email protected]>
>
A few trivial comments inline. I'm happy either way
Reviewed-by: Jonathan Cameron <[email protected]>

> +#define show_add_status(as) __print_symbolic(as, \
> + { CXL_DHI_AS_NORMAL, "Normal" }, \
> + { CXL_DHI_AS_WARNING, "Warning" }, \
> + { CXL_DHI_AS_CRITICAL, "Critical" } \
> +)
> +
> +#define CXL_DHI_AS_LIFE_USED(as) (as & 0x3)
> +#define CXL_DHI_AS_DEV_TEMP(as) ((as & 0xC) >> 2)
> +#define CXL_DHI_AS_COR_VOL_ERR_CNT(as) ((as & 0x10) >> 4)
> +#define CXL_DHI_AS_COR_PER_ERR_CNT(as) ((as & 0x20) >> 5)
>

> + ),
> +
> + CXL_EVT_TP_printk("evt_type='%s' health_status='%s' media_status='%s' " \
> + "as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
> + "as_cor_per_err_cnt=%s life_used=%u dev_temp=%d " \
> + "dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u " \
> + "reserved=%s",
> + show_dev_evt_type(__entry->event_type),
> + show_health_status_flags(__entry->health_status),
> + show_media_status(__entry->media_status),
> + show_add_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
> + show_add_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
> + show_add_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
> + show_add_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
A little nasty to use same show_add_status() for the 2 bit and 1 bit versions.
Obviously it works, but maybe it's worth two macros for the 1 and 2 bit version?

> + __entry->life_used, __entry->device_temp,
> + __entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
> + __entry->cor_per_err_cnt,
> + __print_hex(__entry->reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
> + )
Aligned one tab too far?

2022-10-11 14:13:32

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record

On Mon, 10 Oct 2022 15:41:26 -0700
[email protected] wrote:

> From: Ira Weiny <[email protected]>
>
> CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
>
> Determine if the event read is a DRAM event record and if so trace the
> record.
>
> Signed-off-by: Ira Weiny <[email protected]>
>

Trivial comments inline

Reviewed-by: Jonathan Cameron <[email protected]>

> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index 82a8d3b750a2..7a90cfea348b 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -230,6 +230,100 @@ TRACE_EVENT(general_media,
> )
> );
>

> +
> +TRACE_EVENT(dram,
> +
> + TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> + struct cxl_event_dram *rec),
> +
> + TP_ARGS(dev_name, log, rec),
> +
> + TP_STRUCT__entry(
> + CXL_EVT_TP_entry
> + /* DRAM */
> + __field(u64, phys_addr)
> + __field(u8, descriptor)
> + __field(u8, type)
> + __field(u8, transaction_type)
> + __field(u8, channel)
> + __field(u16, validity_flags)
> + __field(u16, column) /* Out of order to pack trace record */
> + __field(u32, nibble_mask)
> + __field(u32, row)
> + __array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> + __array(u8, reserved, CXL_EVENT_DER_RES_SIZE)

If we are going to have this, why not put it at the end? Will that affect the
packing badly?

> + __field(u8, rank) /* Out of order to pack trace record */
> + __field(u8, bank_group) /* Out of order to pack trace record */
> + __field(u8, bank) /* Out of order to pack trace record */
> + ),
> +
> + TP_fast_assign(
> + CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> + /* DRAM */
> + __entry->phys_addr = le64_to_cpu(rec->phys_addr);
> + __entry->descriptor = rec->descriptor;
> + __entry->type = rec->type;
> + __entry->transaction_type = rec->transaction_type;
> + __entry->validity_flags = get_unaligned_le16(rec->validity_flags);
> + __entry->channel = rec->channel;
> + __entry->rank = rec->rank;
> + __entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
> + __entry->bank_group = rec->bank_group;
> + __entry->bank = rec->bank;
> + __entry->row = get_unaligned_le24(rec->row);
> + __entry->column = get_unaligned_le16(rec->column);
> + memcpy(__entry->cor_mask, &rec->correction_mask,
> + CXL_EVENT_DER_CORRECTION_MASK_SIZE);
> + memcpy(__entry->reserved, &rec->reserved,
> + CXL_EVENT_DER_RES_SIZE);
> + ),
> +
> + CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> + "trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
> + "bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> + "valid_flags='%s' reserved=%s",
> + __entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
> + (__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
> + show_event_desc_flags(__entry->descriptor),
> + show_mem_event_type(__entry->type),
> + show_trans_type(__entry->transaction_type),
> + __entry->channel, __entry->rank, __entry->nibble_mask,
> + __entry->bank_group, __entry->bank,
> + __entry->row, __entry->column,
> + __print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> + show_dram_valid_flags(__entry->validity_flags),
> + __print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
> + )
Probably one less tab on that trailing )?

> +);
> +
> #endif /* _CXL_TRACE_EVENTS_H */
>
> /* This part must be outside protection */

2022-10-14 17:21:19

by Davidlohr Bueso

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code

On Fri, 14 Oct 2022, Davidlohr Bueso wrote:
>>- if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>>+ if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
>>+ dev_dbg(cxlds->dev, "MB error : %s\n",
>
>Maybe s/MB/mbox?

Actually 'Mailbox' seems to be the standard:

core/regs.c: dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset);
core/regs.c: dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset);
pci.c: dev_dbg(dev, "Mailbox operation had an error\n");
pci.c: dev_err(cxlds->dev, "Mailbox is too small (%zub)",
pci.c: dev_dbg(cxlds->dev, "Mailbox payload sized %zu",

2022-10-14 17:26:11

by Davidlohr Bueso

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code

On Mon, 10 Oct 2022, [email protected] wrote:

>From: Ira Weiny <[email protected]>
>
>If a mailbox command fails the driver always reports ENXIO. But this
>may not be enough information to understand why the hardware, or in my
>case Qemu, was failing.
>
>Add a debug print of the error code returned from the hardware.

Reviewed-by: Davidlohr Bueso <[email protected]>

with a nit below.

>
>Signed-off-by: Ira Weiny <[email protected]>
>---
> drivers/cxl/core/mbox.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>index 16176b9278b4..6c4d024ad0e8 100644
>--- a/drivers/cxl/core/mbox.c
>+++ b/drivers/cxl/core/mbox.c
>@@ -181,8 +181,11 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
> if (rc)
> return rc;
>
>- if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>+ if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
>+ dev_dbg(cxlds->dev, "MB error : %s\n",

Maybe s/MB/mbox?

>+ cxl_mbox_cmd_rc2str(&mbox_cmd));
> return cxl_mbox_cmd_rc2errno(&mbox_cmd);
>+ }
>
> /*
> * Variable sized commands can't be validated and so it's up to the
>--
>2.37.2
>

2022-10-14 17:33:47

by Ira Weiny

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code

On Fri, Oct 14, 2022 at 09:31:49AM -0700, Davidlohr Bueso wrote:
> On Fri, 14 Oct 2022, Davidlohr Bueso wrote:
> > > - if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
> > > + if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> > > + dev_dbg(cxlds->dev, "MB error : %s\n",
> >
> > Maybe s/MB/mbox?
>
> Actually 'Mailbox' seems to be the standard:

Good point! Changed.

Thanks!
Ira

>
> core/regs.c: dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset);
> core/regs.c: dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset);
> pci.c: dev_dbg(dev, "Mailbox operation had an error\n");
> pci.c: dev_err(cxlds->dev, "Mailbox is too small (%zub)",
> pci.c: dev_dbg(cxlds->dev, "Mailbox payload sized %zu",

2022-10-14 23:51:01

by Ira Weiny

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record

On Tue, Oct 11, 2022 at 02:47:12PM +0100, Jonathan Cameron wrote:
> On Mon, 10 Oct 2022 15:41:26 -0700
> [email protected] wrote:
>
> > From: Ira Weiny <[email protected]>
> >
> > CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> >
> > Determine if the event read is a DRAM event record and if so trace the
> > record.
> >
> > Signed-off-by: Ira Weiny <[email protected]>
> >
>
> Trivial comments inline
>
> Reviewed-by: Jonathan Cameron <[email protected]>
>
> > diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> > index 82a8d3b750a2..7a90cfea348b 100644
> > --- a/include/trace/events/cxl.h
> > +++ b/include/trace/events/cxl.h
> > @@ -230,6 +230,100 @@ TRACE_EVENT(general_media,
> > )
> > );
> >
>
> > +
> > +TRACE_EVENT(dram,
> > +
> > + TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> > + struct cxl_event_dram *rec),
> > +
> > + TP_ARGS(dev_name, log, rec),
> > +
> > + TP_STRUCT__entry(
> > + CXL_EVT_TP_entry
> > + /* DRAM */
> > + __field(u64, phys_addr)
> > + __field(u8, descriptor)
> > + __field(u8, type)
> > + __field(u8, transaction_type)
> > + __field(u8, channel)
> > + __field(u16, validity_flags)
> > + __field(u16, column) /* Out of order to pack trace record */
> > + __field(u32, nibble_mask)
> > + __field(u32, row)
> > + __array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> > + __array(u8, reserved, CXL_EVENT_DER_RES_SIZE)
>
> If we are going to have this, why not put it at the end? Will that affect the
> packing badly?

I removed it.

[snip]

> > +
> > + CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> > + "trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
> > + "bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> > + "valid_flags='%s' reserved=%s",
> > + __entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
> > + (__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
> > + show_event_desc_flags(__entry->descriptor),
> > + show_mem_event_type(__entry->type),
> > + show_trans_type(__entry->transaction_type),
> > + __entry->channel, __entry->rank, __entry->nibble_mask,
> > + __entry->bank_group, __entry->bank,
> > + __entry->row, __entry->column,
> > + __print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> > + show_dram_valid_flags(__entry->validity_flags),
> > + __print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
> > + )
> Probably one less tab on that trailing )?

Done.

Thanks!
Ira

2022-10-15 11:47:45

by Steven Rostedt

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record

On Mon, 10 Oct 2022 15:41:26 -0700
[email protected] wrote:

> +TRACE_EVENT(dram,

Call this "cxl_dram"

-- Steve

> +
> + TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> + struct cxl_event_dram *rec),
> +
> + TP_ARGS(dev_name, log, rec),
> +
> + TP_STRUCT__entry(
> + CXL_EVT_TP_entry
> + /* DRAM */
> + __field(u64, phys_addr)
> + __field(u8, descriptor)
> + __field(u8, type)
> + __field(u8, transaction_type)
> + __field(u8, channel)
> + __field(u16, validity_flags)
> + __field(u16, column) /* Out of order to pack trace record */
> + __field(u32, nibble_mask)
> + __field(u32, row)
> + __array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> + __array(u8, reserved, CXL_EVENT_DER_RES_SIZE)
> + __field(u8, rank) /* Out of order to pack trace record */
> + __field(u8, bank_group) /* Out of order to pack trace record */
> + __field(u8, bank) /* Out of order to pack trace record */
> + ),
> +
> + TP_fast_assign(
> + CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> + /* DRAM */
> + __entry->phys_addr = le64_to_cpu(rec->phys_addr);
> + __entry->descriptor = rec->descriptor;
> + __entry->type = rec->type;
> + __entry->transaction_type = rec->transaction_type;
> + __entry->validity_flags = get_unaligned_le16(rec->validity_flags);
> + __entry->channel = rec->channel;
> + __entry->rank = rec->rank;
> + __entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
> + __entry->bank_group = rec->bank_group;
> + __entry->bank = rec->bank;
> + __entry->row = get_unaligned_le24(rec->row);
> + __entry->column = get_unaligned_le16(rec->column);
> + memcpy(__entry->cor_mask, &rec->correction_mask,
> + CXL_EVENT_DER_CORRECTION_MASK_SIZE);
> + memcpy(__entry->reserved, &rec->reserved,
> + CXL_EVENT_DER_RES_SIZE);
> + ),
> +
> + CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> + "trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
> + "bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> + "valid_flags='%s' reserved=%s",
> + __entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
> + (__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
> + show_event_desc_flags(__entry->descriptor),
> + show_mem_event_type(__entry->type),
> + show_trans_type(__entry->transaction_type),
> + __entry->channel, __entry->rank, __entry->nibble_mask,
> + __entry->bank_group, __entry->bank,
> + __entry->row, __entry->column,
> + __print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> + show_dram_valid_flags(__entry->validity_flags),
> + __print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
> + )
> +);
> +

2022-10-15 11:48:01

by Steven Rostedt

[permalink] [raw]

Subject: Re: [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module Event Record

On Mon, 10 Oct 2022 15:41:27 -0700
[email protected] wrote:

> +TRACE_EVENT(memory_module,

Make sure all your new events have the "cxl_" prefix. "cxl_memory_module".

This goes for all events in this series.

Thanks,

-- Steve

> +
> + TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> + struct cxl_event_mem_module *rec),
> +
> + TP_ARGS(dev_name, log, rec),
> +
> + TP_STRUCT__entry(
> + CXL_EVT_TP_entry
> +
> + /* Memory Module Event */
> + __field(u8, event_type)
> +
> + /* Device Health Info */
> + __field(u8, health_status)
> + __field(u8, media_status)
> + __field(u8, life_used)
> + __field(u32, dirty_shutdown_cnt)
> + __field(u32, cor_vol_err_cnt)
> + __field(u32, cor_per_err_cnt)
> + __field(s16, device_temp)
> + __field(u8, add_status)
> +
> + __array(u8, reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
> + ),
> +
> + TP_fast_assign(
> + CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> + /* Memory Module Event */
> + __entry->event_type = rec->event_type;
> +
> + /* Device Health Info */
> + __entry->health_status = rec->info.health_status;
> + __entry->media_status = rec->info.media_status;
> + __entry->life_used = rec->info.life_used;
> + __entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
> + __entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
> + __entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
> + __entry->device_temp = get_unaligned_le16(rec->info.device_temp);
> + __entry->add_status = rec->info.add_status;
> + memcpy(__entry->reserved, &rec->reserved,
> + CXL_EVENT_MEM_MOD_RES_SIZE);
> + ),
> +
> + CXL_EVT_TP_printk("evt_type='%s' health_status='%s' media_status='%s' " \
> + "as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
> + "as_cor_per_err_cnt=%s life_used=%u dev_temp=%d " \
> + "dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u " \
> + "reserved=%s",
> + show_dev_evt_type(__entry->event_type),
> + show_health_status_flags(__entry->health_status),
> + show_media_status(__entry->media_status),
> + show_add_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
> + show_add_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
> + show_add_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
> + show_add_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
> + __entry->life_used, __entry->device_temp,
> + __entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
> + __entry->cor_per_err_cnt,
> + __print_hex(__entry->reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
> + )
> +);
> +
> +