2023-01-09 20:32:08

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v6 0/8] cxl: Process event logs

I accidentally missed some comments from Dan in my V5 that I sent.[*] Dan
asked me to do a quick spin to pick them up.

[*] https://lore.kernel.org/all/[email protected]/

This code has been tested with a newer qemu which allows for more events to be
returned at a time as well an additional QMP event and interrupt injection.
Those patches will follow once they have been cleaned up.

The series is now in 3 parts:

1) Base functionality including interrupts
2) Tracing specific events (Dynamic Capacity Event Record is defered)
3) cxl-test infrastructure for basic tests

To: Dan Williams <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Alison Schofield <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ben Widawsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ira Weiny <[email protected]>

---
Changes in v6:
- Dan: address the missed comments on V4
- Link to v5: https://lore.kernel.org/r/[email protected]

---
Davidlohr Bueso (1):
cxl/mem: Wire up event interrupts

Ira Weiny (7):
cxl/mem: Read, trace, and clear events on driver load
cxl/mem: Trace General Media Event Record
cxl/mem: Trace DRAM Event Record
cxl/mem: Trace Memory Module Event Record
cxl/test: Add generic mock events
cxl/test: Add specific events
cxl/test: Simulate event log overflow

drivers/cxl/core/mbox.c | 187 +++++++++++++++++
drivers/cxl/core/trace.h | 479 ++++++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxl.h | 16 ++
drivers/cxl/cxlmem.h | 173 +++++++++++++++
drivers/cxl/cxlpci.h | 6 +
drivers/cxl/pci.c | 241 +++++++++++++++++++++
tools/testing/cxl/test/Kbuild | 2 +-
tools/testing/cxl/test/mem.c | 352 +++++++++++++++++++++++++++++++
8 files changed, 1455 insertions(+), 1 deletion(-)
---
base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
change-id: 20221216-cxl-ev-log-f06383197541

Best regards,
--
Ira Weiny <[email protected]>


2023-01-09 20:33:45

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v6 3/8] cxl/mem: Trace General Media Event Record

CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.

Determine if the event read is a general media record and if so trace
the record as a General Media Event Record.

Reviewed-by: Dan Williams <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from V4:
Pick up tag
---
drivers/cxl/core/mbox.c | 29 ++++++++++-
drivers/cxl/core/trace.h | 124 +++++++++++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 19 ++++++++
3 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 5ad4716f2e11..0ece1766f613 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -718,6 +718,31 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
}
EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);

+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+static const uuid_t gen_media_event_uuid =
+ UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+ 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
+
+static void cxl_event_trace_record(const struct device *dev,
+ enum cxl_event_log_type type,
+ struct cxl_event_record_raw *record)
+{
+ uuid_t *id = &record->hdr.id;
+
+ if (uuid_equal(id, &gen_media_event_uuid)) {
+ struct cxl_event_gen_media *rec =
+ (struct cxl_event_gen_media *)record;
+
+ trace_cxl_general_media(dev, type, rec);
+ } else {
+ /* For unknown record types print just the header */
+ trace_cxl_generic_event(dev, type, record);
+ }
+}
+
static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
enum cxl_event_log_type log,
struct cxl_get_event_payload *get_pl)
@@ -811,8 +836,8 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
break;

for (i = 0; i < nr_rec; i++)
- trace_cxl_generic_event(cxlds->dev, type,
- &payload->records[i]);
+ cxl_event_trace_record(cxlds->dev, type,
+ &payload->records[i]);

if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
trace_cxl_overflow(cxlds->dev, type, payload);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 6898212fcb47..d85f0481661d 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -223,6 +223,130 @@ TRACE_EVENT(cxl_generic_event,
__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
);

+/*
+ * Physical Address field masks
+ *
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ *
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+#define CXL_DPA_FLAGS_MASK 0x3F
+#define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
+
+#define CXL_DPA_VOLATILE BIT(0)
+#define CXL_DPA_NOT_REPAIRABLE BIT(1)
+#define show_dpa_flags(flags) __print_flags(flags, "|", \
+ { CXL_DPA_VOLATILE, "VOLATILE" }, \
+ { CXL_DPA_NOT_REPAIRABLE, "NOT_REPAIRABLE" } \
+)
+
+/*
+ * General Media Event Record - GMER
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT BIT(0)
+#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT BIT(1)
+#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW BIT(2)
+#define show_event_desc_flags(flags) __print_flags(flags, "|", \
+ { CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT, "UNCORRECTABLE_EVENT" }, \
+ { CXL_GMER_EVT_DESC_THRESHOLD_EVENT, "THRESHOLD_EVENT" }, \
+ { CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW, "POISON_LIST_OVERFLOW" } \
+)
+
+#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR 0x00
+#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR 0x01
+#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR 0x02
+#define show_mem_event_type(type) __print_symbolic(type, \
+ { CXL_GMER_MEM_EVT_TYPE_ECC_ERROR, "ECC Error" }, \
+ { CXL_GMER_MEM_EVT_TYPE_INV_ADDR, "Invalid Address" }, \
+ { CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR, "Data Path Error" } \
+)
+
+#define CXL_GMER_TRANS_UNKNOWN 0x00
+#define CXL_GMER_TRANS_HOST_READ 0x01
+#define CXL_GMER_TRANS_HOST_WRITE 0x02
+#define CXL_GMER_TRANS_HOST_SCAN_MEDIA 0x03
+#define CXL_GMER_TRANS_HOST_INJECT_POISON 0x04
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB 0x05
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT 0x06
+#define show_trans_type(type) __print_symbolic(type, \
+ { CXL_GMER_TRANS_UNKNOWN, "Unknown" }, \
+ { CXL_GMER_TRANS_HOST_READ, "Host Read" }, \
+ { CXL_GMER_TRANS_HOST_WRITE, "Host Write" }, \
+ { CXL_GMER_TRANS_HOST_SCAN_MEDIA, "Host Scan Media" }, \
+ { CXL_GMER_TRANS_HOST_INJECT_POISON, "Host Inject Poison" }, \
+ { CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB, "Internal Media Scrub" }, \
+ { CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT, "Internal Media Management" } \
+)
+
+#define CXL_GMER_VALID_CHANNEL BIT(0)
+#define CXL_GMER_VALID_RANK BIT(1)
+#define CXL_GMER_VALID_DEVICE BIT(2)
+#define CXL_GMER_VALID_COMPONENT BIT(3)
+#define show_valid_flags(flags) __print_flags(flags, "|", \
+ { CXL_GMER_VALID_CHANNEL, "CHANNEL" }, \
+ { CXL_GMER_VALID_RANK, "RANK" }, \
+ { CXL_GMER_VALID_DEVICE, "DEVICE" }, \
+ { CXL_GMER_VALID_COMPONENT, "COMPONENT" } \
+)
+
+TRACE_EVENT(cxl_general_media,
+
+ TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+ struct cxl_event_gen_media *rec),
+
+ TP_ARGS(dev, log, rec),
+
+ TP_STRUCT__entry(
+ CXL_EVT_TP_entry
+ /* General Media */
+ __field(u64, dpa)
+ __field(u8, descriptor)
+ __field(u8, type)
+ __field(u8, transaction_type)
+ __field(u8, channel)
+ __field(u32, device)
+ __array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
+ __field(u16, validity_flags)
+ /* Following are out of order to pack trace record */
+ __field(u8, rank)
+ __field(u8, dpa_flags)
+ ),
+
+ TP_fast_assign(
+ CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+ /* General Media */
+ __entry->dpa = le64_to_cpu(rec->phys_addr);
+ __entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+ /* Mask after flags have been parsed */
+ __entry->dpa &= CXL_DPA_MASK;
+ __entry->descriptor = rec->descriptor;
+ __entry->type = rec->type;
+ __entry->transaction_type = rec->transaction_type;
+ __entry->channel = rec->channel;
+ __entry->rank = rec->rank;
+ __entry->device = get_unaligned_le24(rec->device);
+ memcpy(__entry->comp_id, &rec->component_id,
+ CXL_EVENT_GEN_MED_COMP_ID_SIZE);
+ __entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
+ ),
+
+ CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
+ "descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
+ "device=%x comp_id=%s validity_flags='%s'",
+ __entry->dpa, show_dpa_flags(__entry->dpa_flags),
+ show_event_desc_flags(__entry->descriptor),
+ show_mem_event_type(__entry->type),
+ show_trans_type(__entry->transaction_type),
+ __entry->channel, __entry->rank, __entry->device,
+ __print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
+ show_valid_flags(__entry->validity_flags)
+ )
+);
+
#endif /* _CXL_EVENTS_H */

#define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index e6b27e5d5009..7f6355bd6d23 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -451,6 +451,25 @@ struct cxl_mbox_clear_event_payload {
(offsetof(struct cxl_mbox_clear_event_payload, handle) + \
(nr_handles * sizeof(__le16)))

+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MED_COMP_ID_SIZE 0x10
+struct cxl_event_gen_media {
+ struct cxl_event_record_hdr hdr;
+ __le64 phys_addr;
+ u8 descriptor;
+ u8 type;
+ u8 transaction_type;
+ u8 validity_flags[2];
+ u8 channel;
+ u8 rank;
+ u8 device[3];
+ u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
+ u8 reserved[46];
+} __packed;
+
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;

--
2.39.0

2023-01-09 20:35:00

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v6 5/8] cxl/mem: Trace Memory Module Event Record

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Reviewed-by: Dan Williams <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from V4:
Pick up tag
---
drivers/cxl/core/mbox.c | 13 +++++
drivers/cxl/core/trace.h | 143 +++++++++++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 26 +++++++++
3 files changed, 182 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 0101b4de700f..4ac9dc184ccf 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -734,6 +734,14 @@ static const uuid_t dram_event_uuid =
UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);

+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+ UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+ 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
static void cxl_event_trace_record(const struct device *dev,
enum cxl_event_log_type type,
struct cxl_event_record_raw *record)
@@ -749,6 +757,11 @@ static void cxl_event_trace_record(const struct device *dev,
struct cxl_event_dram *rec = (struct cxl_event_dram *)record;

trace_cxl_dram(dev, type, rec);
+ } else if (uuid_equal(id, &mem_mod_event_uuid)) {
+ struct cxl_event_mem_module *rec =
+ (struct cxl_event_mem_module *)record;
+
+ trace_cxl_memory_module(dev, type, rec);
} else {
/* For unknown record types print just the header */
trace_cxl_generic_event(dev, type, record);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index b6321cfb1d9f..ebb4c8ce8587 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -439,6 +439,149 @@ TRACE_EVENT(cxl_dram,
)
);

+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE 0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE 0x01
+#define CXL_MMER_LIFE_USED_CHANGE 0x02
+#define CXL_MMER_TEMP_CHANGE 0x03
+#define CXL_MMER_DATA_PATH_ERROR 0x04
+#define CXL_MMER_LAS_ERROR 0x05
+#define show_dev_evt_type(type) __print_symbolic(type, \
+ { CXL_MMER_HEALTH_STATUS_CHANGE, "Health Status Change" }, \
+ { CXL_MMER_MEDIA_STATUS_CHANGE, "Media Status Change" }, \
+ { CXL_MMER_LIFE_USED_CHANGE, "Life Used Change" }, \
+ { CXL_MMER_TEMP_CHANGE, "Temperature Change" }, \
+ { CXL_MMER_DATA_PATH_ERROR, "Data Path Error" }, \
+ { CXL_MMER_LAS_ERROR, "LSA Error" } \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED BIT(2)
+#define show_health_status_flags(flags) __print_flags(flags, "|", \
+ { CXL_DHI_HS_MAINTENANCE_NEEDED, "MAINTENANCE_NEEDED" }, \
+ { CXL_DHI_HS_PERFORMANCE_DEGRADED, "PERFORMANCE_DEGRADED" }, \
+ { CXL_DHI_HS_HW_REPLACEMENT_NEEDED, "REPLACEMENT_NEEDED" } \
+)
+
+#define CXL_DHI_MS_NORMAL 0x00
+#define CXL_DHI_MS_NOT_READY 0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST 0x02
+#define CXL_DHI_MS_ALL_DATA_LOST 0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS 0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN 0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT 0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS 0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN 0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT 0x09
+#define show_media_status(ms) __print_symbolic(ms, \
+ { CXL_DHI_MS_NORMAL, \
+ "Normal" }, \
+ { CXL_DHI_MS_NOT_READY, \
+ "Not Ready" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOST, \
+ "Write Persistency Lost" }, \
+ { CXL_DHI_MS_ALL_DATA_LOST, \
+ "All Data Lost" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS, \
+ "Write Persistency Loss in the Event of Power Loss" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN, \
+ "Write Persistency Loss in Event of Shutdown" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT, \
+ "Write Persistency Loss Imminent" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS, \
+ "All Data Loss in Event of Power Loss" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN, \
+ "All Data loss in the Event of Shutdown" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT, \
+ "All Data Loss Imminent" } \
+)
+
+#define CXL_DHI_AS_NORMAL 0x0
+#define CXL_DHI_AS_WARNING 0x1
+#define CXL_DHI_AS_CRITICAL 0x2
+#define show_two_bit_status(as) __print_symbolic(as, \
+ { CXL_DHI_AS_NORMAL, "Normal" }, \
+ { CXL_DHI_AS_WARNING, "Warning" }, \
+ { CXL_DHI_AS_CRITICAL, "Critical" } \
+)
+#define show_one_bit_status(as) __print_symbolic(as, \
+ { CXL_DHI_AS_NORMAL, "Normal" }, \
+ { CXL_DHI_AS_WARNING, "Warning" } \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as) (as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as) ((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as) ((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as) ((as & 0x20) >> 5)
+
+TRACE_EVENT(cxl_memory_module,
+
+ TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+ struct cxl_event_mem_module *rec),
+
+ TP_ARGS(dev, log, rec),
+
+ TP_STRUCT__entry(
+ CXL_EVT_TP_entry
+
+ /* Memory Module Event */
+ __field(u8, event_type)
+
+ /* Device Health Info */
+ __field(u8, health_status)
+ __field(u8, media_status)
+ __field(u8, life_used)
+ __field(u32, dirty_shutdown_cnt)
+ __field(u32, cor_vol_err_cnt)
+ __field(u32, cor_per_err_cnt)
+ __field(s16, device_temp)
+ __field(u8, add_status)
+ ),
+
+ TP_fast_assign(
+ CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+ /* Memory Module Event */
+ __entry->event_type = rec->event_type;
+
+ /* Device Health Info */
+ __entry->health_status = rec->info.health_status;
+ __entry->media_status = rec->info.media_status;
+ __entry->life_used = rec->info.life_used;
+ __entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+ __entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+ __entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+ __entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+ __entry->add_status = rec->info.add_status;
+ ),
+
+ CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
+ "as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+ "as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
+ "dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
+ show_dev_evt_type(__entry->event_type),
+ show_health_status_flags(__entry->health_status),
+ show_media_status(__entry->media_status),
+ show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+ show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+ show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+ show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+ __entry->life_used, __entry->device_temp,
+ __entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+ __entry->cor_per_err_cnt
+ )
+);
+
#endif /* _CXL_EVENTS_H */

#define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 27dc89d84037..4eae1cf4e445 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -493,6 +493,32 @@ struct cxl_event_dram {
u8 reserved[0x17];
} __packed;

+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+ u8 health_status;
+ u8 media_status;
+ u8 add_status;
+ u8 life_used;
+ u8 device_temp[2];
+ u8 dirty_shutdown_cnt[4];
+ u8 cor_vol_err_cnt[4];
+ u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+struct cxl_event_mem_module {
+ struct cxl_event_record_hdr hdr;
+ u8 event_type;
+ struct cxl_get_health_info info;
+ u8 reserved[0x3d];
+} __packed;
+
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;

--
2.39.0

2023-01-13 12:57:25

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v6 3/8] cxl/mem: Trace General Media Event Record

On Mon, 09 Jan 2023 11:42:22 -0800
Ira Weiny <[email protected]> wrote:

> CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
>
> Determine if the event read is a general media record and if so trace
> the record as a General Media Event Record.
>
> Reviewed-by: Dan Williams <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
LGTM
Reviewed-by: Jonathan Cameron <[email protected]>

2023-01-13 13:55:08

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v6 5/8] cxl/mem: Trace Memory Module Event Record

On Mon, 09 Jan 2023 11:42:24 -0800
Ira Weiny <[email protected]> wrote:

> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
>
> Determine if the event read is memory module record and if so trace the
> record.
>
> Reviewed-by: Dan Williams <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
Typo inline. With that fixed
Reviewed-by: Jonathan Cameron <[email protected]>

> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index b6321cfb1d9f..ebb4c8ce8587 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -439,6 +439,149 @@ TRACE_EVENT(cxl_dram,
> )
> );
>
> +/*
> + * Memory Module Event Record - MMER
> + *
> + * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +#define CXL_MMER_HEALTH_STATUS_CHANGE 0x00
> +#define CXL_MMER_MEDIA_STATUS_CHANGE 0x01
> +#define CXL_MMER_LIFE_USED_CHANGE 0x02
> +#define CXL_MMER_TEMP_CHANGE 0x03
> +#define CXL_MMER_DATA_PATH_ERROR 0x04
> +#define CXL_MMER_LAS_ERROR 0x05

CXL_MMER_LSA_ERROR

> +#define show_dev_evt_type(type) __print_symbolic(type, \
> + { CXL_MMER_HEALTH_STATUS_CHANGE, "Health Status Change" }, \
> + { CXL_MMER_MEDIA_STATUS_CHANGE, "Media Status Change" }, \
> + { CXL_MMER_LIFE_USED_CHANGE, "Life Used Change" }, \
> + { CXL_MMER_TEMP_CHANGE, "Temperature Change" }, \
> + { CXL_MMER_DATA_PATH_ERROR, "Data Path Error" }, \
> + { CXL_MMER_LAS_ERROR, "LSA Error" } \
> +)

2023-01-13 18:44:59

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH v6 3/8] cxl/mem: Trace General Media Event Record

On Mon, 09 Jan 2023, Ira Weiny wrote:

>CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
>
>Determine if the event read is a general media record and if so trace
>the record as a General Media Event Record.
>
>Reviewed-by: Dan Williams <[email protected]>
>Signed-off-by: Ira Weiny <[email protected]>

Reviewed-by: Davidlohr Bueso <[email protected]>

2023-01-18 01:27:03

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v6 5/8] cxl/mem: Trace Memory Module Event Record

Jonathan Cameron wrote:
> On Mon, 09 Jan 2023 11:42:24 -0800
> Ira Weiny <[email protected]> wrote:
>
> > CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> >
> > Determine if the event read is memory module record and if so trace the
> > record.
> >
> > Reviewed-by: Dan Williams <[email protected]>
> > Signed-off-by: Ira Weiny <[email protected]>
> Typo inline. With that fixed
> Reviewed-by: Jonathan Cameron <[email protected]>
>
> > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> > index b6321cfb1d9f..ebb4c8ce8587 100644
> > --- a/drivers/cxl/core/trace.h
> > +++ b/drivers/cxl/core/trace.h
> > @@ -439,6 +439,149 @@ TRACE_EVENT(cxl_dram,
> > )
> > );
> >
> > +/*
> > + * Memory Module Event Record - MMER
> > + *
> > + * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
> > + */
> > +#define CXL_MMER_HEALTH_STATUS_CHANGE 0x00
> > +#define CXL_MMER_MEDIA_STATUS_CHANGE 0x01
> > +#define CXL_MMER_LIFE_USED_CHANGE 0x02
> > +#define CXL_MMER_TEMP_CHANGE 0x03
> > +#define CXL_MMER_DATA_PATH_ERROR 0x04
> > +#define CXL_MMER_LAS_ERROR 0x05
>
> CXL_MMER_LSA_ERROR

Done.

Thanks,
Ira