2023-12-14 01:20:24

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 0/7] efi/cxl-cper: Report CPER CXL component events through trace events

Series status/background
========================

Smita has been a great help with this series. This includes testing the
last RFC version.[1]

Unfortunately, Dan had a better idea for how to register the call
between the efi and cxl subsystems so this is reworked for V2.

[1] https://lore.kernel.org/all/[email protected]/

Dependencies
============

NOTE this patch depends on Dan's addition of a device guard[1].
Therefore, the base commit is not a stable commit. I've pushed a branch
with this commit included for testing if folks are interested.[2]

[1] https://lore.kernel.org/all/170250854466.1522182.17555361077409628655.stgit@dwillia2-xfh.jf.intel.com/
[2] https://github.com/weiny2/linux-kernel/tree/cxl-cper-2023-12-13

Cover letter
============

CXL Component Events, as defined by EFI 2.10 Section N.2.14, wrap a
mostly CXL event payload in an EFI Common Platform Error Record (CPER)
record. If a device is configured for firmware first CXL event records
are not sent directly to the host.

The CXL sub-system uniquely has DPA to HPA translation information. It
also already has event format tracing. Restructure the code to make
sharing the data between CPER/event logs most efficient. Then send the
CXL CPER records to the CXL sub-system for processing.

With event logs the events interrupt the driver directly. In the EFI
case events are wrapped with device information which allows the CXL
subsystem to identify the PCI device.

Previous version considered matching the memdev differently. However,
the most robust was to find the PCI device via Bus, Device, Function and
use the PCI device to find the driver data.

CPER records are identified with GUID's while CXL event logs contain
UUID's. The UUID is reported for all events. While the UUID is
redundant for the known events the UUID's are already used by rasdaemon.
To keep compatibility UUIDs are injected for CPER records based on the
record type.

In addition this series cleans up the UUID defines used between the
event processing and cxl_test code.

Signed-off-by: Ira Weiny <[email protected]>
---
Changes in v3:
- djbw: use scope based cleanup functions
- iweiny: further clean ups
- Link to v2: https://lore.kernel.org/r/[email protected]

---
Ira Weiny (7):
cxl/trace: Pass uuid explicitly to event traces
cxl/events: Promote CXL event structures to a core header
cxl/events: Create common event UUID defines
cxl/events: Separate UUID from event structures
cxl/events: Create a CXL event union
firmware/efi: Process CXL Component Events
cxl/memdev: Register for and process CPER events

drivers/cxl/core/mbox.c | 93 +++++++++++------------
drivers/cxl/core/trace.h | 30 ++++----
drivers/cxl/cxlmem.h | 110 ++++++---------------------
drivers/cxl/pci.c | 55 +++++++++++++-
drivers/firmware/efi/cper.c | 15 ++++
drivers/firmware/efi/cper_cxl.c | 46 ++++++++++++
drivers/firmware/efi/cper_cxl.h | 29 +++++++
include/linux/cxl-event.h | 154 +++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 2 +
tools/testing/cxl/test/mem.c | 163 +++++++++++++++++++++++-----------------
10 files changed, 477 insertions(+), 220 deletions(-)
---
base-commit: 6436863dfabce0d7ac416c8dc661fd513b967d39
change-id: 20230601-cxl-cper-26ffc839c6c6

Best regards,
--
Ira Weiny <[email protected]>


2023-12-14 01:20:34

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 2/7] cxl/events: Promote CXL event structures to a core header

UEFI code can process CXL events through CPER records. Those records
use almost the same format as the CXL events.

Lift the CXL event structures to a core header to be shared in later
patches.

Signed-off-by: Ira Weiny <[email protected]>
---
drivers/cxl/cxlmem.h | 90 +----------------------------------------
include/linux/cxl-event.h | 100 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 101 insertions(+), 89 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a2fcbca253f3..f0e7ebb84f02 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -6,6 +6,7 @@
#include <linux/cdev.h>
#include <linux/uuid.h>
#include <linux/rcuwait.h>
+#include <linux/cxl-event.h>
#include "cxl.h"

/* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -579,27 +580,6 @@ struct cxl_mbox_identify {
u8 qos_telemetry_caps;
} __packed;

-/*
- * Common Event Record Format
- * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
- */
-struct cxl_event_record_hdr {
- uuid_t id;
- u8 length;
- u8 flags[3];
- __le16 handle;
- __le16 related_handle;
- __le64 timestamp;
- u8 maint_op_class;
- u8 reserved[15];
-} __packed;
-
-#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
-struct cxl_event_record_raw {
- struct cxl_event_record_hdr hdr;
- u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
-} __packed;
-
/*
* Get Event Records output payload
* CXL rev 3.0 section 8.2.9.2.2; Table 8-50
@@ -641,74 +621,6 @@ struct cxl_mbox_clear_event_payload {
} __packed;
#define CXL_CLEAR_EVENT_MAX_HANDLES U8_MAX

-/*
- * General Media Event Record
- * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
- */
-#define CXL_EVENT_GEN_MED_COMP_ID_SIZE 0x10
-struct cxl_event_gen_media {
- struct cxl_event_record_hdr hdr;
- __le64 phys_addr;
- u8 descriptor;
- u8 type;
- u8 transaction_type;
- u8 validity_flags[2];
- u8 channel;
- u8 rank;
- u8 device[3];
- u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
- u8 reserved[46];
-} __packed;
-
-/*
- * DRAM Event Record - DER
- * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
- */
-#define CXL_EVENT_DER_CORRECTION_MASK_SIZE 0x20
-struct cxl_event_dram {
- struct cxl_event_record_hdr hdr;
- __le64 phys_addr;
- u8 descriptor;
- u8 type;
- u8 transaction_type;
- u8 validity_flags[2];
- u8 channel;
- u8 rank;
- u8 nibble_mask[3];
- u8 bank_group;
- u8 bank;
- u8 row[3];
- u8 column[2];
- u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
- u8 reserved[0x17];
-} __packed;
-
-/*
- * Get Health Info Record
- * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
- */
-struct cxl_get_health_info {
- u8 health_status;
- u8 media_status;
- u8 add_status;
- u8 life_used;
- u8 device_temp[2];
- u8 dirty_shutdown_cnt[4];
- u8 cor_vol_err_cnt[4];
- u8 cor_per_err_cnt[4];
-} __packed;
-
-/*
- * Memory Module Event Record
- * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
- */
-struct cxl_event_mem_module {
- struct cxl_event_record_hdr hdr;
- u8 event_type;
- struct cxl_get_health_info info;
- u8 reserved[0x3d];
-} __packed;
-
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
new file mode 100644
index 000000000000..1c94e8fdd227
--- /dev/null
+++ b/include/linux/cxl-event.h
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CXL_EVENT_H
+#define _LINUX_CXL_EVENT_H
+
+/*
+ * CXL event records; CXL rev 3.0
+ *
+ * Copyright(c) 2023 Intel Corporation.
+ */
+
+/*
+ * Common Event Record Format
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
+struct cxl_event_record_hdr {
+ uuid_t id;
+ u8 length;
+ u8 flags[3];
+ __le16 handle;
+ __le16 related_handle;
+ __le64 timestamp;
+ u8 maint_op_class;
+ u8 reserved[15];
+} __packed;
+
+#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
+struct cxl_event_record_raw {
+ struct cxl_event_record_hdr hdr;
+ u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
+} __packed;
+
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MED_COMP_ID_SIZE 0x10
+struct cxl_event_gen_media {
+ struct cxl_event_record_hdr hdr;
+ __le64 phys_addr;
+ u8 descriptor;
+ u8 type;
+ u8 transaction_type;
+ u8 validity_flags[2];
+ u8 channel;
+ u8 rank;
+ u8 device[3];
+ u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
+ u8 reserved[46];
+} __packed;
+
+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE 0x20
+struct cxl_event_dram {
+ struct cxl_event_record_hdr hdr;
+ __le64 phys_addr;
+ u8 descriptor;
+ u8 type;
+ u8 transaction_type;
+ u8 validity_flags[2];
+ u8 channel;
+ u8 rank;
+ u8 nibble_mask[3];
+ u8 bank_group;
+ u8 bank;
+ u8 row[3];
+ u8 column[2];
+ u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+ u8 reserved[0x17];
+} __packed;
+
+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+ u8 health_status;
+ u8 media_status;
+ u8 add_status;
+ u8 life_used;
+ u8 device_temp[2];
+ u8 dirty_shutdown_cnt[4];
+ u8 cor_vol_err_cnt[4];
+ u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+struct cxl_event_mem_module {
+ struct cxl_event_record_hdr hdr;
+ u8 event_type;
+ struct cxl_get_health_info info;
+ u8 reserved[0x3d];
+} __packed;
+
+#endif /* _LINUX_CXL_EVENT_H */

--
2.43.0

2023-12-14 01:20:39

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 6/7] firmware/efi: Process CXL Component Events

BIOS can configure memory devices as firmware first. This will send CXL
events to the firmware instead of the OS. The firmware can then send
these events to the OS via UEFI.

UEFI v2.10 section N.2.14 defines a Common Platform Error Record (CPER)
format for CXL Component Events. The format is mostly the same as the
CXL Common Event Record Format. The difference is the use of a GUID in
the Section Type rather than a UUID as part of the event itself.

Add EFI support to detect CXL CPER records and call a registered
notifier with the event. Enforce that only one notifier call can be
registered at any time.

Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes for v3:
[djbw: check notifier vs registered and error if not the same]
[djbw: check for previous notifier and fail if already set]
[djbw: use guard for rw sem]
[djbw: pass the event type and record data as 2 parameters]
[djbw: prefer __packed over pragma packed]
[djbw: clean up commit message]
[iweiny: clean structure names]
---
drivers/firmware/efi/cper.c | 15 ++++++++++++++
drivers/firmware/efi/cper_cxl.c | 46 +++++++++++++++++++++++++++++++++++++++++
drivers/firmware/efi/cper_cxl.h | 29 ++++++++++++++++++++++++++
include/linux/cxl-event.h | 43 ++++++++++++++++++++++++++++++++++++++
4 files changed, 133 insertions(+)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 35c37f667781..2ee5011c4a8e 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -22,6 +22,7 @@
#include <linux/aer.h>
#include <linux/printk.h>
#include <linux/bcd.h>
+#include <linux/cxl-event.h>
#include <acpi/ghes.h>
#include <ras/ras_event.h>
#include "cper_cxl.h"
@@ -607,6 +608,20 @@ cper_estatus_print_section(const char *pfx, struct acpi_hest_generic_data *gdata
cper_print_prot_err(newpfx, prot_err);
else
goto err_section_too_small;
+ } else if (guid_equal(sec_type, &CPER_SEC_CXL_GEN_MEDIA_GUID) ||
+ guid_equal(sec_type, &CPER_SEC_CXL_DRAM_GUID) ||
+ guid_equal(sec_type, &CPER_SEC_CXL_MEM_MODULE_GUID)) {
+ struct cxl_cper_event_rec *rec = acpi_hest_get_payload(gdata);
+
+ if (rec->hdr.length <= sizeof(rec->hdr))
+ goto err_section_too_small;
+
+ if (rec->hdr.length > sizeof(*rec)) {
+ pr_err(FW_WARN "error section length is too big\n");
+ return;
+ }
+
+ cxl_cper_post_event(newpfx, sec_type, rec);
} else {
const void *err = acpi_hest_get_payload(gdata);

diff --git a/drivers/firmware/efi/cper_cxl.c b/drivers/firmware/efi/cper_cxl.c
index a55771b99a97..da6ae0912bc3 100644
--- a/drivers/firmware/efi/cper_cxl.c
+++ b/drivers/firmware/efi/cper_cxl.c
@@ -8,6 +8,7 @@
*/

#include <linux/cper.h>
+#include <linux/cxl-event.h>
#include "cper_cxl.h"

#define PROT_ERR_VALID_AGENT_TYPE BIT_ULL(0)
@@ -187,3 +188,48 @@ void cper_print_prot_err(const char *pfx, const struct cper_sec_prot_err *prot_e
sizeof(cxl_ras->header_log), 0);
}
}
+
+DECLARE_RWSEM(cxl_cper_rw_sem);
+static cxl_cper_notifier cper_notifier;
+
+void cxl_cper_post_event(const char *pfx, guid_t *sec_type,
+ struct cxl_cper_event_rec *rec)
+{
+ enum cxl_event_type event_type;
+
+ if (!(rec->hdr.validation_bits & CPER_CXL_COMP_EVENT_LOG_VALID)) {
+ pr_err(FW_WARN "cxl event no Component Event Log present\n");
+ return;
+ }
+
+ if (guid_equal(sec_type, &CPER_SEC_CXL_GEN_MEDIA_GUID))
+ event_type = CXL_CPER_EVENT_GEN_MEDIA;
+ else if (guid_equal(sec_type, &CPER_SEC_CXL_DRAM_GUID))
+ event_type = CXL_CPER_EVENT_DRAM;
+ else if (guid_equal(sec_type, &CPER_SEC_CXL_MEM_MODULE_GUID))
+ event_type = CXL_CPER_EVENT_MEM_MODULE;
+
+ guard(rwsem_read)(&cxl_cper_rw_sem);
+ if (cper_notifier)
+ cper_notifier(event_type, rec);
+}
+
+int cxl_cper_register_notifier(cxl_cper_notifier notifier)
+{
+ guard(rwsem_write)(&cxl_cper_rw_sem);
+ if (cper_notifier)
+ return -EINVAL;
+ cper_notifier = notifier;
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_cper_register_notifier, CXL);
+
+int cxl_cper_unregister_notifier(cxl_cper_notifier notifier)
+{
+ guard(rwsem_write)(&cxl_cper_rw_sem);
+ if (notifier != cper_notifier)
+ return -EINVAL;
+ cper_notifier = NULL;
+ return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_cper_unregister_notifier, CXL);
diff --git a/drivers/firmware/efi/cper_cxl.h b/drivers/firmware/efi/cper_cxl.h
index 86bfcf7909ec..71f27b3e2810 100644
--- a/drivers/firmware/efi/cper_cxl.h
+++ b/drivers/firmware/efi/cper_cxl.h
@@ -10,11 +10,38 @@
#ifndef LINUX_CPER_CXL_H
#define LINUX_CPER_CXL_H

+#include <linux/cxl-event.h>
+
/* CXL Protocol Error Section */
#define CPER_SEC_CXL_PROT_ERR \
GUID_INIT(0x80B9EFB4, 0x52B5, 0x4DE3, 0xA7, 0x77, 0x68, 0x78, \
0x4B, 0x77, 0x10, 0x48)

+/* CXL Event record UUIDs are formated at GUIDs and reported in section type */
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CPER_SEC_CXL_GEN_MEDIA_GUID \
+ GUID_INIT(0xfbcd0a77, 0xc260, 0x417f, \
+ 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6)
+
+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+#define CPER_SEC_CXL_DRAM_GUID \
+ GUID_INIT(0x601dcbb3, 0x9c06, 0x4eab, \
+ 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24)
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CPER_SEC_CXL_MEM_MODULE_GUID \
+ GUID_INIT(0xfe927475, 0xdd59, 0x4339, \
+ 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74)
+
#pragma pack(1)

/* Compute Express Link Protocol Error Section, UEFI v2.10 sec N.2.13 */
@@ -62,5 +89,7 @@ struct cper_sec_prot_err {
#pragma pack()

void cper_print_prot_err(const char *pfx, const struct cper_sec_prot_err *prot_err);
+void cxl_cper_post_event(const char *pfx, guid_t *sec_type,
+ struct cxl_cper_event_rec *rec);

#endif //__CPER_CXL_
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index 18dab4d90dc8..167cc812f212 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -108,4 +108,47 @@ struct cxl_event_record_raw {
union cxl_event event;
} __packed;

+enum cxl_event_type {
+ CXL_CPER_EVENT_GEN_MEDIA,
+ CXL_CPER_EVENT_DRAM,
+ CXL_CPER_EVENT_MEM_MODULE,
+};
+
+#define CPER_CXL_DEVICE_ID_VALID BIT(0)
+#define CPER_CXL_DEVICE_SN_VALID BIT(1)
+#define CPER_CXL_COMP_EVENT_LOG_VALID BIT(2)
+struct cxl_cper_event_rec {
+ struct {
+ u32 length;
+ u64 validation_bits;
+ struct cper_cxl_event_devid {
+ u16 vendor_id;
+ u16 device_id;
+ u8 func_num;
+ u8 device_num;
+ u8 bus_num;
+ u16 segment_num;
+ u16 slot_num; /* bits 2:0 reserved */
+ u8 reserved;
+ } device_id __packed;
+ struct cper_cxl_event_sn {
+ u32 lower_dw;
+ u32 upper_dw;
+ } dev_serial_num __packed;
+ } hdr __packed;
+
+ union cxl_event event;
+} __packed;
+
+typedef void (*cxl_cper_notifier)(enum cxl_event_type type,
+ struct cxl_cper_event_rec *rec);
+
+#ifdef CONFIG_UEFI_CPER
+int cxl_cper_register_notifier(cxl_cper_notifier notifier);
+int cxl_cper_unregister_notifier(cxl_cper_notifier notifier);
+#else
+static inline int cxl_cper_register_notifier(cxl_cper_notifier notifier) { }
+static inline int cxl_cper_unregister_notifier(cxl_cper_notifier notifier) { }
+#endif
+
#endif /* _LINUX_CXL_EVENT_H */

--
2.43.0

2023-12-14 01:21:12

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 5/7] cxl/events: Create a CXL event union

The CXL CPER and event log records share everything but a UUID/GUID in
their structures.

Define a cxl_event union without the UUID/GUID to be shared between the
CPER and event log record formats. Adjust the code to use this union.

Signed-off-by: Ira Weiny <[email protected]>
---
drivers/cxl/core/mbox.c | 32 +++++++++++++-------------------
drivers/cxl/core/trace.h | 8 ++++----
include/linux/cxl-event.h | 23 +++++++++++++++++------
tools/testing/cxl/test/mem.c | 31 ++++++++++++++++++-------------
4 files changed, 52 insertions(+), 42 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 5c4e4a692dea..b7efa058a100 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -844,26 +844,17 @@ static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
enum cxl_event_log_type type,
struct cxl_event_record_raw *record)
{
+ union cxl_event *evt = &record->event;
uuid_t *id = &record->id;

- if (uuid_equal(id, &gen_media_event_uuid)) {
- struct cxl_event_gen_media *rec =
- (struct cxl_event_gen_media *)record;
-
- trace_cxl_general_media(cxlmd, type, id, rec);
- } else if (uuid_equal(id, &dram_event_uuid)) {
- struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
-
- trace_cxl_dram(cxlmd, type, id, rec);
- } else if (uuid_equal(id, &mem_mod_event_uuid)) {
- struct cxl_event_mem_module *rec =
- (struct cxl_event_mem_module *)record;
-
- trace_cxl_memory_module(cxlmd, type, id, rec);
- } else {
- /* For unknown record types print just the header */
- trace_cxl_generic_event(cxlmd, type, id, record);
- }
+ if (uuid_equal(id, &gen_media_event_uuid))
+ trace_cxl_general_media(cxlmd, type, id, &evt->gen_media);
+ else if (uuid_equal(id, &dram_event_uuid))
+ trace_cxl_dram(cxlmd, type, id, &evt->dram);
+ else if (uuid_equal(id, &mem_mod_event_uuid))
+ trace_cxl_memory_module(cxlmd, type, id, &evt->mem_module);
+ else
+ trace_cxl_generic_event(cxlmd, type, id, &evt->generic);
}

static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -906,7 +897,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
*/
i = 0;
for (cnt = 0; cnt < total; cnt++) {
- payload->handles[i++] = get_pl->records[cnt].hdr.handle;
+ struct cxl_event_record_raw *raw = &get_pl->records[cnt];
+ struct cxl_event_generic *gen = &raw->event.generic;
+
+ payload->handles[i++] = gen->hdr.handle;
dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
le16_to_cpu(payload->handles[i]));

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 3da16026b8db..862b50cb1829 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -225,9 +225,9 @@ TRACE_EVENT(cxl_overflow,
TRACE_EVENT(cxl_generic_event,

TP_PROTO(const struct cxl_memdev *cxlmd, enum cxl_event_log_type log,
- const uuid_t *uuid, struct cxl_event_record_raw *rec),
+ const uuid_t *uuid, struct cxl_event_generic *gen_rec),

- TP_ARGS(cxlmd, log, uuid, rec),
+ TP_ARGS(cxlmd, log, uuid, gen_rec),

TP_STRUCT__entry(
CXL_EVT_TP_entry
@@ -235,8 +235,8 @@ TRACE_EVENT(cxl_generic_event,
),

TP_fast_assign(
- CXL_EVT_TP_fast_assign(cxlmd, log, uuid, rec->hdr);
- memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
+ CXL_EVT_TP_fast_assign(cxlmd, log, uuid, gen_rec->hdr);
+ memcpy(__entry->data, gen_rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
),

CXL_EVT_TP_printk("%s",
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index ebb00ead1496..18dab4d90dc8 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -18,13 +18,8 @@ struct cxl_event_record_hdr {
u8 reserved[15];
} __packed;

-/*
- * Common Event Record Format
- * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
- */
#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
-struct cxl_event_record_raw {
- uuid_t id;
+struct cxl_event_generic {
struct cxl_event_record_hdr hdr;
u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
} __packed;
@@ -97,4 +92,20 @@ struct cxl_event_mem_module {
u8 reserved[0x3d];
} __packed;

+union cxl_event {
+ struct cxl_event_generic generic;
+ struct cxl_event_gen_media gen_media;
+ struct cxl_event_dram dram;
+ struct cxl_event_mem_module mem_module;
+} __packed;
+
+/*
+ * Common Event Record Format; in event logs
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
+struct cxl_event_record_raw {
+ uuid_t id;
+ union cxl_event event;
+} __packed;
+
#endif /* _LINUX_CXL_EVENT_H */
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 9cc2b8ce1efd..35ee41e435ab 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -251,7 +251,8 @@ static int mock_get_event(struct device *dev, struct cxl_mbox_cmd *cmd)
for (i = 0; i < CXL_TEST_EVENT_CNT && !event_log_empty(log); i++) {
memcpy(&pl->records[i], event_get_current(log),
sizeof(pl->records[i]));
- pl->records[i].hdr.handle = event_get_cur_event_handle(log);
+ pl->records[i].event.generic.hdr.handle =
+ event_get_cur_event_handle(log);
log->cur_idx++;
}

@@ -339,25 +340,29 @@ static void cxl_mock_event_trigger(struct device *dev)
struct cxl_event_record_raw maint_needed = {
.id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
- .hdr = {
- .length = sizeof(struct cxl_event_record_raw),
- .flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
- /* .handle = Set dynamically */
- .related_handle = cpu_to_le16(0xa5b6),
+ .event.generic = {
+ .hdr = {
+ .length = sizeof(struct cxl_event_record_raw),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0xa5b6),
+ },
+ .data = { 0xDE, 0xAD, 0xBE, 0xEF },
},
- .data = { 0xDE, 0xAD, 0xBE, 0xEF },
};

struct cxl_event_record_raw hardware_replace = {
.id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
- .hdr = {
- .length = sizeof(struct cxl_event_record_raw),
- .flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
- /* .handle = Set dynamically */
- .related_handle = cpu_to_le16(0xb6a5),
+ .event.generic = {
+ .hdr = {
+ .length = sizeof(struct cxl_event_record_raw),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0xb6a5),
+ },
+ .data = { 0xDE, 0xAD, 0xBE, 0xEF },
},
- .data = { 0xDE, 0xAD, 0xBE, 0xEF },
};

struct cxl_test_gen_media {

--
2.43.0

2023-12-14 01:21:36

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 4/7] cxl/events: Separate UUID from event structures

The UEFI CXL CPER structure does not include the UUID. Now that the
UUID is passed separately to the trace event there is no need to have
the UUID in those structures.

Move UUID from the event record header to the raw structures. Adjust
cxl-test to Create dummy structures for creating test records.

Signed-off-by: Ira Weiny <[email protected]>
---
drivers/cxl/core/mbox.c | 2 +-
include/linux/cxl-event.h | 10 ++--
tools/testing/cxl/test/mem.c | 129 +++++++++++++++++++++++++------------------
3 files changed, 81 insertions(+), 60 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index be2ad3b6f431..5c4e4a692dea 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -844,7 +844,7 @@ static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
enum cxl_event_log_type type,
struct cxl_event_record_raw *record)
{
- uuid_t *id = &record->hdr.id;
+ uuid_t *id = &record->id;

if (uuid_equal(id, &gen_media_event_uuid)) {
struct cxl_event_gen_media *rec =
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index 1c94e8fdd227..ebb00ead1496 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -8,12 +8,7 @@
* Copyright(c) 2023 Intel Corporation.
*/

-/*
- * Common Event Record Format
- * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
- */
struct cxl_event_record_hdr {
- uuid_t id;
u8 length;
u8 flags[3];
__le16 handle;
@@ -23,8 +18,13 @@ struct cxl_event_record_hdr {
u8 reserved[15];
} __packed;

+/*
+ * Common Event Record Format
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
struct cxl_event_record_raw {
+ uuid_t id;
struct cxl_event_record_hdr hdr;
u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
} __packed;
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 5a95b04b329a..9cc2b8ce1efd 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -337,9 +337,9 @@ static void cxl_mock_event_trigger(struct device *dev)
}

struct cxl_event_record_raw maint_needed = {
+ .id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
+ 0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
.hdr = {
- .id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
- 0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
.length = sizeof(struct cxl_event_record_raw),
.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
/* .handle = Set dynamically */
@@ -349,9 +349,9 @@ struct cxl_event_record_raw maint_needed = {
};

struct cxl_event_record_raw hardware_replace = {
+ .id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
+ 0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
.hdr = {
- .id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
- 0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
.length = sizeof(struct cxl_event_record_raw),
.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
/* .handle = Set dynamically */
@@ -360,61 +360,82 @@ struct cxl_event_record_raw hardware_replace = {
.data = { 0xDE, 0xAD, 0xBE, 0xEF },
};

-struct cxl_event_gen_media gen_media = {
- .hdr = {
- .id = CXL_EVENT_GEN_MEDIA_UUID,
- .length = sizeof(struct cxl_event_gen_media),
- .flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
- /* .handle = Set dynamically */
- .related_handle = cpu_to_le16(0),
+struct cxl_test_gen_media {
+ uuid_t id;
+ struct cxl_event_gen_media rec;
+} __packed;
+
+struct cxl_test_gen_media gen_media = {
+ .id = CXL_EVENT_GEN_MEDIA_UUID,
+ .rec = {
+ .hdr = {
+ .length = sizeof(struct cxl_test_gen_media),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0),
+ },
+ .phys_addr = cpu_to_le64(0x2000),
+ .descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+ .type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+ .transaction_type = CXL_GMER_TRANS_HOST_WRITE,
+ /* .validity_flags = <set below> */
+ .channel = 1,
+ .rank = 30
},
- .phys_addr = cpu_to_le64(0x2000),
- .descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
- .type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
- .transaction_type = CXL_GMER_TRANS_HOST_WRITE,
- /* .validity_flags = <set below> */
- .channel = 1,
- .rank = 30
};

-struct cxl_event_dram dram = {
- .hdr = {
- .id = CXL_EVENT_DRAM_UUID,
- .length = sizeof(struct cxl_event_dram),
- .flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
- /* .handle = Set dynamically */
- .related_handle = cpu_to_le16(0),
+struct cxl_test_dram {
+ uuid_t id;
+ struct cxl_event_dram rec;
+} __packed;
+
+struct cxl_test_dram dram = {
+ .id = CXL_EVENT_DRAM_UUID,
+ .rec = {
+ .hdr = {
+ .length = sizeof(struct cxl_test_dram),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0),
+ },
+ .phys_addr = cpu_to_le64(0x8000),
+ .descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
+ .type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
+ .transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
+ /* .validity_flags = <set below> */
+ .channel = 1,
+ .bank_group = 5,
+ .bank = 2,
+ .column = {0xDE, 0xAD},
},
- .phys_addr = cpu_to_le64(0x8000),
- .descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
- .type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
- .transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
- /* .validity_flags = <set below> */
- .channel = 1,
- .bank_group = 5,
- .bank = 2,
- .column = {0xDE, 0xAD},
};

-struct cxl_event_mem_module mem_module = {
- .hdr = {
- .id = CXL_EVENT_MEM_MODULE_UUID,
- .length = sizeof(struct cxl_event_mem_module),
- /* .handle = Set dynamically */
- .related_handle = cpu_to_le16(0),
+struct cxl_test_mem_module {
+ uuid_t id;
+ struct cxl_event_mem_module rec;
+} __packed;
+
+struct cxl_test_mem_module mem_module = {
+ .id = CXL_EVENT_MEM_MODULE_UUID,
+ .rec = {
+ .hdr = {
+ .length = sizeof(struct cxl_test_mem_module),
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0),
+ },
+ .event_type = CXL_MMER_TEMP_CHANGE,
+ .info = {
+ .health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
+ .media_status = CXL_DHI_MS_ALL_DATA_LOST,
+ .add_status = (CXL_DHI_AS_CRITICAL << 2) |
+ (CXL_DHI_AS_WARNING << 4) |
+ (CXL_DHI_AS_WARNING << 5),
+ .device_temp = { 0xDE, 0xAD},
+ .dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
+ .cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+ .cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+ }
},
- .event_type = CXL_MMER_TEMP_CHANGE,
- .info = {
- .health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
- .media_status = CXL_DHI_MS_ALL_DATA_LOST,
- .add_status = (CXL_DHI_AS_CRITICAL << 2) |
- (CXL_DHI_AS_WARNING << 4) |
- (CXL_DHI_AS_WARNING << 5),
- .device_temp = { 0xDE, 0xAD},
- .dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
- .cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
- .cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
- }
};

static int mock_set_timestamp(struct cxl_dev_state *cxlds,
@@ -436,11 +457,11 @@ static int mock_set_timestamp(struct cxl_dev_state *cxlds,
static void cxl_mock_add_event_logs(struct mock_event_store *mes)
{
put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
- &gen_media.validity_flags);
+ &gen_media.rec.validity_flags);

put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
- &dram.validity_flags);
+ &dram.rec.validity_flags);

mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
mes_add_event(mes, CXL_EVENT_TYPE_INFO,

--
2.43.0

2023-12-14 01:21:36

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 7/7] cxl/memdev: Register for and process CPER events

If the firmware has configured CXL event support to be firmware first
the OS can process those events through CPER records. The CXL layer has
unique DPA to HPA knowledge and standard event trace parsing in place.

CPER records contain Bus, Device, Function information which can be used
to identify the PCI device which is sending the event.

Change pci driver registration to include registration for a CXL CPER
notifier to process the events through the trace subsystem.

Define and use scoped based management to simplify the handling of the
pci device object.

Cc: Bjorn Helgaas <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
NOTE this patch depends on Dan's addition of a device guard[1].

[1] https://lore.kernel.org/all/170250854466.1522182.17555361077409628655.stgit@dwillia2-xfh.jf.intel.com/

Changes for v3:
[djbw: define a __free(pci_dev_put) to release the device automatically]
[djbw: use device guard from Vishal]
[iweiny: delete old notifier block structure]
[iweiny: adjust for new notifier interface]
---
drivers/cxl/core/mbox.c | 31 +++++++++++++++++++++++-----
drivers/cxl/cxlmem.h | 4 ++++
drivers/cxl/pci.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++-
include/linux/pci.h | 2 ++
4 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index b7efa058a100..c9aa723e3391 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -840,9 +840,30 @@ static const uuid_t gen_media_event_uuid = CXL_EVENT_GEN_MEDIA_UUID;
static const uuid_t dram_event_uuid = CXL_EVENT_DRAM_UUID;
static const uuid_t mem_mod_event_uuid = CXL_EVENT_MEM_MODULE_UUID;

-static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
- enum cxl_event_log_type type,
- struct cxl_event_record_raw *record)
+void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
+ enum cxl_event_log_type type,
+ enum cxl_event_type event_type,
+ union cxl_event *event)
+{
+ switch (event_type) {
+ case CXL_CPER_EVENT_GEN_MEDIA:
+ trace_cxl_general_media(cxlmd, type, &gen_media_event_uuid,
+ &event->gen_media);
+ break;
+ case CXL_CPER_EVENT_DRAM:
+ trace_cxl_dram(cxlmd, type, &dram_event_uuid, &event->dram);
+ break;
+ case CXL_CPER_EVENT_MEM_MODULE:
+ trace_cxl_memory_module(cxlmd, type, &mem_mod_event_uuid,
+ &event->mem_module);
+ break;
+ }
+}
+EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+
+static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
+ enum cxl_event_log_type type,
+ struct cxl_event_record_raw *record)
{
union cxl_event *evt = &record->event;
uuid_t *id = &record->id;
@@ -965,8 +986,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
break;

for (i = 0; i < nr_rec; i++)
- cxl_event_trace_record(cxlmd, type,
- &payload->records[i]);
+ __cxl_event_trace_record(cxlmd, type,
+ &payload->records[i]);

if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index e5d770e26e02..e7e9508fecac 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -802,6 +802,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
unsigned long *cmds);
void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
+void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
+ enum cxl_event_log_type type,
+ enum cxl_event_type event_type,
+ union cxl_event *event);
int cxl_set_timestamp(struct cxl_memdev_state *mds);
int cxl_poison_state_init(struct cxl_memdev_state *mds);
int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 0155fb66b580..638275569d63 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+#include <asm-generic/unaligned.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/moduleparam.h>
#include <linux/module.h>
@@ -969,6 +970,58 @@ static struct pci_driver cxl_pci_driver = {
},
};

+#define CXL_EVENT_HDR_FLAGS_REC_SEVERITY GENMASK(1, 0)
+static void cxl_cper_event_call(enum cxl_event_type ev_type,
+ struct cxl_cper_event_rec *rec)
+{
+ struct cper_cxl_event_devid *device_id = &rec->hdr.device_id;
+ struct pci_dev *pdev __free(pci_dev_put) = NULL;
+ struct cxl_dev_state *cxlds = NULL;
+ enum cxl_event_log_type log_type;
+ unsigned int devfn;
+ u32 hdr_flags;
+
+ devfn = PCI_DEVFN(device_id->device_num, device_id->func_num);
+ pdev = pci_get_domain_bus_and_slot(device_id->segment_num,
+ device_id->bus_num, devfn);
+ if (!pdev)
+ return;
+
+ guard(device)(&pdev->dev);
+ if (pdev->driver == &cxl_pci_driver)
+ cxlds = pci_get_drvdata(pdev);
+ if (!cxlds)
+ return;
+
+ /* Fabricate a log type */
+ hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags);
+ log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags);
+
+ cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type, &rec->event);
+}
+
+static int __init cxl_pci_driver_init(void)
+{
+ int rc;
+
+ rc = pci_register_driver(&cxl_pci_driver);
+ if (rc)
+ return rc;
+
+ rc = cxl_cper_register_notifier(cxl_cper_event_call);
+ if (rc)
+ pci_unregister_driver(&cxl_pci_driver);
+
+ return rc;
+}
+
+static void __exit cxl_pci_driver_exit(void)
+{
+ cxl_cper_unregister_notifier(cxl_cper_event_call);
+ pci_unregister_driver(&cxl_pci_driver);
+}
+
+module_init(cxl_pci_driver_init);
+module_exit(cxl_pci_driver_exit);
MODULE_LICENSE("GPL v2");
-module_pci_driver(cxl_pci_driver);
MODULE_IMPORT_NS(CXL);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 60ca768bc867..290d0a2651b2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1170,6 +1170,7 @@ int pci_get_interrupt_pin(struct pci_dev *dev, struct pci_dev **bridge);
u8 pci_common_swizzle(struct pci_dev *dev, u8 *pinp);
struct pci_dev *pci_dev_get(struct pci_dev *dev);
void pci_dev_put(struct pci_dev *dev);
+DEFINE_FREE(pci_dev_put, struct pci_dev *, if (_T) pci_dev_put(_T))
void pci_remove_bus(struct pci_bus *b);
void pci_stop_and_remove_bus_device(struct pci_dev *dev);
void pci_stop_and_remove_bus_device_locked(struct pci_dev *dev);
@@ -1871,6 +1872,7 @@ void pci_cfg_access_unlock(struct pci_dev *dev);
void pci_dev_lock(struct pci_dev *dev);
int pci_dev_trylock(struct pci_dev *dev);
void pci_dev_unlock(struct pci_dev *dev);
+DEFINE_GUARD(pci_dev, struct pci_dev *, pci_dev_lock(_T), pci_dev_unlock(_T))

/*
* PCI domain support. Sometimes called PCI segment (eg by ACPI),

--
2.43.0

2023-12-14 01:21:38

by Ira Weiny

[permalink] [raw]
Subject: [PATCH v3 3/7] cxl/events: Create common event UUID defines

Dan points out in review that the cxl_test code could be made better
through the use of UUID's defines rather than being open coded.[1]
Static const variables still need to exist to be passed to the trace
code.

Create UUID defines and use them rather than open coding them.

[1] https://lore.kernel.org/all/[email protected]/

Suggested-by: Dan Williams <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes for v3:
[iweiny: move comments along with defines]
---
drivers/cxl/core/mbox.c | 26 +++-----------------------
drivers/cxl/cxlmem.h | 24 ++++++++++++++++++++++++
tools/testing/cxl/test/mem.c | 9 +++------
3 files changed, 30 insertions(+), 29 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 00f429c440df..be2ad3b6f431 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -836,29 +836,9 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
}
EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);

-/*
- * General Media Event Record
- * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
- */
-static const uuid_t gen_media_event_uuid =
- UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
- 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
-
-/*
- * DRAM Event Record
- * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
- */
-static const uuid_t dram_event_uuid =
- UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
- 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
-
-/*
- * Memory Module Event Record
- * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
- */
-static const uuid_t mem_mod_event_uuid =
- UUID_INIT(0xfe927475, 0xdd59, 0x4339,
- 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+static const uuid_t gen_media_event_uuid = CXL_EVENT_GEN_MEDIA_UUID;
+static const uuid_t dram_event_uuid = CXL_EVENT_DRAM_UUID;
+static const uuid_t mem_mod_event_uuid = CXL_EVENT_MEM_MODULE_UUID;

static void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
enum cxl_event_log_type type,
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f0e7ebb84f02..e5d770e26e02 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -580,6 +580,30 @@ struct cxl_mbox_identify {
u8 qos_telemetry_caps;
} __packed;

+/*
+ * General Media Event Record UUID
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MEDIA_UUID \
+ UUID_INIT(0xfbcd0a77, 0xc260, 0x417f, \
+ 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6)
+
+/*
+ * DRAM Event Record UUID
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+#define CXL_EVENT_DRAM_UUID \
+ UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab, \
+ 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24)
+
+/*
+ * Memory Module Event Record UUID
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_EVENT_MEM_MODULE_UUID \
+ UUID_INIT(0xfe927475, 0xdd59, 0x4339, \
+ 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74)
+
/*
* Get Event Records output payload
* CXL rev 3.0 section 8.2.9.2.2; Table 8-50
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index ee61fa3a2411..5a95b04b329a 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -362,8 +362,7 @@ struct cxl_event_record_raw hardware_replace = {

struct cxl_event_gen_media gen_media = {
.hdr = {
- .id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
- 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+ .id = CXL_EVENT_GEN_MEDIA_UUID,
.length = sizeof(struct cxl_event_gen_media),
.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
/* .handle = Set dynamically */
@@ -380,8 +379,7 @@ struct cxl_event_gen_media gen_media = {

struct cxl_event_dram dram = {
.hdr = {
- .id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
- 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
+ .id = CXL_EVENT_DRAM_UUID,
.length = sizeof(struct cxl_event_dram),
.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
/* .handle = Set dynamically */
@@ -400,8 +398,7 @@ struct cxl_event_dram dram = {

struct cxl_event_mem_module mem_module = {
.hdr = {
- .id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
- 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
+ .id = CXL_EVENT_MEM_MODULE_UUID,
.length = sizeof(struct cxl_event_mem_module),
/* .handle = Set dynamically */
.related_handle = cpu_to_le16(0),

--
2.43.0

2023-12-14 15:46:09

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] efi/cxl-cper: Report CPER CXL component events through trace events

Ira Weiny wrote:
> Series status/background
> ========================

I'll need to send a V4 on this because 0day caught that the changes I made
to the notifier functions were not updated for CONFIG_UEFI_CPER=n case.

Apologies,
Ira

2023-12-14 18:24:49

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] firmware/efi: Process CXL Component Events

Ira Weiny wrote:
> BIOS can configure memory devices as firmware first. This will send CXL
> events to the firmware instead of the OS. The firmware can then send
> these events to the OS via UEFI.
>
> UEFI v2.10 section N.2.14 defines a Common Platform Error Record (CPER)
> format for CXL Component Events. The format is mostly the same as the
> CXL Common Event Record Format. The difference is the use of a GUID in
> the Section Type rather than a UUID as part of the event itself.
>
> Add EFI support to detect CXL CPER records and call a registered
> notifier with the event. Enforce that only one notifier call can be
> registered at any time.
>
> Cc: Ard Biesheuvel <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>

V4 coming with 0day build fixes for the CONFIG_UEFI_CPER=n case.

Ira