LinuxLists.cc - [PATCH V4 0/9] CXL: Process event logs

2022-12-12 07:19:06

Subject: [PATCH V4 0/9] CXL: Process event logs

From: Ira Weiny <[email protected]>

This code has been tested with a newer qemu which allows for more events to be
returned at a time as well an additional QMP event and interrupt injection.
Those patches will follow once they have been cleaned up.

The series is now in 3 parts:

1) Base functionality including interrupts
2) Tracing specific events (Dynamic Capacity Event Record is defered)
3) cxl-test infrastructure for basic tests

Changes from V3
Feedback from Dan
Spit out ACPI changes for Bjorn

- Link to v3: https://lore.kernel.org/all/[email protected]/

Davidlohr Bueso (1):
cxl/mem: Wire up event interrupts

Ira Weiny (8):
PCI/CXL: Export native CXL error reporting control
cxl/mem: Read, trace, and clear events on driver load
cxl/mem: Trace General Media Event Record
cxl/mem: Trace DRAM Event Record
cxl/mem: Trace Memory Module Event Record
cxl/test: Add generic mock events
cxl/test: Add specific events
cxl/test: Simulate event log overflow

drivers/acpi/pci_root.c | 3 +
drivers/cxl/core/mbox.c | 186 +++++++++++++
drivers/cxl/core/trace.h | 479 ++++++++++++++++++++++++++++++++++
drivers/cxl/cxl.h | 16 ++
drivers/cxl/cxlmem.h | 171 ++++++++++++
drivers/cxl/cxlpci.h | 6 +
drivers/cxl/pci.c | 236 +++++++++++++++++
drivers/pci/probe.c | 1 +
include/linux/pci.h | 1 +
tools/testing/cxl/test/Kbuild | 2 +-
tools/testing/cxl/test/mem.c | 352 +++++++++++++++++++++++++
11 files changed, 1452 insertions(+), 1 deletion(-)

base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
--
2.37.2

2022-12-12 07:19:34

by Ira Weiny

[permalink] [raw]

Subject: [PATCH V4 8/9] cxl/test: Add specific events

From: Ira Weiny <[email protected]>

Each type of event has different trace point outputs.

Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.

Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from V3:
Dan
Move code to mem.c
---
tools/testing/cxl/test/mem.c | 73 ++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)

diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 036f27f9c18e..73db722a8879 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -277,12 +277,85 @@ struct cxl_event_record_raw hardware_replace = {
.data = { 0xDE, 0xAD, 0xBE, 0xEF },
};

+struct cxl_event_gen_media gen_media = {
+ .hdr = {
+ .id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+ 0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+ .length = sizeof(struct cxl_event_gen_media),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0),
+ },
+ .phys_addr = cpu_to_le64(0x2000),
+ .descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+ .type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+ .transaction_type = CXL_GMER_TRANS_HOST_WRITE,
+ /* .validity_flags = <set below> */
+ .channel = 1,
+ .rank = 30
+};
+
+struct cxl_event_dram dram = {
+ .hdr = {
+ .id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+ 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
+ .length = sizeof(struct cxl_event_dram),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0),
+ },
+ .phys_addr = cpu_to_le64(0x8000),
+ .descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
+ .type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
+ .transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
+ /* .validity_flags = <set below> */
+ .channel = 1,
+ .bank_group = 5,
+ .bank = 2,
+ .column = {0xDE, 0xAD},
+};
+
+struct cxl_event_mem_module mem_module = {
+ .hdr = {
+ .id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+ 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
+ .length = sizeof(struct cxl_event_mem_module),
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0),
+ },
+ .event_type = CXL_MMER_TEMP_CHANGE,
+ .info = {
+ .health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
+ .media_status = CXL_DHI_MS_ALL_DATA_LOST,
+ .add_status = (CXL_DHI_AS_CRITICAL << 2) |
+ (CXL_DHI_AS_WARNING << 4) |
+ (CXL_DHI_AS_WARNING << 5),
+ .device_temp = { 0xDE, 0xAD},
+ .dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
+ .cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+ .cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+ }
+};
+
static void cxl_mock_add_event_logs(struct mock_event_store *mes)
{
+ put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
+ &gen_media.validity_flags);
+
+ put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
+ CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
+ &dram.validity_flags);
+
mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+ mes_add_event(mes, CXL_EVENT_TYPE_INFO,
+ (struct cxl_event_record_raw *)&gen_media);
+ mes_add_event(mes, CXL_EVENT_TYPE_INFO,
+ (struct cxl_event_record_raw *)&mem_module);
mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;

mes_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+ mes_add_event(mes, CXL_EVENT_TYPE_FATAL,
+ (struct cxl_event_record_raw *)&dram);
mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
}

--
2.37.2

2022-12-12 07:20:53

by Ira Weiny

[permalink] [raw]

Subject: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

From: Davidlohr Bueso <[email protected]>

Currently the only CXL features targeted for irq support require their
message numbers to be within the first 16 entries. The device may
however support less than 16 entries depending on the support it
provides.

Attempt to allocate these 16 irq vectors. If the device supports less
then the PCI infrastructure will allocate that number. Upon successful
allocation, users can plug in their respective isr at any point
thereafter.

CXL device events are signaled via interrupts. Each event log may have
a different interrupt message number. These message numbers are
reported in the Get Event Interrupt Policy mailbox command.

Add interrupt support for event logs. Interrupts are allocated as
shared interrupts. Therefore, all or some event logs can share the same
message number.

In addition all logs are queried on any interrupt in order of the most
to least severe based on the status register.

Cc: Bjorn Helgaas <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Co-developed-by: Ira Weiny <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Davidlohr Bueso <[email protected]>

---
Changes from V3:
Adjust based on changes in patch 1
Consolidate event setup into cxl_event_config()
Consistently use cxl_event_* for function names
Remove cxl_event_int_is_msi()
Ensure DCD log is ignored in status
Simplify event status loop logic
Dan
Fail driver load if the irq's are not allocated
move cxl_event_config_msgnums() to pci.c
s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
s/devm_kmalloc/devm_kzalloc
Fix up pci_alloc_irq_vectors() comment
Pass pdev to cxl_alloc_irq_vectors()
Check FW irq policy prior to configuration
Jonathan
Use FIELD_GET instead of manual masking
---
drivers/cxl/cxl.h | 4 +
drivers/cxl/cxlmem.h | 19 ++++
drivers/cxl/cxlpci.h | 6 ++
drivers/cxl/pci.c | 208 +++++++++++++++++++++++++++++++++++++++++--
4 files changed, 231 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5974d1082210..b3964149c77b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
CXLDEV_EVENT_STATUS_FAIL | \
CXLDEV_EVENT_STATUS_FATAL)

+/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
+#define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
+#define CXLDEV_EVENT_INT_MSGNUM_MASK GENMASK(7, 4)
+
/* CXL 2.0 8.2.8.4 Mailbox Registers */
#define CXLDEV_MBOX_CAPS_OFFSET 0x00
#define CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index dd9aa3dd738e..bd8bfbe61ec8 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
struct range dvsec_range[2];
};

+/**
+ * Event Interrupt Policy
+ *
+ * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
+ */
+enum cxl_event_int_mode {
+ CXL_INT_NONE = 0x00,
+ CXL_INT_MSI_MSIX = 0x01,
+ CXL_INT_FW = 0x02
+};
+struct cxl_event_interrupt_policy {
+ u8 info_settings;
+ u8 warn_settings;
+ u8 failure_settings;
+ u8 fatal_settings;
+} __packed;
+
/**
* struct cxl_event_state - Event log driver state
*
@@ -288,6 +305,8 @@ enum cxl_opcode {
CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
+ CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
+ CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
CXL_MBOX_OP_GET_FW_INFO = 0x0200,
CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 77dbdb980b12..a8ea04f536ab 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -53,6 +53,12 @@
#define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
#define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)

+/*
+ * NOTE: Currently all the functions which are enabled for CXL require their
+ * vectors to be in the first 16. Use this as the default max.
+ */
+#define CXL_PCI_DEFAULT_MAX_VECTORS 16
+
/* Register Block Identifier (RBI) */
enum cxl_regloc_type {
CXL_REGLOC_RBI_EMPTY = 0,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index a2d8382bc593..d42d87faddb8 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
return 0;
}

+static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
+{
+ int nvecs;
+
+ /*
+ * CXL requires MSI/MSIX support.
+ *
+ * Additionally pci_alloc_irq_vectors() handles calling
+ * pci_free_irq_vectors() automatically despite not being called
+ * pcim_*. See pci_setup_msi_context().
+ */
+ nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
+ PCI_IRQ_MSIX | PCI_IRQ_MSI);
+ if (nvecs < 1) {
+ dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
+ return -ENXIO;
+ }
+ return 0;
+}
+
+struct cxl_dev_id {
+ struct cxl_dev_state *cxlds;
+};
+
+static irqreturn_t cxl_event_thread(int irq, void *id)
+{
+ struct cxl_dev_id *dev_id = id;
+ struct cxl_dev_state *cxlds = dev_id->cxlds;
+ u32 status;
+
+ do {
+ /*
+ * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
+ * ignore the reserved upper 32 bits
+ */
+ status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+ /* Ignore logs unknown to the driver */
+ status &= CXLDEV_EVENT_STATUS_ALL;
+ if (!status)
+ break;
+ cxl_mem_get_event_records(cxlds, status);
+ cond_resched();
+ } while (status);
+
+ return IRQ_HANDLED;
+}
+
+static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
+{
+ struct device *dev = cxlds->dev;
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct cxl_dev_id *dev_id;
+ int irq;
+
+ if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
+ return -ENXIO;
+
+ /* dev_id must be globally unique and must contain the cxlds */
+ dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
+ if (!dev_id)
+ return -ENOMEM;
+ dev_id->cxlds = cxlds;
+
+ irq = pci_irq_vector(pdev,
+ FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
+ if (irq < 0)
+ return irq;
+
+ return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
+ IRQF_SHARED, NULL, dev_id);
+}
+
+static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
+ struct cxl_event_interrupt_policy *policy)
+{
+ struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
+ .opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
+ .payload_out = policy,
+ .size_out = sizeof(*policy),
+ };
+ int rc;
+
+ rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+ if (rc < 0)
+ dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
+ rc);
+
+ return rc;
+}
+
+static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+ struct cxl_event_interrupt_policy *policy)
+{
+ struct cxl_mbox_cmd mbox_cmd;
+ int rc;
+
+ policy->info_settings = CXL_INT_MSI_MSIX;
+ policy->warn_settings = CXL_INT_MSI_MSIX;
+ policy->failure_settings = CXL_INT_MSI_MSIX;
+ policy->fatal_settings = CXL_INT_MSI_MSIX;
+
+ mbox_cmd = (struct cxl_mbox_cmd) {
+ .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
+ .payload_in = policy,
+ .size_in = sizeof(*policy),
+ };
+
+ rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+ if (rc < 0) {
+ dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
+ rc);
+ return rc;
+ }
+
+ /* Retrieve final interrupt settings */
+ return cxl_event_get_int_policy(cxlds, policy);
+}
+
+static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
+{
+ struct cxl_event_interrupt_policy policy;
+ int rc;
+
+ rc = cxl_event_config_msgnums(cxlds, &policy);
+ if (rc)
+ return rc;
+
+ rc = cxl_event_req_irq(cxlds, policy.info_settings);
+ if (rc) {
+ dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
+ return rc;
+ }
+
+ rc = cxl_event_req_irq(cxlds, policy.warn_settings);
+ if (rc) {
+ dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
+ return rc;
+ }
+
+ rc = cxl_event_req_irq(cxlds, policy.failure_settings);
+ if (rc) {
+ dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
+ return rc;
+ }
+
+ rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
+ if (rc) {
+ dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
+ return rc;
+ }
+
+ return 0;
+}
+
+static bool cxl_event_int_is_fw(u8 setting)
+{
+ u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
+
+ return mode == CXL_INT_FW;
+}
+
+static int cxl_event_config(struct pci_host_bridge *host_bridge,
+ struct cxl_dev_state *cxlds)
+{
+ struct cxl_event_interrupt_policy policy;
+ int rc;
+
+ /*
+ * When BIOS maintains CXL error reporting control, it will process
+ * event records. Only one agent can do so.
+ */
+ if (!host_bridge->native_cxl_error)
+ return 0;
+
+ rc = cxl_event_get_int_policy(cxlds, &policy);
+ if (rc)
+ return rc;
+
+ if (cxl_event_int_is_fw(policy.info_settings) ||
+ cxl_event_int_is_fw(policy.warn_settings) ||
+ cxl_event_int_is_fw(policy.failure_settings) ||
+ cxl_event_int_is_fw(policy.fatal_settings)) {
+ dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
+ return -EBUSY;
+ }
+
+ rc = cxl_event_irqsetup(cxlds);
+ if (rc)
+ return rc;
+
+ cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+
+ return 0;
+}
+
static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
@@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (rc)
return rc;

+ rc = cxl_alloc_irq_vectors(pdev);
+ if (rc)
+ return rc;
+
cxlmd = devm_cxl_add_memdev(cxlds);
if (IS_ERR(cxlmd))
return PTR_ERR(cxlmd);
@@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if (rc)
return rc;

- /*
- * When BIOS maintains CXL error reporting control, it will process
- * event records. Only one agent can do so.
- */
- if (host_bridge->native_cxl_error)
- cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+ rc = cxl_event_config(host_bridge, cxlds);
+ if (rc)
+ return rc;

if (cxlds->regs.ras) {
pci_enable_pcie_error_reporting(pdev);
--
2.37.2

2022-12-12 07:21:27

by Ira Weiny

[permalink] [raw]

Subject: [PATCH V4 6/9] cxl/mem: Trace Memory Module Event Record

From: Ira Weiny <[email protected]>

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Reviewed-by: Dan Williams <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from V3:
Dan
Use if/else if/else instead of return
---
drivers/cxl/core/mbox.c | 13 ++++
drivers/cxl/core/trace.h | 143 +++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 26 +++++++
3 files changed, 182 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index be66cb5816e5..b011804ed994 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -734,6 +734,14 @@ static const uuid_t dram_event_uuid =
UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);

+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+ UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+ 0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
static void cxl_event_trace_record(const struct device *dev,
enum cxl_event_log_type type,
struct cxl_event_record_raw *record)
@@ -749,6 +757,11 @@ static void cxl_event_trace_record(const struct device *dev,
struct cxl_event_dram *rec = (struct cxl_event_dram *)record;

trace_cxl_dram(dev, type, rec);
+ } else if (uuid_equal(id, &mem_mod_event_uuid)) {
+ struct cxl_event_mem_module *rec =
+ (struct cxl_event_mem_module *)record;
+
+ trace_cxl_memory_module(dev, type, rec);
} else {
/* For unknown record types print just the header */
trace_cxl_generic_event(dev, type, record);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index b6321cfb1d9f..ebb4c8ce8587 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -439,6 +439,149 @@ TRACE_EVENT(cxl_dram,
)
);

+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE 0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE 0x01
+#define CXL_MMER_LIFE_USED_CHANGE 0x02
+#define CXL_MMER_TEMP_CHANGE 0x03
+#define CXL_MMER_DATA_PATH_ERROR 0x04
+#define CXL_MMER_LAS_ERROR 0x05
+#define show_dev_evt_type(type) __print_symbolic(type, \
+ { CXL_MMER_HEALTH_STATUS_CHANGE, "Health Status Change" }, \
+ { CXL_MMER_MEDIA_STATUS_CHANGE, "Media Status Change" }, \
+ { CXL_MMER_LIFE_USED_CHANGE, "Life Used Change" }, \
+ { CXL_MMER_TEMP_CHANGE, "Temperature Change" }, \
+ { CXL_MMER_DATA_PATH_ERROR, "Data Path Error" }, \
+ { CXL_MMER_LAS_ERROR, "LSA Error" } \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED BIT(2)
+#define show_health_status_flags(flags) __print_flags(flags, "|", \
+ { CXL_DHI_HS_MAINTENANCE_NEEDED, "MAINTENANCE_NEEDED" }, \
+ { CXL_DHI_HS_PERFORMANCE_DEGRADED, "PERFORMANCE_DEGRADED" }, \
+ { CXL_DHI_HS_HW_REPLACEMENT_NEEDED, "REPLACEMENT_NEEDED" } \
+)
+
+#define CXL_DHI_MS_NORMAL 0x00
+#define CXL_DHI_MS_NOT_READY 0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST 0x02
+#define CXL_DHI_MS_ALL_DATA_LOST 0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS 0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN 0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT 0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS 0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN 0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT 0x09
+#define show_media_status(ms) __print_symbolic(ms, \
+ { CXL_DHI_MS_NORMAL, \
+ "Normal" }, \
+ { CXL_DHI_MS_NOT_READY, \
+ "Not Ready" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOST, \
+ "Write Persistency Lost" }, \
+ { CXL_DHI_MS_ALL_DATA_LOST, \
+ "All Data Lost" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS, \
+ "Write Persistency Loss in the Event of Power Loss" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN, \
+ "Write Persistency Loss in Event of Shutdown" }, \
+ { CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT, \
+ "Write Persistency Loss Imminent" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS, \
+ "All Data Loss in Event of Power Loss" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN, \
+ "All Data loss in the Event of Shutdown" }, \
+ { CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT, \
+ "All Data Loss Imminent" } \
+)
+
+#define CXL_DHI_AS_NORMAL 0x0
+#define CXL_DHI_AS_WARNING 0x1
+#define CXL_DHI_AS_CRITICAL 0x2
+#define show_two_bit_status(as) __print_symbolic(as, \
+ { CXL_DHI_AS_NORMAL, "Normal" }, \
+ { CXL_DHI_AS_WARNING, "Warning" }, \
+ { CXL_DHI_AS_CRITICAL, "Critical" } \
+)
+#define show_one_bit_status(as) __print_symbolic(as, \
+ { CXL_DHI_AS_NORMAL, "Normal" }, \
+ { CXL_DHI_AS_WARNING, "Warning" } \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as) (as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as) ((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as) ((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as) ((as & 0x20) >> 5)
+
+TRACE_EVENT(cxl_memory_module,
+
+ TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+ struct cxl_event_mem_module *rec),
+
+ TP_ARGS(dev, log, rec),
+
+ TP_STRUCT__entry(
+ CXL_EVT_TP_entry
+
+ /* Memory Module Event */
+ __field(u8, event_type)
+
+ /* Device Health Info */
+ __field(u8, health_status)
+ __field(u8, media_status)
+ __field(u8, life_used)
+ __field(u32, dirty_shutdown_cnt)
+ __field(u32, cor_vol_err_cnt)
+ __field(u32, cor_per_err_cnt)
+ __field(s16, device_temp)
+ __field(u8, add_status)
+ ),
+
+ TP_fast_assign(
+ CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+ /* Memory Module Event */
+ __entry->event_type = rec->event_type;
+
+ /* Device Health Info */
+ __entry->health_status = rec->info.health_status;
+ __entry->media_status = rec->info.media_status;
+ __entry->life_used = rec->info.life_used;
+ __entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+ __entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+ __entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+ __entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+ __entry->add_status = rec->info.add_status;
+ ),
+
+ CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
+ "as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+ "as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
+ "dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
+ show_dev_evt_type(__entry->event_type),
+ show_health_status_flags(__entry->health_status),
+ show_media_status(__entry->media_status),
+ show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+ show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+ show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+ show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+ __entry->life_used, __entry->device_temp,
+ __entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+ __entry->cor_per_err_cnt
+ )
+);
+
#endif /* _CXL_EVENTS_H */

#define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 78c453278910..459f161c7c86 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -491,6 +491,32 @@ struct cxl_event_dram {
u8 reserved[0x17];
} __packed;

+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+ u8 health_status;
+ u8 media_status;
+ u8 add_status;
+ u8 life_used;
+ u8 device_temp[2];
+ u8 dirty_shutdown_cnt[4];
+ u8 cor_vol_err_cnt[4];
+ u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+struct cxl_event_mem_module {
+ struct cxl_event_record_hdr hdr;
+ u8 event_type;
+ struct cxl_get_health_info info;
+ u8 reserved[0x3d];
+} __packed;
+
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;
--
2.37.2

2022-12-12 07:21:43

by Ira Weiny

[permalink] [raw]

Subject: [PATCH V4 5/9] cxl/mem: Trace DRAM Event Record

From: Ira Weiny <[email protected]>

CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Determine if the event read is a DRAM event record and if so trace the
record.

Reviewed-by: Dan Williams <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from v3:
Dan
Use if/else if/else instead of returns
---
drivers/cxl/core/mbox.c | 12 ++++++
drivers/cxl/core/trace.h | 92 ++++++++++++++++++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 23 ++++++++++
3 files changed, 127 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index ffa311f94baa..be66cb5816e5 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -726,6 +726,14 @@ static const uuid_t gen_media_event_uuid =
UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);

+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+static const uuid_t dram_event_uuid =
+ UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+ 0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
+
static void cxl_event_trace_record(const struct device *dev,
enum cxl_event_log_type type,
struct cxl_event_record_raw *record)
@@ -737,6 +745,10 @@ static void cxl_event_trace_record(const struct device *dev,
(struct cxl_event_gen_media *)record;

trace_cxl_general_media(dev, type, rec);
+ } else if (uuid_equal(id, &dram_event_uuid)) {
+ struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
+
+ trace_cxl_dram(dev, type, rec);
} else {
/* For unknown record types print just the header */
trace_cxl_generic_event(dev, type, record);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index d85f0481661d..b6321cfb1d9f 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -347,6 +347,98 @@ TRACE_EVENT(cxl_general_media,
)
);

+/*
+ * DRAM Event Record - DER
+ *
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+/*
+ * DRAM Event Record defines many fields the same as the General Media Event
+ * Record. Reuse those definitions as appropriate.
+ */
+#define CXL_DER_VALID_CHANNEL BIT(0)
+#define CXL_DER_VALID_RANK BIT(1)
+#define CXL_DER_VALID_NIBBLE BIT(2)
+#define CXL_DER_VALID_BANK_GROUP BIT(3)
+#define CXL_DER_VALID_BANK BIT(4)
+#define CXL_DER_VALID_ROW BIT(5)
+#define CXL_DER_VALID_COLUMN BIT(6)
+#define CXL_DER_VALID_CORRECTION_MASK BIT(7)
+#define show_dram_valid_flags(flags) __print_flags(flags, "|", \
+ { CXL_DER_VALID_CHANNEL, "CHANNEL" }, \
+ { CXL_DER_VALID_RANK, "RANK" }, \
+ { CXL_DER_VALID_NIBBLE, "NIBBLE" }, \
+ { CXL_DER_VALID_BANK_GROUP, "BANK GROUP" }, \
+ { CXL_DER_VALID_BANK, "BANK" }, \
+ { CXL_DER_VALID_ROW, "ROW" }, \
+ { CXL_DER_VALID_COLUMN, "COLUMN" }, \
+ { CXL_DER_VALID_CORRECTION_MASK, "CORRECTION MASK" } \
+)
+
+TRACE_EVENT(cxl_dram,
+
+ TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+ struct cxl_event_dram *rec),
+
+ TP_ARGS(dev, log, rec),
+
+ TP_STRUCT__entry(
+ CXL_EVT_TP_entry
+ /* DRAM */
+ __field(u64, dpa)
+ __field(u8, descriptor)
+ __field(u8, type)
+ __field(u8, transaction_type)
+ __field(u8, channel)
+ __field(u16, validity_flags)
+ __field(u16, column) /* Out of order to pack trace record */
+ __field(u32, nibble_mask)
+ __field(u32, row)
+ __array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+ __field(u8, rank) /* Out of order to pack trace record */
+ __field(u8, bank_group) /* Out of order to pack trace record */
+ __field(u8, bank) /* Out of order to pack trace record */
+ __field(u8, dpa_flags) /* Out of order to pack trace record */
+ ),
+
+ TP_fast_assign(
+ CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+ /* DRAM */
+ __entry->dpa = le64_to_cpu(rec->phys_addr);
+ __entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+ __entry->dpa &= CXL_DPA_MASK;
+ __entry->descriptor = rec->descriptor;
+ __entry->type = rec->type;
+ __entry->transaction_type = rec->transaction_type;
+ __entry->validity_flags = get_unaligned_le16(rec->validity_flags);
+ __entry->channel = rec->channel;
+ __entry->rank = rec->rank;
+ __entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
+ __entry->bank_group = rec->bank_group;
+ __entry->bank = rec->bank;
+ __entry->row = get_unaligned_le24(rec->row);
+ __entry->column = get_unaligned_le16(rec->column);
+ memcpy(__entry->cor_mask, &rec->correction_mask,
+ CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+ ),
+
+ CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' descriptor='%s' type='%s' " \
+ "transaction_type='%s' channel=%u rank=%u nibble_mask=%x " \
+ "bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
+ "validity_flags='%s'",
+ __entry->dpa, show_dpa_flags(__entry->dpa_flags),
+ show_event_desc_flags(__entry->descriptor),
+ show_mem_event_type(__entry->type),
+ show_trans_type(__entry->transaction_type),
+ __entry->channel, __entry->rank, __entry->nibble_mask,
+ __entry->bank_group, __entry->bank,
+ __entry->row, __entry->column,
+ __print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
+ show_dram_valid_flags(__entry->validity_flags)
+ )
+);
+
#endif /* _CXL_EVENTS_H */

#define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 9cfd20abc3a1..78c453278910 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -468,6 +468,29 @@ struct cxl_event_gen_media {
u8 reserved[46];
} __packed;

+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE 0x20
+struct cxl_event_dram {
+ struct cxl_event_record_hdr hdr;
+ __le64 phys_addr;
+ u8 descriptor;
+ u8 type;
+ u8 transaction_type;
+ u8 validity_flags[2];
+ u8 channel;
+ u8 rank;
+ u8 nibble_mask[3];
+ u8 bank_group;
+ u8 bank;
+ u8 row[3];
+ u8 column[2];
+ u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+ u8 reserved[0x17];
+} __packed;
+
struct cxl_mbox_get_partition_info {
__le64 active_volatile_cap;
__le64 active_persistent_cap;
--
2.37.2

2022-12-12 07:38:06

by Ira Weiny

[permalink] [raw]

Subject: [PATCH V4 7/9] cxl/test: Add generic mock events

From: Ira Weiny <[email protected]>

Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.

Data is completely made up with data patterns which should be easy to
spot in trace output.

A single sysfs entry resets the event data and triggers collecting the
events for testing.

Test traces are easy to obtain with a small script such as this:

#!/bin/bash -x

devices=`find /sys/devices/platform -name cxl_mem*`

# Turn on tracing
echo "" > /sys/kernel/tracing/trace
echo 1 > /sys/kernel/tracing/events/cxl/enable
echo 1 > /sys/kernel/tracing/tracing_on

# Generate fake interrupt
for device in $devices; do
echo 1 > $device/event_trigger
done

# Turn off tracing and report events
echo 0 > /sys/kernel/tracing/tracing_on
cat /sys/kernel/tracing/trace

Reviewed-by: Dan Williams <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>

---
Changes from v3:
Dan
Move code to mem.c and remove event.c and overhead
Clean up function names
---
tools/testing/cxl/test/Kbuild | 2 +-
tools/testing/cxl/test/mem.c | 231 ++++++++++++++++++++++++++++++++++
2 files changed, 232 insertions(+), 1 deletion(-)

diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
index 4e59e2c911f6..61d5f7bcddf9 100644
--- a/tools/testing/cxl/test/Kbuild
+++ b/tools/testing/cxl/test/Kbuild
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-ccflags-y := -I$(srctree)/drivers/cxl/
+ccflags-y := -I$(srctree)/drivers/cxl/ -I$(srctree)/drivers/cxl/core

obj-m += cxl_test.o
obj-m += cxl_mock.o
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 5e4ecd93f1d2..036f27f9c18e 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -9,6 +9,8 @@
#include <linux/bits.h>
#include <cxlmem.h>

+#include "trace.h"
+
#define LSA_SIZE SZ_128K
#define DEV_SIZE SZ_2G
#define EFFECT(x) (1U << x)
@@ -67,6 +69,24 @@ static struct {

#define PASS_TRY_LIMIT 3

+#define CXL_TEST_EVENT_CNT_MAX 15
+
+/* Set a number of events to return at a time for simulation. */
+#define CXL_TEST_EVENT_CNT 3
+
+struct mock_event_log {
+ u16 clear_idx;
+ u16 cur_idx;
+ u16 nr_events;
+ struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
+};
+
+struct mock_event_store {
+ struct cxl_dev_state *cxlds;
+ struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
+ u32 ev_status;
+};
+
struct cxl_mockmem_data {
void *lsa;
u32 security_state;
@@ -74,9 +94,198 @@ struct cxl_mockmem_data {
u8 master_pass[NVDIMM_PASSPHRASE_LEN];
int user_limit;
int master_limit;
+ struct mock_event_store mes;
+ u8 event_buf[SZ_4K];
+};
+
+static struct mock_event_log *event_find_log(struct device *dev, int log_type)
+{
+ struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
+
+ if (log_type >= CXL_EVENT_TYPE_MAX)
+ return NULL;
+ return &mdata->mes.mock_logs[log_type];
+}
+
+static struct cxl_event_record_raw *event_get_current(struct mock_event_log *log)
+{
+ return log->events[log->cur_idx];
+}
+
+static void event_reset_log(struct mock_event_log *log)
+{
+ log->cur_idx = 0;
+ log->clear_idx = 0;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+static u16 event_get_clear_handle(struct mock_event_log *log)
+{
+ return log->clear_idx + 1;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+static __le16 event_get_cur_event_handle(struct mock_event_log *log)
+{
+ u16 cur_handle = log->cur_idx + 1;
+
+ return cpu_to_le16(cur_handle);
+}
+
+static bool event_log_empty(struct mock_event_log *log)
+{
+ return log->cur_idx == log->nr_events;
+}
+
+static void mes_add_event(struct mock_event_store *mes,
+ enum cxl_event_log_type log_type,
+ struct cxl_event_record_raw *event)
+{
+ struct mock_event_log *log;
+
+ if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
+ return;
+
+ log = &mes->mock_logs[log_type];
+ if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+ return;
+
+ log->events[log->nr_events] = event;
+ log->nr_events++;
+}
+
+static int mock_get_event(struct cxl_dev_state *cxlds,
+ struct cxl_mbox_cmd *cmd)
+{
+ struct cxl_get_event_payload *pl;
+ struct mock_event_log *log;
+ u8 log_type;
+ int i;
+
+ if (cmd->size_in != sizeof(log_type))
+ return -EINVAL;
+
+ if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
+ return -EINVAL;
+
+ log_type = *((u8 *)cmd->payload_in);
+ if (log_type >= CXL_EVENT_TYPE_MAX)
+ return -EINVAL;
+
+ memset(cmd->payload_out, 0, cmd->size_out);
+
+ log = event_find_log(cxlds->dev, log_type);
+ if (!log || event_log_empty(log))
+ return 0;
+
+ pl = cmd->payload_out;
+
+ for (i = 0; i < CXL_TEST_EVENT_CNT && !event_log_empty(log); i++) {
+ memcpy(&pl->records[i], event_get_current(log),
+ sizeof(pl->records[i]));
+ pl->records[i].hdr.handle = event_get_cur_event_handle(log);
+ log->cur_idx++;
+ }
+
+ pl->record_count = cpu_to_le16(i);
+ if (!event_log_empty(log))
+ pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
+
+ return 0;
+}
+
+static int mock_clear_event(struct cxl_dev_state *cxlds,
+ struct cxl_mbox_cmd *cmd)
+{
+ struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
+ struct mock_event_log *log;
+ u8 log_type = pl->event_log;
+ u16 handle;
+ int nr;
+
+ if (log_type >= CXL_EVENT_TYPE_MAX)
+ return -EINVAL;
+
+ log = event_find_log(cxlds->dev, log_type);
+ if (!log)
+ return 0; /* No mock data in this log */
+
+ /*
+ * This check is technically not invalid per the specification AFAICS.
+ * (The host could 'guess' handles and clear them in order).
+ * However, this is not good behavior for the host so test it.
+ */
+ if (log->clear_idx + pl->nr_recs > log->cur_idx) {
+ dev_err(cxlds->dev,
+ "Attempting to clear more events than returned!\n");
+ return -EINVAL;
+ }
+
+ /* Check handle order prior to clearing events */
+ for (nr = 0, handle = event_get_clear_handle(log);
+ nr < pl->nr_recs;
+ nr++, handle++) {
+ if (handle != le16_to_cpu(pl->handle[nr])) {
+ dev_err(cxlds->dev, "Clearing events out of order\n");
+ return -EINVAL;
+ }
+ }
+
+ /* Clear events */
+ log->clear_idx += pl->nr_recs;
+ return 0;
+}
+
+static void cxl_mock_event_trigger(struct device *dev)
+{
+ struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
+ struct mock_event_store *mes = &mdata->mes;
+ int i;
+
+ for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
+ struct mock_event_log *log;
+
+ log = event_find_log(dev, i);
+ if (log)
+ event_reset_log(log);
+ }
+
+ cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
+}
+
+struct cxl_event_record_raw maint_needed = {
+ .hdr = {
+ .id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
+ 0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+ .length = sizeof(struct cxl_event_record_raw),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0xa5b6),
+ },
+ .data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};

+struct cxl_event_record_raw hardware_replace = {
+ .hdr = {
+ .id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
+ 0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+ .length = sizeof(struct cxl_event_record_raw),
+ .flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+ /* .handle = Set dynamically */
+ .related_handle = cpu_to_le16(0xb6a5),
+ },
+ .data = { 0xDE, 0xAD, 0xBE, 0xEF },
};

+static void cxl_mock_add_event_logs(struct mock_event_store *mes)
+{
+ mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+ mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
+
+ mes_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+ mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
+}
+
static int mock_gsl(struct cxl_mbox_cmd *cmd)
{
if (cmd->size_out < sizeof(mock_gsl_payload))
@@ -582,6 +791,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
case CXL_MBOX_OP_GET_PARTITION_INFO:
rc = mock_partition_info(cxlds, cmd);
break;
+ case CXL_MBOX_OP_GET_EVENT_RECORD:
+ rc = mock_get_event(cxlds, cmd);
+ break;
+ case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
+ rc = mock_clear_event(cxlds, cmd);
+ break;
case CXL_MBOX_OP_SET_LSA:
rc = mock_set_lsa(cxlds, cmd);
break;
@@ -628,6 +843,15 @@ static bool is_rcd(struct platform_device *pdev)
return !!id->driver_data;
}

+static ssize_t event_trigger_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ cxl_mock_event_trigger(dev);
+ return count;
+}
+static DEVICE_ATTR_WO(event_trigger);
+
static int cxl_mock_mem_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -655,6 +879,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
cxlds->serial = pdev->id;
cxlds->mbox_send = cxl_mock_mbox_send;
cxlds->payload_size = SZ_4K;
+ cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
if (is_rcd(pdev)) {
cxlds->rcd = true;
cxlds->component_reg_phys = CXL_RESOURCE_NONE;
@@ -672,10 +897,15 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
if (rc)
return rc;

+ mdata->mes.cxlds = cxlds;
+ cxl_mock_add_event_logs(&mdata->mes);
+
cxlmd = devm_cxl_add_memdev(cxlds);
if (IS_ERR(cxlmd))
return PTR_ERR(cxlmd);

+ cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+
return 0;
}

@@ -714,6 +944,7 @@ static DEVICE_ATTR_RW(security_lock);

static struct attribute *cxl_mock_mem_attrs[] = {
&dev_attr_security_lock.attr,
+ &dev_attr_event_trigger.attr,
NULL
};
ATTRIBUTE_GROUPS(cxl_mock_mem);
--
2.37.2

2022-12-13 20:26:24

by Dan Williams

[permalink] [raw]

Subject: RE: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

ira.weiny@ wrote:
> From: Davidlohr Bueso <[email protected]>
>
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries. The device may
> however support less than 16 entries depending on the support it
> provides.
>
> Attempt to allocate these 16 irq vectors. If the device supports less
> then the PCI infrastructure will allocate that number. Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
>
> CXL device events are signaled via interrupts. Each event log may have
> a different interrupt message number. These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
>
> Add interrupt support for event logs. Interrupts are allocated as
> shared interrupts. Therefore, all or some event logs can share the same
> message number.
>
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Jonathan Cameron <[email protected]>
> Co-developed-by: Ira Weiny <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
> Signed-off-by: Davidlohr Bueso <[email protected]>
>
> ---
> Changes from V3:
> Adjust based on changes in patch 1
> Consolidate event setup into cxl_event_config()
> Consistently use cxl_event_* for function names
> Remove cxl_event_int_is_msi()
> Ensure DCD log is ignored in status
> Simplify event status loop logic
> Dan
> Fail driver load if the irq's are not allocated
> move cxl_event_config_msgnums() to pci.c
> s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> s/devm_kmalloc/devm_kzalloc
> Fix up pci_alloc_irq_vectors() comment
> Pass pdev to cxl_alloc_irq_vectors()
> Check FW irq policy prior to configuration
> Jonathan
> Use FIELD_GET instead of manual masking
> ---
> drivers/cxl/cxl.h | 4 +
> drivers/cxl/cxlmem.h | 19 ++++
> drivers/cxl/cxlpci.h | 6 ++
> drivers/cxl/pci.c | 208 +++++++++++++++++++++++++++++++++++++++++--
> 4 files changed, 231 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5974d1082210..b3964149c77b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> CXLDEV_EVENT_STATUS_FAIL | \
> CXLDEV_EVENT_STATUS_FATAL)
>
> +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> +#define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
> +#define CXLDEV_EVENT_INT_MSGNUM_MASK GENMASK(7, 4)
> +
> /* CXL 2.0 8.2.8.4 Mailbox Registers */
> #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> #define CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..bd8bfbe61ec8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> struct range dvsec_range[2];
> };
>
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> + CXL_INT_NONE = 0x00,
> + CXL_INT_MSI_MSIX = 0x01,
> + CXL_INT_FW = 0x02
> +};
> +struct cxl_event_interrupt_policy {
> + u8 info_settings;
> + u8 warn_settings;
> + u8 failure_settings;
> + u8 fatal_settings;
> +} __packed;
> +
> /**
> * struct cxl_event_state - Event log driver state
> *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
> CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..a8ea04f536ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
> #define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
> #define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)
>
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16. Use this as the default max.
> + */
> +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> +
> /* Register Block Identifier (RBI) */
> enum cxl_regloc_type {
> CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a2d8382bc593..d42d87faddb8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> return 0;
> }
>
> +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +{
> + int nvecs;
> +
> + /*
> + * CXL requires MSI/MSIX support.

Spec reference please.

"Per CXL 3.0 3.1.1 CXL.io Endpoint a function on a CXL device must not
generate INTx messages if that function participates in CXL.cache or
CXL.mem."

> + *
> + * Additionally pci_alloc_irq_vectors() handles calling
> + * pci_free_irq_vectors() automatically despite not being called
> + * pcim_*. See pci_setup_msi_context().
> + */
> + nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> + PCI_IRQ_MSIX | PCI_IRQ_MSI);
> + if (nvecs < 1) {
> + dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> + return -ENXIO;
> + }
> + return 0;
> +}
> +
> +struct cxl_dev_id {
> + struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> + struct cxl_dev_id *dev_id = id;
> + struct cxl_dev_state *cxlds = dev_id->cxlds;
> + u32 status;
> +
> + do {
> + /*
> + * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> + * ignore the reserved upper 32 bits
> + */
> + status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> + /* Ignore logs unknown to the driver */
> + status &= CXLDEV_EVENT_STATUS_ALL;
> + if (!status)
> + break;
> + cxl_mem_get_event_records(cxlds, status);
> + cond_resched();
> + } while (status);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> + struct device *dev = cxlds->dev;
> + struct pci_dev *pdev = to_pci_dev(dev);
> + struct cxl_dev_id *dev_id;
> + int irq;
> +
> + if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> + return -ENXIO;
> +
> + /* dev_id must be globally unique and must contain the cxlds */
> + dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> + if (!dev_id)
> + return -ENOMEM;
> + dev_id->cxlds = cxlds;
> +
> + irq = pci_irq_vector(pdev,
> + FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> + if (irq < 0)
> + return irq;
> +
> + return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> + IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> + struct cxl_event_interrupt_policy *policy)
> +{
> + struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> + .payload_out = policy,
> + .size_out = sizeof(*policy),
> + };
> + int rc;
> +
> + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> + if (rc < 0)
> + dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> + rc);
> +
> + return rc;
> +}
> +
> +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> + struct cxl_event_interrupt_policy *policy)
> +{
> + struct cxl_mbox_cmd mbox_cmd;
> + int rc;
> +
> + policy->info_settings = CXL_INT_MSI_MSIX;
> + policy->warn_settings = CXL_INT_MSI_MSIX;
> + policy->failure_settings = CXL_INT_MSI_MSIX;
> + policy->fatal_settings = CXL_INT_MSI_MSIX;

I prefer this syntax for command payload initialization if only because
it 0-fills all fields:

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index d42d87faddb8..e8b4bd5e517f 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -541,10 +541,12 @@ static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
struct cxl_mbox_cmd mbox_cmd;
int rc;

- policy->info_settings = CXL_INT_MSI_MSIX;
- policy->warn_settings = CXL_INT_MSI_MSIX;
- policy->failure_settings = CXL_INT_MSI_MSIX;
- policy->fatal_settings = CXL_INT_MSI_MSIX;
+ *policy = (struct cxl_event_interrupt_policy) {
+ .info_settings = CXL_INT_MSI_MSIX,
+ .warn_settings = CXL_INT_MSI_MSIX,
+ .failure_settings = CXL_INT_MSI_MSIX,
+ .fatal_settings = CXL_INT_MSI_MSIX,
+ };

mbox_cmd = (struct cxl_mbox_cmd) {
.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,

...but that can be a follow-on fixup because I am not seeing much else
to need respinning this patch.

> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> + .payload_in = policy,
> + .size_in = sizeof(*policy),
> + };
> +
> + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> + if (rc < 0) {
> + dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> + rc);
> + return rc;
> + }
> +
> + /* Retrieve final interrupt settings */
> + return cxl_event_get_int_policy(cxlds, policy);
> +}
> +
> +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> + struct cxl_event_interrupt_policy policy;
> + int rc;
> +
> + rc = cxl_event_config_msgnums(cxlds, &policy);
> + if (rc)
> + return rc;
> +
> + rc = cxl_event_req_irq(cxlds, policy.info_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> + return rc;
> + }
> +
> + rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> + return rc;
> + }
> +
> + rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> + return rc;
> + }
> +
> + rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> + return rc;
> + }
> +
> + return 0;
> +}
> +
> +static bool cxl_event_int_is_fw(u8 setting)
> +{
> + u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> +
> + return mode == CXL_INT_FW;
> +}
> +
> +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> + struct cxl_dev_state *cxlds)
> +{
> + struct cxl_event_interrupt_policy policy;
> + int rc;
> +
> + /*
> + * When BIOS maintains CXL error reporting control, it will process
> + * event records. Only one agent can do so.
> + */
> + if (!host_bridge->native_cxl_error)
> + return 0;
> +
> + rc = cxl_event_get_int_policy(cxlds, &policy);
> + if (rc)
> + return rc;
> +
> + if (cxl_event_int_is_fw(policy.info_settings) ||
> + cxl_event_int_is_fw(policy.warn_settings) ||
> + cxl_event_int_is_fw(policy.failure_settings) ||
> + cxl_event_int_is_fw(policy.fatal_settings)) {
> + dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> + return -EBUSY;
> + }
> +
> + rc = cxl_event_irqsetup(cxlds);
> + if (rc)
> + return rc;
> +
> + cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
> + return 0;
> +}
> +
> static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> {
> struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> return rc;
>
> + rc = cxl_alloc_irq_vectors(pdev);
> + if (rc)
> + return rc;
> +
> cxlmd = devm_cxl_add_memdev(cxlds);
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
> @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> return rc;
>
> - /*
> - * When BIOS maintains CXL error reporting control, it will process
> - * event records. Only one agent can do so.
> - */
> - if (host_bridge->native_cxl_error)
> - cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> + rc = cxl_event_config(host_bridge, cxlds);
> + if (rc)
> + return rc;

In the future, it would help me if these kinds of policy changes mid
patch set, if they cannot be avoided, are at least declared in the
changelog. It saps some reviewer resources to invalidate previous
understanding within the same review session.

2022-12-16 12:41:41

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 0/9] CXL: Process event logs

On Sun, 11 Dec 2022 23:06:18 -0800
[email protected] wrote:

> From: Ira Weiny <[email protected]>
>
> This code has been tested with a newer qemu which allows for more events to be
> returned at a time as well an additional QMP event and interrupt injection.
> Those patches will follow once they have been cleaned up.
>
> The series is now in 3 parts:
>
> 1) Base functionality including interrupts
> 2) Tracing specific events (Dynamic Capacity Event Record is defered)
> 3) cxl-test infrastructure for basic tests
>
> Changes from V3
> Feedback from Dan
> Spit out ACPI changes for Bjorn
>
> - Link to v3: https://lore.kernel.org/all/[email protected]/

Because I'm in a grumpy mood (as my colleagues will attest!)...
This is dependent on the patch that moves the trace definitions and
that's not upstream yet except in cxl/preview which is optimistic
place to use for a base commit. The id isn't the one below either which
isn't in either mailine or the current CXL trees.

Not that I actually checked the cover letter until it failed to apply
(and hence already knew what was missing) but still, please call out
dependencies unless they are in the branches Dan has queued up to push.

I just want to play with Dave's fix for the RAS errors so having to jump
through these other sets.

Thanks,

Jonathan

>
>
> Davidlohr Bueso (1):
> cxl/mem: Wire up event interrupts
>
> Ira Weiny (8):
> PCI/CXL: Export native CXL error reporting control
> cxl/mem: Read, trace, and clear events on driver load
> cxl/mem: Trace General Media Event Record
> cxl/mem: Trace DRAM Event Record
> cxl/mem: Trace Memory Module Event Record
> cxl/test: Add generic mock events
> cxl/test: Add specific events
> cxl/test: Simulate event log overflow
>
> drivers/acpi/pci_root.c | 3 +
> drivers/cxl/core/mbox.c | 186 +++++++++++++
> drivers/cxl/core/trace.h | 479 ++++++++++++++++++++++++++++++++++
> drivers/cxl/cxl.h | 16 ++
> drivers/cxl/cxlmem.h | 171 ++++++++++++
> drivers/cxl/cxlpci.h | 6 +
> drivers/cxl/pci.c | 236 +++++++++++++++++
> drivers/pci/probe.c | 1 +
> include/linux/pci.h | 1 +
> tools/testing/cxl/test/Kbuild | 2 +-
> tools/testing/cxl/test/mem.c | 352 +++++++++++++++++++++++++
> 11 files changed, 1452 insertions(+), 1 deletion(-)
>
>
> base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e

2022-12-16 14:55:01

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

2022-12-16 18:13:53

by Dan Williams

[permalink] [raw]

Subject: Re: [PATCH V4 0/9] CXL: Process event logs

Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:18 -0800
> [email protected] wrote:
>
> > From: Ira Weiny <[email protected]>
> >
> > This code has been tested with a newer qemu which allows for more events to be
> > returned at a time as well an additional QMP event and interrupt injection.
> > Those patches will follow once they have been cleaned up.
> >
> > The series is now in 3 parts:
> >
> > 1) Base functionality including interrupts
> > 2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > 3) cxl-test infrastructure for basic tests
> >
> > Changes from V3
> > Feedback from Dan
> > Spit out ACPI changes for Bjorn
> >
> > - Link to v3: https://lore.kernel.org/all/[email protected]/
>
> Because I'm in a grumpy mood (as my colleagues will attest!)...
> This is dependent on the patch that moves the trace definitions and
> that's not upstream yet except in cxl/preview which is optimistic
> place to use for a base commit. The id isn't the one below either which
> isn't in either mailine or the current CXL trees.

I do not want to commit to a new baseline until after -rc1, so yes this
is in a messy period.

> Not that I actually checked the cover letter until it failed to apply
> (and hence already knew what was missing) but still, please call out
> dependencies unless they are in the branches Dan has queued up to push.
>
> I just want to play with Dave's fix for the RAS errors so having to jump
> through these other sets.

Yes, that is annoying, apologies.

>
> Thanks,
>
> Jonathan
>
> >
> >
> > Davidlohr Bueso (1):
> > cxl/mem: Wire up event interrupts
> >
> > Ira Weiny (8):
> > PCI/CXL: Export native CXL error reporting control
> > cxl/mem: Read, trace, and clear events on driver load
> > cxl/mem: Trace General Media Event Record
> > cxl/mem: Trace DRAM Event Record
> > cxl/mem: Trace Memory Module Event Record
> > cxl/test: Add generic mock events
> > cxl/test: Add specific events
> > cxl/test: Simulate event log overflow
> >
> > drivers/acpi/pci_root.c | 3 +
> > drivers/cxl/core/mbox.c | 186 +++++++++++++
> > drivers/cxl/core/trace.h | 479 ++++++++++++++++++++++++++++++++++
> > drivers/cxl/cxl.h | 16 ++
> > drivers/cxl/cxlmem.h | 171 ++++++++++++
> > drivers/cxl/cxlpci.h | 6 +
> > drivers/cxl/pci.c | 236 +++++++++++++++++
> > drivers/pci/probe.c | 1 +
> > include/linux/pci.h | 1 +
> > tools/testing/cxl/test/Kbuild | 2 +-
> > tools/testing/cxl/test/mem.c | 352 +++++++++++++++++++++++++
> > 11 files changed, 1452 insertions(+), 1 deletion(-)
> >
> >
> > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
>

I think going forward these base-commits need to be something that are
reachable on cxl.git. For now I have pushed out a baseline for both Dave
and Ira's patches to cxl/preview which will rebase after -rc1 comes out.

Just the small matter of needing some acks/reviews on those lead in
patches so I can move them to through cxl/pending to cxl/next:

http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
http://lore.kernel.org/r/[email protected]

2022-12-16 18:20:21

by Ira Weiny

[permalink] [raw]

Subject: Re: [PATCH V4 0/9] CXL: Process event logs

On Fri, Dec 16, 2022 at 09:01:13AM -0800, Dan Williams wrote:
> Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:18 -0800
> > [email protected] wrote:
> >
> > > From: Ira Weiny <[email protected]>
> > >
> > > This code has been tested with a newer qemu which allows for more events to be
> > > returned at a time as well an additional QMP event and interrupt injection.
> > > Those patches will follow once they have been cleaned up.
> > >
> > > The series is now in 3 parts:
> > >
> > > 1) Base functionality including interrupts
> > > 2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > > 3) cxl-test infrastructure for basic tests
> > >
> > > Changes from V3
> > > Feedback from Dan
> > > Spit out ACPI changes for Bjorn
> > >
> > > - Link to v3: https://lore.kernel.org/all/[email protected]/
> >
> > Because I'm in a grumpy mood (as my colleagues will attest!)...
> > This is dependent on the patch that moves the trace definitions and
> > that's not upstream yet except in cxl/preview which is optimistic
> > place to use for a base commit. The id isn't the one below either which
> > isn't in either mailine or the current CXL trees.
>
> I do not want to commit to a new baseline until after -rc1, so yes this
> is in a messy period.
>
> > Not that I actually checked the cover letter until it failed to apply
> > (and hence already knew what was missing) but still, please call out
> > dependencies unless they are in the branches Dan has queued up to push.
> >
> > I just want to play with Dave's fix for the RAS errors so having to jump
> > through these other sets.
>
> Yes, that is annoying, apologies.
>
> >
> > Thanks,
> >
> > Jonathan
> >
> > >
> > >
> > > Davidlohr Bueso (1):
> > > cxl/mem: Wire up event interrupts
> > >
> > > Ira Weiny (8):
> > > PCI/CXL: Export native CXL error reporting control
> > > cxl/mem: Read, trace, and clear events on driver load
> > > cxl/mem: Trace General Media Event Record
> > > cxl/mem: Trace DRAM Event Record
> > > cxl/mem: Trace Memory Module Event Record
> > > cxl/test: Add generic mock events
> > > cxl/test: Add specific events
> > > cxl/test: Simulate event log overflow
> > >
> > > drivers/acpi/pci_root.c | 3 +
> > > drivers/cxl/core/mbox.c | 186 +++++++++++++
> > > drivers/cxl/core/trace.h | 479 ++++++++++++++++++++++++++++++++++
> > > drivers/cxl/cxl.h | 16 ++
> > > drivers/cxl/cxlmem.h | 171 ++++++++++++
> > > drivers/cxl/cxlpci.h | 6 +
> > > drivers/cxl/pci.c | 236 +++++++++++++++++
> > > drivers/pci/probe.c | 1 +
> > > include/linux/pci.h | 1 +
> > > tools/testing/cxl/test/Kbuild | 2 +-
> > > tools/testing/cxl/test/mem.c | 352 +++++++++++++++++++++++++
> > > 11 files changed, 1452 insertions(+), 1 deletion(-)
> > >
> > >
> > > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
> >
>
> I think going forward these base-commits need to be something that are
> reachable on cxl.git.

Agreed. I thought this was in preview. But even preview is not stable and I
should have waited and asked to see this land in next first.

Ira

> For now I have pushed out a baseline for both Dave
> and Ira's patches to cxl/preview which will rebase after -rc1 comes out.
>
> Just the small matter of needing some acks/reviews on those lead in
> patches so I can move them to through cxl/pending to cxl/next:
>
> http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
> http://lore.kernel.org/r/[email protected]

2022-12-16 18:51:50

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

On Sun, 11 Dec 2022 23:06:21 -0800
[email protected] wrote:

> From: Davidlohr Bueso <[email protected]>
>
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries. The device may
> however support less than 16 entries depending on the support it
> provides.
>
> Attempt to allocate these 16 irq vectors. If the device supports less
> then the PCI infrastructure will allocate that number. Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
>
> CXL device events are signaled via interrupts. Each event log may have
> a different interrupt message number. These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
>
> Add interrupt support for event logs. Interrupts are allocated as
> shared interrupts. Therefore, all or some event logs can share the same
> message number.
>
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Jonathan Cameron <[email protected]>
> Co-developed-by: Ira Weiny <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
> Signed-off-by: Davidlohr Bueso <[email protected]>

Sometimes it feels like we go around in circles as I'm sure we've
fixed this at least 3 times now in different sets that got dropped
(was definitely in the DOE interrupt support set and was controversial ;)

Nothing in this patch set calls pci_set_master() so no interrupts
will even be delivered. I don't think there are any sets in flight
that would fix that.

Jonathan

>
> ---
> Changes from V3:
> Adjust based on changes in patch 1
> Consolidate event setup into cxl_event_config()
> Consistently use cxl_event_* for function names
> Remove cxl_event_int_is_msi()
> Ensure DCD log is ignored in status
> Simplify event status loop logic
> Dan
> Fail driver load if the irq's are not allocated
> move cxl_event_config_msgnums() to pci.c
> s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> s/devm_kmalloc/devm_kzalloc
> Fix up pci_alloc_irq_vectors() comment
> Pass pdev to cxl_alloc_irq_vectors()
> Check FW irq policy prior to configuration
> Jonathan
> Use FIELD_GET instead of manual masking
> ---
> drivers/cxl/cxl.h | 4 +
> drivers/cxl/cxlmem.h | 19 ++++
> drivers/cxl/cxlpci.h | 6 ++
> drivers/cxl/pci.c | 208 +++++++++++++++++++++++++++++++++++++++++--
> 4 files changed, 231 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5974d1082210..b3964149c77b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> CXLDEV_EVENT_STATUS_FAIL | \
> CXLDEV_EVENT_STATUS_FATAL)
>
> +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> +#define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
> +#define CXLDEV_EVENT_INT_MSGNUM_MASK GENMASK(7, 4)
> +
> /* CXL 2.0 8.2.8.4 Mailbox Registers */
> #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> #define CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..bd8bfbe61ec8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> struct range dvsec_range[2];
> };
>
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> + CXL_INT_NONE = 0x00,
> + CXL_INT_MSI_MSIX = 0x01,
> + CXL_INT_FW = 0x02
> +};
> +struct cxl_event_interrupt_policy {
> + u8 info_settings;
> + u8 warn_settings;
> + u8 failure_settings;
> + u8 fatal_settings;
> +} __packed;
> +
> /**
> * struct cxl_event_state - Event log driver state
> *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
> CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..a8ea04f536ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
> #define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
> #define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)
>
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16. Use this as the default max.
> + */
> +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> +
> /* Register Block Identifier (RBI) */
> enum cxl_regloc_type {
> CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a2d8382bc593..d42d87faddb8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> return 0;
> }
>
> +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +{
> + int nvecs;
> +
> + /*
> + * CXL requires MSI/MSIX support.
> + *
> + * Additionally pci_alloc_irq_vectors() handles calling
> + * pci_free_irq_vectors() automatically despite not being called
> + * pcim_*. See pci_setup_msi_context().
> + */
> + nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> + PCI_IRQ_MSIX | PCI_IRQ_MSI);
> + if (nvecs < 1) {
> + dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> + return -ENXIO;
> + }
> + return 0;
> +}
> +
> +struct cxl_dev_id {
> + struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> + struct cxl_dev_id *dev_id = id;
> + struct cxl_dev_state *cxlds = dev_id->cxlds;
> + u32 status;
> +
> + do {
> + /*
> + * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> + * ignore the reserved upper 32 bits
> + */
> + status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> + /* Ignore logs unknown to the driver */
> + status &= CXLDEV_EVENT_STATUS_ALL;
> + if (!status)
> + break;
> + cxl_mem_get_event_records(cxlds, status);
> + cond_resched();
> + } while (status);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> + struct device *dev = cxlds->dev;
> + struct pci_dev *pdev = to_pci_dev(dev);
> + struct cxl_dev_id *dev_id;
> + int irq;
> +
> + if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> + return -ENXIO;
> +
> + /* dev_id must be globally unique and must contain the cxlds */
> + dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> + if (!dev_id)
> + return -ENOMEM;
> + dev_id->cxlds = cxlds;
> +
> + irq = pci_irq_vector(pdev,
> + FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> + if (irq < 0)
> + return irq;
> +
> + return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> + IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> + struct cxl_event_interrupt_policy *policy)
> +{
> + struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> + .payload_out = policy,
> + .size_out = sizeof(*policy),
> + };
> + int rc;
> +
> + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> + if (rc < 0)
> + dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> + rc);
> +
> + return rc;
> +}
> +
> +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> + struct cxl_event_interrupt_policy *policy)
> +{
> + struct cxl_mbox_cmd mbox_cmd;
> + int rc;
> +
> + policy->info_settings = CXL_INT_MSI_MSIX;
> + policy->warn_settings = CXL_INT_MSI_MSIX;
> + policy->failure_settings = CXL_INT_MSI_MSIX;
> + policy->fatal_settings = CXL_INT_MSI_MSIX;
> +
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> + .payload_in = policy,
> + .size_in = sizeof(*policy),
> + };
> +
> + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> + if (rc < 0) {
> + dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> + rc);
> + return rc;
> + }
> +
> + /* Retrieve final interrupt settings */
> + return cxl_event_get_int_policy(cxlds, policy);
> +}
> +
> +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> + struct cxl_event_interrupt_policy policy;
> + int rc;
> +
> + rc = cxl_event_config_msgnums(cxlds, &policy);
> + if (rc)
> + return rc;
> +
> + rc = cxl_event_req_irq(cxlds, policy.info_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> + return rc;
> + }
> +
> + rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> + return rc;
> + }
> +
> + rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> + return rc;
> + }
> +
> + rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> + return rc;
> + }
> +
> + return 0;
> +}
> +
> +static bool cxl_event_int_is_fw(u8 setting)
> +{
> + u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> +
> + return mode == CXL_INT_FW;
> +}
> +
> +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> + struct cxl_dev_state *cxlds)
> +{
> + struct cxl_event_interrupt_policy policy;
> + int rc;
> +
> + /*
> + * When BIOS maintains CXL error reporting control, it will process
> + * event records. Only one agent can do so.
> + */
> + if (!host_bridge->native_cxl_error)
> + return 0;
> +
> + rc = cxl_event_get_int_policy(cxlds, &policy);
> + if (rc)
> + return rc;
> +
> + if (cxl_event_int_is_fw(policy.info_settings) ||
> + cxl_event_int_is_fw(policy.warn_settings) ||
> + cxl_event_int_is_fw(policy.failure_settings) ||
> + cxl_event_int_is_fw(policy.fatal_settings)) {
> + dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> + return -EBUSY;
> + }
> +
> + rc = cxl_event_irqsetup(cxlds);
> + if (rc)
> + return rc;
> +
> + cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
> + return 0;
> +}
> +
> static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> {
> struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> return rc;
>
> + rc = cxl_alloc_irq_vectors(pdev);
> + if (rc)
> + return rc;
> +
> cxlmd = devm_cxl_add_memdev(cxlds);
> if (IS_ERR(cxlmd))
> return PTR_ERR(cxlmd);
> @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> return rc;
>
> - /*
> - * When BIOS maintains CXL error reporting control, it will process
> - * event records. Only one agent can do so.
> - */
> - if (host_bridge->native_cxl_error)
> - cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> + rc = cxl_event_config(host_bridge, cxlds);
> + if (rc)
> + return rc;
>
> if (cxlds->regs.ras) {
> pci_enable_pcie_error_reporting(pdev);

2022-12-16 18:54:31

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

On Fri, 16 Dec 2022 14:24:38 +0000
Jonathan Cameron <[email protected]> wrote:

> On Sun, 11 Dec 2022 23:06:21 -0800
> [email protected] wrote:
>
> > From: Davidlohr Bueso <[email protected]>
> >
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries. The device may
> > however support less than 16 entries depending on the support it
> > provides.
> >
> > Attempt to allocate these 16 irq vectors. If the device supports less
> > then the PCI infrastructure will allocate that number. Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> >
> > CXL device events are signaled via interrupts. Each event log may have
> > a different interrupt message number. These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> >
> > Add interrupt support for event logs. Interrupts are allocated as
> > shared interrupts. Therefore, all or some event logs can share the same
> > message number.
> >
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> >
> > Cc: Bjorn Helgaas <[email protected]>
> > Cc: Jonathan Cameron <[email protected]>
> > Co-developed-by: Ira Weiny <[email protected]>
> > Signed-off-by: Ira Weiny <[email protected]>
> > Signed-off-by: Davidlohr Bueso <[email protected]>
> >
>
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > + CXL_INT_NONE = 0x00,
> > + CXL_INT_MSI_MSIX = 0x01,
> > + CXL_INT_FW = 0x02
> > +};
> > +struct cxl_event_interrupt_policy {
> > + u8 info_settings;
> > + u8 warn_settings;
> > + u8 failure_settings;
> > + u8 fatal_settings;
>
> This is an issue for your QEMU code which has this set at 5 bytes.
> Guess our handling of record lengths needs updating now we have two different
> spec versions to support and hence these can have multiple lengths.
>
> Btw, do you have an updated version of the QEMU patches you can share?

Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
stuff you have already done!

Jonathan

> I was planning on just doing the AER type RAS stuff for the first pull this cycle
> but given this set means we never reach that code I probably need to do QEMU
> support for this and the stuff to support those all in one go - otherwise
> no one will be able to test it :) We rather optimistically have the OSC set
> to say the OS can have control of these, but upstream code doesn't emulate
> anything yet. Oops. Should have pretended the hardware was handling them
> until we had this support in place in QEMU.
>
> Jonathan
>
> > +} __packed;
> > +
> > /**
> > * struct cxl_event_state - Event log driver state
> > *
> > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> > CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> > CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> > + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> > + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> > CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> > CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> > CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,

2022-12-16 19:08:25

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 0/9] CXL: Process event logs

On Fri, 16 Dec 2022 09:01:13 -0800
Dan Williams <[email protected]> wrote:

> Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:18 -0800
> > [email protected] wrote:
> >
> > > From: Ira Weiny <[email protected]>
> > >
> > > This code has been tested with a newer qemu which allows for more events to be
> > > returned at a time as well an additional QMP event and interrupt injection.
> > > Those patches will follow once they have been cleaned up.
> > >
> > > The series is now in 3 parts:
> > >
> > > 1) Base functionality including interrupts
> > > 2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > > 3) cxl-test infrastructure for basic tests
> > >
> > > Changes from V3
> > > Feedback from Dan
> > > Spit out ACPI changes for Bjorn
> > >
> > > - Link to v3: https://lore.kernel.org/all/[email protected]/
> >
> > Because I'm in a grumpy mood (as my colleagues will attest!)...
> > This is dependent on the patch that moves the trace definitions and
> > that's not upstream yet except in cxl/preview which is optimistic
> > place to use for a base commit. The id isn't the one below either which
> > isn't in either mailine or the current CXL trees.
>
> I do not want to commit to a new baseline until after -rc1, so yes this
> is in a messy period.

Fully understood. I only push trees out as 'testing' for 0-day to hit
until I can rebase on rc1.

>
> > Not that I actually checked the cover letter until it failed to apply
> > (and hence already knew what was missing) but still, please call out
> > dependencies unless they are in the branches Dan has queued up to push.
> >
> > I just want to play with Dave's fix for the RAS errors so having to jump
> > through these other sets.
>
> Yes, that is annoying, apologies.
Not really a problem I just felt like grumbling :)

Have a good weekend.

Jonathan

>
> >
> > Thanks,
> >
> > Jonathan
> >
> > >
> > >
> > > Davidlohr Bueso (1):
> > > cxl/mem: Wire up event interrupts
> > >
> > > Ira Weiny (8):
> > > PCI/CXL: Export native CXL error reporting control
> > > cxl/mem: Read, trace, and clear events on driver load
> > > cxl/mem: Trace General Media Event Record
> > > cxl/mem: Trace DRAM Event Record
> > > cxl/mem: Trace Memory Module Event Record
> > > cxl/test: Add generic mock events
> > > cxl/test: Add specific events
> > > cxl/test: Simulate event log overflow
> > >
> > > drivers/acpi/pci_root.c | 3 +
> > > drivers/cxl/core/mbox.c | 186 +++++++++++++
> > > drivers/cxl/core/trace.h | 479 ++++++++++++++++++++++++++++++++++
> > > drivers/cxl/cxl.h | 16 ++
> > > drivers/cxl/cxlmem.h | 171 ++++++++++++
> > > drivers/cxl/cxlpci.h | 6 +
> > > drivers/cxl/pci.c | 236 +++++++++++++++++
> > > drivers/pci/probe.c | 1 +
> > > include/linux/pci.h | 1 +
> > > tools/testing/cxl/test/Kbuild | 2 +-
> > > tools/testing/cxl/test/mem.c | 352 +++++++++++++++++++++++++
> > > 11 files changed, 1452 insertions(+), 1 deletion(-)
> > >
> > >
> > > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
> >
>
> I think going forward these base-commits need to be something that are
> reachable on cxl.git. For now I have pushed out a baseline for both Dave
> and Ira's patches to cxl/preview which will rebase after -rc1 comes out.
>
> Just the small matter of needing some acks/reviews on those lead in
> patches so I can move them to through cxl/pending to cxl/next:

Don't move too fast with Ira's. Some issues coming up in testing..
(admittedly half of them were things where QEMU hadn't kept up with
what the kernel code now uses).

>
> http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
> http://lore.kernel.org/r/[email protected]

2022-12-16 21:47:21

by Ira Weiny

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

On Fri, Dec 16, 2022 at 06:21:10PM +0000, Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:21 -0800
> [email protected] wrote:
>
> > From: Davidlohr Bueso <[email protected]>
> >
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries. The device may
> > however support less than 16 entries depending on the support it
> > provides.
> >
> > Attempt to allocate these 16 irq vectors. If the device supports less
> > then the PCI infrastructure will allocate that number. Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> >
> > CXL device events are signaled via interrupts. Each event log may have
> > a different interrupt message number. These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> >
> > Add interrupt support for event logs. Interrupts are allocated as
> > shared interrupts. Therefore, all or some event logs can share the same
> > message number.
> >
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> >
> > Cc: Bjorn Helgaas <[email protected]>
> > Cc: Jonathan Cameron <[email protected]>
> > Co-developed-by: Ira Weiny <[email protected]>
> > Signed-off-by: Ira Weiny <[email protected]>
> > Signed-off-by: Davidlohr Bueso <[email protected]>
>
> Sometimes it feels like we go around in circles as I'm sure we've
> fixed this at least 3 times now in different sets that got dropped
> (was definitely in the DOE interrupt support set and was controversial ;)

Well I certainly seem to be spinning ATM. I probably need more eggnog. ;-)

>
> Nothing in this patch set calls pci_set_master() so no interrupts
> will even be delivered.

Is this a bug in Qemu then? Because this _is_ tested and irqs _were_
delivered. I thought we determined that pci_set_master() was called in
pcim_enable_device() but I don't see that now. :-(

What I do know is this works with Qemu. :-/

Ira

> I don't think there are any sets in flight
> that would fix that.
>
> Jonathan
>
>
> >
> > ---
> > Changes from V3:
> > Adjust based on changes in patch 1
> > Consolidate event setup into cxl_event_config()
> > Consistently use cxl_event_* for function names
> > Remove cxl_event_int_is_msi()
> > Ensure DCD log is ignored in status
> > Simplify event status loop logic
> > Dan
> > Fail driver load if the irq's are not allocated
> > move cxl_event_config_msgnums() to pci.c
> > s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> > s/devm_kmalloc/devm_kzalloc
> > Fix up pci_alloc_irq_vectors() comment
> > Pass pdev to cxl_alloc_irq_vectors()
> > Check FW irq policy prior to configuration
> > Jonathan
> > Use FIELD_GET instead of manual masking
> > ---
> > drivers/cxl/cxl.h | 4 +
> > drivers/cxl/cxlmem.h | 19 ++++
> > drivers/cxl/cxlpci.h | 6 ++
> > drivers/cxl/pci.c | 208 +++++++++++++++++++++++++++++++++++++++++--
> > 4 files changed, 231 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 5974d1082210..b3964149c77b 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> > CXLDEV_EVENT_STATUS_FAIL | \
> > CXLDEV_EVENT_STATUS_FATAL)
> >
> > +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> > +#define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
> > +#define CXLDEV_EVENT_INT_MSGNUM_MASK GENMASK(7, 4)
> > +
> > /* CXL 2.0 8.2.8.4 Mailbox Registers */
> > #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> > #define CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index dd9aa3dd738e..bd8bfbe61ec8 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> > struct range dvsec_range[2];
> > };
> >
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > + CXL_INT_NONE = 0x00,
> > + CXL_INT_MSI_MSIX = 0x01,
> > + CXL_INT_FW = 0x02
> > +};
> > +struct cxl_event_interrupt_policy {
> > + u8 info_settings;
> > + u8 warn_settings;
> > + u8 failure_settings;
> > + u8 fatal_settings;
> > +} __packed;
> > +
> > /**
> > * struct cxl_event_state - Event log driver state
> > *
> > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> > CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> > CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> > + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> > + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> > CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> > CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> > CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
> > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > index 77dbdb980b12..a8ea04f536ab 100644
> > --- a/drivers/cxl/cxlpci.h
> > +++ b/drivers/cxl/cxlpci.h
> > @@ -53,6 +53,12 @@
> > #define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
> > #define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)
> >
> > +/*
> > + * NOTE: Currently all the functions which are enabled for CXL require their
> > + * vectors to be in the first 16. Use this as the default max.
> > + */
> > +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> > +
> > /* Register Block Identifier (RBI) */
> > enum cxl_regloc_type {
> > CXL_REGLOC_RBI_EMPTY = 0,
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index a2d8382bc593..d42d87faddb8 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > return 0;
> > }
> >
> > +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> > +{
> > + int nvecs;
> > +
> > + /*
> > + * CXL requires MSI/MSIX support.
> > + *
> > + * Additionally pci_alloc_irq_vectors() handles calling
> > + * pci_free_irq_vectors() automatically despite not being called
> > + * pcim_*. See pci_setup_msi_context().
> > + */
> > + nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> > + PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > + if (nvecs < 1) {
> > + dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> > + return -ENXIO;
> > + }
> > + return 0;
> > +}
> > +
> > +struct cxl_dev_id {
> > + struct cxl_dev_state *cxlds;
> > +};
> > +
> > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > +{
> > + struct cxl_dev_id *dev_id = id;
> > + struct cxl_dev_state *cxlds = dev_id->cxlds;
> > + u32 status;
> > +
> > + do {
> > + /*
> > + * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > + * ignore the reserved upper 32 bits
> > + */
> > + status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > + /* Ignore logs unknown to the driver */
> > + status &= CXLDEV_EVENT_STATUS_ALL;
> > + if (!status)
> > + break;
> > + cxl_mem_get_event_records(cxlds, status);
> > + cond_resched();
> > + } while (status);
> > +
> > + return IRQ_HANDLED;
> > +}
> > +
> > +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> > +{
> > + struct device *dev = cxlds->dev;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + struct cxl_dev_id *dev_id;
> > + int irq;
> > +
> > + if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> > + return -ENXIO;
> > +
> > + /* dev_id must be globally unique and must contain the cxlds */
> > + dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> > + if (!dev_id)
> > + return -ENOMEM;
> > + dev_id->cxlds = cxlds;
> > +
> > + irq = pci_irq_vector(pdev,
> > + FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> > + if (irq < 0)
> > + return irq;
> > +
> > + return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > + IRQF_SHARED, NULL, dev_id);
> > +}
> > +
> > +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> > + struct cxl_event_interrupt_policy *policy)
> > +{
> > + struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> > + .opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > + .payload_out = policy,
> > + .size_out = sizeof(*policy),
> > + };
> > + int rc;
> > +
> > + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > + if (rc < 0)
> > + dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > + rc);
> > +
> > + return rc;
> > +}
> > +
> > +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > + struct cxl_event_interrupt_policy *policy)
> > +{
> > + struct cxl_mbox_cmd mbox_cmd;
> > + int rc;
> > +
> > + policy->info_settings = CXL_INT_MSI_MSIX;
> > + policy->warn_settings = CXL_INT_MSI_MSIX;
> > + policy->failure_settings = CXL_INT_MSI_MSIX;
> > + policy->fatal_settings = CXL_INT_MSI_MSIX;
> > +
> > + mbox_cmd = (struct cxl_mbox_cmd) {
> > + .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > + .payload_in = policy,
> > + .size_in = sizeof(*policy),
> > + };
> > +
> > + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > + if (rc < 0) {
> > + dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > + rc);
> > + return rc;
> > + }
> > +
> > + /* Retrieve final interrupt settings */
> > + return cxl_event_get_int_policy(cxlds, policy);
> > +}
> > +
> > +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > + struct cxl_event_interrupt_policy policy;
> > + int rc;
> > +
> > + rc = cxl_event_config_msgnums(cxlds, &policy);
> > + if (rc)
> > + return rc;
> > +
> > + rc = cxl_event_req_irq(cxlds, policy.info_settings);
> > + if (rc) {
> > + dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > + return rc;
> > + }
> > +
> > + rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> > + if (rc) {
> > + dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > + return rc;
> > + }
> > +
> > + rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> > + if (rc) {
> > + dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > + return rc;
> > + }
> > +
> > + rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> > + if (rc) {
> > + dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > + return rc;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static bool cxl_event_int_is_fw(u8 setting)
> > +{
> > + u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> > +
> > + return mode == CXL_INT_FW;
> > +}
> > +
> > +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> > + struct cxl_dev_state *cxlds)
> > +{
> > + struct cxl_event_interrupt_policy policy;
> > + int rc;
> > +
> > + /*
> > + * When BIOS maintains CXL error reporting control, it will process
> > + * event records. Only one agent can do so.
> > + */
> > + if (!host_bridge->native_cxl_error)
> > + return 0;
> > +
> > + rc = cxl_event_get_int_policy(cxlds, &policy);
> > + if (rc)
> > + return rc;
> > +
> > + if (cxl_event_int_is_fw(policy.info_settings) ||
> > + cxl_event_int_is_fw(policy.warn_settings) ||
> > + cxl_event_int_is_fw(policy.failure_settings) ||
> > + cxl_event_int_is_fw(policy.fatal_settings)) {
> > + dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> > + return -EBUSY;
> > + }
> > +
> > + rc = cxl_event_irqsetup(cxlds);
> > + if (rc)
> > + return rc;
> > +
> > + cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +
> > + return 0;
> > +}
> > +
> > static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > {
> > struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > if (rc)
> > return rc;
> >
> > + rc = cxl_alloc_irq_vectors(pdev);
> > + if (rc)
> > + return rc;
> > +
> > cxlmd = devm_cxl_add_memdev(cxlds);
> > if (IS_ERR(cxlmd))
> > return PTR_ERR(cxlmd);
> > @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > if (rc)
> > return rc;
> >
> > - /*
> > - * When BIOS maintains CXL error reporting control, it will process
> > - * event records. Only one agent can do so.
> > - */
> > - if (host_bridge->native_cxl_error)
> > - cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > + rc = cxl_event_config(host_bridge, cxlds);
> > + if (rc)
> > + return rc;
> >
> > if (cxlds->regs.ras) {
> > pci_enable_pcie_error_reporting(pdev);
>

2022-12-16 21:56:12

by Ira Weiny

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

On Fri, Dec 16, 2022 at 06:42:15PM +0000, Jonathan Cameron wrote:
> On Fri, 16 Dec 2022 14:24:38 +0000
> Jonathan Cameron <[email protected]> wrote:
>
> > On Sun, 11 Dec 2022 23:06:21 -0800
> > [email protected] wrote:
> >
> > > From: Davidlohr Bueso <[email protected]>
> > >
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries. The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > >
> > > Attempt to allocate these 16 irq vectors. If the device supports less
> > > then the PCI infrastructure will allocate that number. Upon successful
> > > allocation, users can plug in their respective isr at any point
> > > thereafter.
> > >
> > > CXL device events are signaled via interrupts. Each event log may have
> > > a different interrupt message number. These message numbers are
> > > reported in the Get Event Interrupt Policy mailbox command.
> > >
> > > Add interrupt support for event logs. Interrupts are allocated as
> > > shared interrupts. Therefore, all or some event logs can share the same
> > > message number.
> > >
> > > In addition all logs are queried on any interrupt in order of the most
> > > to least severe based on the status register.
> > >
> > > Cc: Bjorn Helgaas <[email protected]>
> > > Cc: Jonathan Cameron <[email protected]>
> > > Co-developed-by: Ira Weiny <[email protected]>
> > > Signed-off-by: Ira Weiny <[email protected]>
> > > Signed-off-by: Davidlohr Bueso <[email protected]>
> > >
> >
> > > +/**
> > > + * Event Interrupt Policy
> > > + *
> > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > + */
> > > +enum cxl_event_int_mode {
> > > + CXL_INT_NONE = 0x00,
> > > + CXL_INT_MSI_MSIX = 0x01,
> > > + CXL_INT_FW = 0x02
> > > +};
> > > +struct cxl_event_interrupt_policy {
> > > + u8 info_settings;
> > > + u8 warn_settings;
> > > + u8 failure_settings;
> > > + u8 fatal_settings;
> >
> > This is an issue for your QEMU code which has this set at 5 bytes.
> > Guess our handling of record lengths needs updating now we have two different
> > spec versions to support and hence these can have multiple lengths.
> >
> > Btw, do you have an updated version of the QEMU patches you can share?
>
> Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
> stuff you have already done!

I do have updates to handle more than 3 records at a time which I was polishing
until I saw your review.

I was getting all geared up to respin the patches but I'm not sure there is an
issue now that I look at it.

I'm still investigating.

Ira

>
> Jonathan
>
> > I was planning on just doing the AER type RAS stuff for the first pull this cycle
> > but given this set means we never reach that code I probably need to do QEMU
> > support for this and the stuff to support those all in one go - otherwise
> > no one will be able to test it :) We rather optimistically have the OSC set
> > to say the OS can have control of these, but upstream code doesn't emulate
> > anything yet. Oops. Should have pretended the hardware was handling them
> > until we had this support in place in QEMU.
> >
> > Jonathan
> >
> > > +} __packed;
> > > +
> > > /**
> > > * struct cxl_event_state - Event log driver state
> > > *
> > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > > CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> > > CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> > > CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> > > + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> > > + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> > > CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> > > CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> > > CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
>

2022-12-17 16:54:04

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

On Fri, 16 Dec 2022 13:28:36 -0800
Ira Weiny <[email protected]> wrote:

> On Fri, Dec 16, 2022 at 06:42:15PM +0000, Jonathan Cameron wrote:
> > On Fri, 16 Dec 2022 14:24:38 +0000
> > Jonathan Cameron <[email protected]> wrote:
> >
> > > On Sun, 11 Dec 2022 23:06:21 -0800
> > > [email protected] wrote:
> > >
> > > > From: Davidlohr Bueso <[email protected]>
> > > >
> > > > Currently the only CXL features targeted for irq support require their
> > > > message numbers to be within the first 16 entries. The device may
> > > > however support less than 16 entries depending on the support it
> > > > provides.
> > > >
> > > > Attempt to allocate these 16 irq vectors. If the device supports less
> > > > then the PCI infrastructure will allocate that number. Upon successful
> > > > allocation, users can plug in their respective isr at any point
> > > > thereafter.
> > > >
> > > > CXL device events are signaled via interrupts. Each event log may have
> > > > a different interrupt message number. These message numbers are
> > > > reported in the Get Event Interrupt Policy mailbox command.
> > > >
> > > > Add interrupt support for event logs. Interrupts are allocated as
> > > > shared interrupts. Therefore, all or some event logs can share the same
> > > > message number.
> > > >
> > > > In addition all logs are queried on any interrupt in order of the most
> > > > to least severe based on the status register.
> > > >
> > > > Cc: Bjorn Helgaas <[email protected]>
> > > > Cc: Jonathan Cameron <[email protected]>
> > > > Co-developed-by: Ira Weiny <[email protected]>
> > > > Signed-off-by: Ira Weiny <[email protected]>
> > > > Signed-off-by: Davidlohr Bueso <[email protected]>
> > > >
> > >
> > > > +/**
> > > > + * Event Interrupt Policy
> > > > + *
> > > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > > + */
> > > > +enum cxl_event_int_mode {
> > > > + CXL_INT_NONE = 0x00,
> > > > + CXL_INT_MSI_MSIX = 0x01,
> > > > + CXL_INT_FW = 0x02
> > > > +};
> > > > +struct cxl_event_interrupt_policy {
> > > > + u8 info_settings;
> > > > + u8 warn_settings;
> > > > + u8 failure_settings;
> > > > + u8 fatal_settings;
> > >
> > > This is an issue for your QEMU code which has this set at 5 bytes.
> > > Guess our handling of record lengths needs updating now we have two different
> > > spec versions to support and hence these can have multiple lengths.
> > >
> > > Btw, do you have an updated version of the QEMU patches you can share?
> >
> > Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
> > stuff you have already done!
>
> I do have updates to handle more than 3 records at a time which I was polishing
> until I saw your review.

Great. I'll work on other stuff in the meantime and hopefully we can get all the
error emulation upstream this QEMU cycle.

>
> I was getting all geared up to respin the patches but I'm not sure there is an
> issue now that I look at it.
>
> I'm still investigating.
>
> Ira
>
> >
> > Jonathan
> >
> > > I was planning on just doing the AER type RAS stuff for the first pull this cycle
> > > but given this set means we never reach that code I probably need to do QEMU
> > > support for this and the stuff to support those all in one go - otherwise
> > > no one will be able to test it :) We rather optimistically have the OSC set
> > > to say the OS can have control of these, but upstream code doesn't emulate
> > > anything yet. Oops. Should have pretended the hardware was handling them
> > > until we had this support in place in QEMU.
> > >
> > > Jonathan
> > >
> > > > +} __packed;
> > > > +
> > > > /**
> > > > * struct cxl_event_state - Event log driver state
> > > > *
> > > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > > > CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> > > > CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> > > > CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> > > > + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> > > > + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> > > > CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> > > > CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> > > > CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
> >

2022-12-17 16:55:21

by Jonathan Cameron

[permalink] [raw]

Subject: Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts

On Fri, 16 Dec 2022 13:33:43 -0800
Ira Weiny <[email protected]> wrote:

> On Fri, Dec 16, 2022 at 06:21:10PM +0000, Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:21 -0800
> > [email protected] wrote:
> >
> > > From: Davidlohr Bueso <[email protected]>
> > >
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries. The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > >
> > > Attempt to allocate these 16 irq vectors. If the device supports less
> > > then the PCI infrastructure will allocate that number. Upon successful
> > > allocation, users can plug in their respective isr at any point
> > > thereafter.
> > >
> > > CXL device events are signaled via interrupts. Each event log may have
> > > a different interrupt message number. These message numbers are
> > > reported in the Get Event Interrupt Policy mailbox command.
> > >
> > > Add interrupt support for event logs. Interrupts are allocated as
> > > shared interrupts. Therefore, all or some event logs can share the same
> > > message number.
> > >
> > > In addition all logs are queried on any interrupt in order of the most
> > > to least severe based on the status register.
> > >
> > > Cc: Bjorn Helgaas <[email protected]>
> > > Cc: Jonathan Cameron <[email protected]>
> > > Co-developed-by: Ira Weiny <[email protected]>
> > > Signed-off-by: Ira Weiny <[email protected]>
> > > Signed-off-by: Davidlohr Bueso <[email protected]>
> >
> > Sometimes it feels like we go around in circles as I'm sure we've
> > fixed this at least 3 times now in different sets that got dropped
> > (was definitely in the DOE interrupt support set and was controversial ;)
>
> Well I certainly seem to be spinning ATM. I probably need more eggnog. ;-)
>
> >
> > Nothing in this patch set calls pci_set_master() so no interrupts
> > will even be delivered.
>
> Is this a bug in Qemu then? Because this _is_ tested and irqs _were_
> delivered. I thought we determined that pci_set_master() was called in
> pcim_enable_device() but I don't see that now. :-(
>
> What I do know is this works with Qemu. :-/

Without the addition of a pci_set_master() the kernel code,
I don't get any interrupts on my arm64 test setup. Only reason I realized the
call was missing. Possible there is something different on x86 I guess.

Jonathan

>
> Ira
>
> > I don't think there are any sets in flight
> > that would fix that.
> >
> > Jonathan
> >
> >
> > >
> > > ---
> > > Changes from V3:
> > > Adjust based on changes in patch 1
> > > Consolidate event setup into cxl_event_config()
> > > Consistently use cxl_event_* for function names
> > > Remove cxl_event_int_is_msi()
> > > Ensure DCD log is ignored in status
> > > Simplify event status loop logic
> > > Dan
> > > Fail driver load if the irq's are not allocated
> > > move cxl_event_config_msgnums() to pci.c
> > > s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> > > s/devm_kmalloc/devm_kzalloc
> > > Fix up pci_alloc_irq_vectors() comment
> > > Pass pdev to cxl_alloc_irq_vectors()
> > > Check FW irq policy prior to configuration
> > > Jonathan
> > > Use FIELD_GET instead of manual masking
> > > ---
> > > drivers/cxl/cxl.h | 4 +
> > > drivers/cxl/cxlmem.h | 19 ++++
> > > drivers/cxl/cxlpci.h | 6 ++
> > > drivers/cxl/pci.c | 208 +++++++++++++++++++++++++++++++++++++++++--
> > > 4 files changed, 231 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index 5974d1082210..b3964149c77b 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> > > CXLDEV_EVENT_STATUS_FAIL | \
> > > CXLDEV_EVENT_STATUS_FATAL)
> > >
> > > +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> > > +#define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
> > > +#define CXLDEV_EVENT_INT_MSGNUM_MASK GENMASK(7, 4)
> > > +
> > > /* CXL 2.0 8.2.8.4 Mailbox Registers */
> > > #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> > > #define CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index dd9aa3dd738e..bd8bfbe61ec8 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> > > struct range dvsec_range[2];
> > > };
> > >
> > > +/**
> > > + * Event Interrupt Policy
> > > + *
> > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > + */
> > > +enum cxl_event_int_mode {
> > > + CXL_INT_NONE = 0x00,
> > > + CXL_INT_MSI_MSIX = 0x01,
> > > + CXL_INT_FW = 0x02
> > > +};
> > > +struct cxl_event_interrupt_policy {
> > > + u8 info_settings;
> > > + u8 warn_settings;
> > > + u8 failure_settings;
> > > + u8 fatal_settings;
> > > +} __packed;
> > > +
> > > /**
> > > * struct cxl_event_state - Event log driver state
> > > *
> > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > > CXL_MBOX_OP_RAW = CXL_MBOX_OP_INVALID,
> > > CXL_MBOX_OP_GET_EVENT_RECORD = 0x0100,
> > > CXL_MBOX_OP_CLEAR_EVENT_RECORD = 0x0101,
> > > + CXL_MBOX_OP_GET_EVT_INT_POLICY = 0x0102,
> > > + CXL_MBOX_OP_SET_EVT_INT_POLICY = 0x0103,
> > > CXL_MBOX_OP_GET_FW_INFO = 0x0200,
> > > CXL_MBOX_OP_ACTIVATE_FW = 0x0202,
> > > CXL_MBOX_OP_GET_SUPPORTED_LOGS = 0x0400,
> > > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > > index 77dbdb980b12..a8ea04f536ab 100644
> > > --- a/drivers/cxl/cxlpci.h
> > > +++ b/drivers/cxl/cxlpci.h
> > > @@ -53,6 +53,12 @@
> > > #define CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK GENMASK(15, 8)
> > > #define CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK GENMASK(31, 16)
> > >
> > > +/*
> > > + * NOTE: Currently all the functions which are enabled for CXL require their
> > > + * vectors to be in the first 16. Use this as the default max.
> > > + */
> > > +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> > > +
> > > /* Register Block Identifier (RBI) */
> > > enum cxl_regloc_type {
> > > CXL_REGLOC_RBI_EMPTY = 0,
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index a2d8382bc593..d42d87faddb8 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > > return 0;
> > > }
> > >
> > > +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> > > +{
> > > + int nvecs;
> > > +
> > > + /*
> > > + * CXL requires MSI/MSIX support.
> > > + *
> > > + * Additionally pci_alloc_irq_vectors() handles calling
> > > + * pci_free_irq_vectors() automatically despite not being called
> > > + * pcim_*. See pci_setup_msi_context().
> > > + */
> > > + nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> > > + PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > > + if (nvecs < 1) {
> > > + dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> > > + return -ENXIO;
> > > + }
> > > + return 0;
> > > +}
> > > +
> > > +struct cxl_dev_id {
> > > + struct cxl_dev_state *cxlds;
> > > +};
> > > +
> > > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > > +{
> > > + struct cxl_dev_id *dev_id = id;
> > > + struct cxl_dev_state *cxlds = dev_id->cxlds;
> > > + u32 status;
> > > +
> > > + do {
> > > + /*
> > > + * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > > + * ignore the reserved upper 32 bits
> > > + */
> > > + status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > > + /* Ignore logs unknown to the driver */
> > > + status &= CXLDEV_EVENT_STATUS_ALL;
> > > + if (!status)
> > > + break;
> > > + cxl_mem_get_event_records(cxlds, status);
> > > + cond_resched();
> > > + } while (status);
> > > +
> > > + return IRQ_HANDLED;
> > > +}
> > > +
> > > +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> > > +{
> > > + struct device *dev = cxlds->dev;
> > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > + struct cxl_dev_id *dev_id;
> > > + int irq;
> > > +
> > > + if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> > > + return -ENXIO;
> > > +
> > > + /* dev_id must be globally unique and must contain the cxlds */
> > > + dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> > > + if (!dev_id)
> > > + return -ENOMEM;
> > > + dev_id->cxlds = cxlds;
> > > +
> > > + irq = pci_irq_vector(pdev,
> > > + FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> > > + if (irq < 0)
> > > + return irq;
> > > +
> > > + return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > > + IRQF_SHARED, NULL, dev_id);
> > > +}
> > > +
> > > +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> > > + struct cxl_event_interrupt_policy *policy)
> > > +{
> > > + struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> > > + .opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > > + .payload_out = policy,
> > > + .size_out = sizeof(*policy),
> > > + };
> > > + int rc;
> > > +
> > > + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > + if (rc < 0)
> > > + dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > > + rc);
> > > +
> > > + return rc;
> > > +}
> > > +
> > > +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > + struct cxl_event_interrupt_policy *policy)
> > > +{
> > > + struct cxl_mbox_cmd mbox_cmd;
> > > + int rc;
> > > +
> > > + policy->info_settings = CXL_INT_MSI_MSIX;
> > > + policy->warn_settings = CXL_INT_MSI_MSIX;
> > > + policy->failure_settings = CXL_INT_MSI_MSIX;
> > > + policy->fatal_settings = CXL_INT_MSI_MSIX;
> > > +
> > > + mbox_cmd = (struct cxl_mbox_cmd) {
> > > + .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > > + .payload_in = policy,
> > > + .size_in = sizeof(*policy),
> > > + };
> > > +
> > > + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > + if (rc < 0) {
> > > + dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > > + rc);
> > > + return rc;
> > > + }
> > > +
> > > + /* Retrieve final interrupt settings */
> > > + return cxl_event_get_int_policy(cxlds, policy);
> > > +}
> > > +
> > > +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > > +{
> > > + struct cxl_event_interrupt_policy policy;
> > > + int rc;
> > > +
> > > + rc = cxl_event_config_msgnums(cxlds, &policy);
> > > + if (rc)
> > > + return rc;
> > > +
> > > + rc = cxl_event_req_irq(cxlds, policy.info_settings);
> > > + if (rc) {
> > > + dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > > + return rc;
> > > + }
> > > +
> > > + rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> > > + if (rc) {
> > > + dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > > + return rc;
> > > + }
> > > +
> > > + rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> > > + if (rc) {
> > > + dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > > + return rc;
> > > + }
> > > +
> > > + rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> > > + if (rc) {
> > > + dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > > + return rc;
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static bool cxl_event_int_is_fw(u8 setting)
> > > +{
> > > + u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> > > +
> > > + return mode == CXL_INT_FW;
> > > +}
> > > +
> > > +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> > > + struct cxl_dev_state *cxlds)
> > > +{
> > > + struct cxl_event_interrupt_policy policy;
> > > + int rc;
> > > +
> > > + /*
> > > + * When BIOS maintains CXL error reporting control, it will process
> > > + * event records. Only one agent can do so.
> > > + */
> > > + if (!host_bridge->native_cxl_error)
> > > + return 0;
> > > +
> > > + rc = cxl_event_get_int_policy(cxlds, &policy);
> > > + if (rc)
> > > + return rc;
> > > +
> > > + if (cxl_event_int_is_fw(policy.info_settings) ||
> > > + cxl_event_int_is_fw(policy.warn_settings) ||
> > > + cxl_event_int_is_fw(policy.failure_settings) ||
> > > + cxl_event_int_is_fw(policy.fatal_settings)) {
> > > + dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> > > + return -EBUSY;
> > > + }
> > > +
> > > + rc = cxl_event_irqsetup(cxlds);
> > > + if (rc)
> > > + return rc;
> > > +
> > > + cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > {
> > > struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > > @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > if (rc)
> > > return rc;
> > >
> > > + rc = cxl_alloc_irq_vectors(pdev);
> > > + if (rc)
> > > + return rc;
> > > +
> > > cxlmd = devm_cxl_add_memdev(cxlds);
> > > if (IS_ERR(cxlmd))
> > > return PTR_ERR(cxlmd);
> > > @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > if (rc)
> > > return rc;
> > >
> > > - /*
> > > - * When BIOS maintains CXL error reporting control, it will process
> > > - * event records. Only one agent can do so.
> > > - */
> > > - if (host_bridge->native_cxl_error)
> > > - cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > + rc = cxl_event_config(host_bridge, cxlds);
> > > + if (rc)
> > > + return rc;
> > >
> > > if (cxlds->regs.ras) {
> > > pci_enable_pcie_error_reporting(pdev);
> >
>