2024-02-29 07:14:25

by Ira Weiny

[permalink] [raw]
Subject: [PATCH 3/4] cxl/pci: Register for and process CPER events

If the firmware has configured CXL event support to be firmware first
the OS can process those events through CPER records. The CXL layer has
unique DPA to HPA knowledge and standard event trace parsing in place.

CPER records contain Bus, Device, Function information which can be used
to identify the PCI device which is sending the event.

Add a CXL CPER callback to process events through the CXL trace
subsystem.

Future patches will provide additional region information such as HPA.

Signed-off-by: Ira Weiny <[email protected]>

---
Changes:
[iweiny: Add back in after the revert in 6.8]
---
drivers/cxl/pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 2ff361e756d6..6cf8336d1b33 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -974,6 +974,73 @@ static struct pci_driver cxl_pci_driver = {
},
};

-module_pci_driver(cxl_pci_driver);
+#define CXL_EVENT_HDR_FLAGS_REC_SEVERITY GENMASK(1, 0)
+static void cxl_cper_event_call(enum cxl_event_type ev_type,
+ struct cxl_cper_event_rec *rec)
+{
+ struct cper_cxl_event_devid *device_id = &rec->hdr.device_id;
+ struct pci_dev *pdev __free(pci_dev_put) = NULL;
+ enum cxl_event_log_type log_type;
+ struct cxl_dev_state *cxlds;
+ unsigned int devfn;
+ u32 hdr_flags;
+
+ pr_debug("CPER event for device %u:%u:%u.%u\n",
+ device_id->segment_num, device_id->bus_num,
+ device_id->device_num, device_id->func_num);
+
+ devfn = PCI_DEVFN(device_id->device_num, device_id->func_num);
+ pdev = pci_get_domain_bus_and_slot(device_id->segment_num,
+ device_id->bus_num, devfn);
+ if (!pdev) {
+ pr_err("CPER event device %u:%u:%u.%u not found\n",
+ device_id->segment_num, device_id->bus_num,
+ device_id->device_num, device_id->func_num);
+ return;
+ }
+
+ dev_dbg(&pdev->dev, "Found device %u:%u.%u\n", device_id->bus_num,
+ device_id->device_num, device_id->func_num);
+
+ guard(device)(&pdev->dev);
+ if (pdev->driver != &cxl_pci_driver)
+ return;
+
+ cxlds = pci_get_drvdata(pdev);
+ if (!cxlds)
+ return;
+
+ /* Fabricate a log type */
+ hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags);
+ log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags);
+
+ dev_dbg(&pdev->dev, "Tracing %d\n", ev_type);
+ cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type,
+ &uuid_null, &rec->event);
+}
+
+static int __init cxl_pci_driver_init(void)
+{
+ int rc;
+
+ rc = pci_register_driver(&cxl_pci_driver);
+ if (rc)
+ return rc;
+
+ rc = cxl_cper_register_callback(cxl_cper_event_call);
+ if (rc)
+ pci_unregister_driver(&cxl_pci_driver);
+
+ return rc;
+}
+
+static void __exit cxl_pci_driver_exit(void)
+{
+ cxl_cper_unregister_callback(cxl_cper_event_call);
+ pci_unregister_driver(&cxl_pci_driver);
+}
+
+module_init(cxl_pci_driver_init);
+module_exit(cxl_pci_driver_exit);
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);

--
2.43.0



2024-03-01 21:58:03

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH 3/4] cxl/pci: Register for and process CPER events

Ira Weiny wrote:
> If the firmware has configured CXL event support to be firmware first
> the OS can process those events through CPER records. The CXL layer has
> unique DPA to HPA knowledge and standard event trace parsing in place.
>
> CPER records contain Bus, Device, Function information which can be used
> to identify the PCI device which is sending the event.
>
> Add a CXL CPER callback to process events through the CXL trace
> subsystem.
>
> Future patches will provide additional region information such as HPA.
>
> Signed-off-by: Ira Weiny <[email protected]>
>
> ---
> Changes:
> [iweiny: Add back in after the revert in 6.8]
> ---
> drivers/cxl/pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 68 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 2ff361e756d6..6cf8336d1b33 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -974,6 +974,73 @@ static struct pci_driver cxl_pci_driver = {
> },
> };
>
> -module_pci_driver(cxl_pci_driver);
> +#define CXL_EVENT_HDR_FLAGS_REC_SEVERITY GENMASK(1, 0)
> +static void cxl_cper_event_call(enum cxl_event_type ev_type,
> + struct cxl_cper_event_rec *rec)
> +{
> + struct cper_cxl_event_devid *device_id = &rec->hdr.device_id;
> + struct pci_dev *pdev __free(pci_dev_put) = NULL;
> + enum cxl_event_log_type log_type;
> + struct cxl_dev_state *cxlds;
> + unsigned int devfn;
> + u32 hdr_flags;
> +
> + pr_debug("CPER event for device %u:%u:%u.%u\n",
> + device_id->segment_num, device_id->bus_num,
> + device_id->device_num, device_id->func_num);
> +
> + devfn = PCI_DEVFN(device_id->device_num, device_id->func_num);
> + pdev = pci_get_domain_bus_and_slot(device_id->segment_num,
> + device_id->bus_num, devfn);
> + if (!pdev) {
> + pr_err("CPER event device %u:%u:%u.%u not found\n",
> + device_id->segment_num, device_id->bus_num,
> + device_id->device_num, device_id->func_num);
> + return;
> + }
> +
> + dev_dbg(&pdev->dev, "Found device %u:%u.%u\n", device_id->bus_num,
> + device_id->device_num, device_id->func_num);

These print statements are excessive. The dev_dbg() already encodes the
device BDF into the device name. The pr_err() is not actionable and
somewhat redundant with the default cper_estatus_print_section() print.

I would just delete all of them.