CXL 3.1 Section 3.1.1 states:
"A Function on a CXL device must not generate INTx messages if
that Function participates in CXL.cache protocol or CXL.mem
protocols."
The generic CXL memory driver only supports devices which use the
CXL.mem protocol. The current driver attempts to allocate MSI/MSI-X
vectors in anticipation of their need for mailbox interrupts or event
processing. However, the above requirement does not require a device to
support interrupts, only that they use MSI/MSI-X. For example, a device
may disable mailbox interrupts and either be configured for firmware
first or skip event processing and function.
Dave Larsen reported that the following Intel / Agilex card does not
support interrupts on function 0.
CXL: Intel Corporation Device 0ddb (rev 01) (prog-if 10 [CXL Memory Device (CXL 2.x)])
Rather than fail device probe if interrupts are not supported; flag that
irqs are not enabled and avoid features which require interrupts.
Emit messages appropriate for the situation to aid in debugging should
device behavior be unexpected due to a failure to allocate vectors.
Note that it is possible for a device to have host based event
processing through polling. However, the driver does not support
polling and it is not anticipated to be generally required. Leave that
functionality to a future patch if such a device comes along.
Reported-by: Dave Larsen <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Reviewed-by: Fan Ni <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
---
Changes in v2:
- [djbw: add reported by and info about the card.]
- [djbw: skip error reporting in the mailbox case.]
- [djbw: clean up event message]
- [iweiny: pick up tags]
- Link to v1: https://lore.kernel.org/r/[email protected]
---
drivers/cxl/pci.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 0155fb66b580..93df03f0622e 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -381,7 +381,7 @@ static int cxl_pci_mbox_send(struct cxl_memdev_state *mds,
return rc;
}
-static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
+static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds, bool irq_avail)
{
struct cxl_dev_state *cxlds = &mds->cxlds;
const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
@@ -440,7 +440,7 @@ static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds)
INIT_DELAYED_WORK(&mds->security.poll_dwork, cxl_mbox_sanitize_work);
/* background command interrupts are optional */
- if (!(cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ))
+ if (!(cap & CXLDEV_MBOX_CAP_BG_CMD_IRQ) || !irq_avail)
return 0;
msgnum = FIELD_GET(CXLDEV_MBOX_CAP_IRQ_MSGNUM_MASK, cap);
@@ -587,7 +587,7 @@ static int cxl_mem_alloc_event_buf(struct cxl_memdev_state *mds)
return devm_add_action_or_reset(mds->cxlds.dev, free_event_buf, buf);
}
-static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
+static bool cxl_alloc_irq_vectors(struct pci_dev *pdev)
{
int nvecs;
@@ -604,9 +604,9 @@ static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
PCI_IRQ_MSIX | PCI_IRQ_MSI);
if (nvecs < 1) {
dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
- return -ENXIO;
+ return false;
}
- return 0;
+ return true;
}
static irqreturn_t cxl_event_thread(int irq, void *id)
@@ -742,7 +742,7 @@ static bool cxl_event_int_is_fw(u8 setting)
}
static int cxl_event_config(struct pci_host_bridge *host_bridge,
- struct cxl_memdev_state *mds)
+ struct cxl_memdev_state *mds, bool irq_avail)
{
struct cxl_event_interrupt_policy policy;
int rc;
@@ -754,6 +754,11 @@ static int cxl_event_config(struct pci_host_bridge *host_bridge,
if (!host_bridge->native_cxl_error)
return 0;
+ if (!irq_avail) {
+ dev_info(mds->cxlds.dev, "No interrupt support, disable event processing.\n");
+ return 0;
+ }
+
rc = cxl_mem_alloc_event_buf(mds);
if (rc)
return rc;
@@ -788,6 +793,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
struct cxl_register_map map;
struct cxl_memdev *cxlmd;
int i, rc, pmu_count;
+ bool irq_avail;
/*
* Double check the anonymous union trickery in struct cxl_regs
@@ -845,11 +851,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
else
dev_warn(&pdev->dev, "Media not active (%d)\n", rc);
- rc = cxl_alloc_irq_vectors(pdev);
- if (rc)
- return rc;
+ irq_avail = cxl_alloc_irq_vectors(pdev);
- rc = cxl_pci_setup_mailbox(mds);
+ rc = cxl_pci_setup_mailbox(mds, irq_avail);
if (rc)
return rc;
@@ -908,7 +912,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
}
}
- rc = cxl_event_config(host_bridge, mds);
+ rc = cxl_event_config(host_bridge, mds, irq_avail);
if (rc)
return rc;
---
base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
change-id: 20240108-dont-fail-irq-a96310368f0f
Best regards,
--
Ira Weiny <[email protected]>
On Wed, 17 Jan 2024, Ira Weiny wrote:
>CXL 3.1 Section 3.1.1 states:
>
> "A Function on a CXL device must not generate INTx messages if
> that Function participates in CXL.cache protocol or CXL.mem
> protocols."
>
>The generic CXL memory driver only supports devices which use the
>CXL.mem protocol. The current driver attempts to allocate MSI/MSI-X
>vectors in anticipation of their need for mailbox interrupts or event
>processing. However, the above requirement does not require a device to
>support interrupts, only that they use MSI/MSI-X. For example, a device
>may disable mailbox interrupts and either be configured for firmware
>first or skip event processing and function.
>
>Dave Larsen reported that the following Intel / Agilex card does not
>support interrupts on function 0.
>
> CXL: Intel Corporation Device 0ddb (rev 01) (prog-if 10 [CXL Memory Device (CXL 2.x)])
>
>Rather than fail device probe if interrupts are not supported; flag that
>irqs are not enabled and avoid features which require interrupts.
>Emit messages appropriate for the situation to aid in debugging should
>device behavior be unexpected due to a failure to allocate vectors.
>
>Note that it is possible for a device to have host based event
>processing through polling. However, the driver does not support
>polling and it is not anticipated to be generally required. Leave that
>functionality to a future patch if such a device comes along.
>
>Reported-by: Dave Larsen <[email protected]>
>Reviewed-by: Dave Jiang <[email protected]>
>Reviewed-by: Fan Ni <[email protected]>
>Signed-off-by: Ira Weiny <[email protected]>
Yes, I suspected this might come up at some point.
Reviewed-and-tested-by: Davidlohr Bueso <[email protected]>
Ira Weiny wrote:
> CXL 3.1 Section 3.1.1 states:
>
> "A Function on a CXL device must not generate INTx messages if
> that Function participates in CXL.cache protocol or CXL.mem
> protocols."
>
> The generic CXL memory driver only supports devices which use the
> CXL.mem protocol. The current driver attempts to allocate MSI/MSI-X
> vectors in anticipation of their need for mailbox interrupts or event
> processing. However, the above requirement does not require a device to
> support interrupts, only that they use MSI/MSI-X. For example, a device
> may disable mailbox interrupts and either be configured for firmware
> first or skip event processing and function.
>
> Dave Larsen reported that the following Intel / Agilex card does not
> support interrupts on function 0.
>
> CXL: Intel Corporation Device 0ddb (rev 01) (prog-if 10 [CXL Memory Device (CXL 2.x)])
>
> Rather than fail device probe if interrupts are not supported; flag that
> irqs are not enabled and avoid features which require interrupts.
> Emit messages appropriate for the situation to aid in debugging should
> device behavior be unexpected due to a failure to allocate vectors.
>
> Note that it is possible for a device to have host based event
> processing through polling. However, the driver does not support
> polling and it is not anticipated to be generally required. Leave that
> functionality to a future patch if such a device comes along.
>
> Reported-by: Dave Larsen <[email protected]>
> Reviewed-by: Dave Jiang <[email protected]>
> Reviewed-by: Fan Ni <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
> ---
> Changes in v2:
> - [djbw: add reported by and info about the card.]
> - [djbw: skip error reporting in the mailbox case.]
> - [djbw: clean up event message]
> - [iweiny: pick up tags]
> - Link to v1: https://lore.kernel.org/r/[email protected]
> ---
> drivers/cxl/pci.c | 26 +++++++++++++++-----------
> 1 file changed, 15 insertions(+), 11 deletions(-)
Looks good, I consider this suitable as post -rc1 material. Will get it
queued once the merge window closes.