2023-09-05 16:22:11

by Terry Bowman

[permalink] [raw]
Subject: [PATCH v10 12/15] cxl/pci: Disable root port interrupts in RCH mode

The RCH root port contains root command AER registers that should not be
enabled.[1] Disable these to prevent root port interrupts.

[1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors

Signed-off-by: Terry Bowman <[email protected]>
Signed-off-by: Robert Richter <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
---
drivers/cxl/core/core.h | 6 ++++++
drivers/cxl/core/pci.c | 29 +++++++++++++++++++++++++++++
drivers/cxl/core/port.c | 3 +++
3 files changed, 38 insertions(+)

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index f470ef5c0a6a..6b037030b936 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -87,4 +87,10 @@ enum cxl_poison_trace_type {
CXL_POISON_TRACE_CLEAR,
};

+#ifdef CONFIG_PCIEAER_CXL
+void cxl_disable_rch_root_ints(struct cxl_dport *dport);
+#else
+static inline void cxl_disable_rch_root_ints(struct cxl_dport *dport) { };
+#endif
+
#endif /* __CXL_CORE_H__ */
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 1c40270968b6..e306d3c9638b 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -819,6 +819,35 @@ static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
cxl_handle_rdport_ras(cxlds, dport);
}

+void cxl_disable_rch_root_ints(struct cxl_dport *dport)
+{
+ void __iomem *aer_base = dport->regs.dport_aer;
+ struct pci_host_bridge *bridge;
+ u32 aer_cmd_mask, aer_cmd;
+
+ if (!aer_base)
+ return;
+
+ bridge = to_pci_host_bridge(dport->dport_dev);
+
+ /*
+ * Disable RCH root port command interrupts.
+ * CXL 3.0 12.2.1.1 - RCH Downstream Port-detected Errors
+ *
+ * This sequence may not be necessary. CXL spec states disabling
+ * the root cmd register's interrupts is required. But, PCI spec
+ * shows these are disabled by default on reset.
+ */
+ if (bridge->native_cxl_error) {
+ aer_cmd_mask = (PCI_ERR_ROOT_CMD_COR_EN |
+ PCI_ERR_ROOT_CMD_NONFATAL_EN |
+ PCI_ERR_ROOT_CMD_FATAL_EN);
+ aer_cmd = readl(aer_base + PCI_ERR_ROOT_COMMAND);
+ aer_cmd &= ~aer_cmd_mask;
+ writel(aer_cmd, aer_base + PCI_ERR_ROOT_COMMAND);
+ }
+}
+
#else
static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) { }
#endif
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 2a22a7ed4704..d195af72ed65 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,

cxl_dport_map_regs(dport);

+ if (dport->rch)
+ cxl_disable_rch_root_ints(dport);
+
cond_cxl_root_lock(port);
rc = add_dport(port, dport);
cond_cxl_root_unlock(port);
--
2.34.1


2023-09-16 03:27:07

by Dan Williams

[permalink] [raw]
Subject: RE: [PATCH v10 12/15] cxl/pci: Disable root port interrupts in RCH mode

Terry Bowman wrote:
> The RCH root port contains root command AER registers that should not be
> enabled.[1] Disable these to prevent root port interrupts.
>
> [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors
>
> Signed-off-by: Terry Bowman <[email protected]>
> Signed-off-by: Robert Richter <[email protected]>
> Reviewed-by: Jonathan Cameron <[email protected]>
> Reviewed-by: Dave Jiang <[email protected]>
[..]
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 2a22a7ed4704..d195af72ed65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
>
> cxl_dport_map_regs(dport);
>
> + if (dport->rch)
> + cxl_disable_rch_root_ints(dport);
> +

Similar to the comment about cxl_dport_map_regs() not being appropriate
in an enumeration routine, this also needs to move out of _add_dport. It
occurs to me that it should also be undone on driver detach just like
other device "enables".

2023-09-19 17:41:55

by Terry Bowman

[permalink] [raw]
Subject: Re: [PATCH v10 12/15] cxl/pci: Disable root port interrupts in RCH mode

Hi Dan,

I added comments below.

On 9/15/23 13:43, Dan Williams wrote:
> Terry Bowman wrote:
>> The RCH root port contains root command AER registers that should not be
>> enabled.[1] Disable these to prevent root port interrupts.
>>
>> [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors
>>
>> Signed-off-by: Terry Bowman <[email protected]>
>> Signed-off-by: Robert Richter <[email protected]>
>> Reviewed-by: Jonathan Cameron <[email protected]>
>> Reviewed-by: Dave Jiang <[email protected]>
> [..]
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 2a22a7ed4704..d195af72ed65 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
>>
>> cxl_dport_map_regs(dport);
>>
>> + if (dport->rch)
>> + cxl_disable_rch_root_ints(dport);
>> +
>
> Similar to the comment about cxl_dport_map_regs() not being appropriate
> in an enumeration routine, this also needs to move out of _add_dport. It
> occurs to me that it should also be undone on driver detach just like
> other device "enables".

Ok. I will move out of enumeration.

Per the 'undo' request: This is a RCH downstream port (dport) with PCIe root port
capability. PCI spec states root port error reporting is disabled by default at
powerup. And SW does *not* enable the root port errors because the RCH dport is *not*
bound to a root port driver (missing BDF, etc). This mask is added to follow the
CXL spec precisely and if the rest of the system behaves as expected should not
be necessary.

I don't believe masking should be 'undone' in driver detach or elsewhere. Adding
the 'undo' masking would potentially introduce RCH dport root port interrupt
reporting which is incorrect for the RCH/RCD mode. Only CXL components (device,
uport, switch) may reside under the RCH dport and never want RCH dport reporting
root port errors. RCEC reports the root complex errors in RCH/RCD mode.

Regards,
Terry

2023-09-27 01:30:25

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v10 12/15] cxl/pci: Disable root port interrupts in RCH mode

Terry Bowman wrote:
> Hi Dan,
>
> I added comments below.
>
> On 9/15/23 13:43, Dan Williams wrote:
> > Terry Bowman wrote:
> >> The RCH root port contains root command AER registers that should not be
> >> enabled.[1] Disable these to prevent root port interrupts.
> >>
> >> [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors
> >>
> >> Signed-off-by: Terry Bowman <[email protected]>
> >> Signed-off-by: Robert Richter <[email protected]>
> >> Reviewed-by: Jonathan Cameron <[email protected]>
> >> Reviewed-by: Dave Jiang <[email protected]>
> > [..]
> >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> >> index 2a22a7ed4704..d195af72ed65 100644
> >> --- a/drivers/cxl/core/port.c
> >> +++ b/drivers/cxl/core/port.c
> >> @@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
> >>
> >> cxl_dport_map_regs(dport);
> >>
> >> + if (dport->rch)
> >> + cxl_disable_rch_root_ints(dport);
> >> +
> >
> > Similar to the comment about cxl_dport_map_regs() not being appropriate
> > in an enumeration routine, this also needs to move out of _add_dport. It
> > occurs to me that it should also be undone on driver detach just like
> > other device "enables".
>
> Ok. I will move out of enumeration.
>
> Per the 'undo' request: This is a RCH downstream port (dport) with PCIe root port
> capability. PCI spec states root port error reporting is disabled by default at
> powerup. And SW does *not* enable the root port errors because the RCH dport is *not*
> bound to a root port driver (missing BDF, etc). This mask is added to follow the
> CXL spec precisely and if the rest of the system behaves as expected should not
> be necessary.

Ah, got it perhaps add a comment to sanity check that the hardware is in
the per-spec state. Are you certain that even in firmware-first error
handling it is safe for the driver to unconditionally disable these
interrupts?

> I don't believe masking should be 'undone' in driver detach or elsewhere. Adding
> the 'undo' masking would potentially introduce RCH dport root port interrupt
> reporting which is incorrect for the RCH/RCD mode. Only CXL components (device,
> uport, switch) may reside under the RCH dport and never want RCH dport reporting
> root port errors. RCEC reports the root complex errors in RCH/RCD mode.

Ok, that also seems to suggest that even in the firmware-first case the
driver should make sure they are off per-spec.