2024-05-20 14:54:28

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 0/7] PCI: xilinx-nwl: Add phy support

Add phy subsystem support for the xilinx-nwl PCIe controller. This
series also includes several small fixes and improvements.

Changes in v3:
- Document phys property
- Expand off-by-one commit message

Changes in v2:
- Remove phy-names
- Add an example
- Get phys by index and not by name

Sean Anderson (7):
dt-bindings: pci: xilinx-nwl: Add phys
PCI: xilinx-nwl: Fix off-by-one in IRQ handler
PCI: xilinx-nwl: Fix register misspelling
PCI: xilinx-nwl: Rate-limit misc interrupt messages
PCI: xilinx-nwl: Clean up clock on probe failure/removal
PCI: xilinx-nwl: Add phy support
arm64: zynqmp: Add PCIe phys

.../bindings/pci/xlnx,nwl-pcie.yaml | 7 +
.../boot/dts/xilinx/zynqmp-zcu102-revA.dts | 1 +
drivers/pci/controller/pcie-xilinx-nwl.c | 122 ++++++++++++++----
3 files changed, 107 insertions(+), 23 deletions(-)

--
2.35.1.1320.gc452695387.dirty



2024-05-20 14:54:47

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 2/7] PCI: xilinx-nwl: Fix off-by-one in IRQ handler

MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit
2, and INTD in bit 3. Hardware IRQ numbers start at 0, and we register
PCI_NUM_INTX irqs. So to enable INTA (aka hwirq 0) we should set bit 0.
Remove the subtraction of one. This fixes the following UBSAN error:

[ 5.037483] ================================================================================
[ 5.046260] UBSAN: shift-out-of-bounds in ../drivers/pci/controller/pcie-xilinx-nwl.c:389:11
[ 5.054983] shift exponent 18446744073709551615 is too large for 32-bit type 'int'
[ 5.062813] CPU: 1 PID: 61 Comm: kworker/u10:1 Not tainted 6.6.20+ #268
[ 5.070008] Hardware name: xlnx,zynqmp (DT)
[ 5.074348] Workqueue: events_unbound deferred_probe_work_func
[ 5.080410] Call trace:
[ 5.082958] dump_backtrace (arch/arm64/kernel/stacktrace.c:235)
[ 5.086850] show_stack (arch/arm64/kernel/stacktrace.c:242)
[ 5.090292] dump_stack_lvl (lib/dump_stack.c:107)
[ 5.094095] dump_stack (lib/dump_stack.c:114)
[ 5.097540] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:218 lib/ubsan.c:387)
[ 5.103227] nwl_unmask_leg_irq (drivers/pci/controller/pcie-xilinx-nwl.c:389 (discriminator 1))
[ 5.107386] irq_enable (kernel/irq/internals.h:234 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345)
[ 5.110838] __irq_startup (kernel/irq/internals.h:239 kernel/irq/chip.c:180 kernel/irq/chip.c:250)
[ 5.114552] irq_startup (kernel/irq/chip.c:270)
[ 5.118266] __setup_irq (kernel/irq/manage.c:1800)
[ 5.121982] request_threaded_irq (kernel/irq/manage.c:2206)
[ 5.126412] pcie_pme_probe (include/linux/interrupt.h:168 drivers/pci/pcie/pme.c:348)
[ 5.130303] pcie_port_probe_service (drivers/pci/pcie/portdrv.c:528)
[ 5.134915] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
[ 5.138720] __driver_probe_device (drivers/base/dd.c:800)
[ 5.143236] driver_probe_device (drivers/base/dd.c:830)
[ 5.147571] __device_attach_driver (drivers/base/dd.c:959)
[ 5.152179] bus_for_each_drv (drivers/base/bus.c:457)
[ 5.156163] __device_attach (drivers/base/dd.c:1032)
[ 5.160147] device_initial_probe (drivers/base/dd.c:1080)
[ 5.164488] bus_probe_device (drivers/base/bus.c:532)
[ 5.168471] device_add (drivers/base/core.c:3638)
[ 5.172098] device_register (drivers/base/core.c:3714)
[ 5.175994] pcie_portdrv_probe (drivers/pci/pcie/portdrv.c:309 drivers/pci/pcie/portdrv.c:363 drivers/pci/pcie/portdrv.c:695)
[ 5.180338] pci_device_probe (drivers/pci/pci-driver.c:324 drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
[ 5.184410] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
[ 5.188213] __driver_probe_device (drivers/base/dd.c:800)
[ 5.192729] driver_probe_device (drivers/base/dd.c:830)
[ 5.197064] __device_attach_driver (drivers/base/dd.c:959)
[ 5.201672] bus_for_each_drv (drivers/base/bus.c:457)
[ 5.205657] __device_attach (drivers/base/dd.c:1032)
[ 5.209641] device_attach (drivers/base/dd.c:1074)
[ 5.213357] pci_bus_add_device (drivers/pci/bus.c:352)
[ 5.217518] pci_bus_add_devices (drivers/pci/bus.c:371 (discriminator 2))
[ 5.221774] pci_host_probe (drivers/pci/probe.c:3099)
[ 5.225581] nwl_pcie_probe (drivers/pci/controller/pcie-xilinx-nwl.c:938)
[ 5.229562] platform_probe (drivers/base/platform.c:1404)
[ 5.233367] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
[ 5.237169] __driver_probe_device (drivers/base/dd.c:800)
[ 5.241685] driver_probe_device (drivers/base/dd.c:830)
[ 5.246020] __device_attach_driver (drivers/base/dd.c:959)
[ 5.250628] bus_for_each_drv (drivers/base/bus.c:457)
[ 5.254612] __device_attach (drivers/base/dd.c:1032)
[ 5.258596] device_initial_probe (drivers/base/dd.c:1080)
[ 5.262938] bus_probe_device (drivers/base/bus.c:532)
[ 5.266920] deferred_probe_work_func (drivers/base/dd.c:124)
[ 5.271619] process_one_work (arch/arm64/include/asm/jump_label.h:21 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2632)
[ 5.275788] worker_thread (kernel/workqueue.c:2694 (discriminator 2) kernel/workqueue.c:2781 (discriminator 2))
[ 5.279686] kthread (kernel/kthread.c:388)
[ 5.283048] ret_from_fork (arch/arm64/kernel/entry.S:862)
[ 5.286765] ================================================================================

Fixes: 9a181e1093af ("PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts")
Cc: <[email protected]>
Signed-off-by: Sean Anderson <[email protected]>
---

Changes in v3:
- Expand commit message

drivers/pci/controller/pcie-xilinx-nwl.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
index 0408f4d612b5..437927e3bcca 100644
--- a/drivers/pci/controller/pcie-xilinx-nwl.c
+++ b/drivers/pci/controller/pcie-xilinx-nwl.c
@@ -371,7 +371,7 @@ static void nwl_mask_intx_irq(struct irq_data *data)
u32 mask;
u32 val;

- mask = 1 << (data->hwirq - 1);
+ mask = 1 << data->hwirq;
raw_spin_lock_irqsave(&pcie->leg_mask_lock, flags);
val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
nwl_bridge_writel(pcie, (val & (~mask)), MSGF_LEG_MASK);
@@ -385,7 +385,7 @@ static void nwl_unmask_intx_irq(struct irq_data *data)
u32 mask;
u32 val;

- mask = 1 << (data->hwirq - 1);
+ mask = 1 << data->hwirq;
raw_spin_lock_irqsave(&pcie->leg_mask_lock, flags);
val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
nwl_bridge_writel(pcie, (val | mask), MSGF_LEG_MASK);
--
2.35.1.1320.gc452695387.dirty


2024-05-20 14:54:56

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 3/7] PCI: xilinx-nwl: Fix register misspelling

MSIC -> MISC

Fixes: c2a7ff18edcd ("PCI: xilinx-nwl: Expand error logging")
Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v1)

drivers/pci/controller/pcie-xilinx-nwl.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
index 437927e3bcca..ce881baac6d8 100644
--- a/drivers/pci/controller/pcie-xilinx-nwl.c
+++ b/drivers/pci/controller/pcie-xilinx-nwl.c
@@ -80,8 +80,8 @@
#define MSGF_MISC_SR_NON_FATAL_DEV BIT(22)
#define MSGF_MISC_SR_FATAL_DEV BIT(23)
#define MSGF_MISC_SR_LINK_DOWN BIT(24)
-#define MSGF_MSIC_SR_LINK_AUTO_BWIDTH BIT(25)
-#define MSGF_MSIC_SR_LINK_BWIDTH BIT(26)
+#define MSGF_MISC_SR_LINK_AUTO_BWIDTH BIT(25)
+#define MSGF_MISC_SR_LINK_BWIDTH BIT(26)

#define MSGF_MISC_SR_MASKALL (MSGF_MISC_SR_RXMSG_AVAIL | \
MSGF_MISC_SR_RXMSG_OVER | \
@@ -96,8 +96,8 @@
MSGF_MISC_SR_NON_FATAL_DEV | \
MSGF_MISC_SR_FATAL_DEV | \
MSGF_MISC_SR_LINK_DOWN | \
- MSGF_MSIC_SR_LINK_AUTO_BWIDTH | \
- MSGF_MSIC_SR_LINK_BWIDTH)
+ MSGF_MISC_SR_LINK_AUTO_BWIDTH | \
+ MSGF_MISC_SR_LINK_BWIDTH)

/* Legacy interrupt status mask bits */
#define MSGF_LEG_SR_INTA BIT(0)
@@ -299,10 +299,10 @@ static irqreturn_t nwl_pcie_misc_handler(int irq, void *data)
if (misc_stat & MSGF_MISC_SR_FATAL_DEV)
dev_err(dev, "Fatal Error Detected\n");

- if (misc_stat & MSGF_MSIC_SR_LINK_AUTO_BWIDTH)
+ if (misc_stat & MSGF_MISC_SR_LINK_AUTO_BWIDTH)
dev_info(dev, "Link Autonomous Bandwidth Management Status bit set\n");

- if (misc_stat & MSGF_MSIC_SR_LINK_BWIDTH)
+ if (misc_stat & MSGF_MISC_SR_LINK_BWIDTH)
dev_info(dev, "Link Bandwidth Management Status bit set\n");

/* Clear misc interrupt status */
--
2.35.1.1320.gc452695387.dirty


2024-05-20 14:54:58

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 1/7] dt-bindings: pci: xilinx-nwl: Add phys

Add phys properties so Linux can power-on/configure the GTR
transcievers.

Signed-off-by: Sean Anderson <[email protected]>
---

Changes in v3:
- Document phys property

Changes in v2:
- Remove phy-names
- Add an example

Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml b/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
index 426f90a47f35..cc50795d170b 100644
--- a/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
+++ b/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
@@ -61,6 +61,11 @@ properties:
interrupt-map:
maxItems: 4

+ phys:
+ minItems: 1
+ maxItems: 4
+ description: One phy per logical lane, in order
+
power-domains:
maxItems: 1

@@ -110,6 +115,7 @@ examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/phy/phy.h>
#include <dt-bindings/power/xlnx-zynqmp-power.h>
soc {
#address-cells = <2>;
@@ -138,6 +144,7 @@ examples:
<0x0 0x0 0x0 0x3 &pcie_intc 0x3>,
<0x0 0x0 0x0 0x4 &pcie_intc 0x4>;
msi-parent = <&nwl_pcie>;
+ phys = <&psgtr 0 PHY_TYPE_PCIE 0 0>;
power-domains = <&zynqmp_firmware PD_PCIE>;
iommus = <&smmu 0x4d0>;
pcie_intc: legacy-interrupt-controller {
--
2.35.1.1320.gc452695387.dirty


2024-05-20 14:55:50

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 5/7] PCI: xilinx-nwl: Clean up clock on probe failure/removal

Make sure we turn off the clock on probe failure and device removal.

Fixes: de0a01f52966 ("PCI: xilinx-nwl: Enable the clock through CCF")
Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v1)

drivers/pci/controller/pcie-xilinx-nwl.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
index c0a60cebdb2e..424cc5a1b4d1 100644
--- a/drivers/pci/controller/pcie-xilinx-nwl.c
+++ b/drivers/pci/controller/pcie-xilinx-nwl.c
@@ -779,6 +779,7 @@ static int nwl_pcie_probe(struct platform_device *pdev)
return -ENODEV;

pcie = pci_host_bridge_priv(bridge);
+ platform_set_drvdata(pdev, pcie);

pcie->dev = dev;

@@ -801,13 +802,13 @@ static int nwl_pcie_probe(struct platform_device *pdev)
err = nwl_pcie_bridge_init(pcie);
if (err) {
dev_err(dev, "HW Initialization failed\n");
- return err;
+ goto err_clk;
}

err = nwl_pcie_init_irq_domain(pcie);
if (err) {
dev_err(dev, "Failed creating IRQ Domain\n");
- return err;
+ goto err_clk;
}

bridge->sysdata = pcie;
@@ -817,11 +818,23 @@ static int nwl_pcie_probe(struct platform_device *pdev)
err = nwl_pcie_enable_msi(pcie);
if (err < 0) {
dev_err(dev, "failed to enable MSI support: %d\n", err);
- return err;
+ goto err_clk;
}
}

- return pci_host_probe(bridge);
+ err = pci_host_probe(bridge);
+
+err_clk:
+ if (err)
+ clk_disable_unprepare(pcie->clk);
+ return err;
+}
+
+static void nwl_pcie_remove(struct platform_device *pdev)
+{
+ struct nwl_pcie *pcie = platform_get_drvdata(pdev);
+
+ clk_disable_unprepare(pcie->clk);
}

static struct platform_driver nwl_pcie_driver = {
@@ -831,5 +844,6 @@ static struct platform_driver nwl_pcie_driver = {
.of_match_table = nwl_pcie_of_match,
},
.probe = nwl_pcie_probe,
+ .remove_new = nwl_pcie_remove,
};
builtin_platform_driver(nwl_pcie_driver);
--
2.35.1.1320.gc452695387.dirty


2024-05-20 14:56:09

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 4/7] PCI: xilinx-nwl: Rate-limit misc interrupt messages

The conditions logged by the misc interrupt can occur repeatedly and
continuously. Avoid rendering the console unusable by rate-limiting
these messages.

Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v1)

drivers/pci/controller/pcie-xilinx-nwl.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
index ce881baac6d8..c0a60cebdb2e 100644
--- a/drivers/pci/controller/pcie-xilinx-nwl.c
+++ b/drivers/pci/controller/pcie-xilinx-nwl.c
@@ -267,37 +267,37 @@ static irqreturn_t nwl_pcie_misc_handler(int irq, void *data)
return IRQ_NONE;

if (misc_stat & MSGF_MISC_SR_RXMSG_OVER)
- dev_err(dev, "Received Message FIFO Overflow\n");
+ dev_err_ratelimited(dev, "Received Message FIFO Overflow\n");

if (misc_stat & MSGF_MISC_SR_SLAVE_ERR)
- dev_err(dev, "Slave error\n");
+ dev_err_ratelimited(dev, "Slave error\n");

if (misc_stat & MSGF_MISC_SR_MASTER_ERR)
- dev_err(dev, "Master error\n");
+ dev_err_ratelimited(dev, "Master error\n");

if (misc_stat & MSGF_MISC_SR_I_ADDR_ERR)
- dev_err(dev, "In Misc Ingress address translation error\n");
+ dev_err_ratelimited(dev, "In Misc Ingress address translation error\n");

if (misc_stat & MSGF_MISC_SR_E_ADDR_ERR)
- dev_err(dev, "In Misc Egress address translation error\n");
+ dev_err_ratelimited(dev, "In Misc Egress address translation error\n");

if (misc_stat & MSGF_MISC_SR_FATAL_AER)
- dev_err(dev, "Fatal Error in AER Capability\n");
+ dev_err_ratelimited(dev, "Fatal Error in AER Capability\n");

if (misc_stat & MSGF_MISC_SR_NON_FATAL_AER)
- dev_err(dev, "Non-Fatal Error in AER Capability\n");
+ dev_err_ratelimited(dev, "Non-Fatal Error in AER Capability\n");

if (misc_stat & MSGF_MISC_SR_CORR_AER)
- dev_err(dev, "Correctable Error in AER Capability\n");
+ dev_err_ratelimited(dev, "Correctable Error in AER Capability\n");

if (misc_stat & MSGF_MISC_SR_UR_DETECT)
- dev_err(dev, "Unsupported request Detected\n");
+ dev_err_ratelimited(dev, "Unsupported request Detected\n");

if (misc_stat & MSGF_MISC_SR_NON_FATAL_DEV)
- dev_err(dev, "Non-Fatal Error Detected\n");
+ dev_err_ratelimited(dev, "Non-Fatal Error Detected\n");

if (misc_stat & MSGF_MISC_SR_FATAL_DEV)
- dev_err(dev, "Fatal Error Detected\n");
+ dev_err_ratelimited(dev, "Fatal Error Detected\n");

if (misc_stat & MSGF_MISC_SR_LINK_AUTO_BWIDTH)
dev_info(dev, "Link Autonomous Bandwidth Management Status bit set\n");
--
2.35.1.1320.gc452695387.dirty


2024-05-20 14:56:10

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 7/7] arm64: zynqmp: Add PCIe phys

Add PCIe phy bindings for the ZCU102.

Signed-off-by: Sean Anderson <[email protected]>
Tested-by: [email protected]
---

(no changes since v2)

Changes in v2:
- Remove phy-names

arch/arm64/boot/dts/xilinx/zynqmp-zcu102-revA.dts | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-revA.dts b/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-revA.dts
index ad8f23a0ec67..d2175f3dd099 100644
--- a/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-revA.dts
+++ b/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-revA.dts
@@ -941,6 +941,7 @@ conf-pull-none {

&pcie {
status = "okay";
+ phys = <&psgtr 0 PHY_TYPE_PCIE 0 0>;
};

&psgtr {
--
2.35.1.1320.gc452695387.dirty


2024-05-20 14:57:11

by Sean Anderson

[permalink] [raw]
Subject: [PATCH v3 6/7] PCI: xilinx-nwl: Add phy support

Add support for enabling/disabling PCIe phys. We can't really do
anything about failures in the disable/remove path, so just warn.

Signed-off-by: Sean Anderson <[email protected]>
---

(no changes since v2)

Changes in v2:
- Get phys by index and not by name

drivers/pci/controller/pcie-xilinx-nwl.c | 68 ++++++++++++++++++++++--
1 file changed, 65 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
index 424cc5a1b4d1..d32cf4247836 100644
--- a/drivers/pci/controller/pcie-xilinx-nwl.c
+++ b/drivers/pci/controller/pcie-xilinx-nwl.c
@@ -19,6 +19,7 @@
#include <linux/of_platform.h>
#include <linux/pci.h>
#include <linux/pci-ecam.h>
+#include <linux/phy/phy.h>
#include <linux/platform_device.h>
#include <linux/irqchip/chained_irq.h>

@@ -157,6 +158,7 @@ struct nwl_pcie {
void __iomem *breg_base;
void __iomem *pcireg_base;
void __iomem *ecam_base;
+ struct phy *phy[4];
phys_addr_t phys_breg_base; /* Physical Bridge Register Base */
phys_addr_t phys_pcie_reg_base; /* Physical PCIe Controller Base */
phys_addr_t phys_ecam_base; /* Physical Configuration Base */
@@ -521,6 +523,43 @@ static int nwl_pcie_init_msi_irq_domain(struct nwl_pcie *pcie)
return 0;
}

+static int nwl_pcie_phy_enable(struct nwl_pcie *pcie)
+{
+ int i, ret;
+
+ for (i = 0; i < ARRAY_SIZE(pcie->phy); i++) {
+ ret = phy_init(pcie->phy[i]);
+ if (ret)
+ goto err;
+
+ ret = phy_power_on(pcie->phy[i]);
+ if (ret) {
+ WARN_ON(phy_exit(pcie->phy[i]));
+ goto err;
+ }
+ }
+
+ return 0;
+
+err:
+ while (--i) {
+ WARN_ON(phy_power_off(pcie->phy[i]));
+ WARN_ON(phy_exit(pcie->phy[i]));
+ }
+
+ return ret;
+}
+
+static void nwl_pcie_phy_disable(struct nwl_pcie *pcie)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(pcie->phy); i++) {
+ WARN_ON(phy_power_off(pcie->phy[i]));
+ WARN_ON(phy_exit(pcie->phy[i]));
+ }
+}
+
static int nwl_pcie_init_irq_domain(struct nwl_pcie *pcie)
{
struct device *dev = pcie->dev;
@@ -732,6 +771,7 @@ static int nwl_pcie_parse_dt(struct nwl_pcie *pcie,
{
struct device *dev = pcie->dev;
struct resource *res;
+ int i;

res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "breg");
pcie->breg_base = devm_ioremap_resource(dev, res);
@@ -759,6 +799,18 @@ static int nwl_pcie_parse_dt(struct nwl_pcie *pcie,
irq_set_chained_handler_and_data(pcie->irq_intx,
nwl_pcie_leg_handler, pcie);

+
+ for (i = 0; i < ARRAY_SIZE(pcie->phy); i++) {
+ pcie->phy[i] = devm_of_phy_get_by_index(dev, dev->of_node, i);
+ if (PTR_ERR(pcie->phy[i]) == -ENODEV) {
+ pcie->phy[i] = NULL;
+ break;
+ }
+
+ if (IS_ERR(pcie->phy[i]))
+ return PTR_ERR(pcie->phy[i]);
+ }
+
return 0;
}

@@ -799,16 +851,22 @@ static int nwl_pcie_probe(struct platform_device *pdev)
return err;
}

+ err = nwl_pcie_phy_enable(pcie);
+ if (err) {
+ dev_err(dev, "could not enable PHYs\n");
+ goto err_clk;
+ }
+
err = nwl_pcie_bridge_init(pcie);
if (err) {
dev_err(dev, "HW Initialization failed\n");
- goto err_clk;
+ goto err_phy;
}

err = nwl_pcie_init_irq_domain(pcie);
if (err) {
dev_err(dev, "Failed creating IRQ Domain\n");
- goto err_clk;
+ goto err_phy;
}

bridge->sysdata = pcie;
@@ -818,12 +876,15 @@ static int nwl_pcie_probe(struct platform_device *pdev)
err = nwl_pcie_enable_msi(pcie);
if (err < 0) {
dev_err(dev, "failed to enable MSI support: %d\n", err);
- goto err_clk;
+ goto err_phy;
}
}

err = pci_host_probe(bridge);

+err_phy:
+ if (err)
+ nwl_pcie_phy_disable(pcie);
err_clk:
if (err)
clk_disable_unprepare(pcie->clk);
@@ -834,6 +895,7 @@ static void nwl_pcie_remove(struct platform_device *pdev)
{
struct nwl_pcie *pcie = platform_get_drvdata(pdev);

+ nwl_pcie_phy_disable(pcie);
clk_disable_unprepare(pcie->clk);
}

--
2.35.1.1320.gc452695387.dirty


2024-05-22 14:50:29

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] dt-bindings: pci: xilinx-nwl: Add phys


On Mon, 20 May 2024 10:53:56 -0400, Sean Anderson wrote:
> Add phys properties so Linux can power-on/configure the GTR
> transcievers.
>
> Signed-off-by: Sean Anderson <[email protected]>
> ---
>
> Changes in v3:
> - Document phys property
>
> Changes in v2:
> - Remove phy-names
> - Add an example
>
> Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml | 7 +++++++
> 1 file changed, 7 insertions(+)
>

Reviewed-by: Rob Herring (Arm) <[email protected]>


2024-05-22 22:28:48

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] PCI: xilinx-nwl: Fix off-by-one in IRQ handler

On Mon, May 20, 2024 at 10:53:57AM -0400, Sean Anderson wrote:
> MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit
> 2, and INTD in bit 3. Hardware IRQ numbers start at 0, and we register
> PCI_NUM_INTX irqs. So to enable INTA (aka hwirq 0) we should set bit 0.
> Remove the subtraction of one. This fixes the following UBSAN error:

Thanks for these details!

I guess UBSAN == "undefined behavior sanitizer", right? That sounds
like an easy way to find this but not the way users are likely to find
it.

I assume users would notice spurious and missing interrupts, e.g.,
a driver that tried to enable INTB would have actually enabled INTA,
so we'd see spurious INTA interrupts and the driver would never see
the INTB it expected.

And a driver that tried to enable INTA would never see that interrupt,
and we might not set any bit in MSGF_LEG_MASK?

I think the normal way people would trip over this, i.e., spurious and
missing INTx interrupts, is the important thing to mention here.

> [ 5.037483] ================================================================================
> [ 5.046260] UBSAN: shift-out-of-bounds in ../drivers/pci/controller/pcie-xilinx-nwl.c:389:11
> [ 5.054983] shift exponent 18446744073709551615 is too large for 32-bit type 'int'
> [ 5.062813] CPU: 1 PID: 61 Comm: kworker/u10:1 Not tainted 6.6.20+ #268
> [ 5.070008] Hardware name: xlnx,zynqmp (DT)
> [ 5.074348] Workqueue: events_unbound deferred_probe_work_func
> [ 5.080410] Call trace:
> [ 5.082958] dump_backtrace (arch/arm64/kernel/stacktrace.c:235)
> [ 5.086850] show_stack (arch/arm64/kernel/stacktrace.c:242)
> [ 5.090292] dump_stack_lvl (lib/dump_stack.c:107)
> [ 5.094095] dump_stack (lib/dump_stack.c:114)
> [ 5.097540] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:218 lib/ubsan.c:387)
> [ 5.103227] nwl_unmask_leg_irq (drivers/pci/controller/pcie-xilinx-nwl.c:389 (discriminator 1))
> [ 5.107386] irq_enable (kernel/irq/internals.h:234 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345)
> [ 5.110838] __irq_startup (kernel/irq/internals.h:239 kernel/irq/chip.c:180 kernel/irq/chip.c:250)
> [ 5.114552] irq_startup (kernel/irq/chip.c:270)
> [ 5.118266] __setup_irq (kernel/irq/manage.c:1800)
> [ 5.121982] request_threaded_irq (kernel/irq/manage.c:2206)
> [ 5.126412] pcie_pme_probe (include/linux/interrupt.h:168 drivers/pci/pcie/pme.c:348)

The rest of the stacktrace below is not relevant and could be omitted.
The timestamps don't add useful information either.

> [ 5.130303] pcie_port_probe_service (drivers/pci/pcie/portdrv.c:528)
> [ 5.134915] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
> [ 5.138720] __driver_probe_device (drivers/base/dd.c:800)
> [ 5.143236] driver_probe_device (drivers/base/dd.c:830)
> [ 5.147571] __device_attach_driver (drivers/base/dd.c:959)
> [ 5.152179] bus_for_each_drv (drivers/base/bus.c:457)
> [ 5.156163] __device_attach (drivers/base/dd.c:1032)
> [ 5.160147] device_initial_probe (drivers/base/dd.c:1080)
> [ 5.164488] bus_probe_device (drivers/base/bus.c:532)
> [ 5.168471] device_add (drivers/base/core.c:3638)
> [ 5.172098] device_register (drivers/base/core.c:3714)
> [ 5.175994] pcie_portdrv_probe (drivers/pci/pcie/portdrv.c:309 drivers/pci/pcie/portdrv.c:363 drivers/pci/pcie/portdrv.c:695)
> [ 5.180338] pci_device_probe (drivers/pci/pci-driver.c:324 drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
> [ 5.184410] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
> [ 5.188213] __driver_probe_device (drivers/base/dd.c:800)
> [ 5.192729] driver_probe_device (drivers/base/dd.c:830)
> [ 5.197064] __device_attach_driver (drivers/base/dd.c:959)
> [ 5.201672] bus_for_each_drv (drivers/base/bus.c:457)
> [ 5.205657] __device_attach (drivers/base/dd.c:1032)
> [ 5.209641] device_attach (drivers/base/dd.c:1074)
> [ 5.213357] pci_bus_add_device (drivers/pci/bus.c:352)
> [ 5.217518] pci_bus_add_devices (drivers/pci/bus.c:371 (discriminator 2))
> [ 5.221774] pci_host_probe (drivers/pci/probe.c:3099)
> [ 5.225581] nwl_pcie_probe (drivers/pci/controller/pcie-xilinx-nwl.c:938)
> [ 5.229562] platform_probe (drivers/base/platform.c:1404)
> [ 5.233367] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
> [ 5.237169] __driver_probe_device (drivers/base/dd.c:800)
> [ 5.241685] driver_probe_device (drivers/base/dd.c:830)
> [ 5.246020] __device_attach_driver (drivers/base/dd.c:959)
> [ 5.250628] bus_for_each_drv (drivers/base/bus.c:457)
> [ 5.254612] __device_attach (drivers/base/dd.c:1032)
> [ 5.258596] device_initial_probe (drivers/base/dd.c:1080)
> [ 5.262938] bus_probe_device (drivers/base/bus.c:532)
> [ 5.266920] deferred_probe_work_func (drivers/base/dd.c:124)
> [ 5.271619] process_one_work (arch/arm64/include/asm/jump_label.h:21 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2632)
> [ 5.275788] worker_thread (kernel/workqueue.c:2694 (discriminator 2) kernel/workqueue.c:2781 (discriminator 2))
> [ 5.279686] kthread (kernel/kthread.c:388)
> [ 5.283048] ret_from_fork (arch/arm64/kernel/entry.S:862)
> [ 5.286765] ================================================================================
>
> Fixes: 9a181e1093af ("PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts")
> Cc: <[email protected]>
> Signed-off-by: Sean Anderson <[email protected]>
> ---
>
> Changes in v3:
> - Expand commit message
>
> drivers/pci/controller/pcie-xilinx-nwl.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
> index 0408f4d612b5..437927e3bcca 100644
> --- a/drivers/pci/controller/pcie-xilinx-nwl.c
> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
> @@ -371,7 +371,7 @@ static void nwl_mask_intx_irq(struct irq_data *data)
> u32 mask;
> u32 val;
>
> - mask = 1 << (data->hwirq - 1);
> + mask = 1 << data->hwirq;
> raw_spin_lock_irqsave(&pcie->leg_mask_lock, flags);
> val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
> nwl_bridge_writel(pcie, (val & (~mask)), MSGF_LEG_MASK);
> @@ -385,7 +385,7 @@ static void nwl_unmask_intx_irq(struct irq_data *data)
> u32 mask;
> u32 val;
>
> - mask = 1 << (data->hwirq - 1);
> + mask = 1 << data->hwirq;
> raw_spin_lock_irqsave(&pcie->leg_mask_lock, flags);
> val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
> nwl_bridge_writel(pcie, (val | mask), MSGF_LEG_MASK);
> --
> 2.35.1.1320.gc452695387.dirty
>

2024-05-22 22:46:09

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] dt-bindings: pci: xilinx-nwl: Add phys

On Mon, May 20, 2024 at 10:53:56AM -0400, Sean Anderson wrote:
> Add phys properties so Linux can power-on/configure the GTR
> transcievers.

s/transcievers/transceivers/

Possibly s/phys/PHYs/ in subject, commit log, DT description to avoid
confusion with "phys" (short for generic "physical"). Or maybe even
just "PHY properties"?

What does "GTR" mean? Possibly expand that?

> Signed-off-by: Sean Anderson <[email protected]>
> ---
>
> Changes in v3:
> - Document phys property
>
> Changes in v2:
> - Remove phy-names
> - Add an example
>
> Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml b/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
> index 426f90a47f35..cc50795d170b 100644
> --- a/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
> +++ b/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
> @@ -61,6 +61,11 @@ properties:
> interrupt-map:
> maxItems: 4
>
> + phys:
> + minItems: 1
> + maxItems: 4
> + description: One phy per logical lane, in order
> +
> power-domains:
> maxItems: 1
>
> @@ -110,6 +115,7 @@ examples:
> - |
> #include <dt-bindings/interrupt-controller/arm-gic.h>
> #include <dt-bindings/interrupt-controller/irq.h>
> + #include <dt-bindings/phy/phy.h>
> #include <dt-bindings/power/xlnx-zynqmp-power.h>
> soc {
> #address-cells = <2>;
> @@ -138,6 +144,7 @@ examples:
> <0x0 0x0 0x0 0x3 &pcie_intc 0x3>,
> <0x0 0x0 0x0 0x4 &pcie_intc 0x4>;
> msi-parent = <&nwl_pcie>;
> + phys = <&psgtr 0 PHY_TYPE_PCIE 0 0>;
> power-domains = <&zynqmp_firmware PD_PCIE>;
> iommus = <&smmu 0x4d0>;
> pcie_intc: legacy-interrupt-controller {
> --
> 2.35.1.1320.gc452695387.dirty
>

2024-05-23 15:20:01

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] dt-bindings: pci: xilinx-nwl: Add phys

On 5/22/24 18:28, Bjorn Helgaas wrote:
> On Mon, May 20, 2024 at 10:53:56AM -0400, Sean Anderson wrote:
>> Add phys properties so Linux can power-on/configure the GTR
>> transcievers.
>
> s/transcievers/transceivers/

I always forget the spelling is backwards on this one

> Possibly s/phys/PHYs/ in subject, commit log, DT description to avoid
> confusion with "phys" (short for generic "physical"). Or maybe even
> just "PHY properties"?

Well, this is the name for the property...

> What does "GTR" mean? Possibly expand that?

It's "xlnx,zynqmp-psgtr-v1.1". These are the available transceivers on
the ZynqMP, which is the only SoC this device is present on. I had hoped
this would be clear by calling them "GTR transcievers"...

--Sean

>> Signed-off-by: Sean Anderson <[email protected]>
>> ---
>>
>> Changes in v3:
>> - Document phys property
>>
>> Changes in v2:
>> - Remove phy-names
>> - Add an example
>>
>> Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml b/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
>> index 426f90a47f35..cc50795d170b 100644
>> --- a/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
>> +++ b/Documentation/devicetree/bindings/pci/xlnx,nwl-pcie.yaml
>> @@ -61,6 +61,11 @@ properties:
>> interrupt-map:
>> maxItems: 4
>>
>> + phys:
>> + minItems: 1
>> + maxItems: 4
>> + description: One phy per logical lane, in order
>> +
>> power-domains:
>> maxItems: 1
>>
>> @@ -110,6 +115,7 @@ examples:
>> - |
>> #include <dt-bindings/interrupt-controller/arm-gic.h>
>> #include <dt-bindings/interrupt-controller/irq.h>
>> + #include <dt-bindings/phy/phy.h>
>> #include <dt-bindings/power/xlnx-zynqmp-power.h>
>> soc {
>> #address-cells = <2>;
>> @@ -138,6 +144,7 @@ examples:
>> <0x0 0x0 0x0 0x3 &pcie_intc 0x3>,
>> <0x0 0x0 0x0 0x4 &pcie_intc 0x4>;
>> msi-parent = <&nwl_pcie>;
>> + phys = <&psgtr 0 PHY_TYPE_PCIE 0 0>;
>> power-domains = <&zynqmp_firmware PD_PCIE>;
>> iommus = <&smmu 0x4d0>;
>> pcie_intc: legacy-interrupt-controller {
>> --
>> 2.35.1.1320.gc452695387.dirty
>>


2024-05-23 15:23:52

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] PCI: xilinx-nwl: Fix off-by-one in IRQ handler

On 5/22/24 18:28, Bjorn Helgaas wrote:
> On Mon, May 20, 2024 at 10:53:57AM -0400, Sean Anderson wrote:
>> MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit
>> 2, and INTD in bit 3. Hardware IRQ numbers start at 0, and we register
>> PCI_NUM_INTX irqs. So to enable INTA (aka hwirq 0) we should set bit 0.
>> Remove the subtraction of one. This fixes the following UBSAN error:
>
> Thanks for these details!
>
> I guess UBSAN == "undefined behavior sanitizer", right? That sounds
> like an easy way to find this but not the way users are likely to find
> it.

It's pretty likely they will find it this way, since I found it this way
and no one else had ;)

> I assume users would notice spurious and missing interrupts, e.g.,
> a driver that tried to enable INTB would have actually enabled INTA,
> so we'd see spurious INTA interrupts and the driver would never see
> the INTB it expected.
>
> And a driver that tried to enable INTA would never see that interrupt,
> and we might not set any bit in MSGF_LEG_MASK?

And yes, this would manifest as INTx interrupts being broken.

> I think the normal way people would trip over this, i.e., spurious and
> missing INTx interrupts, is the important thing to mention here.
>
>> [ 5.037483] ================================================================================
>> [ 5.046260] UBSAN: shift-out-of-bounds in ../drivers/pci/controller/pcie-xilinx-nwl.c:389:11
>> [ 5.054983] shift exponent 18446744073709551615 is too large for 32-bit type 'int'
>> [ 5.062813] CPU: 1 PID: 61 Comm: kworker/u10:1 Not tainted 6.6.20+ #268
>> [ 5.070008] Hardware name: xlnx,zynqmp (DT)
>> [ 5.074348] Workqueue: events_unbound deferred_probe_work_func
>> [ 5.080410] Call trace:
>> [ 5.082958] dump_backtrace (arch/arm64/kernel/stacktrace.c:235)
>> [ 5.086850] show_stack (arch/arm64/kernel/stacktrace.c:242)
>> [ 5.090292] dump_stack_lvl (lib/dump_stack.c:107)
>> [ 5.094095] dump_stack (lib/dump_stack.c:114)
>> [ 5.097540] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:218 lib/ubsan.c:387)
>> [ 5.103227] nwl_unmask_leg_irq (drivers/pci/controller/pcie-xilinx-nwl.c:389 (discriminator 1))
>> [ 5.107386] irq_enable (kernel/irq/internals.h:234 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345)
>> [ 5.110838] __irq_startup (kernel/irq/internals.h:239 kernel/irq/chip.c:180 kernel/irq/chip.c:250)
>> [ 5.114552] irq_startup (kernel/irq/chip.c:270)
>> [ 5.118266] __setup_irq (kernel/irq/manage.c:1800)
>> [ 5.121982] request_threaded_irq (kernel/irq/manage.c:2206)
>> [ 5.126412] pcie_pme_probe (include/linux/interrupt.h:168 drivers/pci/pcie/pme.c:348)
>
> The rest of the stacktrace below is not relevant and could be omitted.
> The timestamps don't add useful information either.

OK

--Sean

>> [ 5.130303] pcie_port_probe_service (drivers/pci/pcie/portdrv.c:528)
>> [ 5.134915] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>> [ 5.138720] __driver_probe_device (drivers/base/dd.c:800)
>> [ 5.143236] driver_probe_device (drivers/base/dd.c:830)
>> [ 5.147571] __device_attach_driver (drivers/base/dd.c:959)
>> [ 5.152179] bus_for_each_drv (drivers/base/bus.c:457)
>> [ 5.156163] __device_attach (drivers/base/dd.c:1032)
>> [ 5.160147] device_initial_probe (drivers/base/dd.c:1080)
>> [ 5.164488] bus_probe_device (drivers/base/bus.c:532)
>> [ 5.168471] device_add (drivers/base/core.c:3638)
>> [ 5.172098] device_register (drivers/base/core.c:3714)
>> [ 5.175994] pcie_portdrv_probe (drivers/pci/pcie/portdrv.c:309 drivers/pci/pcie/portdrv.c:363 drivers/pci/pcie/portdrv.c:695)
>> [ 5.180338] pci_device_probe (drivers/pci/pci-driver.c:324 drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>> [ 5.184410] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>> [ 5.188213] __driver_probe_device (drivers/base/dd.c:800)
>> [ 5.192729] driver_probe_device (drivers/base/dd.c:830)
>> [ 5.197064] __device_attach_driver (drivers/base/dd.c:959)
>> [ 5.201672] bus_for_each_drv (drivers/base/bus.c:457)
>> [ 5.205657] __device_attach (drivers/base/dd.c:1032)
>> [ 5.209641] device_attach (drivers/base/dd.c:1074)
>> [ 5.213357] pci_bus_add_device (drivers/pci/bus.c:352)
>> [ 5.217518] pci_bus_add_devices (drivers/pci/bus.c:371 (discriminator 2))
>> [ 5.221774] pci_host_probe (drivers/pci/probe.c:3099)
>> [ 5.225581] nwl_pcie_probe (drivers/pci/controller/pcie-xilinx-nwl.c:938)
>> [ 5.229562] platform_probe (drivers/base/platform.c:1404)
>> [ 5.233367] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>> [ 5.237169] __driver_probe_device (drivers/base/dd.c:800)
>> [ 5.241685] driver_probe_device (drivers/base/dd.c:830)
>> [ 5.246020] __device_attach_driver (drivers/base/dd.c:959)
>> [ 5.250628] bus_for_each_drv (drivers/base/bus.c:457)
>> [ 5.254612] __device_attach (drivers/base/dd.c:1032)
>> [ 5.258596] device_initial_probe (drivers/base/dd.c:1080)
>> [ 5.262938] bus_probe_device (drivers/base/bus.c:532)
>> [ 5.266920] deferred_probe_work_func (drivers/base/dd.c:124)
>> [ 5.271619] process_one_work (arch/arm64/include/asm/jump_label.h:21 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2632)
>> [ 5.275788] worker_thread (kernel/workqueue.c:2694 (discriminator 2) kernel/workqueue.c:2781 (discriminator 2))
>> [ 5.279686] kthread (kernel/kthread.c:388)
>> [ 5.283048] ret_from_fork (arch/arm64/kernel/entry.S:862)
>> [ 5.286765] ================================================================================
>>
>> Fixes: 9a181e1093af ("PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts")
>> Cc: <[email protected]>
>> Signed-off-by: Sean Anderson <[email protected]>
>> ---
>>
>> Changes in v3:
>> - Expand commit message
>>
>> drivers/pci/controller/pcie-xilinx-nwl.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
>> index 0408f4d612b5..437927e3bcca 100644
>> --- a/drivers/pci/controller/pcie-xilinx-nwl.c
>> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
>> @@ -371,7 +371,7 @@ static void nwl_mask_intx_irq(struct irq_data *data)
>> u32 mask;
>> u32 val;
>>
>> - mask = 1 << (data->hwirq - 1);
>> + mask = 1 << data->hwirq;
>> raw_spin_lock_irqsave(&pcie->leg_mask_lock, flags);
>> val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
>> nwl_bridge_writel(pcie, (val & (~mask)), MSGF_LEG_MASK);
>> @@ -385,7 +385,7 @@ static void nwl_unmask_intx_irq(struct irq_data *data)
>> u32 mask;
>> u32 val;
>>
>> - mask = 1 << (data->hwirq - 1);
>> + mask = 1 << data->hwirq;
>> raw_spin_lock_irqsave(&pcie->leg_mask_lock, flags);
>> val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
>> nwl_bridge_writel(pcie, (val | mask), MSGF_LEG_MASK);
>> --
>> 2.35.1.1320.gc452695387.dirty
>>


2024-05-23 19:18:57

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] PCI: xilinx-nwl: Clean up clock on probe failure/removal

> Make sure we turn off the clock on probe failure and device removal.

> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c

> @@ -817,11 +818,23 @@ static int nwl_pcie_probe(struct platform_device *pdev)
> err = nwl_pcie_enable_msi(pcie);
> if (err < 0) {
> dev_err(dev, "failed to enable MSI support: %d\n", err);
> - return err;
> + goto err_clk;
> }
> }
>
> - return pci_host_probe(bridge);
> + err = pci_host_probe(bridge);
> +
> +err_clk:
> + if (err)
> + clk_disable_unprepare(pcie->clk);

I suggest to use the label “disable_unprepare_clock” directly before this function call
(in the if branch) so that a duplicate check would be avoided after some error cases.

Regards,
Markus

2024-05-23 19:22:04

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] PCI: xilinx-nwl: Clean up clock on probe failure/removal

On 5/23/24 15:18, Markus Elfring wrote:
>> Make sure we turn off the clock on probe failure and device removal.
> …
>> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
> …
>> @@ -817,11 +818,23 @@ static int nwl_pcie_probe(struct platform_device *pdev)
>> err = nwl_pcie_enable_msi(pcie);
>> if (err < 0) {
>> dev_err(dev, "failed to enable MSI support: %d\n", err);
>> - return err;
>> + goto err_clk;
>> }
>> }
>>
>> - return pci_host_probe(bridge);
>> + err = pci_host_probe(bridge);
>> +
>> +err_clk:
>> + if (err)
>> + clk_disable_unprepare(pcie->clk);
>
> I suggest to use the label “disable_unprepare_clock” directly before this function call
> (in the if branch) so that a duplicate check would be avoided after some error cases.

Well if you want to avoid a check, we can just do

err = pci_host_probe(bridge);
if (!err)
return 0;

err_clk:
...

--Sean

2024-05-23 20:12:01

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] PCI: xilinx-nwl: Clean up clock on probe failure/removal

>> …
>>> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
>> …
>>> @@ -817,11 +818,23 @@ static int nwl_pcie_probe(struct platform_device *pdev)
>>> err = nwl_pcie_enable_msi(pcie);
>>> if (err < 0) {
>>> dev_err(dev, "failed to enable MSI support: %d\n", err);
>>> - return err;
>>> + goto err_clk;
>>> }
>>> }
>>>
>>> - return pci_host_probe(bridge);
>>> + err = pci_host_probe(bridge);
>>> +
>>> +err_clk:
>>> + if (err)
>>> + clk_disable_unprepare(pcie->clk);
>>
>> I suggest to use the label “disable_unprepare_clock” directly before this function call
>> (in the if branch) so that a duplicate check would be avoided after some error cases.
>
> Well if you want to avoid a check, we can just do
>
> err = pci_host_probe(bridge);
> if (!err)
> return 0;
>
> err_clk:
> ...

This design variant can also be reasonable.

Do any preferences matter here for label name selections?

Regards,
Markus

2024-05-23 20:19:14

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] PCI: xilinx-nwl: Clean up clock on probe failure/removal

On 5/23/24 16:11, Markus Elfring wrote:
>>> …
>>>> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
>>> …
>>>> @@ -817,11 +818,23 @@ static int nwl_pcie_probe(struct platform_device *pdev)
>>>> err = nwl_pcie_enable_msi(pcie);
>>>> if (err < 0) {
>>>> dev_err(dev, "failed to enable MSI support: %d\n", err);
>>>> - return err;
>>>> + goto err_clk;
>>>> }
>>>> }
>>>>
>>>> - return pci_host_probe(bridge);
>>>> + err = pci_host_probe(bridge);
>>>> +
>>>> +err_clk:
>>>> + if (err)
>>>> + clk_disable_unprepare(pcie->clk);
>>>
>>> I suggest to use the label “disable_unprepare_clock” directly before this function call
>>> (in the if branch) so that a duplicate check would be avoided after some error cases.
>>
>> Well if you want to avoid a check, we can just do
>>
>> err = pci_host_probe(bridge);
>> if (!err)
>> return 0;
>>
>> err_clk:
>> ...
>
> This design variant can also be reasonable.
>
> Do any preferences matter here for label name selections?

Personally, I prefer to use labels named after what they're cleaning up and not what they're doing.

--Sean

2024-05-24 08:17:39

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] PCI: xilinx-nwl: Add phy support

> Add support for enabling/disabling PCIe phys. We can't really do
> anything about failures in the disable/remove path, so just warn.

> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c

> @@ -818,12 +876,15 @@ static int nwl_pcie_probe(struct platform_device *pdev)
> err = nwl_pcie_enable_msi(pcie);
> if (err < 0) {
> dev_err(dev, "failed to enable MSI support: %d\n", err);
> - goto err_clk;
> + goto err_phy;
> }
> }
>
> err = pci_host_probe(bridge);
>
> +err_phy:
> + if (err)
> + nwl_pcie_phy_disable(pcie);
> err_clk:
> if (err)
> clk_disable_unprepare(pcie->clk);

I got the impression that some source code adjustments should be performed
in another separate update step for this function implementation.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9#n81

You propose to extend the exception handling here.
Does such information indicate a need for another tag “Fixes”?

How do you think about to increase the usage of a corresponding goto chain?
https://wiki.sei.cmu.edu/confluence/display/c/MEM12-C.+Consider+using+a+goto+chain+when+leaving+a+function+on+error+when+using+and+releasing+resources

Would you become more interested in the application of scope-based resource management?

Regards,
Markus

2024-05-24 14:38:42

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] PCI: xilinx-nwl: Add phy support

On 5/24/24 04:16, Markus Elfring wrote:
>> Add support for enabling/disabling PCIe phys. We can't really do
>> anything about failures in the disable/remove path, so just warn.
> …
>> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
> …
>> @@ -818,12 +876,15 @@ static int nwl_pcie_probe(struct platform_device *pdev)
>> err = nwl_pcie_enable_msi(pcie);
>> if (err < 0) {
>> dev_err(dev, "failed to enable MSI support: %d\n", err);
>> - goto err_clk;
>> + goto err_phy;
>> }
>> }
>>
>> err = pci_host_probe(bridge);
>>
>> +err_phy:
>> + if (err)
>> + nwl_pcie_phy_disable(pcie);
>> err_clk:
>> if (err)
>> clk_disable_unprepare(pcie->clk);
>
> I got the impression that some source code adjustments should be performed
> in another separate update step for this function implementation.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9#n81
>
> You propose to extend the exception handling here.
> Does such information indicate a need for another tag “Fixes”?

Huh? I am only disabling what I enabled...

--Sean

2024-05-24 14:56:48

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] PCI: xilinx-nwl: Fix off-by-one in IRQ handler

On Thu, May 23, 2024 at 11:21:52AM -0400, Sean Anderson wrote:
> On 5/22/24 18:28, Bjorn Helgaas wrote:
> > On Mon, May 20, 2024 at 10:53:57AM -0400, Sean Anderson wrote:
> >> MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit
> >> 2, and INTD in bit 3. Hardware IRQ numbers start at 0, and we register
> >> PCI_NUM_INTX irqs. So to enable INTA (aka hwirq 0) we should set bit 0.
> >> Remove the subtraction of one. This fixes the following UBSAN error:
> >
> > Thanks for these details!
> >
> > I guess UBSAN == "undefined behavior sanitizer", right? That sounds
> > like an easy way to find this but not the way users are likely to find
> > it.
>
> It's pretty likely they will find it this way, since I found it this way
> and no one else had ;)
>
> > I assume users would notice spurious and missing interrupts, e.g.,
> > a driver that tried to enable INTB would have actually enabled INTA,
> > so we'd see spurious INTA interrupts and the driver would never see
> > the INTB it expected.
> >
> > And a driver that tried to enable INTA would never see that interrupt,
> > and we might not set any bit in MSGF_LEG_MASK?
>
> And yes, this would manifest as INTx interrupts being broken.
>

It's so weird that it's been broken for seven years and no one reported
it. :/

regards,
dan carpenter


2024-05-24 14:59:31

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] PCI: xilinx-nwl: Add phy support

On Mon, May 20, 2024 at 10:54:01AM -0400, Sean Anderson wrote:
> +static int nwl_pcie_phy_enable(struct nwl_pcie *pcie)
> +{
> + int i, ret;
> +
> + for (i = 0; i < ARRAY_SIZE(pcie->phy); i++) {
> + ret = phy_init(pcie->phy[i]);
> + if (ret)
> + goto err;
> +
> + ret = phy_power_on(pcie->phy[i]);
> + if (ret) {
> + WARN_ON(phy_exit(pcie->phy[i]));
> + goto err;
> + }
> + }
> +
> + return 0;
> +
> +err:
> + while (--i) {

This doesn't work. If we fail on the first iteration, then it will
lead to an array underflow. It should be while (--i >= 0) or
while (i--). I prefer the first, but the second format works for people
who use unsigned iterators.

> + WARN_ON(phy_power_off(pcie->phy[i]));
> + WARN_ON(phy_exit(pcie->phy[i]));
> + }
> +
> + return ret;
> +}

regards,
dan carpenter


2024-05-24 15:03:20

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] PCI: xilinx-nwl: Fix off-by-one in IRQ handler

On 5/24/24 10:56, Dan Carpenter wrote:
> On Thu, May 23, 2024 at 11:21:52AM -0400, Sean Anderson wrote:
>> On 5/22/24 18:28, Bjorn Helgaas wrote:
>> > On Mon, May 20, 2024 at 10:53:57AM -0400, Sean Anderson wrote:
>> >> MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit
>> >> 2, and INTD in bit 3. Hardware IRQ numbers start at 0, and we register
>> >> PCI_NUM_INTX irqs. So to enable INTA (aka hwirq 0) we should set bit 0.
>> >> Remove the subtraction of one. This fixes the following UBSAN error:
>> >
>> > Thanks for these details!
>> >
>> > I guess UBSAN == "undefined behavior sanitizer", right? That sounds
>> > like an easy way to find this but not the way users are likely to find
>> > it.
>>
>> It's pretty likely they will find it this way, since I found it this way
>> and no one else had ;)
>>
>> > I assume users would notice spurious and missing interrupts, e.g.,
>> > a driver that tried to enable INTB would have actually enabled INTA,
>> > so we'd see spurious INTA interrupts and the driver would never see
>> > the INTB it expected.
>> >
>> > And a driver that tried to enable INTA would never see that interrupt,
>> > and we might not set any bit in MSGF_LEG_MASK?
>>
>> And yes, this would manifest as INTx interrupts being broken.
>>
>
> It's so weird that it's been broken for seven years and no one reported
> it. :/

If I had to guess it's because most PCIe hardware uses MSIs. Unless you
plugged in a PCI bridge there's almost no reason to use INTx at all.

--Sean

2024-05-24 15:25:07

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] PCI: xilinx-nwl: Add phy support

On 5/24/24 10:59, Dan Carpenter wrote:
> On Mon, May 20, 2024 at 10:54:01AM -0400, Sean Anderson wrote:
>> +static int nwl_pcie_phy_enable(struct nwl_pcie *pcie)
>> +{
>> + int i, ret;
>> +
>> + for (i = 0; i < ARRAY_SIZE(pcie->phy); i++) {
>> + ret = phy_init(pcie->phy[i]);
>> + if (ret)
>> + goto err;
>> +
>> + ret = phy_power_on(pcie->phy[i]);
>> + if (ret) {
>> + WARN_ON(phy_exit(pcie->phy[i]));
>> + goto err;
>> + }
>> + }
>> +
>> + return 0;
>> +
>> +err:
>> + while (--i) {
>
> This doesn't work. If we fail on the first iteration, then it will
> lead to an array underflow. It should be while (--i >= 0) or
> while (i--). I prefer the first, but the second format works for people
> who use unsigned iterators.

Thanks, will fix.

--Sean

>> + WARN_ON(phy_power_off(pcie->phy[i]));
>> + WARN_ON(phy_exit(pcie->phy[i]));
>> + }
>> +
>> + return ret;
>> +}
>
> regards,
> dan carpenter
>


2024-05-24 15:37:50

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] PCI: xilinx-nwl: Add phy support

>> …
>>> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
>> …
>>> @@ -818,12 +876,15 @@ static int nwl_pcie_probe(struct platform_device *pdev)
>>> err = nwl_pcie_enable_msi(pcie);
>>> if (err < 0) {
>>> dev_err(dev, "failed to enable MSI support: %d\n", err);
>>> - goto err_clk;
>>> + goto err_phy;
>>> }
>>> }
>>>
>>> err = pci_host_probe(bridge);
>>>
>>> +err_phy:
>>> + if (err)
>>> + nwl_pcie_phy_disable(pcie);
>>> err_clk:
>>> if (err)
>>> clk_disable_unprepare(pcie->clk);
>>
>> I got the impression that some source code adjustments should be performed
>> in another separate update step for this function implementation.
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9#n81
>>
>> You propose to extend the exception handling here.
>> Does such information indicate a need for another tag “Fixes”?
>
> Huh? I am only disabling what I enabled...

* Was a resource deactivation accidentally missing in a previous release
of this software component?

* Can repeated checks be avoided a bit more by a design approach which we tried
to clarify for the update step “[PATCH v3 5/7] PCI: xilinx-nwl: Clean up clock
on probe failure/removal”?
https://lore.kernel.org/lkml/[email protected]/


Regards,
Markus