2021-05-06 15:35:22

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 00/42] PCI: aardvark: Various driver fixes

This patch series fixes various issues in pci-aardvark.c driver
(PCIe controller on Marvell Armada 3700 SoC) used on Espressobin
and Turris Mox.

First patch fixes kernel panic (or TF-A panic depending on used
firmware) during execution of PIO transfer and I would suggest to
include this fix into v5.13 release to prevent future kernel panics.

Other patches then fixes PIO issues, PCIe link training, initialization
of PCIe controller, accessing PCIe Root Bridge/Port registers, handling
of interrupts (including ERR and PME), setup of Multi-MSI interrupts,
adding support for masking MSI interrupts, adding support for more than
32 MSI interrupts with MSI-X support, and adding support for AER.

Note that there are still some unresolved issues with PCIe on A3720.
I asked Marvell for PCIe documentation or explanations but I have not
received anything (yet).

Patch "Fix checking for PIO status" is taken from the Marvell github fork
and adapted for inclusion into mainline kernel:
https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/84444d22023c

Whole patch series is available also in my git branch pci-aardvark:
https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark

If you have Espressobin, Turris Mox or other A3720 board with PCIe
please test this patch series with your favourite PCIe card if
everything is working fine.

Evan Wang (1):
PCI: aardvark: Fix checking for PIO status

Pali Rohár (39):
PCI: aardvark: Fix kernel panic during PIO transfer
PCI: aardvark: Fix checking for PIO Non-posted Request
PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO
response
PCI: pci-bridge-emul: Add PCIe Root Capabilities Register
PCI: aardvark: Fix reporting CRS Software Visibility on emulated
bridge
PCI: aardvark: Fix link training
PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
PCI: aardvark: Fix PCIe Max Payload Size setting
PCI: aardvark: Implement workaround for the readback value of VEND_ID
PCI: aardvark: Do not touch status bits of masked interrupts in
interrupt handler
PCI: aardvark: Check for virq mapping when processing INTx IRQ
PCI: aardvark: Remove irq_mask_ack callback for INTx interrupts
PCI: aardvark: Don't mask irq when mapping
PCI: aardvark: Change name of INTx irq_chip to advk-INT
PCI: aardvark: Remove unneeded goto
PCI: aardvark: Fix support for MSI interrupts
PCI: aardvark: Correctly clear and unmask all MSI interrupts
PCI: aardvark: Fix setting MSI address
PCI: aardvark: Add support for more than 32 MSI interrupts
PCI: aardvark: Add support for masking MSI interrupts
PCI: aardvark: Enable MSI-X support
PCI: aardvark: Fix support for ERR interrupt on emulated bridge
PCI: aardvark: Fix support for PME on emulated bridge
PCI: aardvark: Fix support for PME requester on emulated bridge
PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on
emulated bridge
PCI: aardvark: Disable bus mastering and mask all interrupts when
unbinding driver
PCI: aardvark: Free config space for emulated root bridge when
unbinding driver to fix memory leak
PCI: aardvark: Reset PCIe card and disable PHY when unbinding driver
PCI: aardvark: Rewrite irq code to chained irq handler
PCI: aardvark: Use separate INTA interrupt for emulated root bridge
PCI: pci-bridge-emul: Add description for class_revision field
PCI: pci-bridge-emul: Add definitions for missing capabilities
registers
PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
registers on emulated bridge
PCI: aardvark: Add support for PCI_BRIDGE_CTL_BUS_RESET on emulated
bridge
PCI: aardvark: Replace custom PCIE_CORE_ERR_CAPCTL_* macros by
linux/pci_regs.h macros
PCI: aardvark: Replace custom PCIE_CORE_INT_* macros by linux
PCI_INTERRUPT_* values
PCI: aardvark: Cleanup some register macros
PCI: aardvark: Add comments for OB_WIN_ENABLE and ADDR_WIN_DISABLE
PCI: aardvark: Add support for Advanced Error Reporting registers on
emulated bridge

Russell King (2):
PCI: pci-bridge-emul: re-arrange register tests
PCI: pci-bridge-emul: add support for PCIe extended capabilities

drivers/pci/controller/pci-aardvark.c | 980 +++++++++++++++++++-------
drivers/pci/pci-bridge-emul.c | 149 ++--
drivers/pci/pci-bridge-emul.h | 17 +-
include/uapi/linux/pci_regs.h | 6 +
4 files changed, 850 insertions(+), 302 deletions(-)

--
2.20.1


2021-05-06 15:35:52

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 12/42] PCI: aardvark: Check for virq mapping when processing INTx IRQ

It is possible that we receive spurious INTx interrupt. So add needed check
before calling generic_handle_irq() function.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 362faddae935..e7089db11f79 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1106,7 +1106,10 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
PCIE_ISR1_REG);

virq = irq_find_mapping(pcie->irq_domain, i);
- generic_handle_irq(virq);
+ if (virq)
+ generic_handle_irq(virq);
+ else
+ dev_err(&pcie->pdev->dev, "unexpected INT%c IRQ\n", (char)i+'A');
}
}

--
2.20.1

2021-05-06 15:35:52

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 03/42] PCI: aardvark: Fix checking for PIO status

From: Evan Wang <[email protected]>

There is an issue that when PCIe switch is connected to an Armada 3700
board, there will be lots of warnings about PIO errors when reading the
config space. According to Aardvark PIO read and write sequence in HW
specification, the current way to check PIO status has the following
issues:

1) For PIO read operation, it reports the error message, which should be
avoided according to HW specification.

2) For PIO read and write operations, it only checks PIO operation complete
status, which is not enough, and error status should also be checked.

This patch aligns the code with Aardvark PIO read and write sequence in HW
specification on PIO status check and fix the warnings when reading config
space.

This patch also returns Completion Retry Status value when the read request
timeout, to give the caller a chance to send the request again instead of
failing.

Signed-off-by: Evan Wang <[email protected]>
Reviewed-by: Victor Gu <[email protected]>
Tested-by: Victor Gu <[email protected]>
[pali: Return CRS also after timeout]
Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected] # b1bd5714472c ("PCI: aardvark: Indicate error in 'val' when config read fails")
---
drivers/pci/controller/pci-aardvark.c | 93 +++++++++++++++++++++++----
1 file changed, 80 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 2f8380a1f84f..a37ba86f1b2d 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -58,6 +58,7 @@
#define PIO_COMPLETION_STATUS_CRS 2
#define PIO_COMPLETION_STATUS_CA 4
#define PIO_NON_POSTED_REQ BIT(10)
+#define PIO_ERR_STATUS BIT(11)
#define PIO_ADDR_LS (PIO_BASE_ADDR + 0x8)
#define PIO_ADDR_MS (PIO_BASE_ADDR + 0xc)
#define PIO_WR_DATA (PIO_BASE_ADDR + 0x10)
@@ -176,6 +177,9 @@

#define MSI_IRQ_NUM 32

+#define CFG_RD_UR_VAL 0xffffffff
+#define CFG_RD_CRS_VAL 0xffff0001
+
struct advk_pcie {
struct platform_device *pdev;
void __iomem *base;
@@ -461,7 +465,7 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG);
}

-static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
+static int advk_pcie_check_pio_status(struct advk_pcie *pcie, u32 *val)
{
struct device *dev = &pcie->pdev->dev;
u32 reg;
@@ -472,15 +476,50 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
status = (reg & PIO_COMPLETION_STATUS_MASK) >>
PIO_COMPLETION_STATUS_SHIFT;

- if (!status)
- return;
-
+ /*
+ * According to HW spec, the PIO status check sequence as below:
+ * 1) even if COMPLETION_STATUS(bit9:7) indicates successful,
+ * it still needs to check Error Status(bit11), only when this bit
+ * indicates no error happen, the operation is successful.
+ * 2) value Unsupported Request(1) of COMPLETION_STATUS(bit9:7) only
+ * means a PIO write error, and for PIO read it is successful with
+ * a read value of 0xFFFFFFFF.
+ * 3) value Completion Retry Status(CRS) of COMPLETION_STATUS(bit9:7)
+ * only means a PIO write error, and for PIO read it is successful
+ * with a read value of 0xFFFF0001.
+ * 4) value Completer Abort (CA) of COMPLETION_STATUS(bit9:7) means
+ * error for both PIO read and PIO write operation.
+ * 5) other errors are indicated as 'unknown'.
+ */
switch (status) {
+ case PIO_COMPLETION_STATUS_OK:
+ if (reg & PIO_ERR_STATUS) {
+ strcomp_status = "COMP_ERR";
+ break;
+ }
+ /* Get the read result */
+ if (val)
+ *val = advk_readl(pcie, PIO_RD_DATA);
+ /* No error */
+ strcomp_status = NULL;
+ break;
case PIO_COMPLETION_STATUS_UR:
- strcomp_status = "UR";
+ if (val) {
+ /* For reading, UR is not an error status */
+ *val = CFG_RD_UR_VAL;
+ strcomp_status = NULL;
+ } else {
+ strcomp_status = "UR";
+ }
break;
case PIO_COMPLETION_STATUS_CRS:
- strcomp_status = "CRS";
+ if (val) {
+ /* For reading, CRS is not an error status */
+ *val = CFG_RD_CRS_VAL;
+ strcomp_status = NULL;
+ } else {
+ strcomp_status = "CRS";
+ }
break;
case PIO_COMPLETION_STATUS_CA:
strcomp_status = "CA";
@@ -490,6 +529,9 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
break;
}

+ if (!strcomp_status)
+ return 0;
+
if (reg & PIO_NON_POSTED_REQ)
str_posted = "Non-posted";
else
@@ -497,6 +539,8 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)

dev_err(dev, "%s PIO Response Status: %s, %#x @ %#x\n",
str_posted, strcomp_status, reg, advk_readl(pcie, PIO_ADDR_LS));
+
+ return -EFAULT;
}

static int advk_pcie_wait_pio(struct advk_pcie *pcie)
@@ -703,8 +747,17 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
size, val);

if (advk_pcie_pio_is_running(pcie)) {
- *val = 0xffffffff;
- return PCIBIOS_SET_FAILED;
+ /*
+ * For PCI_VENDOR_ID register, return Completion Retry Status
+ * so caller tries to issue the request again insted of failing
+ */
+ if (where == PCI_VENDOR_ID) {
+ *val = CFG_RD_CRS_VAL;
+ return PCIBIOS_SUCCESSFUL;
+ } else {
+ *val = 0xffffffff;
+ return PCIBIOS_SET_FAILED;
+ }
}

/* Program the control register */
@@ -729,15 +782,27 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
advk_writel(pcie, 1, PIO_START);

ret = advk_pcie_wait_pio(pcie);
+ if (ret < 0) {
+ /*
+ * For PCI_VENDOR_ID register, return Completion Retry Status
+ * so caller tries to issue the request again instead of failing
+ */
+ if (where == PCI_VENDOR_ID) {
+ *val = CFG_RD_CRS_VAL;
+ return PCIBIOS_SUCCESSFUL;
+ } else {
+ *val = 0xffffffff;
+ return PCIBIOS_SET_FAILED;
+ }
+ }
+
+ /* Check PIO status and get the read result */
+ ret = advk_pcie_check_pio_status(pcie, val);
if (ret < 0) {
*val = 0xffffffff;
return PCIBIOS_SET_FAILED;
}

- advk_pcie_check_pio_status(pcie);
-
- /* Get the read result */
- *val = advk_readl(pcie, PIO_RD_DATA);
if (size == 1)
*val = (*val >> (8 * (where & 3))) & 0xff;
else if (size == 2)
@@ -801,7 +866,9 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
if (ret < 0)
return PCIBIOS_SET_FAILED;

- advk_pcie_check_pio_status(pcie);
+ ret = advk_pcie_check_pio_status(pcie, NULL);
+ if (ret < 0)
+ return PCIBIOS_SET_FAILED;

return PCIBIOS_SUCCESSFUL;
}
--
2.20.1

2021-05-06 15:36:04

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 15/42] PCI: aardvark: Change name of INTx irq_chip to advk-INT

This name is visible in /proc/interrupts file and for better reading it
should have at most 8 characters. Also there is no need to allocate this
name dynamically, since there is only one PCIe controller on Armada 37xx.
This aligns with how the MSI irq_chip in this driver names it's interrupt
("advk-MSI").

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 08f1157e1c5e..c50421af9d06 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1022,14 +1022,7 @@ static int advk_pcie_init_irq_domain(struct advk_pcie *pcie)
}

irq_chip = &pcie->irq_chip;
-
- irq_chip->name = devm_kasprintf(dev, GFP_KERNEL, "%s-irq",
- dev_name(dev));
- if (!irq_chip->name) {
- ret = -ENOMEM;
- goto out_put_node;
- }
-
+ irq_chip->name = "advk-INT";
irq_chip->irq_mask = advk_pcie_irq_mask;
irq_chip->irq_unmask = advk_pcie_irq_unmask;

--
2.20.1

2021-05-06 15:36:19

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 13/42] PCI: aardvark: Remove irq_mask_ack callback for INTx interrupts

Callback for irq_mask_ack is the same as for irq_mask. As there is no
special handling for irq_ack, there is no need to define irq_mask_ack too.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index e7089db11f79..2aced8c9ae9f 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1032,7 +1032,6 @@ static int advk_pcie_init_irq_domain(struct advk_pcie *pcie)
}

irq_chip->irq_mask = advk_pcie_irq_mask;
- irq_chip->irq_mask_ack = advk_pcie_irq_mask;
irq_chip->irq_unmask = advk_pcie_irq_unmask;

pcie->irq_domain =
--
2.20.1

2021-05-06 15:36:49

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 18/42] PCI: aardvark: Correctly clear and unmask all MSI interrupts

Define a new macro PCIE_MSI_ALL_MASK and use it for masking, unmasking and
clearing all MSI interrupts.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 498810c00b6d..5e0243b2c473 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -117,6 +117,7 @@
#define PCIE_MSI_ADDR_HIGH_REG (CONTROL_BASE_ADDR + 0x54)
#define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
#define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
+#define PCIE_MSI_ALL_MASK GENMASK(31, 0)
#define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
#define PCIE_MSI_DATA_MASK GENMASK(15, 0)

@@ -386,19 +387,22 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);

/* Clear all interrupts */
+ advk_writel(pcie, PCIE_MSI_ALL_MASK, PCIE_MSI_STATUS_REG);
advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_REG);
advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_REG);
advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_STATUS_REG);

/* Disable All ISR0/1 Sources */
- reg = PCIE_ISR0_ALL_MASK;
- reg &= ~PCIE_ISR0_MSI_INT_PENDING;
- advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
-
+ advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_MASK_REG);
advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_MASK_REG);

/* Unmask all MSIs */
- advk_writel(pcie, 0, PCIE_MSI_MASK_REG);
+ advk_writel(pcie, ~(u32)PCIE_MSI_ALL_MASK, PCIE_MSI_MASK_REG);
+
+ /* Unmask summary MSI interrupt */
+ reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
+ reg &= ~PCIE_ISR0_MSI_INT_PENDING;
+ advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);

/* Enable summary interrupt for GIC SPI source */
reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK);
@@ -1049,7 +1053,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)

msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
- msi_status = msi_val & ~msi_mask;
+ msi_status = msi_val & ((~msi_mask) & PCIE_MSI_ALL_MASK);

for (msi_idx = 0; msi_idx < MSI_IRQ_NUM; msi_idx++) {
if (!(BIT(msi_idx) & msi_status))
--
2.20.1

2021-05-06 15:36:50

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 10/42] PCI: aardvark: Implement workaround for the readback value of VEND_ID

Marvell Armada 3700 Functional Errata, Guidelines, and Restrictions
document describes in erratum 4.1 PCIe value of vendor ID (Ref #: 243):

The readback value of VEND_ID (RD0070000h [15:0]) is 1B4Bh, while it
should read 11ABh.

The firmware can write the correct value, 11ABh, through VEND_ID
(RD0076044h [15:0]).

Implement this workaround in aardvark driver for both PCI vendor id and PCI
subsystem vendor id.

This change affects and fixes PCI vendor id of emulated PCIe root bridge.
After this change emulated PCIe root bridge has correct vendor id.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 873efd79fffb..cd4b427d7692 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -126,6 +126,7 @@
#define LTSSM_MASK 0x3f
#define LTSSM_L0 0x10
#define RC_BAR_CONFIG 0x300
+#define VENDOR_ID_REG (LMI_BASE_ADDR + 0x44)

/* PCIe core controller registers */
#define CTRL_CORE_BASE_ADDR 0x18000
@@ -340,6 +341,16 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg |= (IS_RC_MSK << IS_RC_SHIFT);
advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

+ /*
+ * Replace incorrect PCI vendor id value 0x1b4b by correct value 0x11ab.
+ * VENDOR_ID_REG contains vendor id in low 16 bits and subsystem vendor
+ * id in high 16 bits. Updating this register changes readback value of
+ * read-only vendor id bits in PCIE_CORE_DEV_ID_REG register. Workaround
+ * for erratum 4.1: "The value of device and vendor ID is incorrect".
+ */
+ reg = (PCI_VENDOR_ID_MARVELL << 16) | PCI_VENDOR_ID_MARVELL;
+ advk_writel(pcie, reg, VENDOR_ID_REG);
+
/* Set Advanced Error Capabilities and Control PF0 register */
reg = PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX |
PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX_EN |
--
2.20.1

2021-05-06 15:37:47

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 08/42] PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros

Define a macro PCI_EXP_DEVCTL_PAYLOAD_* for every possible Max Payload
Size in linux/pci_regs.h, in the same style as PCI_EXP_DEVCTL_READRQ_*.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
include/uapi/linux/pci_regs.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e709ae8235e7..ff6ccbc6efe9 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -504,6 +504,12 @@
#define PCI_EXP_DEVCTL_URRE 0x0008 /* Unsupported Request Reporting En. */
#define PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed ordering */
#define PCI_EXP_DEVCTL_PAYLOAD 0x00e0 /* Max_Payload_Size */
+#define PCI_EXP_DEVCTL_PAYLOAD_128B 0x0000 /* 128 Bytes */
+#define PCI_EXP_DEVCTL_PAYLOAD_256B 0x0020 /* 256 Bytes */
+#define PCI_EXP_DEVCTL_PAYLOAD_512B 0x0040 /* 512 Bytes */
+#define PCI_EXP_DEVCTL_PAYLOAD_1024B 0x0060 /* 1024 Bytes */
+#define PCI_EXP_DEVCTL_PAYLOAD_2048B 0x0080 /* 2048 Bytes */
+#define PCI_EXP_DEVCTL_PAYLOAD_4096B 0x00a0 /* 4096 Bytes */
#define PCI_EXP_DEVCTL_EXT_TAG 0x0100 /* Extended Tag Field Enable */
#define PCI_EXP_DEVCTL_PHANTOM 0x0200 /* Phantom Functions Enable */
#define PCI_EXP_DEVCTL_AUX_PME 0x0400 /* Auxiliary Power PM Enable */
--
2.20.1

2021-05-06 15:38:11

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 11/42] PCI: aardvark: Do not touch status bits of masked interrupts in interrupt handler

It is incorrect to clear status bits of masked interrupts.

The aardvark driver clears all status interrupt bits when no unmasked
status bit was set. When some unmasked bit was set then masked bits were
not cleared. Fix this so that masked bits are never cleared.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index cd4b427d7692..362faddae935 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1090,11 +1090,8 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);

- if (!isr0_status && !isr1_status) {
- advk_writel(pcie, isr0_val, PCIE_ISR0_REG);
- advk_writel(pcie, isr1_val, PCIE_ISR1_REG);
+ if (!isr0_status && !isr1_status)
return;
- }

/* Process MSI interrupts */
if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
--
2.20.1

2021-05-06 15:38:28

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 20/42] PCI: aardvark: Add support for more than 32 MSI interrupts

Aardvark HW can handle MSI interrupt with any 16-bit number. Received MSI
interrupt number is visible in PCIE_MSI_PAYLOAD_REG register after clearing
corresponding bit in PCIE_MSI_STATUS_REG register.

The first 32 interrupt numbers are currently stored in linear map in MSI
inner domain. Store the rest in dynamic radix tree for space efficiency.

Free interrupt numbers (available for MSI inner domain allocation) for the
first 32 interrupts are currently stored in a bitmap. For the rest,
introduce a linked list of allocated regions.

In the most common scenario there is only one PCIe card connected on boards
with Armada 3720 SoC. Since in Multi-MSI mode the PCIe device can use at
most 32 interrupts, all these interrupts are allocated in the linear map of
MSI inner domain and marked as used in the bitmap.

For less common scenarios with PCIe devices with multiple functions or with
a PCIe Bridge with packet switches with more connected PCIe devices more
than 32 interrupts are requested. In this case, store each interrupt range
from each interrupt request into the linked list as one node. In the worst
case every PCIe function will occupy one node in this linked list.

This change allows to use all 32 Multi-MSI interrupts on every connected
PCIe card on the Turris Mox router with Mox G module.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 71 ++++++++++++++++++++++++---
1 file changed, 64 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 199015215779..d74e84b0e689 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -178,11 +178,18 @@
#define RETRAIN_WAIT_MAX_RETRIES 10
#define RETRAIN_WAIT_USLEEP_US 2000

-#define MSI_IRQ_NUM 32
+#define MSI_IRQ_LINEAR_COUNT 32
+#define MSI_IRQ_TOTAL_COUNT 65536

#define CFG_RD_UR_VAL 0xffffffff
#define CFG_RD_CRS_VAL 0xffff0001

+struct advk_msi_range {
+ struct list_head list;
+ u16 first;
+ u16 count;
+};
+
struct advk_pcie {
struct platform_device *pdev;
void __iomem *base;
@@ -193,7 +200,8 @@ struct advk_pcie {
struct irq_chip msi_bottom_irq_chip;
struct irq_chip msi_irq_chip;
struct msi_domain_info msi_domain_info;
- DECLARE_BITMAP(msi_used, MSI_IRQ_NUM);
+ DECLARE_BITMAP(msi_used_linear, MSI_IRQ_LINEAR_COUNT);
+ struct list_head msi_used_radix;
struct mutex msi_used_lock;
int link_gen;
struct pci_bridge_emul bridge;
@@ -885,12 +893,44 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
unsigned int nr_irqs, void *args)
{
struct advk_pcie *pcie = domain->host_data;
+ struct advk_msi_range *msi_range, *msi_range_prev, *msi_range_next;
+ unsigned int first, count, last;
int hwirq, i;

mutex_lock(&pcie->msi_used_lock);
- hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
+
+ /* First few used interrupt numbers are marked in bitmap (the most common) */
+ hwirq = bitmap_find_free_region(pcie->msi_used_linear, MSI_IRQ_LINEAR_COUNT,
order_base_2(nr_irqs));
+
+ /* And rest used interrupt numbers are stored in linked list as ranges */
+ if (hwirq < 0) {
+ count = 1 << order_base_2(nr_irqs);
+ msi_range_prev = list_entry(&pcie->msi_used_radix, typeof(*msi_range), list);
+ do {
+ msi_range_next = list_next_entry(msi_range_prev, list);
+ last = list_entry_is_head(msi_range_next, &pcie->msi_used_radix, list)
+ ? MSI_IRQ_TOTAL_COUNT : msi_range_next->first;
+ first = list_entry_is_head(msi_range_prev, &pcie->msi_used_radix, list)
+ ? MSI_IRQ_LINEAR_COUNT : round_up(msi_range_prev->first +
+ msi_range_prev->count, count);
+ if (first + count > last) {
+ msi_range_prev = msi_range_next;
+ continue;
+ }
+ msi_range = kzalloc(sizeof(*msi_range), GFP_KERNEL);
+ if (msi_range) {
+ hwirq = first;
+ msi_range->first = first;
+ msi_range->count = count;
+ list_add(&msi_range->list, &msi_range_prev->list);
+ }
+ break;
+ } while (!list_entry_is_head(msi_range_next, &pcie->msi_used_radix, list));
+ }
+
mutex_unlock(&pcie->msi_used_lock);
+
if (hwirq < 0)
return -ENOSPC;

@@ -908,9 +948,20 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
{
struct irq_data *d = irq_domain_get_irq_data(domain, virq);
struct advk_pcie *pcie = domain->host_data;
+ struct advk_msi_range *msi_range;

mutex_lock(&pcie->msi_used_lock);
- bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
+ if (d->hwirq < MSI_IRQ_LINEAR_COUNT) {
+ bitmap_release_region(pcie->msi_used_linear, d->hwirq, order_base_2(nr_irqs));
+ } else {
+ list_for_each_entry(msi_range, &pcie->msi_used_radix, list) {
+ if (msi_range->first != d->hwirq)
+ continue;
+ list_del(&msi_range->list);
+ kfree(msi_range);
+ break;
+ }
+ }
mutex_unlock(&pcie->msi_used_lock);
}

@@ -967,6 +1018,7 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
struct msi_domain_info *msi_di;

mutex_init(&pcie->msi_used_lock);
+ INIT_LIST_HEAD(&pcie->msi_used_radix);

bottom_ic = &pcie->msi_bottom_irq_chip;

@@ -982,9 +1034,14 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
MSI_FLAG_MULTI_PCI_MSI;
msi_di->chip = msi_ic;

+ /*
+ * Aardvark HW can handle MSI interrupt with any 16bit number.
+ * For optimization first few interrupts are allocated in linear map
+ * (which is common scenario) and rest are allocated in radix tree.
+ */
pcie->msi_inner_domain =
- irq_domain_add_linear(NULL, MSI_IRQ_NUM,
- &advk_msi_domain_ops, pcie);
+ __irq_domain_add(NULL, MSI_IRQ_LINEAR_COUNT, MSI_IRQ_TOTAL_COUNT, 0,
+ &advk_msi_domain_ops, pcie);
if (!pcie->msi_inner_domain)
return -ENOMEM;

@@ -1052,7 +1109,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
msi_status = msi_val & ((~msi_mask) & PCIE_MSI_ALL_MASK);

- for (msi_idx = 0; msi_idx < MSI_IRQ_NUM; msi_idx++) {
+ for (msi_idx = 0; msi_idx < BITS_PER_TYPE(msi_status); msi_idx++) {
if (!(BIT(msi_idx) & msi_status))
continue;

--
2.20.1

2021-05-06 15:39:22

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 41/42] PCI: pci-bridge-emul: add support for PCIe extended capabilities

From: Russell King <[email protected]>

Add support for PCIe extended capabilities, which we just redirect to the
emulating driver.

Signed-off-by: Russell King <[email protected]>
---
drivers/pci/pci-bridge-emul.c | 52 +++++++++++++++++++++++++----------
drivers/pci/pci-bridge-emul.h | 15 ++++++++++
2 files changed, 53 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index 63959e4b188a..236036fdeaa2 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -385,10 +385,16 @@ int pci_bridge_emul_conf_read(struct pci_bridge_emul *bridge, int where,
read_op = bridge->ops->read_pcie;
cfgspace = (__le32 *) &bridge->pcie_conf;
behavior = bridge->pcie_cap_regs_behavior;
- } else {
- /* Beyond our PCIe space */
+ } else if (reg < PCI_CFG_SPACE_SIZE) {
+ /* Rest of PCI space not implemented */
*value = 0;
return PCIBIOS_SUCCESSFUL;
+ } else {
+ /* PCIe extended capability space */
+ reg -= PCI_CFG_SPACE_SIZE;
+ read_op = bridge->ops->read_ext;
+ cfgspace = NULL;
+ behavior = NULL;
}

if (read_op)
@@ -396,15 +402,20 @@ int pci_bridge_emul_conf_read(struct pci_bridge_emul *bridge, int where,
else
ret = PCI_BRIDGE_EMUL_NOT_HANDLED;

- if (ret == PCI_BRIDGE_EMUL_NOT_HANDLED)
- *value = le32_to_cpu(cfgspace[reg / 4]);
+ if (ret == PCI_BRIDGE_EMUL_NOT_HANDLED) {
+ if (cfgspace)
+ *value = le32_to_cpu(cfgspace[reg / 4]);
+ else
+ *value = 0;
+ }

/*
* Make sure we never return any reserved bit with a value
* different from 0.
*/
- *value &= behavior[reg / 4].ro | behavior[reg / 4].rw |
- behavior[reg / 4].w1c;
+ if (behavior)
+ *value &= behavior[reg / 4].ro | behavior[reg / 4].rw |
+ behavior[reg / 4].w1c;

if (size == 1)
*value = (*value >> (8 * (where & 3))) & 0xff;
@@ -450,8 +461,15 @@ int pci_bridge_emul_conf_write(struct pci_bridge_emul *bridge, int where,
write_op = bridge->ops->write_pcie;
cfgspace = (__le32 *) &bridge->pcie_conf;
behavior = bridge->pcie_cap_regs_behavior;
- } else {
+ } else if (reg < PCI_CFG_SPACE_SIZE) {
+ /* Rest of PCI space not implemented */
return PCIBIOS_SUCCESSFUL;
+ } else {
+ /* PCIe extended capability space */
+ reg -= PCI_CFG_SPACE_SIZE;
+ write_op = bridge->ops->write_ext;
+ cfgspace = NULL;
+ behavior = NULL;
}

shift = (where & 0x3) * 8;
@@ -465,16 +483,22 @@ int pci_bridge_emul_conf_write(struct pci_bridge_emul *bridge, int where,
else
return PCIBIOS_BAD_REGISTER_NUMBER;

- /* Keep all bits, except the RW bits */
- new = old & (~mask | ~behavior[reg / 4].rw);
+ if (behavior) {
+ /* Keep all bits, except the RW bits */
+ new = old & (~mask | ~behavior[reg / 4].rw);

- /* Update the value of the RW bits */
- new |= (value << shift) & (behavior[reg / 4].rw & mask);
+ /* Update the value of the RW bits */
+ new |= (value << shift) & (behavior[reg / 4].rw & mask);

- /* Clear the W1C bits */
- new &= ~((value << shift) & (behavior[reg / 4].w1c & mask));
+ /* Clear the W1C bits */
+ new &= ~((value << shift) & (behavior[reg / 4].w1c & mask));
+ } else {
+ new = old & ~mask;
+ new |= (value << shift) & mask;
+ }

- cfgspace[reg / 4] = cpu_to_le32(new);
+ if (cfgspace)
+ cfgspace[reg / 4] = cpu_to_le32(new);

if (write_op)
write_op(bridge, reg, old, new, mask);
diff --git a/drivers/pci/pci-bridge-emul.h b/drivers/pci/pci-bridge-emul.h
index 49bbd37ee318..2552ab660b08 100644
--- a/drivers/pci/pci-bridge-emul.h
+++ b/drivers/pci/pci-bridge-emul.h
@@ -90,6 +90,14 @@ struct pci_bridge_emul_ops {
*/
pci_bridge_emul_read_status_t (*read_pcie)(struct pci_bridge_emul *bridge,
int reg, u32 *value);
+
+ /*
+ * Same as ->read_base(), except it is for reading from the
+ * PCIe extended capability configuration space.
+ */
+ pci_bridge_emul_read_status_t (*read_ext)(struct pci_bridge_emul *bridge,
+ int reg, u32 *value);
+
/*
* Called when writing to the regular PCI bridge configuration
* space. old is the current value, new is the new value being
@@ -105,6 +113,13 @@ struct pci_bridge_emul_ops {
*/
void (*write_pcie)(struct pci_bridge_emul *bridge, int reg,
u32 old, u32 new, u32 mask);
+
+ /*
+ * Same as ->write_base(), except it is for writing from the
+ * PCIe extended capability configuration space.
+ */
+ void (*write_ext)(struct pci_bridge_emul *bridge, int reg,
+ u32 old, u32 new, u32 mask);
};

struct pci_bridge_reg_behavior;
--
2.20.1

2021-05-06 15:39:25

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 09/42] PCI: aardvark: Fix PCIe Max Payload Size setting

Change PCIe Max Payload Size setting in PCIe Device Control register to 512
bytes to align with PCIe Link Initialization sequence as defined in Marvell
Armada 3700 Functional Specification. According to the specification,
maximal Max Payload Size supported by this device is 512 bytes.

Without this kernel prints suspicious line:

pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 16384, max 512)

With this change it changes to:

pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 512, max 512)

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index edeacd95297e..873efd79fffb 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -351,8 +351,9 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL);
reg &= ~PCI_EXP_DEVCTL_RELAX_EN;
reg &= ~PCI_EXP_DEVCTL_NOSNOOP_EN;
+ reg &= ~PCI_EXP_DEVCTL_PAYLOAD;
reg &= ~PCI_EXP_DEVCTL_READRQ;
- reg |= PCI_EXP_DEVCTL_PAYLOAD; /* Set max payload size */
+ reg |= PCI_EXP_DEVCTL_PAYLOAD_512B;
reg |= PCI_EXP_DEVCTL_READRQ_512B;
advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL);

--
2.20.1

2021-05-06 15:39:37

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 39/42] PCI: aardvark: Add comments for OB_WIN_ENABLE and ADDR_WIN_DISABLE

Add a comment with explanation of these two bits. These comments are taken
from U-Boot 2020.10 PCI aardvark driver.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 3c18e139b095..ac3ee48e69d7 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -428,11 +428,24 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK);
advk_writel(pcie, reg, HOST_CTRL_INT_MASK_REG);

+ /*
+ * Enable AXI address window location generation:
+ * When it is enabled, the default outbound window
+ * configurations (Default User Field: 0xD0074CFC)
+ * are used to transparent address translation for
+ * the outbound transactions. Thus, PCIe address
+ * windows are not required.
+ */
reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG);
reg |= PCIE_CORE_CTRL2_OB_WIN_ENABLE;
advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);

- /* Bypass the address window mapping for PIO */
+ /*
+ * Bypass the address window mapping for PIO:
+ * Since PIO access already contains all required
+ * info over AXI interface by PIO registers, the
+ * address window is not required.
+ */
reg = advk_readl(pcie, PIO_CTRL);
reg |= PIO_CTRL_ADDR_WIN_DISABLE;
advk_writel(pcie, reg, PIO_CTRL);
--
2.20.1

2021-05-06 15:40:38

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 38/42] PCI: aardvark: Cleanup some register macros

Define SPEED_GEN_* macros with correct PCIE_GEN_SEL_SHIFT.

Simplify macro for setting root complex mode (use BIT instead of
MSK+SHIFT).

Rename PCIE_MSG_PM_PME_MASK to PCIE_ISR0_MSG_PM_PME to match existing
naming convention, rename PCIE_ISR0_MSI_INT_PENDING to PCIE_ISR0_MSI_INT as
it is used for both interrupt mask and pending bit.

Change the code which disables strict ordering by doing it explicitly
(instead of not specifying *_STRICT_ORDER_ENABLE macro, specify bitwise
negation of that macro).

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 34 +++++++++++++--------------
1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 2258b9ae1084..3c18e139b095 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -63,11 +63,10 @@
#define PCIE_CORE_CTRL0_REG (CONTROL_BASE_ADDR + 0x0)
#define PCIE_GEN_SEL_MSK 0x3
#define PCIE_GEN_SEL_SHIFT 0x0
-#define SPEED_GEN_1 0
-#define SPEED_GEN_2 1
-#define SPEED_GEN_3 2
-#define IS_RC_MSK 1
-#define IS_RC_SHIFT 2
+#define SPEED_GEN_1 (0 << PCIE_GEN_SEL_SHIFT)
+#define SPEED_GEN_2 (1 << PCIE_GEN_SEL_SHIFT)
+#define SPEED_GEN_3 (2 << PCIE_GEN_SEL_SHIFT)
+#define IS_RC BIT(2)
#define LANE_CNT_MSK 0x18
#define LANE_CNT_SHIFT 0x3
#define LANE_COUNT_1 (0 << LANE_CNT_SHIFT)
@@ -91,15 +90,15 @@
#define PCIE_CORE_REF_CLK_TX_ENABLE BIT(1)
#define PCIE_MSG_LOG_REG (CONTROL_BASE_ADDR + 0x30)
#define PCIE_ISR0_REG (CONTROL_BASE_ADDR + 0x40)
-#define PCIE_MSG_PM_PME_MASK BIT(7)
#define PCIE_ISR0_MASK_REG (CONTROL_BASE_ADDR + 0x44)
-#define PCIE_ISR0_MSI_INT_PENDING BIT(24)
+#define PCIE_ISR0_MSG_PM_PME BIT(7)
#define PCIE_ISR0_CORR_ERR BIT(11)
#define PCIE_ISR0_NFAT_ERR BIT(12)
#define PCIE_ISR0_FAT_ERR BIT(13)
#define PCIE_ISR0_INTX_ASSERT(val) BIT(16 + (val))
#define PCIE_ISR0_INTX_DEASSERT(val) BIT(20 + (val))
-#define PCIE_ISR0_ALL_MASK GENMASK(26, 0)
+#define PCIE_ISR0_MSI_INT BIT(24)
+#define PCIE_ISR0_ALL_MASK GENMASK(26, 0)
#define PCIE_ISR1_REG (CONTROL_BASE_ADDR + 0x48)
#define PCIE_ISR1_MASK_REG (CONTROL_BASE_ADDR + 0x4C)
#define PCIE_ISR1_POWER_STATE_CHANGE BIT(4)
@@ -345,7 +344,7 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)

/* Set PCI global control register to RC mode */
reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
- reg |= (IS_RC_MSK << IS_RC_SHIFT);
+ reg |= IS_RC;
advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

/*
@@ -379,8 +378,8 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL);

/* Program PCIe Control 2 to disable strict ordering */
- reg = PCIE_CORE_CTRL2_RESERVED |
- PCIE_CORE_CTRL2_TD_ENABLE;
+ reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG);
+ reg &= ~PCIE_CORE_CTRL2_STRICT_ORDER_ENABLE;
advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);

/* Set lane X1 */
@@ -412,7 +411,7 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)

/* Unmask summary MSI interrupt */
reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
- reg &= ~PCIE_ISR0_MSI_INT_PENDING;
+ reg &= ~PCIE_ISR0_MSI_INT;
advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);

/* Unmask bits for ERR interrupt */
@@ -422,7 +421,7 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)

/* Unmask PME interrupt for processing of PME requester */
reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
- reg &= ~PCIE_MSG_PM_PME_MASK;
+ reg &= ~PCIE_ISR0_MSG_PM_PME;
advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);

/* Enable summary interrupt for GIC SPI source */
@@ -1265,8 +1264,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
generic_handle_irq(virq);
}

- advk_writel(pcie, PCIE_ISR0_MSI_INT_PENDING,
- PCIE_ISR0_REG);
+ advk_writel(pcie, PCIE_ISR0_MSI_INT, PCIE_ISR0_REG);
}

static void advk_pcie_handle_int(struct advk_pcie *pcie)
@@ -1290,8 +1288,8 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
return;

/* Process PME interrupt as the first one to do not miss PME requester id */
- if (isr0_status & PCIE_MSG_PM_PME_MASK) {
- advk_writel(pcie, PCIE_MSG_PM_PME_MASK, PCIE_ISR0_REG);
+ if (isr0_status & PCIE_ISR0_MSG_PM_PME) {
+ advk_writel(pcie, PCIE_ISR0_MSG_PM_PME, PCIE_ISR0_REG);
/*
* PCIE_MSG_LOG_REG contains the last inbound message,
* so store requester id only when PME was not asserted yet.
@@ -1326,7 +1324,7 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
}

/* Process MSI interrupts */
- if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
+ if (isr0_status & PCIE_ISR0_MSI_INT)
advk_pcie_handle_msi(pcie);

/* Process legacy interrupts */
--
2.20.1

2021-05-06 15:40:52

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 42/42] PCI: aardvark: Add support for Advanced Error Reporting registers on emulated bridge

PCI aardvark hardware supports access to Advanced Error Reporting
configuration registers of PCIe core via PCIE_CORE_PCIERR_CAP.

Export them via emulated software root bridge through the new .read_ext and
.write_ext emulated bridge callbacks.

Note that in Advanced Error Reporting Capability header, the offset to the
next Extended Capability header is set, but it is not documented in Armada
3700 Functional Specification. As this change adds support only for
Advanced Error Reporting, explicitly clear PCI_EXT_CAP_NEXT bits in AER
capability header.

After this change, pcieport driver correctly detects AER support and allows
PCIe AER driver to start receiving ERR interrupts. It prints into dmesg:

[ 4.358401] pcieport 0000:00:00.0: AER: enabled with IRQ 52

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 74 +++++++++++++++++++++++++++
1 file changed, 74 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index ac3ee48e69d7..8914af62ccc3 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -683,11 +683,85 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
}
}

+static pci_bridge_emul_read_status_t
+advk_pci_bridge_emul_ext_conf_read(struct pci_bridge_emul *bridge,
+ int reg, u32 *value)
+{
+ struct advk_pcie *pcie = bridge->data;
+
+ switch (reg) {
+ case 0:
+ *value = advk_readl(pcie, PCIE_CORE_PCIERR_CAP + reg);
+ /*
+ * Clear PCI_EXT_CAP_NEXT bits as they are set to 0x150 offset.
+ * Armada 3700 Functional Specification does not contain any
+ * documentation about registers at that address, so explicitly
+ * mark Advanced Error Reporting Capability header as the end of
+ * Extended Capabilities.
+ */
+ *value &= 0x000fffff;
+ return PCI_BRIDGE_EMUL_HANDLED;
+
+ case PCI_ERR_UNCOR_STATUS:
+ case PCI_ERR_UNCOR_MASK:
+ case PCI_ERR_UNCOR_SEVER:
+ case PCI_ERR_COR_STATUS:
+ case PCI_ERR_COR_MASK:
+ case PCI_ERR_CAP:
+ case PCI_ERR_HEADER_LOG+0:
+ case PCI_ERR_HEADER_LOG+4:
+ case PCI_ERR_HEADER_LOG+8:
+ case PCI_ERR_HEADER_LOG+12:
+ case PCI_ERR_ROOT_COMMAND:
+ case PCI_ERR_ROOT_STATUS:
+ case PCI_ERR_ROOT_ERR_SRC:
+ *value = advk_readl(pcie, PCIE_CORE_PCIERR_CAP + reg);
+ return PCI_BRIDGE_EMUL_HANDLED;
+
+ default:
+ return PCI_BRIDGE_EMUL_NOT_HANDLED;
+ }
+}
+
+static void
+advk_pci_bridge_emul_ext_conf_write(struct pci_bridge_emul *bridge,
+ int reg, u32 old, u32 new, u32 mask)
+{
+ struct advk_pcie *pcie = bridge->data;
+
+ switch (reg) {
+ /* These are W1C registers, so clear other bits */
+ case PCI_ERR_UNCOR_STATUS:
+ case PCI_ERR_COR_STATUS:
+ case PCI_ERR_ROOT_STATUS:
+ new &= mask;
+ fallthrough;
+
+ case PCI_ERR_UNCOR_MASK:
+ case PCI_ERR_UNCOR_SEVER:
+ case PCI_ERR_COR_MASK:
+ case PCI_ERR_CAP:
+ case PCI_ERR_HEADER_LOG+0:
+ case PCI_ERR_HEADER_LOG+4:
+ case PCI_ERR_HEADER_LOG+8:
+ case PCI_ERR_HEADER_LOG+12:
+ case PCI_ERR_ROOT_COMMAND:
+ case PCI_ERR_ROOT_ERR_SRC:
+ advk_writel(pcie, new, PCIE_CORE_PCIERR_CAP + reg);
+ break;
+
+ default:
+ break;
+ }
+}
+
static struct pci_bridge_emul_ops advk_pci_bridge_emul_ops = {
.read_base = advk_pci_bridge_emul_base_conf_read,
.write_base = advk_pci_bridge_emul_base_conf_write,
.read_pcie = advk_pci_bridge_emul_pcie_conf_read,
.write_pcie = advk_pci_bridge_emul_pcie_conf_write,
+ .read_ext = advk_pci_bridge_emul_ext_conf_read,
+ .write_ext = advk_pci_bridge_emul_ext_conf_write,
};

/*
--
2.20.1

2021-05-06 19:20:14

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 01/42] PCI: aardvark: Fix kernel panic during PIO transfer

Trying to start a new PIO transfer by writing value 0 in PIO_START register
when previous transfer has not yet completed (which is indicated by value 1
in PIO_START) causes an External Abort on CPU, which results in kernel
panic:

SError Interrupt on CPU0, code 0xbf000002 -- SError
Kernel panic - not syncing: Asynchronous SError Interrupt

To prevent kernel panic, it is required to reject a new PIO transfer when
previous one has not finished yet.

If previous PIO transfer is not finished yet, the kernel may issue a new
PIO request only if the previous PIO transfer timed out.

In the past the root cause of this issue was incorrectly identified (as it
often happens during link retraining or after link down event) and special
hack was implemented in Trusted Firmware to catch all SError events in EL3,
to ignore errors with code 0xbf000002 and not forwarding any other errors
to kernel and instead throw panic from EL3 Trusted Firmware handler.

Links to discussion and patches about this issue:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50
https://lore.kernel.org/linux-pci/[email protected]/
https://lore.kernel.org/linux-pci/[email protected]/
https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541

But the real cause was the fact that during link retraning or after link
down event the PIO transfer may take longer time, up to the 1.44s until it
times out. This increased probability that a new PIO transfer would be
issued by kernel while previous one has not finished yet.

After applying this change into the kernel, it is possible to revert the
mentioned TF-A hack and SError events do not have to be caught in TF-A EL3.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected] # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock")
---
drivers/pci/controller/pci-aardvark.c | 49 ++++++++++++++++++++++-----
1 file changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 051b48bd7985..e3f5e7ab7606 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -514,7 +514,7 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie)
udelay(PIO_RETRY_DELAY);
}

- dev_err(dev, "config read/write timed out\n");
+ dev_err(dev, "PIO read/write transfer time out\n");
return -ETIMEDOUT;
}

@@ -657,6 +657,35 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
return true;
}

+static bool advk_pcie_pio_is_running(struct advk_pcie *pcie)
+{
+ struct device *dev = &pcie->pdev->dev;
+
+ /*
+ * Trying to start a new PIO transfer when previous has not completed
+ * cause External Abort on CPU which results in kernel panic:
+ *
+ * SError Interrupt on CPU0, code 0xbf000002 -- SError
+ * Kernel panic - not syncing: Asynchronous SError Interrupt
+ *
+ * Functions advk_pcie_rd_conf() and advk_pcie_wr_conf() are protected
+ * by raw_spin_lock_irqsave() at pci_lock_config() level to prevent
+ * concurrent calls at the same time. But because PIO transfer may take
+ * about 1.5s when link is down or card is disconnected, it means that
+ * advk_pcie_wait_pio() does not always have to wait for completion.
+ *
+ * Some versions of ARM Trusted Firmware handles this External Abort at
+ * EL3 level and mask it to prevent kernel panic. Relevant TF-A commit:
+ * https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50
+ */
+ if (advk_readl(pcie, PIO_START)) {
+ dev_err(dev, "Previous PIO read/write transfer is still running\n");
+ return true;
+ }
+
+ return false;
+}
+
static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
int where, int size, u32 *val)
{
@@ -673,9 +702,10 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
return pci_bridge_emul_conf_read(&pcie->bridge, where,
size, val);

- /* Start PIO */
- advk_writel(pcie, 0, PIO_START);
- advk_writel(pcie, 1, PIO_ISR);
+ if (advk_pcie_pio_is_running(pcie)) {
+ *val = 0xffffffff;
+ return PCIBIOS_SET_FAILED;
+ }

/* Program the control register */
reg = advk_readl(pcie, PIO_CTRL);
@@ -694,7 +724,8 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
/* Program the data strobe */
advk_writel(pcie, 0xf, PIO_WR_DATA_STRB);

- /* Start the transfer */
+ /* Clear PIO DONE ISR and start the transfer */
+ advk_writel(pcie, 1, PIO_ISR);
advk_writel(pcie, 1, PIO_START);

ret = advk_pcie_wait_pio(pcie);
@@ -734,9 +765,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
if (where % size)
return PCIBIOS_SET_FAILED;

- /* Start PIO */
- advk_writel(pcie, 0, PIO_START);
- advk_writel(pcie, 1, PIO_ISR);
+ if (advk_pcie_pio_is_running(pcie))
+ return PCIBIOS_SET_FAILED;

/* Program the control register */
reg = advk_readl(pcie, PIO_CTRL);
@@ -763,7 +793,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
/* Program the data strobe */
advk_writel(pcie, data_strobe, PIO_WR_DATA_STRB);

- /* Start the transfer */
+ /* Clear PIO DONE ISR and start the transfer */
+ advk_writel(pcie, 1, PIO_ISR);
advk_writel(pcie, 1, PIO_START);

ret = advk_pcie_wait_pio(pcie);
--
2.20.1

2021-05-06 19:20:18

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 02/42] PCI: aardvark: Fix checking for PIO Non-posted Request

PIO_NON_POSTED_REQ for PIO_STAT register is incorrectly defined. Bit 10 in
register PIO_STAT indicates the response is to a non-posted request.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index e3f5e7ab7606..2f8380a1f84f 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -57,7 +57,7 @@
#define PIO_COMPLETION_STATUS_UR 1
#define PIO_COMPLETION_STATUS_CRS 2
#define PIO_COMPLETION_STATUS_CA 4
-#define PIO_NON_POSTED_REQ BIT(0)
+#define PIO_NON_POSTED_REQ BIT(10)
#define PIO_ADDR_LS (PIO_BASE_ADDR + 0x8)
#define PIO_ADDR_MS (PIO_BASE_ADDR + 0xc)
#define PIO_WR_DATA (PIO_BASE_ADDR + 0x10)
--
2.20.1

2021-05-06 19:20:23

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 05/42] PCI: pci-bridge-emul: Add PCIe Root Capabilities Register

This is 16-bit register at offset 0x1E. Rename current 'rsvd' struct member
to 'rootcap'.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
Cc: [email protected] # e0d9d30b7354 ("PCI: pci-bridge-emul: Fix big-endian support")
---
drivers/pci/pci-bridge-emul.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci-bridge-emul.h b/drivers/pci/pci-bridge-emul.h
index b31883022a8e..49bbd37ee318 100644
--- a/drivers/pci/pci-bridge-emul.h
+++ b/drivers/pci/pci-bridge-emul.h
@@ -54,7 +54,7 @@ struct pci_bridge_emul_pcie_conf {
__le16 slotctl;
__le16 slotsta;
__le16 rootctl;
- __le16 rsvd;
+ __le16 rootcap;
__le32 rootsta;
__le32 devcap2;
__le16 devctl2;
--
2.20.1

2021-05-06 19:20:33

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 16/42] PCI: aardvark: Remove unneeded goto

Simplify advk_pcie_init_irq_domain() function.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index c50421af9d06..366d7480bc1b 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1013,7 +1013,6 @@ static int advk_pcie_init_irq_domain(struct advk_pcie *pcie)
struct device_node *node = dev->of_node;
struct device_node *pcie_intc_node;
struct irq_chip *irq_chip;
- int ret = 0;

pcie_intc_node = of_get_next_child(node, NULL);
if (!pcie_intc_node) {
@@ -1029,15 +1028,15 @@ static int advk_pcie_init_irq_domain(struct advk_pcie *pcie)
pcie->irq_domain =
irq_domain_add_linear(pcie_intc_node, PCI_NUM_INTX,
&advk_pcie_irq_domain_ops, pcie);
+
+ of_node_put(pcie_intc_node);
+
if (!pcie->irq_domain) {
dev_err(dev, "Failed to get a INTx IRQ domain\n");
- ret = -ENOMEM;
- goto out_put_node;
+ return -ENOMEM;
}

-out_put_node:
- of_node_put(pcie_intc_node);
- return ret;
+ return 0;
}

static void advk_pcie_remove_irq_domain(struct advk_pcie *pcie)
--
2.20.1

2021-05-06 19:20:39

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 07/42] PCI: aardvark: Fix link training

Fix multiple link training issues in aardvark driver. The main reason of
these issues was misunderstanding of what certain registers do, since their
names and comments were misleading: before commit 96be36dbffac ("PCI:
aardvark: Replace custom macros by standard linux/pci_regs.h macros"), the
pci-aardvark.c driver used custom macros for accessing standard PCIe Root
Bridge registers, and misleading comments did not help to understand what
the code was really doing.

After doing more tests and experiments I've come to the conclusion that the
SPEED_GEN register in aardvark sets the PCIe revision / generation
compliance and forces maximal link speed. Both GEN3 and GEN2 values set the
read-only PCI_EXP_FLAGS_VERS bits (PCIe capabilities version of Root
Bridge) to value 2, while GEN1 value sets PCI_EXP_FLAGS_VERS to 1, which
matches with PCI Express specifications revision 3, 2 and 1, respectively.
Changing SPEED_GEN also sets the read-only bits PCI_EXP_LNKCAP_SLS and
PCI_EXP_LNKCAP2_SLS to corresponding speed.

Note that PCI Express rev 1 specification does not define PCI_EXP_LNKCAP2
and PCI_EXP_LNKCTL2 registers and when SPEED_GEN is set to GEN1 (which also
sets PCI_EXP_FLAGS_VERS set to 1), lspci cannot access PCI_EXP_LNKCAP2 and
PCI_EXP_LNKCTL2 registers.

Changing PCIe link speed can be done via PCI_EXP_LNKCTL2_TLS bits of
PCI_EXP_LNKCTL2 register. Armada 3700 Functional Specifications says that
the default value of PCI_EXP_LNKCTL2_TLS is based on SPEED_GEN value, but
tests showed that the default value is always 8.0 GT/s, independently of
speed set by SPEED_GEN. So after setting SPEED_GEN, we must also set value
in PCI_EXP_LNKCTL2 register via PCI_EXP_LNKCTL2_TLS bits.

Triggering PCI_EXP_LNKCTL_RL bit immediately after setting LINK_TRAINING_EN
bit actually doesn't do anything. Tests have shown that a delay is needed
after enabling LINK_TRAINING_EN bit. As triggering PCI_EXP_LNKCTL_RL
currently does nothing, remove it.

Commit 43fc679ced18 ("PCI: aardvark: Improve link training") introduced
code which sets SPEED_GEN register based on negotiated link speed from
PCI_EXP_LNKSTA_CLS bits of PCI_EXP_LNKSTA register. This code was added to
fix detection of Compex WLE900VX (Atheros QCA9880) WiFi GEN1 PCIe cards, as
otherwise these cards were "invisible" on PCIe bus (probably because they
crashed). But apparently more people reported the same issues with these
cards also with other PCIe controllers [1] and I was able to reproduce this
issue also with other "noname" WiFi cards based on Atheros QCA9890 chip
(with the same PCI vendor/device ids as Atheros QCA9880). So this is not an
issue in aardvark but rather issue in Atheros QCA98xx chips. Also, this
issue only exists if the kernel is compiled with PCIe ASPM support, and a
generic workaround for this is to change PCIe Bridge to 2.5 GT/s link speed
via PCI_EXP_LNKCTL2_TLS_2_5GT bits in PCI_EXP_LNKCTL2 register [2], before
triggering PCI_EXP_LNKCTL_RL bit. This workaround also works when SPEED_GEN
is set to value GEN2 (5 GT/s). So remove this hack completely in the
aardvark driver and always set SPEED_GEN to value from 'max-link-speed' DT
property. Fix for Atheros QCA98xx chips is handled separately by patch [2].

These two things (code for triggering PCI_EXP_LNKCTL_RL bit and changing
SPEED_GEN value) also explain why commit 6964494582f5 ("PCI: aardvark:
Train link immediately after enabling training") somehow fixed detection of
those problematic Compex cards with Atheros chips: if triggering link
retraining (via PCI_EXP_LNKCTL_RL bit) was done immediately after enabling
link training (via LINK_TRAINING_EN), it did nothing. If there was a
specific delay, aardvark HW already initialized PCIe link and therefore
triggering link retraining caused the above issue. Compex cards triggered
link down event and disappeared from the PCIe bus.

Commit f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready before
training link") added 100ms sleep before calling 'Start link training'
command and explained that it is a requirement of PCI Express
specification. But the code after this 100ms sleep was not doing 'Start
link training', rather it triggered PCI_EXP_LNKCTL_RL bit via PCIe Root
Bridge to put link into Recovery state.

The required delay after fundamental reset is already done in function
advk_pcie_wait_for_link() which also check when PCIe link is up.

So after removing the code which triggers PCI_EXP_LNKCTL_RL bit on PCIe
Root Bridge, there is no need to wait 100ms again. Remove the extra
msleep() call and update comment about the delay required by the PCI
Express specification.

According to Marvell Armada 3700 Functional Specifications, Link training
should be enabled via aardvark register LINK_TRAINING_EN after selecting
PCIe generation and x1 lane. There is no need to disable it prior resetting
card via PERST# signal. This disabling code was introduced in commit
5169a9851daa ("PCI: aardvark: Issue PERST via GPIO") as a workaround for
some Atheros cards. It turns out that this also is Atheros specific issue
and affects any PCIe controller, not only aardvark. Moreover this Atheros
issue was triggered by juggling with PCI_EXP_LNKCTL_RL, LINK_TRAINING_EN
and SPEED_GEN bits interleaved with sleeps. Now, after removing triggering
PCI_EXP_LNKCTL_RL, there is no need to explicitly disable LINK_TRAINING_EN
bit. So remove this code too. The problematic Compex cards described in
previous git commits are correctly detected in advk_pcie_train_link()
function even after applying all these changes.

Note that with this patch, and also prior this patch, some NVMe disks which
support PCIe GEN3 with 8 GT/s speed are negotiated only at the lowest link
speed 2.5 GT/s, independently of SPEED_GEN value. After manually triggering
PCI_EXP_LNKCTL_RL bit (e.g. from userspace via setpci), these NVMe disks
change link speed to 5 GT/s when SPEED_GEN was configured to GEN2. This
issue first needs to be properly investigated. I will send a fix in the
future.

On the other hand, some other GEN2 PCIe cards with 5 GT/s speed are
autonomously by HW autonegotiated at full 5 GT/s speed without need of any
software interaction.

Armada 3700 Functional Specifications describes the following steps for
link training: set SPEED_GEN to GEN2, enable LINK_TRAINING_EN, poll until
link training is complete, trigger PCI_EXP_LNKCTL_RL, poll until signal
rate is 5 GT/s, poll until link training is complete, enable ASPM L0s.

The requirement for triggering PCI_EXP_LNKCTL_RL can be explained by the
need to achieve 5 GT/s speed (as changing link speed is done by throw to
recovery state entered by PCI_EXP_LNKCTL_RL) or maybe as a part of enabling
ASPM L0s (but in this case ASPM L0s should have been enabled prior
PCI_EXP_LNKCTL_RL).

It is unknown why the original pci-aardvark.c driver was triggering
PCI_EXP_LNKCTL_RL bit before waiting for the link to be up. This does not
align with neither PCIe base specifications nor with Armada 3700 Functional
Specification. (Note that in older versions of aardvark, this bit was
called incorrectly PCIE_CORE_LINK_TRAINING, so this may be the reason.)

It is also unknown why Armada 3700 Functional Specification says that it is
needed to trigger PCI_EXP_LNKCTL_RL for GEN2 mode, as according to PCIe
base specification 5 GT/s speed negotiation is supposed to be entirely
autonomous, even if initial speed is 2.5 GT/s.

[1] - https://lore.kernel.org/linux-pci/[email protected]/
[2] - https://lore.kernel.org/linux-pci/[email protected]/

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected] # f4c7d053d7f7 ("PCI: aardvark: Wait for endpoint to be ready before training link")
Cc: [email protected] # 6964494582f5 ("PCI: aardvark: Train link immediately after enabling training")
Cc: [email protected] # 43fc679ced18 ("PCI: aardvark: Improve link training")
Cc: [email protected] # 5169a9851daa ("PCI: aardvark: Issue PERST via GPIO")
Cc: [email protected] # 96be36dbffac ("PCI: aardvark: Replace custom macros by standard linux/pci_regs.h macros")
Cc: [email protected] # d0c6a3475b03 ("PCI: aardvark: Move PCIe reset card code to advk_pcie_train_link()")
Cc: [email protected] # 1d1cd163d0de ("PCI: aardvark: Update comment about disabling link training")
---
drivers/pci/controller/pci-aardvark.c | 117 ++++++++------------------
1 file changed, 34 insertions(+), 83 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index e297ec9ec390..edeacd95297e 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -209,11 +209,6 @@ static inline u32 advk_readl(struct advk_pcie *pcie, u64 reg)
return readl(pcie->base + reg);
}

-static inline u16 advk_read16(struct advk_pcie *pcie, u64 reg)
-{
- return advk_readl(pcie, (reg & ~0x3)) >> ((reg & 0x3) * 8);
-}
-
static int advk_pcie_link_up(struct advk_pcie *pcie)
{
u32 val, ltssm_state;
@@ -251,23 +246,9 @@ static void advk_pcie_wait_for_retrain(struct advk_pcie *pcie)

static void advk_pcie_issue_perst(struct advk_pcie *pcie)
{
- u32 reg;
-
if (!pcie->reset_gpio)
return;

- /*
- * As required by PCI Express spec (PCI Express Base Specification, REV.
- * 4.0 PCI Express, February 19 2014, 6.6.1 Conventional Reset) a delay
- * for at least 100ms after de-asserting PERST# signal is needed before
- * link training is enabled. So ensure that link training is disabled
- * prior de-asserting PERST# signal to fulfill that PCI Express spec
- * requirement.
- */
- reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
- reg &= ~LINK_TRAINING_EN;
- advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
-
/* 10ms delay is needed for some cards */
dev_info(&pcie->pdev->dev, "issuing PERST via reset GPIO for 10ms\n");
gpiod_set_value_cansleep(pcie->reset_gpio, 1);
@@ -275,53 +256,46 @@ static void advk_pcie_issue_perst(struct advk_pcie *pcie)
gpiod_set_value_cansleep(pcie->reset_gpio, 0);
}

-static int advk_pcie_train_at_gen(struct advk_pcie *pcie, int gen)
+static void advk_pcie_train_link(struct advk_pcie *pcie)
{
- int ret, neg_gen;
+ struct device *dev = &pcie->pdev->dev;
u32 reg;
+ int ret;

- /* Setup link speed */
+ /*
+ * Setup PCIe rev / gen compliance based on device tree property
+ * 'max-link-speed' which also forces maximal link speed.
+ */
reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
reg &= ~PCIE_GEN_SEL_MSK;
- if (gen == 3)
+ if (pcie->link_gen == 3)
reg |= SPEED_GEN_3;
- else if (gen == 2)
+ else if (pcie->link_gen == 2)
reg |= SPEED_GEN_2;
else
reg |= SPEED_GEN_1;
advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

/*
- * Enable link training. This is not needed in every call to this
- * function, just once suffices, but it does not break anything either.
+ * Set maximal link speed value also into PCIe Link Control 2 register.
+ * Armada 3700 Functional Specification says that default value is based
+ * on SPEED_GEN but tests showed that default value is always 8.0 GT/s.
*/
+ reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL2);
+ reg &= ~PCI_EXP_LNKCTL2_TLS;
+ if (pcie->link_gen == 3)
+ reg |= PCI_EXP_LNKCTL2_TLS_8_0GT;
+ else if (pcie->link_gen == 2)
+ reg |= PCI_EXP_LNKCTL2_TLS_5_0GT;
+ else
+ reg |= PCI_EXP_LNKCTL2_TLS_2_5GT;
+ advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL2);
+
+ /* Enable link training after selecting PCIe generation */
reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
reg |= LINK_TRAINING_EN;
advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

- /*
- * Start link training immediately after enabling it.
- * This solves problems for some buggy cards.
- */
- reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL);
- reg |= PCI_EXP_LNKCTL_RL;
- advk_writel(pcie, reg, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKCTL);
-
- ret = advk_pcie_wait_for_link(pcie);
- if (ret)
- return ret;
-
- reg = advk_read16(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_LNKSTA);
- neg_gen = reg & PCI_EXP_LNKSTA_CLS;
-
- return neg_gen;
-}
-
-static void advk_pcie_train_link(struct advk_pcie *pcie)
-{
- struct device *dev = &pcie->pdev->dev;
- int neg_gen = -1, gen;
-
/*
* Reset PCIe card via PERST# signal. Some cards are not detected
* during link training when they are in some non-initial state.
@@ -332,41 +306,18 @@ static void advk_pcie_train_link(struct advk_pcie *pcie)
* PERST# signal could have been asserted by pinctrl subsystem before
* probe() callback has been called or issued explicitly by reset gpio
* function advk_pcie_issue_perst(), making the endpoint going into
- * fundamental reset. As required by PCI Express spec a delay for at
- * least 100ms after such a reset before link training is needed.
- */
- msleep(PCI_PM_D3COLD_WAIT);
-
- /*
- * Try link training at link gen specified by device tree property
- * 'max-link-speed'. If this fails, iteratively train at lower gen.
- */
- for (gen = pcie->link_gen; gen > 0; --gen) {
- neg_gen = advk_pcie_train_at_gen(pcie, gen);
- if (neg_gen > 0)
- break;
- }
-
- if (neg_gen < 0)
- goto err;
-
- /*
- * After successful training if negotiated gen is lower than requested,
- * train again on negotiated gen. This solves some stability issues for
- * some buggy gen1 cards.
+ * fundamental reset. As required by PCI Express spec (PCI Express
+ * Base Specification, REV. 4.0 PCI Express, February 19 2014, 6.6.1
+ * Conventional Reset) a delay for at least 100ms after such a reset
+ * before sending a Configuration Request to the device is needed.
+ * So wait until PCIe link is up. Function advk_pcie_wait_for_link()
+ * waits for link at least 900ms.
*/
- if (neg_gen < gen) {
- gen = neg_gen;
- neg_gen = advk_pcie_train_at_gen(pcie, gen);
- }
-
- if (neg_gen == gen) {
- dev_info(dev, "link up at gen %i\n", gen);
- return;
- }
-
-err:
- dev_err(dev, "link never came up\n");
+ ret = advk_pcie_wait_for_link(pcie);
+ if (ret < 0)
+ dev_err(dev, "link never came up\n");
+ else
+ dev_info(dev, "link up\n");
}

static void advk_pcie_setup_hw(struct advk_pcie *pcie)
--
2.20.1

2021-05-06 19:21:04

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 14/42] PCI: aardvark: Don't mask irq when mapping

By default, all Legacy INTx interrupts are masked, so there is no need to
mask this interrupt during irq_map callback.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 2aced8c9ae9f..08f1157e1c5e 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -940,7 +940,6 @@ static int advk_pcie_irq_map(struct irq_domain *h,
{
struct advk_pcie *pcie = h->host_data;

- advk_pcie_irq_mask(irq_get_irq_data(virq));
irq_set_status_flags(virq, IRQ_LEVEL);
irq_set_chip_and_handler(virq, &pcie->irq_chip,
handle_level_irq);
--
2.20.1

2021-05-06 19:21:04

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 06/42] PCI: aardvark: Fix reporting CRS Software Visibility on emulated bridge

CRS Software Visibility is supported and always enabled by PIO.
Correctly report this information via emulated root bridge.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 3f3c72927afb..e297ec9ec390 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -578,6 +578,8 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
case PCI_EXP_RTCTL: {
u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG);
*value = (val & PCIE_MSG_PM_PME_MASK) ? 0 : PCI_EXP_RTCTL_PMEIE;
+ *value |= PCI_EXP_RTCTL_CRSSVE;
+ *value |= PCI_EXP_RTCAP_CRSVIS << 16;
return PCI_BRIDGE_EMUL_HANDLED;
}

@@ -659,6 +661,7 @@ static struct pci_bridge_emul_ops advk_pci_bridge_emul_ops = {
static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
{
struct pci_bridge_emul *bridge = &pcie->bridge;
+ int ret;

bridge->conf.vendor =
cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff);
@@ -682,7 +685,16 @@ static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
bridge->data = pcie;
bridge->ops = &advk_pci_bridge_emul_ops;

- return pci_bridge_emul_init(bridge, 0);
+ /* PCIe config space can be initialized after pci_bridge_emul_init() */
+ ret = pci_bridge_emul_init(bridge, 0);
+ if (ret < 0)
+ return ret;
+
+ /* Completion Retry Status is supported and always enabled by PIO */
+ bridge->pcie_conf.rootctl = cpu_to_le16(PCI_EXP_RTCTL_CRSSVE);
+ bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS);
+
+ return 0;
}

static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
--
2.20.1

2021-05-06 19:21:25

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 25/42] PCI: aardvark: Fix support for PME requester on emulated bridge

PME requester id is stored in the PCIE_MSG_LOG_REG register, which contains
the last inbound message. So when new inbound message is received by HW
(including non-PM), the content in PCIE_MSG_LOG_REG register is replaced by
a new value.

PCIe specification mandates that subsequent PMEs are kept pending until the
PME Status Register bit is cleared by software by writing a 1b.

Enable aardvark PME interrupt unconditionally by unmasking it and read PME
requester id from PCIE_MSG_LOG_REG register to emulated bridge config space
immediately after receiving interrupt.

Support for masking/unmasking PME interrupt on emulated bridge via
PCI_EXP_RTCTL_PMEIE bit is now implemented only in emulated bridge config
space, to ensure that we do not miss any aardvark PME interrupt.

Reading of PCI_EXP_RTCAP and PCI_EXP_RTSTA registers is simplified as final
value is now always stored into emulated bridge config space by the
interrupt handler, so there is no need to implement support for these
registers in read_pcie callback.

Clearing of W1C bit PCI_EXP_RTSTA_PME is now also simplified as it is done
by pci-bridge-emul.c code for emulated bridge config space. So there is no
need to implement support for clearing this bit in write_pcie callback.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Cc: [email protected] # c0f05a6ab525 ("PCI: aardvark: Fix PCI_EXP_RTCTL register configuration")
---
drivers/pci/controller/pci-aardvark.c | 81 ++++++++++++---------------
1 file changed, 36 insertions(+), 45 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index fac48797d922..6c860e67e5a2 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -424,6 +424,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg &= ~(PCIE_ISR0_FAT_ERR | PCIE_ISR0_NFAT_ERR | PCIE_ISR0_CORR_ERR);
advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);

+ /* Unmask PME interrupt for processing of PME requester */
+ reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
+ reg &= ~PCIE_MSG_PM_PME_MASK;
+ advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
+
/* Enable summary interrupt for GIC SPI source */
reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK);
advk_writel(pcie, reg, HOST_CTRL_INT_MASK_REG);
@@ -557,30 +562,17 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
{
struct advk_pcie *pcie = bridge->data;

+ /*
+ * PCI_EXP_RTCTL and PCI_EXP_RTSTA registers are fully supported
+ * but their values are stored only in emulated config space buffer.
+ * So there is no need to handle them in read_pcie callback.
+ */

switch (reg) {
case PCI_EXP_SLTCTL:
*value = PCI_EXP_SLTSTA_PDS << 16;
return PCI_BRIDGE_EMUL_HANDLED;

- case PCI_EXP_RTCTL: {
- u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG);
- *value = (val & PCIE_MSG_PM_PME_MASK) ? 0 : PCI_EXP_RTCTL_PMEIE;
- *value |= PCI_EXP_RTCTL_CRSSVE;
- *value |= PCI_EXP_RTCAP_CRSVIS << 16;
- return PCI_BRIDGE_EMUL_HANDLED;
- }
-
- case PCI_EXP_RTSTA: {
- u32 isr0 = advk_readl(pcie, PCIE_ISR0_REG);
- u32 msglog = advk_readl(pcie, PCIE_MSG_LOG_REG);
- u32 val = msglog >> 16;
- if (isr0 & PCIE_MSG_PM_PME_MASK)
- val |= PCI_EXP_RTSTA_PME;
- *value = val;
- return PCI_BRIDGE_EMUL_HANDLED;
- }
-
case PCI_EXP_LNKCTL: {
/* u32 contains both PCI_EXP_LNKCTL and PCI_EXP_LNKSTA */
u32 val = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + reg) &
@@ -609,6 +601,12 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
{
struct advk_pcie *pcie = bridge->data;

+ /*
+ * PCI_EXP_RTCTL and PCI_EXP_RTSTA registers are fully supported
+ * but their values are stored only in emulated config space buffer.
+ * So there is no need to handle them in write_pcie callback.
+ */
+
switch (reg) {
case PCI_EXP_DEVCTL:
advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
@@ -620,23 +618,6 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
advk_pcie_wait_for_retrain(pcie);
break;

- case PCI_EXP_RTCTL:
- /* Only mask/unmask PME interrupt */
- if (mask & PCI_EXP_RTCTL_PMEIE) {
- u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG);
- if ((new & PCI_EXP_RTCTL_PMEIE) == 0)
- val |= PCIE_MSG_PM_PME_MASK;
- else
- val &= ~PCIE_MSG_PM_PME_MASK;
- advk_writel(pcie, val, PCIE_ISR0_MASK_REG);
- }
- break;
-
- case PCI_EXP_RTSTA:
- if (new & PCI_EXP_RTSTA_PME)
- advk_writel(pcie, PCIE_MSG_PM_PME_MASK, PCIE_ISR0_REG);
- break;
-
default:
break;
}
@@ -1224,19 +1205,29 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
if (!isr0_status && !isr1_status)
return;

- /* Process PME interrupt */
+ /* Process PME interrupt as the first one to do not miss PME requester id */
if (isr0_status & PCIE_MSG_PM_PME_MASK) {
+ advk_writel(pcie, PCIE_MSG_PM_PME_MASK, PCIE_ISR0_REG);
/*
- * Do not clear PME interrupt bit in ISR0, it is cleared by IRQ
- * receiver by writing to the PCI_EXP_RTSTA register of emulated
- * root bridge. Aardvark HW returns zero for PCI_EXP_FLAGS_IRQ,
- * so use PCIe interrupt 0.
+ * PCIE_MSG_LOG_REG contains the last inbound message,
+ * so store requester id only when PME was not asserted yet.
+ * Also do not trigger PME interrupt when PME is still asserted.
*/
- virq = irq_find_mapping(pcie->irq_domain, 0);
- if (virq)
- generic_handle_irq(virq);
- else
- dev_err(&pcie->pdev->dev, "unexpected PME IRQ\n");
+ if (!(le32_to_cpu(pcie->bridge.pcie_conf.rootsta) & PCI_EXP_RTSTA_PME)) {
+ u32 requester = advk_readl(pcie, PCIE_MSG_LOG_REG) >> 16;
+ pcie->bridge.pcie_conf.rootsta = cpu_to_le32(requester | PCI_EXP_RTSTA_PME);
+ /*
+ * Trigger PME interrupt only in case when PMEIE bit in Root Control is set.
+ * Aardvark HW returns zero for PCI_EXP_FLAGS_IRQ, so use PCIe interrupt 0.
+ */
+ if (le16_to_cpu(pcie->bridge.pcie_conf.rootctl) & PCI_EXP_RTCTL_PMEIE) {
+ virq = irq_find_mapping(pcie->irq_domain, 0);
+ if (virq)
+ generic_handle_irq(virq);
+ else
+ dev_err(&pcie->pdev->dev, "unexpected PME IRQ\n");
+ }
+ }
}

/* Process ERR interrupt */
--
2.20.1

2021-05-06 19:21:29

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 26/42] PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on emulated bridge

From very vague, ambiguous and incomplete information from Marvell we
deduced that the 32-bit aardvark register 0x4 (PCIE_CORE_CMD_STATUS_REG),
which is not documented in Armada 3700 Functional Specifications for PCIe
Root Complex mode, should control two 16-bit PCIe registers: Command
Register and Status Registers of virtual PCIe Root Bridge.

This means that bit 2 controls bus mastering and forwarding of memory and
I/O requests in the upstream direction. According to PCI specifications
bits [0:2] of Command Register, this should be by default disabled on
reset. So explicitly disable these bits at early beginning of aardvark
initialization.

Also remove code which unconditionally enables all 3 bits and let kernel
code (via pci_set_master() function) to handle bus mastering of Root PCIe
Bridge via emulated PCI_COMMAND on emulated bridge.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Cc: [email protected] # b2a56469d550 ("PCI: aardvark: Add FIXME comment for PCIE_CORE_CMD_STATUS_REG access")
---
drivers/pci/controller/pci-aardvark.c | 54 +++++++++++++++++++--------
1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 6c860e67e5a2..92f93ec48d6b 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -31,9 +31,6 @@
/* PCIe core registers */
#define PCIE_CORE_DEV_ID_REG 0x0
#define PCIE_CORE_CMD_STATUS_REG 0x4
-#define PCIE_CORE_CMD_IO_ACCESS_EN BIT(0)
-#define PCIE_CORE_CMD_MEM_ACCESS_EN BIT(1)
-#define PCIE_CORE_CMD_MEM_IO_REQ_EN BIT(2)
#define PCIE_CORE_DEV_REV_REG 0x8
#define PCIE_CORE_PCIEXP_CAP 0xc0
#define PCIE_CORE_ERR_CAPCTL_REG 0x118
@@ -365,6 +362,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg = (PCI_VENDOR_ID_MARVELL << 16) | PCI_VENDOR_ID_MARVELL;
advk_writel(pcie, reg, VENDOR_ID_REG);

+ /* Disable Root Bridge I/O space, memory space and bus mastering */
+ reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
+ reg &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
+ advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG);
+
/* Set Advanced Error Capabilities and Control PF0 register */
reg = PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX |
PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX_EN |
@@ -443,19 +445,6 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
advk_writel(pcie, reg, PIO_CTRL);

advk_pcie_train_link(pcie);
-
- /*
- * FIXME: The following register update is suspicious. This register is
- * applicable only when the PCI controller is configured for Endpoint
- * mode, not as a Root Complex. But apparently when this code is
- * removed, some cards stop working. This should be investigated and
- * a comment explaining this should be put here.
- */
- reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
- reg |= PCIE_CORE_CMD_MEM_ACCESS_EN |
- PCIE_CORE_CMD_IO_ACCESS_EN |
- PCIE_CORE_CMD_MEM_IO_REQ_EN;
- advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG);
}

static int advk_pcie_check_pio_status(struct advk_pcie *pcie, u32 *val)
@@ -555,6 +544,37 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie)
return -ETIMEDOUT;
}

+static pci_bridge_emul_read_status_t
+advk_pci_bridge_emul_base_conf_read(struct pci_bridge_emul *bridge,
+ int reg, u32 *value)
+{
+ struct advk_pcie *pcie = bridge->data;
+
+ switch (reg) {
+ case PCI_COMMAND:
+ *value = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
+ return PCI_BRIDGE_EMUL_HANDLED;
+
+ default:
+ return PCI_BRIDGE_EMUL_NOT_HANDLED;
+ }
+}
+
+static void
+advk_pci_bridge_emul_base_conf_write(struct pci_bridge_emul *bridge,
+ int reg, u32 old, u32 new, u32 mask)
+{
+ struct advk_pcie *pcie = bridge->data;
+
+ switch (reg) {
+ case PCI_COMMAND:
+ advk_writel(pcie, new, PCIE_CORE_CMD_STATUS_REG);
+ break;
+
+ default:
+ break;
+ }
+}

static pci_bridge_emul_read_status_t
advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
@@ -624,6 +644,8 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
}

static struct pci_bridge_emul_ops advk_pci_bridge_emul_ops = {
+ .read_base = advk_pci_bridge_emul_base_conf_read,
+ .write_base = advk_pci_bridge_emul_base_conf_write,
.read_pcie = advk_pci_bridge_emul_pcie_conf_read,
.write_pcie = advk_pci_bridge_emul_pcie_conf_write,
};
--
2.20.1

2021-05-06 19:22:01

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 29/42] PCI: aardvark: Reset PCIe card and disable PHY when unbinding driver

When unbinding driver, assert PERST# signal which prepares PCIe card for
power down. Then disable link training and PHY.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and allow to build it as module")
---
drivers/pci/controller/pci-aardvark.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 4b531675db81..b1e6a8a839e0 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1494,6 +1494,18 @@ static int advk_pcie_remove(struct platform_device *pdev)
/* Free config space for emulated root bridge */
pci_bridge_emul_cleanup(&pcie->bridge);

+ /* Assert PERST# signal which prepares PCIe card for power down */
+ if (pcie->reset_gpio)
+ gpiod_set_value_cansleep(pcie->reset_gpio, 1);
+
+ /* Disable link training */
+ val = advk_readl(pcie, PCIE_CORE_CTRL0_REG);
+ val &= ~LINK_TRAINING_EN;
+ advk_writel(pcie, val, PCIE_CORE_CTRL0_REG);
+
+ /* Disable phy */
+ advk_pcie_disable_phy(pcie);
+
return 0;
}

--
2.20.1

2021-05-06 19:22:03

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 19/42] PCI: aardvark: Fix setting MSI address

MSI address for receiving MSI interrupts needs to be correctly set before
enabling processing of MSI interrupts.

Move code for setting PCIE_MSI_ADDR_LOW_REG and PCIE_MSI_ADDR_HIGH_REG
registers with MSI address from advk_pcie_init_msi_irq_domain() function to
advk_pcie_setup_hw() function before enabling PCIE_CORE_CTRL2_MSI_ENABLE.

As part of this change, also remove unused variable msi_msg, which was used
only for MSI doorbell address. MSI address can be any address which does
not conflict with PCI space. So change it to the address of the main struct
advk_pcie.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
---
drivers/pci/controller/pci-aardvark.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 5e0243b2c473..199015215779 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -195,7 +195,6 @@ struct advk_pcie {
struct msi_domain_info msi_domain_info;
DECLARE_BITMAP(msi_used, MSI_IRQ_NUM);
struct mutex msi_used_lock;
- u16 msi_msg;
int link_gen;
struct pci_bridge_emul bridge;
struct gpio_desc *reset_gpio;
@@ -325,6 +324,7 @@ static void advk_pcie_train_link(struct advk_pcie *pcie)

static void advk_pcie_setup_hw(struct advk_pcie *pcie)
{
+ phys_addr_t msi_addr;
u32 reg;

/* Enable TX */
@@ -381,6 +381,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg |= LANE_COUNT_1;
advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

+ /* Set MSI address */
+ msi_addr = virt_to_phys(pcie);
+ advk_writel(pcie, lower_32_bits(msi_addr), PCIE_MSI_ADDR_LOW_REG);
+ advk_writel(pcie, upper_32_bits(msi_addr), PCIE_MSI_ADDR_HIGH_REG);
+
/* Enable MSI */
reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG);
reg |= PCIE_CORE_CTRL2_MSI_ENABLE;
@@ -862,10 +867,10 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
struct msi_msg *msg)
{
struct advk_pcie *pcie = irq_data_get_irq_chip_data(data);
- phys_addr_t msi_msg = virt_to_phys(&pcie->msi_msg);
+ phys_addr_t msi_addr = virt_to_phys(pcie);

- msg->address_lo = lower_32_bits(msi_msg);
- msg->address_hi = upper_32_bits(msi_msg);
+ msg->address_lo = lower_32_bits(msi_addr);
+ msg->address_hi = upper_32_bits(msi_addr);
msg->data = data->hwirq;
}

@@ -960,7 +965,6 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
struct device_node *node = dev->of_node;
struct irq_chip *bottom_ic, *msi_ic;
struct msi_domain_info *msi_di;
- phys_addr_t msi_msg_phys;

mutex_init(&pcie->msi_used_lock);

@@ -978,13 +982,6 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
MSI_FLAG_MULTI_PCI_MSI;
msi_di->chip = msi_ic;

- msi_msg_phys = virt_to_phys(&pcie->msi_msg);
-
- advk_writel(pcie, lower_32_bits(msi_msg_phys),
- PCIE_MSI_ADDR_LOW_REG);
- advk_writel(pcie, upper_32_bits(msi_msg_phys),
- PCIE_MSI_ADDR_HIGH_REG);
-
pcie->msi_inner_domain =
irq_domain_add_linear(NULL, MSI_IRQ_NUM,
&advk_msi_domain_ops, pcie);
--
2.20.1

2021-05-06 19:22:23

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 23/42] PCI: aardvark: Fix support for ERR interrupt on emulated bridge

ERR interrupt is triggered when corresponding bit is unmasked in both ISR0
and PCI_EXP_DEVCTL registers. Unmasking ERR bits in PCI_EXP_DEVCTL register
is not enough. This means that currently the ERR interrupt is never
triggered.

Unmask ERR bits in ISR0 register at driver probe time. ERR interrupt is not
triggered until ERR bits are unmasked also in PCI_EXP_DEVCTL register,
which is done by AER driver. So it is safe to unconditionally unmask all
ERR bits in aardvark probe.

Aardvark HW sets PCI_ERR_ROOT_AER_IRQ to zero and when corresponding bits
in ISR0 and PCI_EXP_DEVCTL are enabled, the HW triggers a generic interrupt
on GIC. Chain this interrupt to PCIe interrupt 0 with generic_handle_irq()
to allow processing of ERR interrupts.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Cc: [email protected]
---
drivers/pci/controller/pci-aardvark.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 8a5133226e41..2ea58ba10a97 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -104,6 +104,9 @@
#define PCIE_MSG_PM_PME_MASK BIT(7)
#define PCIE_ISR0_MASK_REG (CONTROL_BASE_ADDR + 0x44)
#define PCIE_ISR0_MSI_INT_PENDING BIT(24)
+#define PCIE_ISR0_CORR_ERR BIT(11)
+#define PCIE_ISR0_NFAT_ERR BIT(12)
+#define PCIE_ISR0_FAT_ERR BIT(13)
#define PCIE_ISR0_INTX_ASSERT(val) BIT(16 + (val))
#define PCIE_ISR0_INTX_DEASSERT(val) BIT(20 + (val))
#define PCIE_ISR0_ALL_MASK GENMASK(26, 0)
@@ -416,6 +419,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg &= ~PCIE_ISR0_MSI_INT_PENDING;
advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);

+ /* Unmask bits for ERR interrupt */
+ reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
+ reg &= ~(PCIE_ISR0_FAT_ERR | PCIE_ISR0_NFAT_ERR | PCIE_ISR0_CORR_ERR);
+ advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
+
/* Enable summary interrupt for GIC SPI source */
reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK);
advk_writel(pcie, reg, HOST_CTRL_INT_MASK_REG);
@@ -1195,6 +1203,7 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
{
u32 isr0_val, isr0_mask, isr0_status;
u32 isr1_val, isr1_mask, isr1_status;
+ u32 err_bits;
int i, virq;

isr0_val = advk_readl(pcie, PCIE_ISR0_REG);
@@ -1205,9 +1214,22 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
isr1_mask = advk_readl(pcie, PCIE_ISR1_MASK_REG);
isr1_status = isr1_val & ((~isr1_mask) & PCIE_ISR1_ALL_MASK);

+ err_bits = isr0_status & (PCIE_ISR0_FAT_ERR | PCIE_ISR0_NFAT_ERR | PCIE_ISR0_CORR_ERR);
+
if (!isr0_status && !isr1_status)
return;

+ /* Process ERR interrupt */
+ if (err_bits) {
+ advk_writel(pcie, err_bits, PCIE_ISR0_REG);
+ /* Aardvark HW returns zero for PCI_ERR_ROOT_AER_IRQ, so use PCIe interrupt 0 */
+ virq = irq_find_mapping(pcie->irq_domain, 0);
+ if (virq)
+ generic_handle_irq(virq);
+ else
+ dev_err(&pcie->pdev->dev, "unexpected ERR IRQ\n");
+ }
+
/* Process MSI interrupts */
if (isr0_status & PCIE_ISR0_MSI_INT_PENDING)
advk_pcie_handle_msi(pcie);
--
2.20.1

2021-05-06 19:22:28

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 28/42] PCI: aardvark: Free config space for emulated root bridge when unbinding driver to fix memory leak

Do it after disabling and masking all interrupts, since aardvark interrupt
handler accesses config space of emulated root bridge.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and allow to build it as module")
---
drivers/pci/controller/pci-aardvark.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 28ddffce1bec..4b531675db81 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1491,6 +1491,9 @@ static int advk_pcie_remove(struct platform_device *pdev)
advk_pcie_remove_msi_irq_domain(pcie);
advk_pcie_remove_irq_domain(pcie);

+ /* Free config space for emulated root bridge */
+ pci_bridge_emul_cleanup(&pcie->bridge);
+
return 0;
}

--
2.20.1

2021-05-06 19:22:47

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 30/42] PCI: aardvark: Rewrite irq code to chained irq handler

advk_pcie_irq_handler() reads irq status bits and calls other functions
based on which bits are set. These function then reads its own irq status
bits and calls other aardvark functions based on these bits. Finally
generic_handle_irq() with translated linux irq numbers are called.

Rewrite the code to use irq_set_chained_handler_and_data() handler with
chained_irq_enter() and chained_irq_exit() processing instead of using
devm_request_irq().

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 45 ++++++++++++++-------------
1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index b1e6a8a839e0..f2ed276b7e18 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -193,6 +193,7 @@ struct advk_msi_range {
struct advk_pcie {
struct platform_device *pdev;
void __iomem *base;
+ int irq;
struct irq_domain *irq_domain;
struct irq_chip irq_chip;
struct irq_domain *msi_domain;
@@ -1283,21 +1284,24 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
}
}

-static irqreturn_t advk_pcie_irq_handler(int irq, void *arg)
+static void advk_pcie_irq_handler(struct irq_desc *desc)
{
- struct advk_pcie *pcie = arg;
- u32 status;
+ struct advk_pcie *pcie = irq_desc_get_handler_data(desc);
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ u32 val, mask, status;

- status = advk_readl(pcie, HOST_CTRL_INT_STATUS_REG);
- if (!(status & PCIE_IRQ_CORE_INT))
- return IRQ_NONE;
+ chained_irq_enter(chip, desc);

- advk_pcie_handle_int(pcie);
+ val = advk_readl(pcie, HOST_CTRL_INT_STATUS_REG);
+ mask = advk_readl(pcie, HOST_CTRL_INT_MASK_REG);
+ status = val & ((~mask) & PCIE_IRQ_ALL_MASK);

- /* Clear interrupt */
- advk_writel(pcie, PCIE_IRQ_CORE_INT, HOST_CTRL_INT_STATUS_REG);
+ if (status & PCIE_IRQ_CORE_INT) {
+ advk_pcie_handle_int(pcie);
+ advk_writel(pcie, PCIE_IRQ_CORE_INT, HOST_CTRL_INT_STATUS_REG);
+ }

- return IRQ_HANDLED;
+ chained_irq_exit(chip, desc);
}

static void __maybe_unused advk_pcie_disable_phy(struct advk_pcie *pcie)
@@ -1363,7 +1367,7 @@ static int advk_pcie_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
struct advk_pcie *pcie;
struct pci_host_bridge *bridge;
- int ret, irq;
+ int ret;

bridge = devm_pci_alloc_host_bridge(dev, sizeof(struct advk_pcie));
if (!bridge)
@@ -1377,17 +1381,9 @@ static int advk_pcie_probe(struct platform_device *pdev)
if (IS_ERR(pcie->base))
return PTR_ERR(pcie->base);

- irq = platform_get_irq(pdev, 0);
- if (irq < 0)
- return irq;
-
- ret = devm_request_irq(dev, irq, advk_pcie_irq_handler,
- IRQF_SHARED | IRQF_NO_THREAD, "advk-pcie",
- pcie);
- if (ret) {
- dev_err(dev, "Failed to register interrupt\n");
- return ret;
- }
+ pcie->irq = platform_get_irq(pdev, 0);
+ if (pcie->irq < 0)
+ return pcie->irq;

pcie->reset_gpio = devm_gpiod_get_from_of_node(dev, dev->of_node,
"reset-gpios", 0,
@@ -1436,6 +1432,8 @@ static int advk_pcie_probe(struct platform_device *pdev)
return ret;
}

+ irq_set_chained_handler_and_data(pcie->irq, advk_pcie_irq_handler, pcie);
+
bridge->sysdata = pcie;
bridge->ops = &advk_pcie_ops;

@@ -1443,6 +1441,7 @@ static int advk_pcie_probe(struct platform_device *pdev)
if (ret < 0) {
advk_pcie_remove_msi_irq_domain(pcie);
advk_pcie_remove_irq_domain(pcie);
+ irq_set_chained_handler_and_data(pcie->irq, NULL, NULL);
return ret;
}

@@ -1491,6 +1490,8 @@ static int advk_pcie_remove(struct platform_device *pdev)
advk_pcie_remove_msi_irq_domain(pcie);
advk_pcie_remove_irq_domain(pcie);

+ irq_set_chained_handler_and_data(pcie->irq, NULL, NULL);
+
/* Free config space for emulated root bridge */
pci_bridge_emul_cleanup(&pcie->bridge);

--
2.20.1

2021-05-06 19:23:08

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 32/42] PCI: pci-bridge-emul: Add description for class_revision field

Make it clear why there is bitshift by 16 and rewrite the code to align
with the new description.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/pci-bridge-emul.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index fdaf86a888b7..2b6ea84f25af 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -265,7 +265,11 @@ int pci_bridge_emul_init(struct pci_bridge_emul *bridge,
{
BUILD_BUG_ON(sizeof(bridge->conf) != PCI_BRIDGE_CONF_END);

- bridge->conf.class_revision |= cpu_to_le32(PCI_CLASS_BRIDGE_PCI << 16);
+ /*
+ * class_revision: Class is high 24bits and revision is low 8bit
+ * Class for PCI Bridge Normal Decode has 24bit value (PCI_CLASS_BRIDGE_PCI << 8)
+ */
+ bridge->conf.class_revision |= cpu_to_le32((PCI_CLASS_BRIDGE_PCI << 8) << 8);
bridge->conf.header_type = PCI_HEADER_TYPE_BRIDGE;
bridge->conf.cache_line_size = 0x10;
bridge->conf.status = cpu_to_le16(PCI_STATUS_CAP_LIST);
--
2.20.1

2021-05-06 19:23:17

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 33/42] PCI: pci-bridge-emul: Add definitions for missing capabilities registers

pci-bridge-emul driver already allocates buffer for capabilities up to the
PCI_EXP_SLTSTA2 register, but does not define bit access behavior for these
registers. Fix it by adding missing definitions.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
---
drivers/pci/pci-bridge-emul.c | 38 +++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index 2b6ea84f25af..5f8398f8d039 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -251,6 +251,44 @@ struct pci_bridge_reg_behavior pcie_cap_regs_behavior[PCI_CAP_PCIE_SIZEOF / 4] =
.ro = GENMASK(15, 0) | PCI_EXP_RTSTA_PENDING,
.w1c = PCI_EXP_RTSTA_PME,
},
+
+ [PCI_EXP_DEVCAP2 / 4] = {
+ /* Device capabilities 2 register has reserved bits [30:24] and [17:16]. */
+ .ro = BIT(31) | GENMASK(23, 18) | GENMASK(15, 0),
+ },
+
+ [PCI_EXP_DEVCTL2 / 4] = {
+ /*
+ * Device control 2 register is RW but has reserved bits [12:11].
+ *
+ * Device status 2 register is reserved.
+ */
+ .rw = GENMASK(15, 13) | GENMASK(10, 0),
+ },
+
+ [PCI_EXP_LNKCAP2 / 4] = {
+ /* Link capabilities 2 register has reserved bits [30:23] and 0. */
+ .ro = BIT(31) | GENMASK(22, 1),
+ },
+
+ [PCI_EXP_LNKCTL2 / 4] = {
+ /*
+ * Link control 2 register is RW.
+ *
+ * Link status 2 register has bits 5, 10, 15 W1C; bit 11 reserved and others are RO.
+ */
+ .rw = GENMASK(15, 0),
+ .w1c = (BIT(15) | BIT(10) | BIT(5)) << 16,
+ .ro = (GENMASK(14, 12) | GENMASK(9, 6) | GENMASK(4, 0)) << 16,
+ },
+
+ [PCI_EXP_SLTCAP2 / 4] = {
+ /* Slot capabilities 2 register is reserved. */
+ },
+
+ [PCI_EXP_SLTCTL2 / 4] = {
+ /* Both Slot control 2 and Slot status 2 registers are reserved. */
+ },
};

/*
--
2.20.1

2021-05-06 19:23:22

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 37/42] PCI: aardvark: Replace custom PCIE_CORE_INT_* macros by linux PCI_INTERRUPT_* values

Header file linux/pci.h defines enum pci_interrupt_pin with corresponding
interrupt PCI_INTERRUPT_* values.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index d99462d99ed8..2258b9ae1084 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -34,10 +34,7 @@
#define PCIE_CORE_DEV_REV_REG 0x8
#define PCIE_CORE_PCIEXP_CAP 0xc0
#define PCIE_CORE_PCIERR_CAP 0x100
-#define PCIE_CORE_INT_A_ASSERT_ENABLE 1
-#define PCIE_CORE_INT_B_ASSERT_ENABLE 2
-#define PCIE_CORE_INT_C_ASSERT_ENABLE 3
-#define PCIE_CORE_INT_D_ASSERT_ENABLE 4
+
/* PIO registers base address and register offsets */
#define PIO_BASE_ADDR 0x4000
#define PIO_CTRL (PIO_BASE_ADDR + 0x0)
@@ -706,7 +703,7 @@ static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
bridge->conf.pref_mem_limit = cpu_to_le16(PCI_PREF_RANGE_TYPE_64);

/* Support interrupt A for MSI feature */
- bridge->conf.intpin = PCIE_CORE_INT_A_ASSERT_ENABLE;
+ bridge->conf.intpin = PCI_INTERRUPT_INTA;

bridge->has_pcie = true;
bridge->data = pcie;
--
2.20.1

2021-05-06 19:23:55

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 35/42] PCI: aardvark: Add support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge

PCI aardvark hardware supports PCIe Hot Reset via PCIE_CORE_CTRL1_REG
register. Use it for implementing PCI_BRIDGE_CTL_BUS_RESET bit of
PCI_BRIDGE_CONTROL register on emulated bridge.

With this change the function pci_reset_secondary_bus() starts working and
can reset connected PCIe card. Also custom userspace script [1] which uses
setpci can trigger PCIe Hot Reset and reset the card manually.

[1] - https://alexforencich.com/wiki/en/pcie/hot-reset-linux

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
---
drivers/pci/controller/pci-aardvark.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 13bbc0b5134d..d8fb43604154 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -558,6 +558,22 @@ advk_pci_bridge_emul_base_conf_read(struct pci_bridge_emul *bridge,
*value = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
return PCI_BRIDGE_EMUL_HANDLED;

+ case PCI_INTERRUPT_LINE: {
+ /*
+ * From the whole 32bit register we support propagating to HW
+ * only one bit: PCI_BRIDGE_CTL_BUS_RESET. Other bits are
+ * retrieved only from emulated config space buffer.
+ */
+ __le32 *cfgspace = (__le32 *)&bridge->conf;
+ u32 val = le32_to_cpu(cfgspace[PCI_INTERRUPT_LINE / 4]);
+ if (advk_readl(pcie, PCIE_CORE_CTRL1_REG) & HOT_RESET_GEN)
+ val |= PCI_BRIDGE_CTL_BUS_RESET << 16;
+ else
+ val &= ~(PCI_BRIDGE_CTL_BUS_RESET << 16);
+ *value = val;
+ return PCI_BRIDGE_EMUL_HANDLED;
+ }
+
default:
return PCI_BRIDGE_EMUL_NOT_HANDLED;
}
@@ -574,6 +590,17 @@ advk_pci_bridge_emul_base_conf_write(struct pci_bridge_emul *bridge,
advk_writel(pcie, new, PCIE_CORE_CMD_STATUS_REG);
break;

+ case PCI_INTERRUPT_LINE:
+ if (mask & (PCI_BRIDGE_CTL_BUS_RESET << 16)) {
+ u32 val = advk_readl(pcie, PCIE_CORE_CTRL1_REG);
+ if (new & (PCI_BRIDGE_CTL_BUS_RESET << 16))
+ val |= HOT_RESET_GEN;
+ else
+ val &= ~HOT_RESET_GEN;
+ advk_writel(pcie, val, PCIE_CORE_CTRL1_REG);
+ }
+ break;
+
default:
break;
}
--
2.20.1

2021-05-06 19:24:14

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 36/42] PCI: aardvark: Replace custom PCIE_CORE_ERR_CAPCTL_* macros by linux/pci_regs.h macros

Advanced Error Reporting Capability registers start at aardvark offset
0x100.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index d8fb43604154..d99462d99ed8 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -33,11 +33,7 @@
#define PCIE_CORE_CMD_STATUS_REG 0x4
#define PCIE_CORE_DEV_REV_REG 0x8
#define PCIE_CORE_PCIEXP_CAP 0xc0
-#define PCIE_CORE_ERR_CAPCTL_REG 0x118
-#define PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX BIT(5)
-#define PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX_EN BIT(6)
-#define PCIE_CORE_ERR_CAPCTL_ECRC_CHCK BIT(7)
-#define PCIE_CORE_ERR_CAPCTL_ECRC_CHCK_RCV BIT(8)
+#define PCIE_CORE_PCIERR_CAP 0x100
#define PCIE_CORE_INT_A_ASSERT_ENABLE 1
#define PCIE_CORE_INT_B_ASSERT_ENABLE 2
#define PCIE_CORE_INT_C_ASSERT_ENABLE 3
@@ -370,12 +366,10 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
reg &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG);

- /* Set Advanced Error Capabilities and Control PF0 register */
- reg = PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX |
- PCIE_CORE_ERR_CAPCTL_ECRC_CHK_TX_EN |
- PCIE_CORE_ERR_CAPCTL_ECRC_CHCK |
- PCIE_CORE_ERR_CAPCTL_ECRC_CHCK_RCV;
- advk_writel(pcie, reg, PCIE_CORE_ERR_CAPCTL_REG);
+ /* Enable generation and checking of ECRC on Root Bridge */
+ reg = advk_readl(pcie, PCIE_CORE_PCIERR_CAP + PCI_ERR_CAP);
+ reg |= PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE;
+ advk_writel(pcie, reg, PCIE_CORE_PCIERR_CAP + PCI_ERR_CAP);

/* Set PCIe Device Control register */
reg = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + PCI_EXP_DEVCTL);
--
2.20.1

2021-05-06 23:08:31

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 17/42] PCI: aardvark: Fix support for MSI interrupts

MSI domain callback .alloc (implemented by advk_msi_irq_domain_alloc()
function) should return zero on success. Returning non-zero value indicates
failure. Fix return value of this function as in many cases it now returns
failure while allocating IRQs.

Aardvark hardware supports Multi-MSI and MSI_FLAG_MULTI_PCI_MSI is already
set. But when allocating MSI interrupt numbers for Multi-MSI, they need to
be properly aligned, otherwise endpoint devices send MSI interrupt with
incorrect numbers. Fix this issue by using function bitmap_find_free_region()
instead of bitmap_find_next_zero_area().

To ensure that aligned MSI interrupt numbers are used by endpoint devices,
we cannot use Linux virtual irq numbers (as they are random and not
properly aligned). So use hwirq numbers allocated by the function
bitmap_find_free_region(), which are aligned. This needs an update in
advk_msi_irq_compose_msi_msg() and advk_pcie_handle_msi() functions to do
proper mapping between Linux virtual irq numbers and hwirq MSI inner domain
numbers.

Also the whole 16-bit MSI number is stored in the PCIE_MSI_PAYLOAD_REG
register, not only lower 8 bits. Fix reading content of this register.

This change fixes receiving MSI interrupts on Armada 3720 boards and allows
using NVMe disks which use Multi-MSI feature with 3 interrupts.

Without this change, NVMe disks just freeze booting Linux on Armada 3720
boards as linux nvme-core.c driver is waiting 60s for an interrupt.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
---
drivers/pci/controller/pci-aardvark.c | 32 ++++++++++++++++-----------
1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 366d7480bc1b..498810c00b6d 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -118,6 +118,7 @@
#define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
#define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
#define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
+#define PCIE_MSI_DATA_MASK GENMASK(15, 0)

/* LMI registers base address and register offsets */
#define LMI_BASE_ADDR 0x6000
@@ -861,7 +862,7 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,

msg->address_lo = lower_32_bits(msi_msg);
msg->address_hi = upper_32_bits(msi_msg);
- msg->data = data->irq;
+ msg->data = data->hwirq;
}

static int advk_msi_set_affinity(struct irq_data *irq_data,
@@ -878,15 +879,11 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
int hwirq, i;

mutex_lock(&pcie->msi_used_lock);
- hwirq = bitmap_find_next_zero_area(pcie->msi_used, MSI_IRQ_NUM,
- 0, nr_irqs, 0);
- if (hwirq >= MSI_IRQ_NUM) {
- mutex_unlock(&pcie->msi_used_lock);
- return -ENOSPC;
- }
-
- bitmap_set(pcie->msi_used, hwirq, nr_irqs);
+ hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
+ order_base_2(nr_irqs));
mutex_unlock(&pcie->msi_used_lock);
+ if (hwirq < 0)
+ return -ENOSPC;

for (i = 0; i < nr_irqs; i++)
irq_domain_set_info(domain, virq + i, hwirq + i,
@@ -894,7 +891,7 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
domain->host_data, handle_simple_irq,
NULL, NULL);

- return hwirq;
+ return 0;
}

static void advk_msi_irq_domain_free(struct irq_domain *domain,
@@ -904,7 +901,7 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
struct advk_pcie *pcie = domain->host_data;

mutex_lock(&pcie->msi_used_lock);
- bitmap_clear(pcie->msi_used, d->hwirq, nr_irqs);
+ bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
mutex_unlock(&pcie->msi_used_lock);
}

@@ -1048,6 +1045,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
{
u32 msi_val, msi_mask, msi_status, msi_idx;
u16 msi_data;
+ int virq;

msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
@@ -1057,9 +1055,17 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
if (!(BIT(msi_idx) & msi_status))
continue;

+ /*
+ * msi_idx contains bits [4:0] of the msi_data and msi_data
+ * contains 16bit MSI interrupt number from MSI inner domain
+ */
advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
- msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & 0xFF;
- generic_handle_irq(msi_data);
+ msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;
+ virq = irq_find_mapping(pcie->msi_inner_domain, msi_data);
+ if (virq)
+ generic_handle_irq(virq);
+ else
+ dev_err(&pcie->pdev->dev, "unexpected MSI 0x%04hx\n", msi_data);
}

advk_writel(pcie, PCIE_ISR0_MSI_INT_PENDING,
--
2.20.1

2021-05-06 23:08:31

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 21/42] PCI: aardvark: Add support for masking MSI interrupts

Aardvark HW does not support masking individual MSI interrupts. It supports
masking only whole equivalence classes which consist of MSI interrupts with
same lower 5 bits. So mask a whole equivalence class only if all interrupts
in this class are masked.

For each equivalence class store a reference counter to indicate how many
unmasked interrupts are in this class. Use this counter to decide when this
class of MSI interrupts can be masked.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 77 ++++++++++++++++++++++++---
1 file changed, 69 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index d74e84b0e689..376f0666becc 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -202,6 +202,7 @@ struct advk_pcie {
struct msi_domain_info msi_domain_info;
DECLARE_BITMAP(msi_used_linear, MSI_IRQ_LINEAR_COUNT);
struct list_head msi_used_radix;
+ u16 msi_used_ec_refcnt[32];
struct mutex msi_used_lock;
int link_gen;
struct pci_bridge_emul bridge;
@@ -405,12 +406,10 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_REG);
advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_STATUS_REG);

- /* Disable All ISR0/1 Sources */
+ /* Disable All ISR0/1 and MSI Sources */
advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_MASK_REG);
advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_MASK_REG);
-
- /* Unmask all MSIs */
- advk_writel(pcie, ~(u32)PCIE_MSI_ALL_MASK, PCIE_MSI_MASK_REG);
+ advk_writel(pcie, PCIE_MSI_ALL_MASK, PCIE_MSI_MASK_REG);

/* Unmask summary MSI interrupt */
reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
@@ -888,6 +887,54 @@ static int advk_msi_set_affinity(struct irq_data *irq_data,
return -EINVAL;
}

+static void advk_msi_irq_mask(struct irq_data *d)
+{
+ struct advk_pcie *pcie = d->domain->host_data;
+ irq_hw_number_t hwirq = irqd_to_hwirq(d);
+ u8 idx = hwirq & 31;
+ u32 mask;
+
+ /*
+ * Aardvark HW does not support masking individual MSI interrupts. It
+ * supports masking only whole equivalence class idx which consist of
+ * MSI interrupts with same low 5 bits. So mask equivalence class idx
+ * only in case there is no used (unmasked) interrupt in this class.
+ */
+
+ if (--pcie->msi_used_ec_refcnt[idx] > 0)
+ return;
+
+ mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
+ mask |= BIT(idx);
+ advk_writel(pcie, mask, PCIE_MSI_MASK_REG);
+}
+
+static void advk_msi_irq_unmask(struct irq_data *d)
+{
+ struct advk_pcie *pcie = d->domain->host_data;
+ irq_hw_number_t hwirq = irqd_to_hwirq(d);
+ u8 idx = hwirq & 31;
+ u32 mask;
+
+ pcie->msi_used_ec_refcnt[idx]++;
+
+ mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
+ mask &= ~BIT(idx);
+ advk_writel(pcie, mask, PCIE_MSI_MASK_REG);
+}
+
+static void advk_msi_top_irq_mask(struct irq_data *d)
+{
+ pci_msi_mask_irq(d);
+ irq_chip_mask_parent(d);
+}
+
+static void advk_msi_top_irq_unmask(struct irq_data *d)
+{
+ pci_msi_unmask_irq(d);
+ irq_chip_unmask_parent(d);
+}
+
static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
unsigned int virq,
unsigned int nr_irqs, void *args)
@@ -1025,9 +1072,13 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
bottom_ic->name = "MSI";
bottom_ic->irq_compose_msi_msg = advk_msi_irq_compose_msi_msg;
bottom_ic->irq_set_affinity = advk_msi_set_affinity;
+ bottom_ic->irq_mask = advk_msi_irq_mask;
+ bottom_ic->irq_unmask = advk_msi_irq_unmask;

msi_ic = &pcie->msi_irq_chip;
msi_ic->name = "advk-MSI";
+ msi_ic->irq_mask = advk_msi_top_irq_mask;
+ msi_ic->irq_unmask = advk_msi_top_irq_unmask;

msi_di = &pcie->msi_domain_info;
msi_di->flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
@@ -1101,8 +1152,10 @@ static void advk_pcie_remove_irq_domain(struct advk_pcie *pcie)

static void advk_pcie_handle_msi(struct advk_pcie *pcie)
{
- u32 msi_val, msi_mask, msi_status, msi_idx;
+ struct irq_data *irq_data;
+ u32 msi_val, msi_mask, msi_status;
u16 msi_data;
+ u8 msi_idx;
int virq;

msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
@@ -1119,11 +1172,19 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
*/
advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;
+
+ /*
+ * Aardvark HW does not support masking individual MSI interrupts.
+ * So call generic_handle_irq() only in case kernel has not masked
+ * received MSI interrupt.
+ */
virq = irq_find_mapping(pcie->msi_inner_domain, msi_data);
- if (virq)
- generic_handle_irq(virq);
- else
+ irq_data = virq ? irq_get_irq_data(virq) : NULL;
+
+ if (!irq_data)
dev_err(&pcie->pdev->dev, "unexpected MSI 0x%04hx\n", msi_data);
+ else if (!irqd_irq_masked(irq_data))
+ generic_handle_irq(virq);
}

advk_writel(pcie, PCIE_ISR0_MSI_INT_PENDING,
--
2.20.1

2021-05-06 23:08:49

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 22/42] PCI: aardvark: Enable MSI-X support

According to PCI 3.0 specification, sending both MSI and MSI-X interrupts
is done by DWORD memory write operation to doorbell message address. The
write operation for MSI has zero upper 16 bits and the MSI interrupt number
in the lower 16 bits. The write operation for MSI-X contains a 32-bit value
from MSI-X table.

This means that when MSI-X uses only a 16-bit value, the MSI-X message can
be captured by aardvark MSI doorbell address processing. Therefore aardvark
HW should also support MSI-X interrupts, despite Armada 3700 Functional
Specification not mentioning anything about how to configure PCIe for MSI-X
interrupts. (Note that the specification says that MSI-X is supported, it
just does not say explicitly how to enable it.)

Since pci-aardvark.c driver now supports receiving MSI interrupt with any
16-bit number, enable also MSI-X support.

Testing proved that kernel can correctly receive MSI-X interrupts from PCIe
cards which supports both MSI and MSI-X interrupts.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 376f0666becc..8a5133226e41 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1082,7 +1082,7 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)

msi_di = &pcie->msi_domain_info;
msi_di->flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
- MSI_FLAG_MULTI_PCI_MSI;
+ MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX;
msi_di->chip = msi_ic;

/*
--
2.20.1

2021-05-06 23:08:58

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 24/42] PCI: aardvark: Fix support for PME on emulated bridge

The emulated bridge returns incorrect value for PCI_EXP_RTSTA register
during readout in advk_pci_bridge_emul_pcie_conf_read() function. Fix it by
setting correct bit PCI_EXP_RTSTA_PME based on PCIE_MSG_PM_PME_MASK.

Currently enabling PCI_EXP_RTSTA_PME bit in PCI_EXP_RTCTL register does
nothing. This is because PCIe PME driver expects to receive PCIe interrupt
defined in PCI_EXP_FLAGS_IRQ register. But aardvark hardware does not
trigger PCIe INTx/MSI interrupt for PME event, rather it triggers custom
aardvark interrupt which this driver is not processing yet.

Fix this issue by handling PME interrupt in advk_pcie_handle_int() and
chaining it to PCIe interrupt 0 with generic_handle_irq() (since aardvark
hardware sets PCI_EXP_FLAGS_IRQ to zero). With this change PCIe PME driver
finally starts receiving PME interrupt.

To optimize advk_pci_bridge_emul_pcie_conf_write() code, touch
PCIE_ISR0_REG and PCIE_ISR0_MASK_REG registers only when it is really
needed, when processing PCI_EXP_RTCTL_PMEIE and PCI_EXP_RTSTA_PME bits.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Cc: [email protected] # c0f05a6ab525 ("PCI: aardvark: Fix PCI_EXP_RTCTL register configuration")
---
drivers/pci/controller/pci-aardvark.c | 40 ++++++++++++++++++++-------
1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 2ea58ba10a97..fac48797d922 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -574,7 +574,10 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
case PCI_EXP_RTSTA: {
u32 isr0 = advk_readl(pcie, PCIE_ISR0_REG);
u32 msglog = advk_readl(pcie, PCIE_MSG_LOG_REG);
- *value = (isr0 & PCIE_MSG_PM_PME_MASK) << 16 | (msglog >> 16);
+ u32 val = msglog >> 16;
+ if (isr0 & PCIE_MSG_PM_PME_MASK)
+ val |= PCI_EXP_RTSTA_PME;
+ *value = val;
return PCI_BRIDGE_EMUL_HANDLED;
}

@@ -617,19 +620,21 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
advk_pcie_wait_for_retrain(pcie);
break;

- case PCI_EXP_RTCTL: {
+ case PCI_EXP_RTCTL:
/* Only mask/unmask PME interrupt */
- u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG) &
- ~PCIE_MSG_PM_PME_MASK;
- if ((new & PCI_EXP_RTCTL_PMEIE) == 0)
- val |= PCIE_MSG_PM_PME_MASK;
- advk_writel(pcie, val, PCIE_ISR0_MASK_REG);
+ if (mask & PCI_EXP_RTCTL_PMEIE) {
+ u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG);
+ if ((new & PCI_EXP_RTCTL_PMEIE) == 0)
+ val |= PCIE_MSG_PM_PME_MASK;
+ else
+ val &= ~PCIE_MSG_PM_PME_MASK;
+ advk_writel(pcie, val, PCIE_ISR0_MASK_REG);
+ }
break;
- }

case PCI_EXP_RTSTA:
- new = (new & PCI_EXP_RTSTA_PME) >> 9;
- advk_writel(pcie, new, PCIE_ISR0_REG);
+ if (new & PCI_EXP_RTSTA_PME)
+ advk_writel(pcie, PCIE_MSG_PM_PME_MASK, PCIE_ISR0_REG);
break;

default:
@@ -1219,6 +1224,21 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
if (!isr0_status && !isr1_status)
return;

+ /* Process PME interrupt */
+ if (isr0_status & PCIE_MSG_PM_PME_MASK) {
+ /*
+ * Do not clear PME interrupt bit in ISR0, it is cleared by IRQ
+ * receiver by writing to the PCI_EXP_RTSTA register of emulated
+ * root bridge. Aardvark HW returns zero for PCI_EXP_FLAGS_IRQ,
+ * so use PCIe interrupt 0.
+ */
+ virq = irq_find_mapping(pcie->irq_domain, 0);
+ if (virq)
+ generic_handle_irq(virq);
+ else
+ dev_err(&pcie->pdev->dev, "unexpected PME IRQ\n");
+ }
+
/* Process ERR interrupt */
if (err_bits) {
advk_writel(pcie, err_bits, PCIE_ISR0_REG);
--
2.20.1

2021-05-06 23:09:05

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 27/42] PCI: aardvark: Disable bus mastering and mask all interrupts when unbinding driver

Ensure that after unbinding aardvark driver, PCIe cards are not be able to
forward memory and I/O requests in the upstream direction and also that no
interrupt can be triggered.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 526a76991b7b ("PCI: aardvark: Implement driver 'remove' function and allow to build it as module")
---
drivers/pci/controller/pci-aardvark.c | 29 +++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 92f93ec48d6b..28ddffce1bec 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1453,12 +1453,41 @@ static int advk_pcie_remove(struct platform_device *pdev)
{
struct advk_pcie *pcie = platform_get_drvdata(pdev);
struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
+ u32 val;

+ /* Remove PCI bus with all devices */
pci_lock_rescan_remove();
pci_stop_root_bus(bridge->bus);
pci_remove_root_bus(bridge->bus);
pci_unlock_rescan_remove();

+ /* Disable Root Bridge I/O space, memory space and bus mastering */
+ val = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
+ val &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
+ advk_writel(pcie, val, PCIE_CORE_CMD_STATUS_REG);
+
+ /* Disable MSI */
+ val = advk_readl(pcie, PCIE_CORE_CTRL2_REG);
+ val &= ~PCIE_CORE_CTRL2_MSI_ENABLE;
+ advk_writel(pcie, val, PCIE_CORE_CTRL2_REG);
+
+ /* Clear MSI address */
+ advk_writel(pcie, 0, PCIE_MSI_ADDR_LOW_REG);
+ advk_writel(pcie, 0, PCIE_MSI_ADDR_HIGH_REG);
+
+ /* Mask all interrupts */
+ advk_writel(pcie, PCIE_MSI_ALL_MASK, PCIE_MSI_MASK_REG);
+ advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_MASK_REG);
+ advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_MASK_REG);
+ advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_MASK_REG);
+
+ /* Clear all interrupts */
+ advk_writel(pcie, PCIE_MSI_ALL_MASK, PCIE_MSI_STATUS_REG);
+ advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_REG);
+ advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_REG);
+ advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_STATUS_REG);
+
+ /* Remove IRQ domains */
advk_pcie_remove_msi_irq_domain(pcie);
advk_pcie_remove_irq_domain(pcie);

--
2.20.1

2021-05-06 23:10:34

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 34/42] PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers on emulated bridge

PCI aardvark hardware supports access to DEVCAP2, DEVCTL2, LNKCAP2 and
LNKCTL2 configuration registers of PCIe core via PCIE_CORE_PCIEXP_CAP.
Export them via emulated software root bridge.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
---
drivers/pci/controller/pci-aardvark.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index e724d05a61a8..13bbc0b5134d 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -610,8 +610,13 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
case PCI_EXP_DEVCAP:
case PCI_EXP_DEVCTL:
case PCI_EXP_LNKCAP:
+ case PCI_EXP_DEVCAP2:
+ case PCI_EXP_DEVCTL2:
+ case PCI_EXP_LNKCAP2:
+ case PCI_EXP_LNKCTL2:
*value = advk_readl(pcie, PCIE_CORE_PCIEXP_CAP + reg);
return PCI_BRIDGE_EMUL_HANDLED;
+
default:
return PCI_BRIDGE_EMUL_NOT_HANDLED;
}
@@ -631,16 +636,18 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
*/

switch (reg) {
- case PCI_EXP_DEVCTL:
- advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
- break;
-
case PCI_EXP_LNKCTL:
advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
if (new & PCI_EXP_LNKCTL_RL)
advk_pcie_wait_for_retrain(pcie);
break;

+ case PCI_EXP_DEVCTL:
+ case PCI_EXP_DEVCTL2:
+ case PCI_EXP_LNKCTL2:
+ advk_writel(pcie, new, PCIE_CORE_PCIEXP_CAP + reg);
+ break;
+
default:
break;
}
--
2.20.1

2021-05-06 23:10:43

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 31/42] PCI: aardvark: Use separate INTA interrupt for emulated root bridge

Emulated root bridge currently provides only one Legacy INTA interrupt
which is used for reporting PCIe PME and ERR events and handled by kernel
PCIe PME and AER drivers.

Aardvark HW reports these PME and ERR events separately, so there is no
need to mix real INTA interrupt and emulated INTA interrupt for PCIe PME
and AER drivers.

Register a new advk-EMU irq chip and a new irq domain for emulated root
bridge and use this new separate irq domain for providing INTA interrupt
from emulated root bridge for PME and ERR events.

The real INTA interrupt from real devices is now separate.

A custom map_irq callback function on PCI host bridge structure is used to
allocate IRQ mapping for emulated root bridge from new irq domain. Original
callback of_irq_parse_and_map_pci() is used for all other devices as before.

Signed-off-by: Pali Rohár <[email protected]>
Reviewed-by: Marek Behún <[email protected]>
---
drivers/pci/controller/pci-aardvark.c | 66 ++++++++++++++++++++++++++-
1 file changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index f2ed276b7e18..e724d05a61a8 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -194,6 +194,8 @@ struct advk_pcie {
struct platform_device *pdev;
void __iomem *base;
int irq;
+ struct irq_domain *emul_irq_domain;
+ struct irq_chip emul_irq_chip;
struct irq_domain *irq_domain;
struct irq_chip irq_chip;
struct irq_domain *msi_domain;
@@ -1074,6 +1076,22 @@ static const struct irq_domain_ops advk_pcie_irq_domain_ops = {
.xlate = irq_domain_xlate_onecell,
};

+static int advk_pcie_emul_irq_map(struct irq_domain *h,
+ unsigned int virq, irq_hw_number_t hwirq)
+{
+ struct advk_pcie *pcie = h->host_data;
+
+ irq_set_chip_and_handler(virq, &pcie->emul_irq_chip, handle_simple_irq);
+ irq_set_chip_data(virq, pcie);
+
+ return 0;
+}
+
+static const struct irq_domain_ops advk_pcie_emul_irq_domain_ops = {
+ .map = advk_pcie_emul_irq_map,
+ .xlate = irq_domain_xlate_onecell,
+};
+
static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
{
struct device *dev = &pcie->pdev->dev;
@@ -1167,6 +1185,24 @@ static void advk_pcie_remove_irq_domain(struct advk_pcie *pcie)
irq_domain_remove(pcie->irq_domain);
}

+static int advk_pcie_init_emul_irq_domain(struct advk_pcie *pcie)
+{
+ pcie->emul_irq_chip.name = "advk-EMU";
+ pcie->emul_irq_domain = irq_domain_add_linear(NULL, 1,
+ &advk_pcie_emul_irq_domain_ops, pcie);
+ if (!pcie->emul_irq_domain) {
+ dev_err(&pcie->pdev->dev, "Failed to add emul IRQ domain\n");
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void advk_pcie_remove_emul_irq_domain(struct advk_pcie *pcie)
+{
+ irq_domain_remove(pcie->emul_irq_domain);
+}
+
static void advk_pcie_handle_msi(struct advk_pcie *pcie)
{
struct irq_data *irq_data;
@@ -1244,7 +1280,7 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
* Aardvark HW returns zero for PCI_EXP_FLAGS_IRQ, so use PCIe interrupt 0.
*/
if (le16_to_cpu(pcie->bridge.pcie_conf.rootctl) & PCI_EXP_RTCTL_PMEIE) {
- virq = irq_find_mapping(pcie->irq_domain, 0);
+ virq = irq_find_mapping(pcie->emul_irq_domain, 0);
if (virq)
generic_handle_irq(virq);
else
@@ -1257,7 +1293,7 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
if (err_bits) {
advk_writel(pcie, err_bits, PCIE_ISR0_REG);
/* Aardvark HW returns zero for PCI_ERR_ROOT_AER_IRQ, so use PCIe interrupt 0 */
- virq = irq_find_mapping(pcie->irq_domain, 0);
+ virq = irq_find_mapping(pcie->emul_irq_domain, 0);
if (virq)
generic_handle_irq(virq);
else
@@ -1304,6 +1340,21 @@ static void advk_pcie_irq_handler(struct irq_desc *desc)
chained_irq_exit(chip, desc);
}

+static int advk_pcie_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
+{
+ struct advk_pcie *pcie = dev->bus->sysdata;
+
+ /*
+ * Emulated root bridge has itw own emulated irq chip and irq domain.
+ * Variable pin is the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD) and
+ * hwirq for irq_create_mapping() is indexed from zero.
+ */
+ if (pci_is_root_bus(dev->bus))
+ return irq_create_mapping(pcie->emul_irq_domain, pin-1);
+ else
+ return of_irq_parse_and_map_pci(dev, slot, pin);
+}
+
static void __maybe_unused advk_pcie_disable_phy(struct advk_pcie *pcie)
{
phy_power_off(pcie->phy);
@@ -1432,13 +1483,23 @@ static int advk_pcie_probe(struct platform_device *pdev)
return ret;
}

+ ret = advk_pcie_init_emul_irq_domain(pcie);
+ if (ret) {
+ dev_err(dev, "Failed to initialize irq\n");
+ advk_pcie_remove_irq_domain(pcie);
+ advk_pcie_remove_msi_irq_domain(pcie);
+ return ret;
+ }
+
irq_set_chained_handler_and_data(pcie->irq, advk_pcie_irq_handler, pcie);

bridge->sysdata = pcie;
bridge->ops = &advk_pcie_ops;
+ bridge->map_irq = advk_pcie_map_irq;

ret = pci_host_probe(bridge);
if (ret < 0) {
+ advk_pcie_remove_emul_irq_domain(pcie);
advk_pcie_remove_msi_irq_domain(pcie);
advk_pcie_remove_irq_domain(pcie);
irq_set_chained_handler_and_data(pcie->irq, NULL, NULL);
@@ -1487,6 +1548,7 @@ static int advk_pcie_remove(struct platform_device *pdev)
advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_STATUS_REG);

/* Remove IRQ domains */
+ advk_pcie_remove_emul_irq_domain(pcie);
advk_pcie_remove_msi_irq_domain(pcie);
advk_pcie_remove_irq_domain(pcie);

--
2.20.1

2021-05-06 23:11:10

by Pali Rohár

[permalink] [raw]
Subject: [PATCH 40/42] PCI: pci-bridge-emul: re-arrange register tests

From: Russell King <[email protected]>

Re-arrange the tests for which sets of registers are being accessed so that
it is easier to add further regions later. No functional change.

Signed-off-by: Russell King <[email protected]>
[pali: Fix reading old value in pci_bridge_emul_conf_write]
Signed-off-by: Pali Rohár <[email protected]>
---
drivers/pci/pci-bridge-emul.c | 61 ++++++++++++++++++-----------------
1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index 5f8398f8d039..63959e4b188a 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -370,25 +370,25 @@ int pci_bridge_emul_conf_read(struct pci_bridge_emul *bridge, int where,
__le32 *cfgspace;
const struct pci_bridge_reg_behavior *behavior;

- if (bridge->has_pcie && reg >= PCI_CAP_PCIE_END) {
- *value = 0;
- return PCIBIOS_SUCCESSFUL;
- }
-
- if (!bridge->has_pcie && reg >= PCI_BRIDGE_CONF_END) {
+ if (reg < PCI_BRIDGE_CONF_END) {
+ /* Emulated PCI space */
+ read_op = bridge->ops->read_base;
+ cfgspace = (__le32 *) &bridge->conf;
+ behavior = bridge->pci_regs_behavior;
+ } else if (!bridge->has_pcie) {
+ /* PCIe space is not implemented, and no PCI capabilities */
*value = 0;
return PCIBIOS_SUCCESSFUL;
- }
-
- if (bridge->has_pcie && reg >= PCI_CAP_PCIE_START) {
+ } else if (reg < PCI_CAP_PCIE_END) {
+ /* Our emulated PCIe capability */
reg -= PCI_CAP_PCIE_START;
read_op = bridge->ops->read_pcie;
cfgspace = (__le32 *) &bridge->pcie_conf;
behavior = bridge->pcie_cap_regs_behavior;
} else {
- read_op = bridge->ops->read_base;
- cfgspace = (__le32 *) &bridge->conf;
- behavior = bridge->pci_regs_behavior;
+ /* Beyond our PCIe space */
+ *value = 0;
+ return PCIBIOS_SUCCESSFUL;
}

if (read_op)
@@ -432,11 +432,27 @@ int pci_bridge_emul_conf_write(struct pci_bridge_emul *bridge, int where,
__le32 *cfgspace;
const struct pci_bridge_reg_behavior *behavior;

- if (bridge->has_pcie && reg >= PCI_CAP_PCIE_END)
- return PCIBIOS_SUCCESSFUL;
+ ret = pci_bridge_emul_conf_read(bridge, reg, 4, &old);
+ if (ret != PCIBIOS_SUCCESSFUL)
+ return ret;

- if (!bridge->has_pcie && reg >= PCI_BRIDGE_CONF_END)
+ if (reg < PCI_BRIDGE_CONF_END) {
+ /* Emulated PCI space */
+ write_op = bridge->ops->write_base;
+ cfgspace = (__le32 *) &bridge->conf;
+ behavior = bridge->pci_regs_behavior;
+ } else if (!bridge->has_pcie) {
+ /* PCIe space is not implemented, and no PCI capabilities */
return PCIBIOS_SUCCESSFUL;
+ } else if (reg < PCI_CAP_PCIE_END) {
+ /* Our emulated PCIe capability */
+ reg -= PCI_CAP_PCIE_START;
+ write_op = bridge->ops->write_pcie;
+ cfgspace = (__le32 *) &bridge->pcie_conf;
+ behavior = bridge->pcie_cap_regs_behavior;
+ } else {
+ return PCIBIOS_SUCCESSFUL;
+ }

shift = (where & 0x3) * 8;

@@ -449,21 +465,6 @@ int pci_bridge_emul_conf_write(struct pci_bridge_emul *bridge, int where,
else
return PCIBIOS_BAD_REGISTER_NUMBER;

- ret = pci_bridge_emul_conf_read(bridge, reg, 4, &old);
- if (ret != PCIBIOS_SUCCESSFUL)
- return ret;
-
- if (bridge->has_pcie && reg >= PCI_CAP_PCIE_START) {
- reg -= PCI_CAP_PCIE_START;
- write_op = bridge->ops->write_pcie;
- cfgspace = (__le32 *) &bridge->pcie_conf;
- behavior = bridge->pcie_cap_regs_behavior;
- } else {
- write_op = bridge->ops->write_base;
- cfgspace = (__le32 *) &bridge->conf;
- behavior = bridge->pci_regs_behavior;
- }
-
/* Keep all bits, except the RW bits */
new = old & (~mask | ~behavior[reg / 4].rw);

--
2.20.1

2021-05-07 03:46:41

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 05/42] PCI: pci-bridge-emul: Add PCIe Root Capabilities Register

On Thu, May 06, 2021 at 05:31:16PM +0200, Pali Roh?r wrote:
> This is 16-bit register at offset 0x1E. Rename current 'rsvd' struct member
> to 'rootcap'.

"The 16-bit Root Capabilities register is at offset 0x1e in the PCIe
Capability."

Please make the commit log complete in itself. In some contexts, the
subject line is not visible at the same time. It's fine to repeat the
subject in the commit log.

> Signed-off-by: Pali Roh?r <[email protected]>
> Reviewed-by: Marek Beh?n <[email protected]>
> Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
> Cc: [email protected] # e0d9d30b7354 ("PCI: pci-bridge-emul: Fix big-endian support")

I'm not sure how people would deal with *two* SHA1s.

This patch adds functionality, so it's not really fixing a bug in
23a5fba4d941. I see that e0d9d30b7354 came along later and did
"s/u16 rsvd/__le16 rsvd/".

But it seems like a lot to expect for distros and stable kernel
maintainers to interpret this.

Personally I think I would omit both Fixes: and the stable tag since
these two patches (05 and 06) are just adding functionality.

> ---
> drivers/pci/pci-bridge-emul.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci-bridge-emul.h b/drivers/pci/pci-bridge-emul.h
> index b31883022a8e..49bbd37ee318 100644
> --- a/drivers/pci/pci-bridge-emul.h
> +++ b/drivers/pci/pci-bridge-emul.h
> @@ -54,7 +54,7 @@ struct pci_bridge_emul_pcie_conf {
> __le16 slotctl;
> __le16 slotsta;
> __le16 rootctl;
> - __le16 rsvd;
> + __le16 rootcap;
> __le32 rootsta;
> __le32 devcap2;
> __le16 devctl2;
> --
> 2.20.1
>

2021-05-07 09:55:00

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 15/42] PCI: aardvark: Change name of INTx irq_chip to advk-INT

On Thu, 06 May 2021 16:31:26 +0100,
Pali Rohár <[email protected]> wrote:
>
> This name is visible in /proc/interrupts file and for better reading it
> should have at most 8 characters. Also there is no need to allocate this
> name dynamically, since there is only one PCIe controller on Armada 37xx.
> This aligns with how the MSI irq_chip in this driver names it's interrupt
> ("advk-MSI").

And *because* the name is visible in /proc/interrupts, it has become
an ABI, and cannot be changed anymore.

We had the exact same issue with Tegra this merge window as I
accidentally changed "Tegra" to "tegra", resulting in userspace
programs failing find stuff in /proc/interrupts.

Please keep the name as is, no matter how ugly it is.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-05-07 10:02:31

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 14/42] PCI: aardvark: Don't mask irq when mapping

On Thu, 06 May 2021 16:31:25 +0100,
Pali Rohár <[email protected]> wrote:
>
> By default, all Legacy INTx interrupts are masked, so there is no need to
> mask this interrupt during irq_map callback.

What guarantees that they are actually masked? I would actually assume
that the HW is in an unknown state at boot time.

Thanks,

M.

>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> ---
> drivers/pci/controller/pci-aardvark.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 2aced8c9ae9f..08f1157e1c5e 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -940,7 +940,6 @@ static int advk_pcie_irq_map(struct irq_domain *h,
> {
> struct advk_pcie *pcie = h->host_data;
>
> - advk_pcie_irq_mask(irq_get_irq_data(virq));
> irq_set_status_flags(virq, IRQ_LEVEL);
> irq_set_chip_and_handler(virq, &pcie->irq_chip,
> handle_level_irq);
> --
> 2.20.1
>
>

--
Without deviation from the norm, progress is not possible.

2021-05-07 10:02:32

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 12/42] PCI: aardvark: Check for virq mapping when processing INTx IRQ

On Thu, 06 May 2021 16:31:23 +0100,
Pali Rohár <[email protected]> wrote:
>
> It is possible that we receive spurious INTx interrupt. So add needed check
> before calling generic_handle_irq() function.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> Cc: [email protected]
> ---
> drivers/pci/controller/pci-aardvark.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 362faddae935..e7089db11f79 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -1106,7 +1106,10 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
> PCIE_ISR1_REG);
>
> virq = irq_find_mapping(pcie->irq_domain, i);
> - generic_handle_irq(virq);
> + if (virq)
> + generic_handle_irq(virq);
> + else
> + dev_err(&pcie->pdev->dev, "unexpected INT%c IRQ\n", (char)i+'A');

Please don't scream like this. This is the best way to get into a DoS
situation if you interrupt rate is high enough. At least rate-limit
it.

M.

--
Without deviation from the norm, progress is not possible.

2021-05-07 12:25:03

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 14/42] PCI: aardvark: Don't mask irq when mapping

On Friday 07 May 2021 10:20:39 Marc Zyngier wrote:
> On Thu, 06 May 2021 16:31:25 +0100,
> Pali Rohár <[email protected]> wrote:
> >
> > By default, all Legacy INTx interrupts are masked, so there is no need to
> > mask this interrupt during irq_map callback.
>
> What guarantees that they are actually masked? I would actually assume
> that the HW is in an unknown state at boot time.

Function advk_pcie_setup_hw() during driver probing mask all INTx interrupts:

advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_MASK_REG);

Individual INTx interrupts can be unmasked by PCIE_ISR1_INTX_ASSERT(val).
Macro PCIE_ISR1_ALL_MASK contains all bits from PCIE_ISR1_INTX_ASSERT(val).

> Thanks,
>
> M.
>
> >
> > Signed-off-by: Pali Rohár <[email protected]>
> > Reviewed-by: Marek Behún <[email protected]>
> > ---
> > drivers/pci/controller/pci-aardvark.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 2aced8c9ae9f..08f1157e1c5e 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -940,7 +940,6 @@ static int advk_pcie_irq_map(struct irq_domain *h,
> > {
> > struct advk_pcie *pcie = h->host_data;
> >
> > - advk_pcie_irq_mask(irq_get_irq_data(virq));
> > irq_set_status_flags(virq, IRQ_LEVEL);
> > irq_set_chip_and_handler(virq, &pcie->irq_chip,
> > handle_level_irq);
> > --
> > 2.20.1
> >
> >
>
> --
> Without deviation from the norm, progress is not possible.

2021-05-07 12:48:01

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 13/42] PCI: aardvark: Remove irq_mask_ack callback for INTx interrupts

On Thu, 06 May 2021 16:31:24 +0100,
Pali Rohár <[email protected]> wrote:
>
> Callback for irq_mask_ack is the same as for irq_mask. As there is no
> special handling for irq_ack, there is no need to define irq_mask_ack too.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> ---
> drivers/pci/controller/pci-aardvark.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index e7089db11f79..2aced8c9ae9f 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -1032,7 +1032,6 @@ static int advk_pcie_init_irq_domain(struct advk_pcie *pcie)
> }
>
> irq_chip->irq_mask = advk_pcie_irq_mask;
> - irq_chip->irq_mask_ack = advk_pcie_irq_mask;
> irq_chip->irq_unmask = advk_pcie_irq_unmask;
>
> pcie->irq_domain =

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2021-05-07 13:35:39

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 17/42] PCI: aardvark: Fix support for MSI interrupts

On Thu, 06 May 2021 16:31:28 +0100,
Pali Rohár <[email protected]> wrote:
>
> MSI domain callback .alloc (implemented by advk_msi_irq_domain_alloc()
> function) should return zero on success. Returning non-zero value indicates
> failure. Fix return value of this function as in many cases it now returns
> failure while allocating IRQs.
>
> Aardvark hardware supports Multi-MSI and MSI_FLAG_MULTI_PCI_MSI is already
> set. But when allocating MSI interrupt numbers for Multi-MSI, they need to
> be properly aligned, otherwise endpoint devices send MSI interrupt with
> incorrect numbers. Fix this issue by using function bitmap_find_free_region()
> instead of bitmap_find_next_zero_area().
>
> To ensure that aligned MSI interrupt numbers are used by endpoint devices,
> we cannot use Linux virtual irq numbers (as they are random and not
> properly aligned). So use hwirq numbers allocated by the function
> bitmap_find_free_region(), which are aligned. This needs an update in
> advk_msi_irq_compose_msi_msg() and advk_pcie_handle_msi() functions to do
> proper mapping between Linux virtual irq numbers and hwirq MSI inner domain
> numbers.
>
> Also the whole 16-bit MSI number is stored in the PCIE_MSI_PAYLOAD_REG
> register, not only lower 8 bits. Fix reading content of this register.
>
> This change fixes receiving MSI interrupts on Armada 3720 boards and allows
> using NVMe disks which use Multi-MSI feature with 3 interrupts.
>
> Without this change, NVMe disks just freeze booting Linux on Armada 3720
> boards as linux nvme-core.c driver is waiting 60s for an interrupt.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
> ---
> drivers/pci/controller/pci-aardvark.c | 32 ++++++++++++++++-----------
> 1 file changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 366d7480bc1b..498810c00b6d 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -118,6 +118,7 @@
> #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> +#define PCIE_MSI_DATA_MASK GENMASK(15, 0)

See my comment below about this addition.

> /* LMI registers base address and register offsets */
> #define LMI_BASE_ADDR 0x6000
> @@ -861,7 +862,7 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
>
> msg->address_lo = lower_32_bits(msi_msg);
> msg->address_hi = upper_32_bits(msi_msg);
> - msg->data = data->irq;
> + msg->data = data->hwirq;
> }
>
> static int advk_msi_set_affinity(struct irq_data *irq_data,
> @@ -878,15 +879,11 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> int hwirq, i;
>
> mutex_lock(&pcie->msi_used_lock);
> - hwirq = bitmap_find_next_zero_area(pcie->msi_used, MSI_IRQ_NUM,
> - 0, nr_irqs, 0);
> - if (hwirq >= MSI_IRQ_NUM) {
> - mutex_unlock(&pcie->msi_used_lock);
> - return -ENOSPC;
> - }
> -
> - bitmap_set(pcie->msi_used, hwirq, nr_irqs);
> + hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
> + order_base_2(nr_irqs));
> mutex_unlock(&pcie->msi_used_lock);
> + if (hwirq < 0)
> + return -ENOSPC;
>
> for (i = 0; i < nr_irqs; i++)
> irq_domain_set_info(domain, virq + i, hwirq + i,
> @@ -894,7 +891,7 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> domain->host_data, handle_simple_irq,
> NULL, NULL);
>
> - return hwirq;
> + return 0;
> }
>
> static void advk_msi_irq_domain_free(struct irq_domain *domain,
> @@ -904,7 +901,7 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
> struct advk_pcie *pcie = domain->host_data;
>
> mutex_lock(&pcie->msi_used_lock);
> - bitmap_clear(pcie->msi_used, d->hwirq, nr_irqs);
> + bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
> mutex_unlock(&pcie->msi_used_lock);
> }
>
> @@ -1048,6 +1045,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> {
> u32 msi_val, msi_mask, msi_status, msi_idx;
> u16 msi_data;
> + int virq;
>
> msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> @@ -1057,9 +1055,17 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> if (!(BIT(msi_idx) & msi_status))
> continue;
>
> + /*
> + * msi_idx contains bits [4:0] of the msi_data and msi_data
> + * contains 16bit MSI interrupt number from MSI inner domain
> + */
> advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
> - msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & 0xFF;
> - generic_handle_irq(msi_data);
> + msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;

Can this be moved to a separate patch? It seems like this patch should
only focus on correctly dealing with the irq/hwirq issues.

> + virq = irq_find_mapping(pcie->msi_inner_domain, msi_data);
> + if (virq)
> + generic_handle_irq(virq);
> + else
> + dev_err(&pcie->pdev->dev, "unexpected MSI 0x%04hx\n", msi_data);

Same concern about the unmitigated screaming.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-05-07 13:50:35

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 18/42] PCI: aardvark: Correctly clear and unmask all MSI interrupts

On Thu, 06 May 2021 16:31:29 +0100,
Pali Rohár <[email protected]> wrote:
>
> Define a new macro PCIE_MSI_ALL_MASK and use it for masking, unmasking and
> clearing all MSI interrupts.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> Cc: [email protected]
> ---
> drivers/pci/controller/pci-aardvark.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 498810c00b6d..5e0243b2c473 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -117,6 +117,7 @@
> #define PCIE_MSI_ADDR_HIGH_REG (CONTROL_BASE_ADDR + 0x54)
> #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> +#define PCIE_MSI_ALL_MASK GENMASK(31, 0)
> #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> #define PCIE_MSI_DATA_MASK GENMASK(15, 0)
>
> @@ -386,19 +387,22 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);
>
> /* Clear all interrupts */
> + advk_writel(pcie, PCIE_MSI_ALL_MASK, PCIE_MSI_STATUS_REG);
> advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_REG);
> advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_REG);
> advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_STATUS_REG);
>
> /* Disable All ISR0/1 Sources */
> - reg = PCIE_ISR0_ALL_MASK;
> - reg &= ~PCIE_ISR0_MSI_INT_PENDING;
> - advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
> -
> + advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_MASK_REG);
> advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_MASK_REG);
>
> /* Unmask all MSIs */
> - advk_writel(pcie, 0, PCIE_MSI_MASK_REG);
> + advk_writel(pcie, ~(u32)PCIE_MSI_ALL_MASK, PCIE_MSI_MASK_REG);

I really wonder why you'd unmask all MSIs. Yes, the current code does
that already, but I'd expect MSIs to be individually unmasked as they
get enabled by the core code.

Thanks,

M.

> +
> + /* Unmask summary MSI interrupt */
> + reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
> + reg &= ~PCIE_ISR0_MSI_INT_PENDING;
> + advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
>
> /* Enable summary interrupt for GIC SPI source */
> reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK);
> @@ -1049,7 +1053,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
>
> msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> - msi_status = msi_val & ~msi_mask;
> + msi_status = msi_val & ((~msi_mask) & PCIE_MSI_ALL_MASK);
>
> for (msi_idx = 0; msi_idx < MSI_IRQ_NUM; msi_idx++) {
> if (!(BIT(msi_idx) & msi_status))
> --
> 2.20.1
>
>

--
Without deviation from the norm, progress is not possible.

2021-05-07 13:52:16

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 18/42] PCI: aardvark: Correctly clear and unmask all MSI interrupts

On Friday 07 May 2021 11:19:48 Marc Zyngier wrote:
> On Thu, 06 May 2021 16:31:29 +0100,
> Pali Rohár <[email protected]> wrote:
> >
> > Define a new macro PCIE_MSI_ALL_MASK and use it for masking, unmasking and
> > clearing all MSI interrupts.
> >
> > Signed-off-by: Pali Rohár <[email protected]>
> > Reviewed-by: Marek Behún <[email protected]>
> > Cc: [email protected]
> > ---
> > drivers/pci/controller/pci-aardvark.c | 16 ++++++++++------
> > 1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 498810c00b6d..5e0243b2c473 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -117,6 +117,7 @@
> > #define PCIE_MSI_ADDR_HIGH_REG (CONTROL_BASE_ADDR + 0x54)
> > #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> > #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> > +#define PCIE_MSI_ALL_MASK GENMASK(31, 0)
> > #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> > #define PCIE_MSI_DATA_MASK GENMASK(15, 0)
> >
> > @@ -386,19 +387,22 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> > advk_writel(pcie, reg, PCIE_CORE_CTRL2_REG);
> >
> > /* Clear all interrupts */
> > + advk_writel(pcie, PCIE_MSI_ALL_MASK, PCIE_MSI_STATUS_REG);
> > advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_REG);
> > advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_REG);
> > advk_writel(pcie, PCIE_IRQ_ALL_MASK, HOST_CTRL_INT_STATUS_REG);
> >
> > /* Disable All ISR0/1 Sources */
> > - reg = PCIE_ISR0_ALL_MASK;
> > - reg &= ~PCIE_ISR0_MSI_INT_PENDING;
> > - advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
> > -
> > + advk_writel(pcie, PCIE_ISR0_ALL_MASK, PCIE_ISR0_MASK_REG);
> > advk_writel(pcie, PCIE_ISR1_ALL_MASK, PCIE_ISR1_MASK_REG);
> >
> > /* Unmask all MSIs */
> > - advk_writel(pcie, 0, PCIE_MSI_MASK_REG);
> > + advk_writel(pcie, ~(u32)PCIE_MSI_ALL_MASK, PCIE_MSI_MASK_REG);
>
> I really wonder why you'd unmask all MSIs. Yes, the current code does
> that already, but I'd expect MSIs to be individually unmasked as they
> get enabled by the core code.

Individual unmasking is implemented in later patches in this patch series.

> Thanks,
>
> M.
>
> > +
> > + /* Unmask summary MSI interrupt */
> > + reg = advk_readl(pcie, PCIE_ISR0_MASK_REG);
> > + reg &= ~PCIE_ISR0_MSI_INT_PENDING;
> > + advk_writel(pcie, reg, PCIE_ISR0_MASK_REG);
> >
> > /* Enable summary interrupt for GIC SPI source */
> > reg = PCIE_IRQ_ALL_MASK & (~PCIE_IRQ_ENABLE_INTS_MASK);
> > @@ -1049,7 +1053,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> >
> > msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> > msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> > - msi_status = msi_val & ~msi_mask;
> > + msi_status = msi_val & ((~msi_mask) & PCIE_MSI_ALL_MASK);
> >
> > for (msi_idx = 0; msi_idx < MSI_IRQ_NUM; msi_idx++) {
> > if (!(BIT(msi_idx) & msi_status))
> > --
> > 2.20.1
> >
> >
>
> --
> Without deviation from the norm, progress is not possible.

2021-05-07 14:02:45

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 19/42] PCI: aardvark: Fix setting MSI address

On Thu, 06 May 2021 16:31:30 +0100,
Pali Rohár <[email protected]> wrote:
>
> MSI address for receiving MSI interrupts needs to be correctly set before
> enabling processing of MSI interrupts.
>
> Move code for setting PCIE_MSI_ADDR_LOW_REG and PCIE_MSI_ADDR_HIGH_REG
> registers with MSI address from advk_pcie_init_msi_irq_domain() function to
> advk_pcie_setup_hw() function before enabling PCIE_CORE_CTRL2_MSI_ENABLE.
>
> As part of this change, also remove unused variable msi_msg, which was used
> only for MSI doorbell address. MSI address can be any address which does
> not conflict with PCI space.

Not quite. It can be any address that cannot be used to *DMA* to.

> So change it to the address of the main struct advk_pcie.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
> ---
> drivers/pci/controller/pci-aardvark.c | 21 +++++++++------------
> 1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 5e0243b2c473..199015215779 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -195,7 +195,6 @@ struct advk_pcie {
> struct msi_domain_info msi_domain_info;
> DECLARE_BITMAP(msi_used, MSI_IRQ_NUM);
> struct mutex msi_used_lock;
> - u16 msi_msg;
> int link_gen;
> struct pci_bridge_emul bridge;
> struct gpio_desc *reset_gpio;
> @@ -325,6 +324,7 @@ static void advk_pcie_train_link(struct advk_pcie *pcie)
>
> static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> {
> + phys_addr_t msi_addr;
> u32 reg;
>
> /* Enable TX */
> @@ -381,6 +381,11 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
> reg |= LANE_COUNT_1;
> advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
>
> + /* Set MSI address */
> + msi_addr = virt_to_phys(pcie);
> + advk_writel(pcie, lower_32_bits(msi_addr), PCIE_MSI_ADDR_LOW_REG);
> + advk_writel(pcie, upper_32_bits(msi_addr), PCIE_MSI_ADDR_HIGH_REG);
> +
> /* Enable MSI */
> reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG);
> reg |= PCIE_CORE_CTRL2_MSI_ENABLE;
> @@ -862,10 +867,10 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
> struct msi_msg *msg)
> {
> struct advk_pcie *pcie = irq_data_get_irq_chip_data(data);
> - phys_addr_t msi_msg = virt_to_phys(&pcie->msi_msg);
> + phys_addr_t msi_addr = virt_to_phys(pcie);
>
> - msg->address_lo = lower_32_bits(msi_msg);
> - msg->address_hi = upper_32_bits(msi_msg);
> + msg->address_lo = lower_32_bits(msi_addr);
> + msg->address_hi = upper_32_bits(msi_addr);
> msg->data = data->hwirq;
> }
>
> @@ -960,7 +965,6 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
> struct device_node *node = dev->of_node;
> struct irq_chip *bottom_ic, *msi_ic;
> struct msi_domain_info *msi_di;
> - phys_addr_t msi_msg_phys;
>
> mutex_init(&pcie->msi_used_lock);
>
> @@ -978,13 +982,6 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
> MSI_FLAG_MULTI_PCI_MSI;
> msi_di->chip = msi_ic;
>
> - msi_msg_phys = virt_to_phys(&pcie->msi_msg);
> -
> - advk_writel(pcie, lower_32_bits(msi_msg_phys),
> - PCIE_MSI_ADDR_LOW_REG);
> - advk_writel(pcie, upper_32_bits(msi_msg_phys),
> - PCIE_MSI_ADDR_HIGH_REG);
> -
> pcie->msi_inner_domain =
> irq_domain_add_linear(NULL, MSI_IRQ_NUM,
> &advk_msi_domain_ops, pcie);

Otherwise,

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2021-05-07 17:24:51

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 17/42] PCI: aardvark: Fix support for MSI interrupts

On Friday 07 May 2021 11:16:58 Marc Zyngier wrote:
> On Thu, 06 May 2021 16:31:28 +0100,
> Pali Rohár <[email protected]> wrote:
> >
> > MSI domain callback .alloc (implemented by advk_msi_irq_domain_alloc()
> > function) should return zero on success. Returning non-zero value indicates
> > failure. Fix return value of this function as in many cases it now returns
> > failure while allocating IRQs.
> >
> > Aardvark hardware supports Multi-MSI and MSI_FLAG_MULTI_PCI_MSI is already
> > set. But when allocating MSI interrupt numbers for Multi-MSI, they need to
> > be properly aligned, otherwise endpoint devices send MSI interrupt with
> > incorrect numbers. Fix this issue by using function bitmap_find_free_region()
> > instead of bitmap_find_next_zero_area().
> >
> > To ensure that aligned MSI interrupt numbers are used by endpoint devices,
> > we cannot use Linux virtual irq numbers (as they are random and not
> > properly aligned). So use hwirq numbers allocated by the function
> > bitmap_find_free_region(), which are aligned. This needs an update in
> > advk_msi_irq_compose_msi_msg() and advk_pcie_handle_msi() functions to do
> > proper mapping between Linux virtual irq numbers and hwirq MSI inner domain
> > numbers.
> >
> > Also the whole 16-bit MSI number is stored in the PCIE_MSI_PAYLOAD_REG
> > register, not only lower 8 bits. Fix reading content of this register.
> >
> > This change fixes receiving MSI interrupts on Armada 3720 boards and allows
> > using NVMe disks which use Multi-MSI feature with 3 interrupts.
> >
> > Without this change, NVMe disks just freeze booting Linux on Armada 3720
> > boards as linux nvme-core.c driver is waiting 60s for an interrupt.
> >
> > Signed-off-by: Pali Rohár <[email protected]>
> > Reviewed-by: Marek Behún <[email protected]>
> > Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
> > ---
> > drivers/pci/controller/pci-aardvark.c | 32 ++++++++++++++++-----------
> > 1 file changed, 19 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 366d7480bc1b..498810c00b6d 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -118,6 +118,7 @@
> > #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> > #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> > #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> > +#define PCIE_MSI_DATA_MASK GENMASK(15, 0)
>
> See my comment below about this addition.
>
> > /* LMI registers base address and register offsets */
> > #define LMI_BASE_ADDR 0x6000
> > @@ -861,7 +862,7 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
> >
> > msg->address_lo = lower_32_bits(msi_msg);
> > msg->address_hi = upper_32_bits(msi_msg);
> > - msg->data = data->irq;
> > + msg->data = data->hwirq;
> > }
> >
> > static int advk_msi_set_affinity(struct irq_data *irq_data,
> > @@ -878,15 +879,11 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > int hwirq, i;
> >
> > mutex_lock(&pcie->msi_used_lock);
> > - hwirq = bitmap_find_next_zero_area(pcie->msi_used, MSI_IRQ_NUM,
> > - 0, nr_irqs, 0);
> > - if (hwirq >= MSI_IRQ_NUM) {
> > - mutex_unlock(&pcie->msi_used_lock);
> > - return -ENOSPC;
> > - }
> > -
> > - bitmap_set(pcie->msi_used, hwirq, nr_irqs);
> > + hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
> > + order_base_2(nr_irqs));
> > mutex_unlock(&pcie->msi_used_lock);
> > + if (hwirq < 0)
> > + return -ENOSPC;
> >
> > for (i = 0; i < nr_irqs; i++)
> > irq_domain_set_info(domain, virq + i, hwirq + i,
> > @@ -894,7 +891,7 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > domain->host_data, handle_simple_irq,
> > NULL, NULL);
> >
> > - return hwirq;
> > + return 0;
> > }
> >
> > static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > @@ -904,7 +901,7 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > struct advk_pcie *pcie = domain->host_data;
> >
> > mutex_lock(&pcie->msi_used_lock);
> > - bitmap_clear(pcie->msi_used, d->hwirq, nr_irqs);
> > + bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
> > mutex_unlock(&pcie->msi_used_lock);
> > }
> >
> > @@ -1048,6 +1045,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > {
> > u32 msi_val, msi_mask, msi_status, msi_idx;
> > u16 msi_data;
> > + int virq;
> >
> > msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> > msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> > @@ -1057,9 +1055,17 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > if (!(BIT(msi_idx) & msi_status))
> > continue;
> >
> > + /*
> > + * msi_idx contains bits [4:0] of the msi_data and msi_data
> > + * contains 16bit MSI interrupt number from MSI inner domain
> > + */
> > advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
> > - msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & 0xFF;
> > - generic_handle_irq(msi_data);
> > + msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;
>
> Can this be moved to a separate patch? It seems like this patch should
> only focus on correctly dealing with the irq/hwirq issues.

Well, hwirq is read from PCIE_MSI_PAYLOAD_REG register and it is 16-bit.
That is why I included this change in this patch, to fix also reading
IRQ number, not only setting IRQ number.

> > + virq = irq_find_mapping(pcie->msi_inner_domain, msi_data);
> > + if (virq)
> > + generic_handle_irq(virq);
> > + else
> > + dev_err(&pcie->pdev->dev, "unexpected MSI 0x%04hx\n", msi_data);
>
> Same concern about the unmitigated screaming.
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

2021-05-07 17:29:06

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 06/42] PCI: aardvark: Fix reporting CRS Software Visibility on emulated bridge

On Friday 07 May 2021 08:03:07 Bjorn Helgaas wrote:
> On Thu, May 06, 2021 at 05:31:17PM +0200, Pali Rohár wrote:
> > CRS Software Visibility is supported and always enabled by PIO.
> > Correctly report this information via emulated root bridge.
>
> Maybe spell out "Configuration Request Retry Status (CRS) Software
> Visibility" once.
>
> I'm guessing the aardvark hardware spec is proprietary,

Seems so. I have never seen aardvark hardware or software spec. I have
already asked Marvell more times if they can provide me documentation,
but I have not received anything (yet).

I have access to A3720 documentation which is available on Marvell
Customer portal and it contains section about PCIe on A3720. It is
incomplete and mostly buggy.

So most code is written by trial & error method and simply observe how
this hardware behaves. Lot of things I can just guess how should work.
See e.g. long description in patch 7/42 Fix link training.

> but can you at
> least include a reference to the section that says CRSVIS is
> supported and CRSSVE is enabled?

There is nothing :-(

> What is PIO? I assume this is something other than "programmed I/O"?

I'm not sure what PIO abbreviation means. As I saw that sometimes people
refers to PCIe I/O, sometimes (generic) programmed I/O and sometimes
Posted I/O.

I was probably not accurate in all commit messages as some of these
patches I have written months ago and some just few days ago.

I will try to explain what PIO means in this driver and context of A3720
PCIe. I understand it as a way how to send PCIe messages/packets from
A3720 PCIe controller without need to map PCIe address space. A3720 HW
provides registers into which you fill information what kind of PCIe
message you want to send (e.g. config read/write or memory read/write
transactions, INTx or Error or Slot Power or Vendor message) and then by
another register you trigger sending of this message and waiting until
register indicates that operation finished.

> I'd like the commit log to say something about the effect of this
> change, i.e., why are we doing it?
>
> For one thing, I expect lspci will now show "RootCtl: ... CRSVisible+"
> and "RootCap: CRSVisible+".

Exactly.

> With PCI_EXP_RTCAP_CRSVIS set, pci_enable_crs() should now try to set
> PCI_EXP_RTCTL_CRSSVE (which I think is a no-op since
> advk_pci_bridge_emul_pcie_conf_write() doesn't do anything with
> PCI_EXP_RTCTL_CRSSVE).

aardvark hook for emul bridge does nothing for PCI_EXP_RTCTL_CRSSVE.

> So AFAICT this has zero effect on the kernel. Possibly we *should*
> base some kernel behavior on whether PCI_EXP_RTCTL_CRSSVE is set, but
> I don't think we do today.

I understood CRSSVE bit that when it is enabled then root complex starts
returning CRS when trying to read device+vendor id from device which is
not ready yet. And if CRSSVE is not enabled then root complex should
re-issue config request until completes with non-CRS status and do not
return CRS.

And because I have not found any way how to reconfigure aardvark to auto
re-issue config request (which is when CRSSVE is not enabled) started by
PIO, I included this patch which sets CRSSVE bit in emulated bridge to
match expected behavior.

If my understanding is incorrect here, please let me know and I update
and fix this driver.

> > Signed-off-by: Pali Rohár <[email protected]>
> > Reviewed-by: Marek Behún <[email protected]>
> > Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
> > Cc: [email protected]
>
> Again, I think this just adds functionality and doesn't fix something
> that used to be broken.
>
> Per [1], patches for the stable kernel should be for serious issues
> like an oops, hang, data corruption, etc. I know stable kernel
> maintainers pick up all sorts of other stuff, but that's up to them.
> I try to limit stable tags to reduce the risk of regressing.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v5.11

Ok, no problem. I can omit either CC, Fixes or both tags.

> > ---
> > drivers/pci/controller/pci-aardvark.c | 14 +++++++++++++-
> > 1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 3f3c72927afb..e297ec9ec390 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -578,6 +578,8 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
> > case PCI_EXP_RTCTL: {
> > u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG);
> > *value = (val & PCIE_MSG_PM_PME_MASK) ? 0 : PCI_EXP_RTCTL_PMEIE;
> > + *value |= PCI_EXP_RTCTL_CRSSVE;
> > + *value |= PCI_EXP_RTCAP_CRSVIS << 16;
> > return PCI_BRIDGE_EMUL_HANDLED;
> > }
> >
> > @@ -659,6 +661,7 @@ static struct pci_bridge_emul_ops advk_pci_bridge_emul_ops = {
> > static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
> > {
> > struct pci_bridge_emul *bridge = &pcie->bridge;
> > + int ret;
> >
> > bridge->conf.vendor =
> > cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff);
> > @@ -682,7 +685,16 @@ static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
> > bridge->data = pcie;
> > bridge->ops = &advk_pci_bridge_emul_ops;
> >
> > - return pci_bridge_emul_init(bridge, 0);
> > + /* PCIe config space can be initialized after pci_bridge_emul_init() */
> > + ret = pci_bridge_emul_init(bridge, 0);
> > + if (ret < 0)
> > + return ret;
> > +
> > + /* Completion Retry Status is supported and always enabled by PIO */
>
> "CRS Software Visibility", not "Completion Retry Status".
>
> The CRSSVE bit is *supposed* to be RW, per spec. Is it RO on this
> hardware?

As I wrote above, I did not find a way how to turn off CRS.

A3720 spec does not mention any register which could map to PCIe Root
Capabilities Register.

> > + bridge->pcie_conf.rootctl = cpu_to_le16(PCI_EXP_RTCTL_CRSSVE);
> > + bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS);
> > +
> > + return 0;
> > }
> >
> > static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
> > --
> > 2.20.1
> >

2021-05-07 17:33:50

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 06/42] PCI: aardvark: Fix reporting CRS Software Visibility on emulated bridge

On Friday 07 May 2021 08:03:07 Bjorn Helgaas wrote:
> I'm guessing the aardvark hardware spec is proprietary

FYI these are only two publicly accessible Marvell A37xx documents which
do not require login to Marvell Customer portal:

http://www.marvell.com/content/dam/marvell/en/public-collateral/embedded-processors/marvell-embedded-processors-armada-37xx-product-brief-2016-01.pdf
http://www.marvell.com/content/dam/marvell/en/public-collateral/embedded-processors/marvell-embedded-processors-armada-37xx-hardware-specifications-2019-09.pdf

And for pci-aardvark.c driver they are mostly useless.

Funny part is that second document has PCIe Reset section which
mention existence of some bits in some registers but these
bits/registers are not documented in any other documentation...

2021-05-07 17:36:09

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 17/42] PCI: aardvark: Fix support for MSI interrupts

On Fri, 07 May 2021 15:44:20 +0100,
Pali Rohár <[email protected]> wrote:
>
> On Friday 07 May 2021 11:16:58 Marc Zyngier wrote:
> > On Thu, 06 May 2021 16:31:28 +0100,
> > Pali Rohár <[email protected]> wrote:
> > >
> > > MSI domain callback .alloc (implemented by advk_msi_irq_domain_alloc()
> > > function) should return zero on success. Returning non-zero value indicates
> > > failure. Fix return value of this function as in many cases it now returns
> > > failure while allocating IRQs.
> > >
> > > Aardvark hardware supports Multi-MSI and MSI_FLAG_MULTI_PCI_MSI is already
> > > set. But when allocating MSI interrupt numbers for Multi-MSI, they need to
> > > be properly aligned, otherwise endpoint devices send MSI interrupt with
> > > incorrect numbers. Fix this issue by using function bitmap_find_free_region()
> > > instead of bitmap_find_next_zero_area().
> > >
> > > To ensure that aligned MSI interrupt numbers are used by endpoint devices,
> > > we cannot use Linux virtual irq numbers (as they are random and not
> > > properly aligned). So use hwirq numbers allocated by the function
> > > bitmap_find_free_region(), which are aligned. This needs an update in
> > > advk_msi_irq_compose_msi_msg() and advk_pcie_handle_msi() functions to do
> > > proper mapping between Linux virtual irq numbers and hwirq MSI inner domain
> > > numbers.
> > >
> > > Also the whole 16-bit MSI number is stored in the PCIE_MSI_PAYLOAD_REG
> > > register, not only lower 8 bits. Fix reading content of this register.
> > >
> > > This change fixes receiving MSI interrupts on Armada 3720 boards and allows
> > > using NVMe disks which use Multi-MSI feature with 3 interrupts.
> > >
> > > Without this change, NVMe disks just freeze booting Linux on Armada 3720
> > > boards as linux nvme-core.c driver is waiting 60s for an interrupt.
> > >
> > > Signed-off-by: Pali Rohár <[email protected]>
> > > Reviewed-by: Marek Behún <[email protected]>
> > > Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
> > > ---
> > > drivers/pci/controller/pci-aardvark.c | 32 ++++++++++++++++-----------
> > > 1 file changed, 19 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > > index 366d7480bc1b..498810c00b6d 100644
> > > --- a/drivers/pci/controller/pci-aardvark.c
> > > +++ b/drivers/pci/controller/pci-aardvark.c
> > > @@ -118,6 +118,7 @@
> > > #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> > > #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> > > #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> > > +#define PCIE_MSI_DATA_MASK GENMASK(15, 0)
> >
> > See my comment below about this addition.
> >
> > > /* LMI registers base address and register offsets */
> > > #define LMI_BASE_ADDR 0x6000
> > > @@ -861,7 +862,7 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
> > >
> > > msg->address_lo = lower_32_bits(msi_msg);
> > > msg->address_hi = upper_32_bits(msi_msg);
> > > - msg->data = data->irq;
> > > + msg->data = data->hwirq;
> > > }
> > >
> > > static int advk_msi_set_affinity(struct irq_data *irq_data,
> > > @@ -878,15 +879,11 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > > int hwirq, i;
> > >
> > > mutex_lock(&pcie->msi_used_lock);
> > > - hwirq = bitmap_find_next_zero_area(pcie->msi_used, MSI_IRQ_NUM,
> > > - 0, nr_irqs, 0);
> > > - if (hwirq >= MSI_IRQ_NUM) {
> > > - mutex_unlock(&pcie->msi_used_lock);
> > > - return -ENOSPC;
> > > - }
> > > -
> > > - bitmap_set(pcie->msi_used, hwirq, nr_irqs);
> > > + hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
> > > + order_base_2(nr_irqs));
> > > mutex_unlock(&pcie->msi_used_lock);
> > > + if (hwirq < 0)
> > > + return -ENOSPC;
> > >
> > > for (i = 0; i < nr_irqs; i++)
> > > irq_domain_set_info(domain, virq + i, hwirq + i,
> > > @@ -894,7 +891,7 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > > domain->host_data, handle_simple_irq,
> > > NULL, NULL);
> > >
> > > - return hwirq;
> > > + return 0;
> > > }
> > >
> > > static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > > @@ -904,7 +901,7 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > > struct advk_pcie *pcie = domain->host_data;
> > >
> > > mutex_lock(&pcie->msi_used_lock);
> > > - bitmap_clear(pcie->msi_used, d->hwirq, nr_irqs);
> > > + bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
> > > mutex_unlock(&pcie->msi_used_lock);
> > > }
> > >
> > > @@ -1048,6 +1045,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > > {
> > > u32 msi_val, msi_mask, msi_status, msi_idx;
> > > u16 msi_data;
> > > + int virq;
> > >
> > > msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> > > msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> > > @@ -1057,9 +1055,17 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > > if (!(BIT(msi_idx) & msi_status))
> > > continue;
> > >
> > > + /*
> > > + * msi_idx contains bits [4:0] of the msi_data and msi_data
> > > + * contains 16bit MSI interrupt number from MSI inner domain
> > > + */
> > > advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
> > > - msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & 0xFF;
> > > - generic_handle_irq(msi_data);
> > > + msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;
> >
> > Can this be moved to a separate patch? It seems like this patch should
> > only focus on correctly dealing with the irq/hwirq issues.
>
> Well, hwirq is read from PCIE_MSI_PAYLOAD_REG register and it is 16-bit.
> That is why I included this change in this patch, to fix also reading
> IRQ number, not only setting IRQ number.

But this irq number still is a 5 bit quantity at this stage, and the
support for more than 32 MSIs only come in 3 patches later.

So this doesn't fix anything in this patch, and should be moved to
patch 20.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-05-07 17:38:08

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 05/42] PCI: pci-bridge-emul: Add PCIe Root Capabilities Register

On Fri, May 07, 2021 at 04:40:15PM +0200, Pali Roh?r wrote:
> On Thursday 06 May 2021 18:10:09 Bjorn Helgaas wrote:
> > On Thu, May 06, 2021 at 05:31:16PM +0200, Pali Roh?r wrote:

> > > Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
> > > Cc: [email protected] # e0d9d30b7354 ("PCI: pci-bridge-emul: Fix big-endian support")
> >
> > I'm not sure how people would deal with *two* SHA1s.
>
> I guess that this is fine per stable document as it mention such example:
> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

Oh, I see what you mean: the SHA1 on the stable tag is a prerequisite.
I had forgotten that, but it makes sense. Thanks!

> > This patch adds functionality, so it's not really fixing a bug in
> > 23a5fba4d941.
>
> I'm not sure what is the correct meaning of Fixes tag. I included it to
> easily determinate in which commit was introduced member name "rsvd"
> which should have been named "rootcap".

IMO, the point of "Fixes: X" is to say that "if your kernel includes
commit X, you should consider also including this patch because it
fixes a problem with X."

In this case, the fact that your kernel includes 23a5fba4d941 is not
an indication that you should include this patch because this patch
doesn't fix a defect in 23a5fba4d941.

> Submitting patches document is not fully clear for me as I understood it
> that Fixes and CC:stable are two different things. E.g. it mention
> "Attaching a Fixes: tag does not subvert ... the requirement to Cc:
> [email protected] on all stable patch candidates." which I
> understood that patch for backporting needs to have Cc:stable:
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html

Yes, I think you're right. "Fixes:" by itself just indicates that
this fixes a defect in a previous patch. But the fix may not rise to
the level of being considered for stable, because historically we only
wanted small, critical fixes for stable kernels.

I think this patch should omit both Fixes: and the stable tag.

If a future patch that depends on this one is a candidate for stable,
*that patch* should have a stable tag with this patch as a
prerequisite.

Useful tidbit I found recently -- a URL like this:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.12#n554

is more specific and won't become stale as the doc changes. (If you
follow that URL five years later, the doc will have changed so the
v5.12 version will be stale, but at least the URL still makes sense in
the context of this conversation.)

Bjorn

2021-05-07 17:50:05

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 06/42] PCI: aardvark: Fix reporting CRS Software Visibility on emulated bridge

On Thu, May 06, 2021 at 05:31:17PM +0200, Pali Roh?r wrote:
> CRS Software Visibility is supported and always enabled by PIO.
> Correctly report this information via emulated root bridge.

Maybe spell out "Configuration Request Retry Status (CRS) Software
Visibility" once.

I'm guessing the aardvark hardware spec is proprietary, but can you at
least include a reference to the section that says CRSVIS is
supported and CRSSVE is enabled?

What is PIO? I assume this is something other than "programmed I/O"?

I'd like the commit log to say something about the effect of this
change, i.e., why are we doing it?

For one thing, I expect lspci will now show "RootCtl: ... CRSVisible+"
and "RootCap: CRSVisible+".

With PCI_EXP_RTCAP_CRSVIS set, pci_enable_crs() should now try to set
PCI_EXP_RTCTL_CRSSVE (which I think is a no-op since
advk_pci_bridge_emul_pcie_conf_write() doesn't do anything with
PCI_EXP_RTCTL_CRSSVE).

So AFAICT this has zero effect on the kernel. Possibly we *should*
base some kernel behavior on whether PCI_EXP_RTCTL_CRSSVE is set, but
I don't think we do today.

> Signed-off-by: Pali Roh?r <[email protected]>
> Reviewed-by: Marek Beh?n <[email protected]>
> Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space")
> Cc: [email protected]

Again, I think this just adds functionality and doesn't fix something
that used to be broken.

Per [1], patches for the stable kernel should be for serious issues
like an oops, hang, data corruption, etc. I know stable kernel
maintainers pick up all sorts of other stuff, but that's up to them.
I try to limit stable tags to reduce the risk of regressing.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v5.11

> ---
> drivers/pci/controller/pci-aardvark.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 3f3c72927afb..e297ec9ec390 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -578,6 +578,8 @@ advk_pci_bridge_emul_pcie_conf_read(struct pci_bridge_emul *bridge,
> case PCI_EXP_RTCTL: {
> u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG);
> *value = (val & PCIE_MSG_PM_PME_MASK) ? 0 : PCI_EXP_RTCTL_PMEIE;
> + *value |= PCI_EXP_RTCTL_CRSSVE;
> + *value |= PCI_EXP_RTCAP_CRSVIS << 16;
> return PCI_BRIDGE_EMUL_HANDLED;
> }
>
> @@ -659,6 +661,7 @@ static struct pci_bridge_emul_ops advk_pci_bridge_emul_ops = {
> static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
> {
> struct pci_bridge_emul *bridge = &pcie->bridge;
> + int ret;
>
> bridge->conf.vendor =
> cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff);
> @@ -682,7 +685,16 @@ static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
> bridge->data = pcie;
> bridge->ops = &advk_pci_bridge_emul_ops;
>
> - return pci_bridge_emul_init(bridge, 0);
> + /* PCIe config space can be initialized after pci_bridge_emul_init() */
> + ret = pci_bridge_emul_init(bridge, 0);
> + if (ret < 0)
> + return ret;
> +
> + /* Completion Retry Status is supported and always enabled by PIO */

"CRS Software Visibility", not "Completion Retry Status".

The CRSSVE bit is *supposed* to be RW, per spec. Is it RO on this
hardware?

> + bridge->pcie_conf.rootctl = cpu_to_le16(PCI_EXP_RTCTL_CRSSVE);
> + bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS);
> +
> + return 0;
> }
>
> static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
> --
> 2.20.1
>

2021-05-07 17:52:32

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 05/42] PCI: pci-bridge-emul: Add PCIe Root Capabilities Register

On Thursday 06 May 2021 18:10:09 Bjorn Helgaas wrote:
> On Thu, May 06, 2021 at 05:31:16PM +0200, Pali Rohár wrote:
> > This is 16-bit register at offset 0x1E. Rename current 'rsvd' struct member
> > to 'rootcap'.
>
> "The 16-bit Root Capabilities register is at offset 0x1e in the PCIe
> Capability."
>
> Please make the commit log complete in itself. In some contexts, the
> subject line is not visible at the same time. It's fine to repeat the
> subject in the commit log.
>
> > Signed-off-by: Pali Rohár <[email protected]>
> > Reviewed-by: Marek Behún <[email protected]>
> > Fixes: 23a5fba4d941 ("PCI: Introduce PCI bridge emulated config space common logic")
> > Cc: [email protected] # e0d9d30b7354 ("PCI: pci-bridge-emul: Fix big-endian support")
>
> I'm not sure how people would deal with *two* SHA1s.

I guess that this is fine per stable document as it mention such example:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

I have already in past sent patches with Fixes:hash1 and CC:stable/hash2
and were taken correctly.

> This patch adds functionality, so it's not really fixing a bug in
> 23a5fba4d941.

I'm not sure what is the correct meaning of Fixes tag. I included it to
easily determinate in which commit was introduced member name "rsvd"
which should have been named "rootcap".

Submitting patches document is not fully clear for me as I understood it
that Fixes and CC:stable are two different things. E.g. it mention
"Attaching a Fixes: tag does not subvert ... the requirement to Cc:
[email protected] on all stable patch candidates." which I
understood that patch for backporting needs to have Cc:stable:
https://www.kernel.org/doc/html/latest/process/submitting-patches.html

But I will change it as needed. Just I did not know what is "the correct
way".

> I see that e0d9d30b7354 came along later and did
> "s/u16 rsvd/__le16 rsvd/".
>
> But it seems like a lot to expect for distros and stable kernel
> maintainers to interpret this.
>
> Personally I think I would omit both Fixes: and the stable tag since
> these two patches (05 and 06) are just adding functionality.
>
> > ---
> > drivers/pci/pci-bridge-emul.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pci-bridge-emul.h b/drivers/pci/pci-bridge-emul.h
> > index b31883022a8e..49bbd37ee318 100644
> > --- a/drivers/pci/pci-bridge-emul.h
> > +++ b/drivers/pci/pci-bridge-emul.h
> > @@ -54,7 +54,7 @@ struct pci_bridge_emul_pcie_conf {
> > __le16 slotctl;
> > __le16 slotsta;
> > __le16 rootctl;
> > - __le16 rsvd;
> > + __le16 rootcap;
> > __le32 rootsta;
> > __le32 devcap2;
> > __le16 devctl2;
> > --
> > 2.20.1
> >

2021-05-19 19:21:16

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 01/42] PCI: aardvark: Fix kernel panic during PIO transfer

On Thursday 06 May 2021 17:31:12 Pali Rohár wrote:
> Trying to start a new PIO transfer by writing value 0 in PIO_START register
> when previous transfer has not yet completed (which is indicated by value 1
> in PIO_START) causes an External Abort on CPU, which results in kernel
> panic:
>
> SError Interrupt on CPU0, code 0xbf000002 -- SError
> Kernel panic - not syncing: Asynchronous SError Interrupt
>
> To prevent kernel panic, it is required to reject a new PIO transfer when
> previous one has not finished yet.
>
> If previous PIO transfer is not finished yet, the kernel may issue a new
> PIO request only if the previous PIO transfer timed out.
>
> In the past the root cause of this issue was incorrectly identified (as it
> often happens during link retraining or after link down event) and special
> hack was implemented in Trusted Firmware to catch all SError events in EL3,
> to ignore errors with code 0xbf000002 and not forwarding any other errors
> to kernel and instead throw panic from EL3 Trusted Firmware handler.
>
> Links to discussion and patches about this issue:
> https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50
> https://lore.kernel.org/linux-pci/[email protected]/
> https://lore.kernel.org/linux-pci/[email protected]/
> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541
>
> But the real cause was the fact that during link retraning or after link
> down event the PIO transfer may take longer time, up to the 1.44s until it
> times out. This increased probability that a new PIO transfer would be
> issued by kernel while previous one has not finished yet.
>
> After applying this change into the kernel, it is possible to revert the
> mentioned TF-A hack and SError events do not have to be caught in TF-A EL3.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> Cc: [email protected] # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock")

Hello! Could you please review at least this patch? It is fixing kernel
panic and to prevent future kernel crashes I would really suggest to
merge this one patch into 5.13 queue.

> ---
> drivers/pci/controller/pci-aardvark.c | 49 ++++++++++++++++++++++-----
> 1 file changed, 40 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 051b48bd7985..e3f5e7ab7606 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -514,7 +514,7 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie)
> udelay(PIO_RETRY_DELAY);
> }
>
> - dev_err(dev, "config read/write timed out\n");
> + dev_err(dev, "PIO read/write transfer time out\n");
> return -ETIMEDOUT;
> }
>
> @@ -657,6 +657,35 @@ static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
> return true;
> }
>
> +static bool advk_pcie_pio_is_running(struct advk_pcie *pcie)
> +{
> + struct device *dev = &pcie->pdev->dev;
> +
> + /*
> + * Trying to start a new PIO transfer when previous has not completed
> + * cause External Abort on CPU which results in kernel panic:
> + *
> + * SError Interrupt on CPU0, code 0xbf000002 -- SError
> + * Kernel panic - not syncing: Asynchronous SError Interrupt
> + *
> + * Functions advk_pcie_rd_conf() and advk_pcie_wr_conf() are protected
> + * by raw_spin_lock_irqsave() at pci_lock_config() level to prevent
> + * concurrent calls at the same time. But because PIO transfer may take
> + * about 1.5s when link is down or card is disconnected, it means that
> + * advk_pcie_wait_pio() does not always have to wait for completion.
> + *
> + * Some versions of ARM Trusted Firmware handles this External Abort at
> + * EL3 level and mask it to prevent kernel panic. Relevant TF-A commit:
> + * https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50
> + */
> + if (advk_readl(pcie, PIO_START)) {
> + dev_err(dev, "Previous PIO read/write transfer is still running\n");
> + return true;
> + }
> +
> + return false;
> +}
> +
> static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
> int where, int size, u32 *val)
> {
> @@ -673,9 +702,10 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
> return pci_bridge_emul_conf_read(&pcie->bridge, where,
> size, val);
>
> - /* Start PIO */
> - advk_writel(pcie, 0, PIO_START);
> - advk_writel(pcie, 1, PIO_ISR);
> + if (advk_pcie_pio_is_running(pcie)) {
> + *val = 0xffffffff;
> + return PCIBIOS_SET_FAILED;
> + }
>
> /* Program the control register */
> reg = advk_readl(pcie, PIO_CTRL);
> @@ -694,7 +724,8 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
> /* Program the data strobe */
> advk_writel(pcie, 0xf, PIO_WR_DATA_STRB);
>
> - /* Start the transfer */
> + /* Clear PIO DONE ISR and start the transfer */
> + advk_writel(pcie, 1, PIO_ISR);
> advk_writel(pcie, 1, PIO_START);
>
> ret = advk_pcie_wait_pio(pcie);
> @@ -734,9 +765,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
> if (where % size)
> return PCIBIOS_SET_FAILED;
>
> - /* Start PIO */
> - advk_writel(pcie, 0, PIO_START);
> - advk_writel(pcie, 1, PIO_ISR);
> + if (advk_pcie_pio_is_running(pcie))
> + return PCIBIOS_SET_FAILED;
>
> /* Program the control register */
> reg = advk_readl(pcie, PIO_CTRL);
> @@ -763,7 +793,8 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
> /* Program the data strobe */
> advk_writel(pcie, data_strobe, PIO_WR_DATA_STRB);
>
> - /* Start the transfer */
> + /* Clear PIO DONE ISR and start the transfer */
> + advk_writel(pcie, 1, PIO_ISR);
> advk_writel(pcie, 1, PIO_START);
>
> ret = advk_pcie_wait_pio(pcie);
> --
> 2.20.1
>

2021-05-24 14:39:31

by Marek Behún

[permalink] [raw]
Subject: Re: [PATCH 15/42] PCI: aardvark: Change name of INTx irq_chip to advk-INT

On Fri, 07 May 2021 10:08:18 +0100
Marc Zyngier <[email protected]> wrote:

> On Thu, 06 May 2021 16:31:26 +0100,
> Pali Rohár <[email protected]> wrote:
> >
> > This name is visible in /proc/interrupts file and for better reading it
> > should have at most 8 characters. Also there is no need to allocate this
> > name dynamically, since there is only one PCIe controller on Armada 37xx.
> > This aligns with how the MSI irq_chip in this driver names it's interrupt
> > ("advk-MSI").
>
> And *because* the name is visible in /proc/interrupts, it has become
> an ABI, and cannot be changed anymore.
>
> We had the exact same issue with Tegra this merge window as I
> accidentally changed "Tegra" to "tegra", resulting in userspace
> programs failing find stuff in /proc/interrupts.
>
> Please keep the name as is, no matter how ugly it is.

Hmm, I am 99% sure that for the A3720 platform this ABI change would not
affect anybody. And it does make the driver's irq names confusing.
Can't we really do anything here?

Note that there were suggestions from some people to completely remove
this driver due to the many problems it has which Pali is trying to
solve. But if the driver was removed and then later introduced again
without these problems, the new version would use the "advk-INT" IRQ
name...

Marek

2021-05-24 15:24:33

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 15/42] PCI: aardvark: Change name of INTx irq_chip to advk-INT

On Mon, 24 May 2021 15:36:51 +0100,
Marek Behún <[email protected]> wrote:
>
> On Fri, 07 May 2021 10:08:18 +0100
> Marc Zyngier <[email protected]> wrote:
>
> > On Thu, 06 May 2021 16:31:26 +0100,
> > Pali Rohár <[email protected]> wrote:
> > >
> > > This name is visible in /proc/interrupts file and for better reading it
> > > should have at most 8 characters. Also there is no need to allocate this
> > > name dynamically, since there is only one PCIe controller on Armada 37xx.
> > > This aligns with how the MSI irq_chip in this driver names it's interrupt
> > > ("advk-MSI").
> >
> > And *because* the name is visible in /proc/interrupts, it has become
> > an ABI, and cannot be changed anymore.
> >
> > We had the exact same issue with Tegra this merge window as I
> > accidentally changed "Tegra" to "tegra", resulting in userspace
> > programs failing find stuff in /proc/interrupts.
> >
> > Please keep the name as is, no matter how ugly it is.
>
> Hmm, I am 99% sure that for the A3720 platform this ABI change would not
> affect anybody. And it does make the driver's irq names confusing.
> Can't we really do anything here?

No, this is final. Show anything in /proc/*, maintain it forever. We
already went there with the bogomips crap showing up in
/proc/cpuinfo. There is no way you can know what userspace does, and
the best course of action is not to change things for some dubious
value of "nicer" or "less confusing".

> Note that there were suggestions from some people to completely remove
> this driver due to the many problems it has which Pali is trying to
> solve. But if the driver was removed and then later introduced again
> without these problems, the new version would use the "advk-INT" IRQ
> name...

No, you would have to keep the *exact same output*. Userspace doesn't
know about drivers, and expect things in /proc to be stable.

Frankly, there are more important things to do than to worry about the
shape of /proc/interrupts. And if we could change it, I'd simply get
rid of it (you really should look at it on a system that has ~200
CPUs...).

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-06-03 15:20:35

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [PATCH 00/42] PCI: aardvark: Various driver fixes

On Thu, May 06, 2021 at 05:31:11PM +0200, Pali Roh?r wrote:
> This patch series fixes various issues in pci-aardvark.c driver
> (PCIe controller on Marvell Armada 3700 SoC) used on Espressobin
> and Turris Mox.
>
> First patch fixes kernel panic (or TF-A panic depending on used
> firmware) during execution of PIO transfer and I would suggest to
> include this fix into v5.13 release to prevent future kernel panics.
>
> Other patches then fixes PIO issues, PCIe link training, initialization
> of PCIe controller, accessing PCIe Root Bridge/Port registers, handling
> of interrupts (including ERR and PME), setup of Multi-MSI interrupts,
> adding support for masking MSI interrupts, adding support for more than
> 32 MSI interrupts with MSI-X support, and adding support for AER.
>
> Note that there are still some unresolved issues with PCIe on A3720.
> I asked Marvell for PCIe documentation or explanations but I have not
> received anything (yet).
>
> Patch "Fix checking for PIO status" is taken from the Marvell github fork
> and adapted for inclusion into mainline kernel:
> https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/84444d22023c
>
> Whole patch series is available also in my git branch pci-aardvark:
> https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark
>
> If you have Espressobin, Turris Mox or other A3720 board with PCIe
> please test this patch series with your favourite PCIe card if
> everything is working fine.
>
> Evan Wang (1):
> PCI: aardvark: Fix checking for PIO status
>
> Pali Roh?r (39):
> PCI: aardvark: Fix kernel panic during PIO transfer
> PCI: aardvark: Fix checking for PIO Non-posted Request
> PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO
> response
> PCI: pci-bridge-emul: Add PCIe Root Capabilities Register
> PCI: aardvark: Fix reporting CRS Software Visibility on emulated
> bridge
> PCI: aardvark: Fix link training
> PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
> PCI: aardvark: Fix PCIe Max Payload Size setting
> PCI: aardvark: Implement workaround for the readback value of VEND_ID
> PCI: aardvark: Do not touch status bits of masked interrupts in
> interrupt handler
> PCI: aardvark: Check for virq mapping when processing INTx IRQ
> PCI: aardvark: Remove irq_mask_ack callback for INTx interrupts
> PCI: aardvark: Don't mask irq when mapping
> PCI: aardvark: Change name of INTx irq_chip to advk-INT
> PCI: aardvark: Remove unneeded goto
> PCI: aardvark: Fix support for MSI interrupts
> PCI: aardvark: Correctly clear and unmask all MSI interrupts
> PCI: aardvark: Fix setting MSI address
> PCI: aardvark: Add support for more than 32 MSI interrupts
> PCI: aardvark: Add support for masking MSI interrupts
> PCI: aardvark: Enable MSI-X support
> PCI: aardvark: Fix support for ERR interrupt on emulated bridge
> PCI: aardvark: Fix support for PME on emulated bridge
> PCI: aardvark: Fix support for PME requester on emulated bridge
> PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on
> emulated bridge
> PCI: aardvark: Disable bus mastering and mask all interrupts when
> unbinding driver
> PCI: aardvark: Free config space for emulated root bridge when
> unbinding driver to fix memory leak
> PCI: aardvark: Reset PCIe card and disable PHY when unbinding driver
> PCI: aardvark: Rewrite irq code to chained irq handler
> PCI: aardvark: Use separate INTA interrupt for emulated root bridge
> PCI: pci-bridge-emul: Add description for class_revision field
> PCI: pci-bridge-emul: Add definitions for missing capabilities
> registers
> PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
> registers on emulated bridge
> PCI: aardvark: Add support for PCI_BRIDGE_CTL_BUS_RESET on emulated
> bridge
> PCI: aardvark: Replace custom PCIE_CORE_ERR_CAPCTL_* macros by
> linux/pci_regs.h macros
> PCI: aardvark: Replace custom PCIE_CORE_INT_* macros by linux
> PCI_INTERRUPT_* values
> PCI: aardvark: Cleanup some register macros
> PCI: aardvark: Add comments for OB_WIN_ENABLE and ADDR_WIN_DISABLE
> PCI: aardvark: Add support for Advanced Error Reporting registers on
> emulated bridge
>
> Russell King (2):
> PCI: pci-bridge-emul: re-arrange register tests
> PCI: pci-bridge-emul: add support for PCIe extended capabilities
>
> drivers/pci/controller/pci-aardvark.c | 980 +++++++++++++++++++-------
> drivers/pci/pci-bridge-emul.c | 149 ++--
> drivers/pci/pci-bridge-emul.h | 17 +-
> include/uapi/linux/pci_regs.h | 6 +
> 4 files changed, 850 insertions(+), 302 deletions(-)

May I ask you please to split this series in smaller sets so that
it is easier to merge ?

Let's start with the more urgent fixes that don't involve rework (or
you have not received change requests for since they are simple).

Thanks,
Lorenzo

2021-06-03 17:05:02

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 00/42] PCI: aardvark: Various driver fixes

On Thursday 03 June 2021 16:16:05 Lorenzo Pieralisi wrote:
> May I ask you please to split this series in smaller sets so that
> it is easier to merge ?

No problem!

> Let's start with the more urgent fixes that don't involve rework (or
> you have not received change requests for since they are simple).

Ok, I will do it.

> Thanks,
> Lorenzo

2021-06-03 18:07:10

by Simon Glass

[permalink] [raw]
Subject: Re: [PATCH 00/42] PCI: aardvark: Various driver fixes

Hi,

On Thu, 3 Jun 2021 at 09:17, Lorenzo Pieralisi
<[email protected]> wrote:
>
> On Thu, May 06, 2021 at 05:31:11PM +0200, Pali Rohár wrote:
> > This patch series fixes various issues in pci-aardvark.c driver
> > (PCIe controller on Marvell Armada 3700 SoC) used on Espressobin
> > and Turris Mox.
> >
> > First patch fixes kernel panic (or TF-A panic depending on used
> > firmware) during execution of PIO transfer and I would suggest to
> > include this fix into v5.13 release to prevent future kernel panics.
> >
> > Other patches then fixes PIO issues, PCIe link training, initialization
> > of PCIe controller, accessing PCIe Root Bridge/Port registers, handling
> > of interrupts (including ERR and PME), setup of Multi-MSI interrupts,
> > adding support for masking MSI interrupts, adding support for more than
> > 32 MSI interrupts with MSI-X support, and adding support for AER.
> >
> > Note that there are still some unresolved issues with PCIe on A3720.
> > I asked Marvell for PCIe documentation or explanations but I have not
> > received anything (yet).
> >
> > Patch "Fix checking for PIO status" is taken from the Marvell github fork
> > and adapted for inclusion into mainline kernel:
> > https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/84444d22023c
> >
> > Whole patch series is available also in my git branch pci-aardvark:
> > https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark
> >
> > If you have Espressobin, Turris Mox or other A3720 board with PCIe
> > please test this patch series with your favourite PCIe card if
> > everything is working fine.
> >
> > Evan Wang (1):
> > PCI: aardvark: Fix checking for PIO status
> >
> > Pali Rohár (39):
> > PCI: aardvark: Fix kernel panic during PIO transfer
> > PCI: aardvark: Fix checking for PIO Non-posted Request
> > PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO
> > response
> > PCI: pci-bridge-emul: Add PCIe Root Capabilities Register
> > PCI: aardvark: Fix reporting CRS Software Visibility on emulated
> > bridge
> > PCI: aardvark: Fix link training
> > PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
> > PCI: aardvark: Fix PCIe Max Payload Size setting
> > PCI: aardvark: Implement workaround for the readback value of VEND_ID
> > PCI: aardvark: Do not touch status bits of masked interrupts in
> > interrupt handler
> > PCI: aardvark: Check for virq mapping when processing INTx IRQ
> > PCI: aardvark: Remove irq_mask_ack callback for INTx interrupts
> > PCI: aardvark: Don't mask irq when mapping
> > PCI: aardvark: Change name of INTx irq_chip to advk-INT
> > PCI: aardvark: Remove unneeded goto
> > PCI: aardvark: Fix support for MSI interrupts
> > PCI: aardvark: Correctly clear and unmask all MSI interrupts
> > PCI: aardvark: Fix setting MSI address
> > PCI: aardvark: Add support for more than 32 MSI interrupts
> > PCI: aardvark: Add support for masking MSI interrupts
> > PCI: aardvark: Enable MSI-X support
> > PCI: aardvark: Fix support for ERR interrupt on emulated bridge
> > PCI: aardvark: Fix support for PME on emulated bridge
> > PCI: aardvark: Fix support for PME requester on emulated bridge
> > PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on
> > emulated bridge
> > PCI: aardvark: Disable bus mastering and mask all interrupts when
> > unbinding driver
> > PCI: aardvark: Free config space for emulated root bridge when
> > unbinding driver to fix memory leak
> > PCI: aardvark: Reset PCIe card and disable PHY when unbinding driver
> > PCI: aardvark: Rewrite irq code to chained irq handler
> > PCI: aardvark: Use separate INTA interrupt for emulated root bridge
> > PCI: pci-bridge-emul: Add description for class_revision field
> > PCI: pci-bridge-emul: Add definitions for missing capabilities
> > registers
> > PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
> > registers on emulated bridge
> > PCI: aardvark: Add support for PCI_BRIDGE_CTL_BUS_RESET on emulated
> > bridge
> > PCI: aardvark: Replace custom PCIE_CORE_ERR_CAPCTL_* macros by
> > linux/pci_regs.h macros
> > PCI: aardvark: Replace custom PCIE_CORE_INT_* macros by linux
> > PCI_INTERRUPT_* values
> > PCI: aardvark: Cleanup some register macros
> > PCI: aardvark: Add comments for OB_WIN_ENABLE and ADDR_WIN_DISABLE
> > PCI: aardvark: Add support for Advanced Error Reporting registers on
> > emulated bridge
> >
> > Russell King (2):
> > PCI: pci-bridge-emul: re-arrange register tests
> > PCI: pci-bridge-emul: add support for PCIe extended capabilities
> >
> > drivers/pci/controller/pci-aardvark.c | 980 +++++++++++++++++++-------
> > drivers/pci/pci-bridge-emul.c | 149 ++--
> > drivers/pci/pci-bridge-emul.h | 17 +-
> > include/uapi/linux/pci_regs.h | 6 +
> > 4 files changed, 850 insertions(+), 302 deletions(-)
>
> May I ask you please to split this series in smaller sets so that
> it is easier to merge ?
>
> Let's start with the more urgent fixes that don't involve rework (or
> you have not received change requests for since they are simple).

Is this due to the difficulty / time for reviewing them?

I feel it is a bit tough to ask someone to split a series, when it is
already nicely split into patches. Reordering to put the important
things first seems reasonable. Otherwise perhaps people can just
review what they can?

Regards,
Simon

2021-06-03 18:22:55

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 00/42] PCI: aardvark: Various driver fixes

On Thursday 03 June 2021 12:02:28 Simon Glass wrote:
> Reordering to put the important things first seems reasonable.

This is already done. Patches are in "logical" order and the first patch
is the most important.

2021-06-04 14:08:38

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [PATCH 00/42] PCI: aardvark: Various driver fixes

On Thu, Jun 03, 2021 at 12:02:28PM -0600, Simon Glass wrote:
> Hi,
>
> On Thu, 3 Jun 2021 at 09:17, Lorenzo Pieralisi
> <[email protected]> wrote:
> >
> > On Thu, May 06, 2021 at 05:31:11PM +0200, Pali Roh?r wrote:
> > > This patch series fixes various issues in pci-aardvark.c driver
> > > (PCIe controller on Marvell Armada 3700 SoC) used on Espressobin
> > > and Turris Mox.
> > >
> > > First patch fixes kernel panic (or TF-A panic depending on used
> > > firmware) during execution of PIO transfer and I would suggest to
> > > include this fix into v5.13 release to prevent future kernel panics.
> > >
> > > Other patches then fixes PIO issues, PCIe link training, initialization
> > > of PCIe controller, accessing PCIe Root Bridge/Port registers, handling
> > > of interrupts (including ERR and PME), setup of Multi-MSI interrupts,
> > > adding support for masking MSI interrupts, adding support for more than
> > > 32 MSI interrupts with MSI-X support, and adding support for AER.
> > >
> > > Note that there are still some unresolved issues with PCIe on A3720.
> > > I asked Marvell for PCIe documentation or explanations but I have not
> > > received anything (yet).
> > >
> > > Patch "Fix checking for PIO status" is taken from the Marvell github fork
> > > and adapted for inclusion into mainline kernel:
> > > https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/84444d22023c
> > >
> > > Whole patch series is available also in my git branch pci-aardvark:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=pci-aardvark
> > >
> > > If you have Espressobin, Turris Mox or other A3720 board with PCIe
> > > please test this patch series with your favourite PCIe card if
> > > everything is working fine.
> > >
> > > Evan Wang (1):
> > > PCI: aardvark: Fix checking for PIO status
> > >
> > > Pali Roh?r (39):
> > > PCI: aardvark: Fix kernel panic during PIO transfer
> > > PCI: aardvark: Fix checking for PIO Non-posted Request
> > > PCI: aardvark: Increase polling delay to 1.5s while waiting for PIO
> > > response
> > > PCI: pci-bridge-emul: Add PCIe Root Capabilities Register
> > > PCI: aardvark: Fix reporting CRS Software Visibility on emulated
> > > bridge
> > > PCI: aardvark: Fix link training
> > > PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros
> > > PCI: aardvark: Fix PCIe Max Payload Size setting
> > > PCI: aardvark: Implement workaround for the readback value of VEND_ID
> > > PCI: aardvark: Do not touch status bits of masked interrupts in
> > > interrupt handler
> > > PCI: aardvark: Check for virq mapping when processing INTx IRQ
> > > PCI: aardvark: Remove irq_mask_ack callback for INTx interrupts
> > > PCI: aardvark: Don't mask irq when mapping
> > > PCI: aardvark: Change name of INTx irq_chip to advk-INT
> > > PCI: aardvark: Remove unneeded goto
> > > PCI: aardvark: Fix support for MSI interrupts
> > > PCI: aardvark: Correctly clear and unmask all MSI interrupts
> > > PCI: aardvark: Fix setting MSI address
> > > PCI: aardvark: Add support for more than 32 MSI interrupts
> > > PCI: aardvark: Add support for masking MSI interrupts
> > > PCI: aardvark: Enable MSI-X support
> > > PCI: aardvark: Fix support for ERR interrupt on emulated bridge
> > > PCI: aardvark: Fix support for PME on emulated bridge
> > > PCI: aardvark: Fix support for PME requester on emulated bridge
> > > PCI: aardvark: Fix support for bus mastering and PCI_COMMAND on
> > > emulated bridge
> > > PCI: aardvark: Disable bus mastering and mask all interrupts when
> > > unbinding driver
> > > PCI: aardvark: Free config space for emulated root bridge when
> > > unbinding driver to fix memory leak
> > > PCI: aardvark: Reset PCIe card and disable PHY when unbinding driver
> > > PCI: aardvark: Rewrite irq code to chained irq handler
> > > PCI: aardvark: Use separate INTA interrupt for emulated root bridge
> > > PCI: pci-bridge-emul: Add description for class_revision field
> > > PCI: pci-bridge-emul: Add definitions for missing capabilities
> > > registers
> > > PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2
> > > registers on emulated bridge
> > > PCI: aardvark: Add support for PCI_BRIDGE_CTL_BUS_RESET on emulated
> > > bridge
> > > PCI: aardvark: Replace custom PCIE_CORE_ERR_CAPCTL_* macros by
> > > linux/pci_regs.h macros
> > > PCI: aardvark: Replace custom PCIE_CORE_INT_* macros by linux
> > > PCI_INTERRUPT_* values
> > > PCI: aardvark: Cleanup some register macros
> > > PCI: aardvark: Add comments for OB_WIN_ENABLE and ADDR_WIN_DISABLE
> > > PCI: aardvark: Add support for Advanced Error Reporting registers on
> > > emulated bridge
> > >
> > > Russell King (2):
> > > PCI: pci-bridge-emul: re-arrange register tests
> > > PCI: pci-bridge-emul: add support for PCIe extended capabilities
> > >
> > > drivers/pci/controller/pci-aardvark.c | 980 +++++++++++++++++++-------
> > > drivers/pci/pci-bridge-emul.c | 149 ++--
> > > drivers/pci/pci-bridge-emul.h | 17 +-
> > > include/uapi/linux/pci_regs.h | 6 +
> > > 4 files changed, 850 insertions(+), 302 deletions(-)
> >
> > May I ask you please to split this series in smaller sets so that
> > it is easier to merge ?
> >
> > Let's start with the more urgent fixes that don't involve rework (or
> > you have not received change requests for since they are simple).
>
> Is this due to the difficulty / time for reviewing them?

Yes.

> I feel it is a bit tough to ask someone to split a series, when it is
> already nicely split into patches. Reordering to put the important
> things first seems reasonable. Otherwise perhaps people can just
> review what they can?

Pali maintains this code, what I do is reviewing the bits of code that
are common across all PCI host controllers; it would help me to have
patches that affect common functionality (even if the changes are only
in the aardvark code) and patches that are aardvark specific to be
split, to speed up the merge.

It is easier to deal with small series but if this is just a series
containing various fixes we can leave it as it is and we go through
it one-by-one.

Lorenzo

2021-06-04 16:05:16

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 17/42] PCI: aardvark: Fix support for MSI interrupts

On Friday 07 May 2021 17:24:25 Marc Zyngier wrote:
> On Fri, 07 May 2021 15:44:20 +0100,
> Pali Rohár <[email protected]> wrote:
> >
> > On Friday 07 May 2021 11:16:58 Marc Zyngier wrote:
> > > On Thu, 06 May 2021 16:31:28 +0100,
> > > Pali Rohár <[email protected]> wrote:
> > > >
> > > > MSI domain callback .alloc (implemented by advk_msi_irq_domain_alloc()
> > > > function) should return zero on success. Returning non-zero value indicates
> > > > failure. Fix return value of this function as in many cases it now returns
> > > > failure while allocating IRQs.
> > > >
> > > > Aardvark hardware supports Multi-MSI and MSI_FLAG_MULTI_PCI_MSI is already
> > > > set. But when allocating MSI interrupt numbers for Multi-MSI, they need to
> > > > be properly aligned, otherwise endpoint devices send MSI interrupt with
> > > > incorrect numbers. Fix this issue by using function bitmap_find_free_region()
> > > > instead of bitmap_find_next_zero_area().
> > > >
> > > > To ensure that aligned MSI interrupt numbers are used by endpoint devices,
> > > > we cannot use Linux virtual irq numbers (as they are random and not
> > > > properly aligned). So use hwirq numbers allocated by the function
> > > > bitmap_find_free_region(), which are aligned. This needs an update in
> > > > advk_msi_irq_compose_msi_msg() and advk_pcie_handle_msi() functions to do
> > > > proper mapping between Linux virtual irq numbers and hwirq MSI inner domain
> > > > numbers.
> > > >
> > > > Also the whole 16-bit MSI number is stored in the PCIE_MSI_PAYLOAD_REG
> > > > register, not only lower 8 bits. Fix reading content of this register.
> > > >
> > > > This change fixes receiving MSI interrupts on Armada 3720 boards and allows
> > > > using NVMe disks which use Multi-MSI feature with 3 interrupts.
> > > >
> > > > Without this change, NVMe disks just freeze booting Linux on Armada 3720
> > > > boards as linux nvme-core.c driver is waiting 60s for an interrupt.
> > > >
> > > > Signed-off-by: Pali Rohár <[email protected]>
> > > > Reviewed-by: Marek Behún <[email protected]>
> > > > Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
> > > > ---
> > > > drivers/pci/controller/pci-aardvark.c | 32 ++++++++++++++++-----------
> > > > 1 file changed, 19 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > > > index 366d7480bc1b..498810c00b6d 100644
> > > > --- a/drivers/pci/controller/pci-aardvark.c
> > > > +++ b/drivers/pci/controller/pci-aardvark.c
> > > > @@ -118,6 +118,7 @@
> > > > #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> > > > #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> > > > #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> > > > +#define PCIE_MSI_DATA_MASK GENMASK(15, 0)
> > >
> > > See my comment below about this addition.
> > >
> > > > /* LMI registers base address and register offsets */
> > > > #define LMI_BASE_ADDR 0x6000
> > > > @@ -861,7 +862,7 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
> > > >
> > > > msg->address_lo = lower_32_bits(msi_msg);
> > > > msg->address_hi = upper_32_bits(msi_msg);
> > > > - msg->data = data->irq;
> > > > + msg->data = data->hwirq;
> > > > }
> > > >
> > > > static int advk_msi_set_affinity(struct irq_data *irq_data,
> > > > @@ -878,15 +879,11 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > > > int hwirq, i;
> > > >
> > > > mutex_lock(&pcie->msi_used_lock);
> > > > - hwirq = bitmap_find_next_zero_area(pcie->msi_used, MSI_IRQ_NUM,
> > > > - 0, nr_irqs, 0);
> > > > - if (hwirq >= MSI_IRQ_NUM) {
> > > > - mutex_unlock(&pcie->msi_used_lock);
> > > > - return -ENOSPC;
> > > > - }
> > > > -
> > > > - bitmap_set(pcie->msi_used, hwirq, nr_irqs);
> > > > + hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
> > > > + order_base_2(nr_irqs));
> > > > mutex_unlock(&pcie->msi_used_lock);
> > > > + if (hwirq < 0)
> > > > + return -ENOSPC;
> > > >
> > > > for (i = 0; i < nr_irqs; i++)
> > > > irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > @@ -894,7 +891,7 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > > > domain->host_data, handle_simple_irq,
> > > > NULL, NULL);
> > > >
> > > > - return hwirq;
> > > > + return 0;
> > > > }
> > > >
> > > > static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > > > @@ -904,7 +901,7 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > > > struct advk_pcie *pcie = domain->host_data;
> > > >
> > > > mutex_lock(&pcie->msi_used_lock);
> > > > - bitmap_clear(pcie->msi_used, d->hwirq, nr_irqs);
> > > > + bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
> > > > mutex_unlock(&pcie->msi_used_lock);
> > > > }
> > > >
> > > > @@ -1048,6 +1045,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > > > {
> > > > u32 msi_val, msi_mask, msi_status, msi_idx;
> > > > u16 msi_data;
> > > > + int virq;
> > > >
> > > > msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> > > > msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> > > > @@ -1057,9 +1055,17 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > > > if (!(BIT(msi_idx) & msi_status))
> > > > continue;
> > > >
> > > > + /*
> > > > + * msi_idx contains bits [4:0] of the msi_data and msi_data
> > > > + * contains 16bit MSI interrupt number from MSI inner domain
> > > > + */
> > > > advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
> > > > - msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & 0xFF;
> > > > - generic_handle_irq(msi_data);
> > > > + msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;
> > >
> > > Can this be moved to a separate patch? It seems like this patch should
> > > only focus on correctly dealing with the irq/hwirq issues.
> >
> > Well, hwirq is read from PCIE_MSI_PAYLOAD_REG register and it is 16-bit.
> > That is why I included this change in this patch, to fix also reading
> > IRQ number, not only setting IRQ number.
>
> But this irq number still is a 5 bit quantity at this stage, and the

Yes, it should be 5 bit number. And in case wrongly programmed PCIe card
sends interrupt with "incorrect number" then A3720 PCIe controller
"should not try" to map this 16-bit unknown MSI interrupt number to
something in 5-bit domain (by setting upper bits to zero) and trying to
deliver this invalid interrupt via some existing virq.

Interrupt number of received MSI is stored in low 16 bits in
PCIE_MSI_PAYLOAD_REG register and you should use / validate whole
number, not just few bits from it.

> support for more than 32 MSIs only come in 3 patches later.
>
> So this doesn't fix anything in this patch, and should be moved to
> patch 20.
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.

2021-06-04 16:27:30

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 17/42] PCI: aardvark: Fix support for MSI interrupts

On Fri, 04 Jun 2021 17:02:46 +0100,
Pali Rohár <[email protected]> wrote:
>
> On Friday 07 May 2021 17:24:25 Marc Zyngier wrote:
> > On Fri, 07 May 2021 15:44:20 +0100,
> > Pali Rohár <[email protected]> wrote:
> > >
> > > On Friday 07 May 2021 11:16:58 Marc Zyngier wrote:
> > > > On Thu, 06 May 2021 16:31:28 +0100,
> > > > Pali Rohár <[email protected]> wrote:
> > > > >
> > > > > MSI domain callback .alloc (implemented by advk_msi_irq_domain_alloc()
> > > > > function) should return zero on success. Returning non-zero value indicates
> > > > > failure. Fix return value of this function as in many cases it now returns
> > > > > failure while allocating IRQs.
> > > > >
> > > > > Aardvark hardware supports Multi-MSI and MSI_FLAG_MULTI_PCI_MSI is already
> > > > > set. But when allocating MSI interrupt numbers for Multi-MSI, they need to
> > > > > be properly aligned, otherwise endpoint devices send MSI interrupt with
> > > > > incorrect numbers. Fix this issue by using function bitmap_find_free_region()
> > > > > instead of bitmap_find_next_zero_area().
> > > > >
> > > > > To ensure that aligned MSI interrupt numbers are used by endpoint devices,
> > > > > we cannot use Linux virtual irq numbers (as they are random and not
> > > > > properly aligned). So use hwirq numbers allocated by the function
> > > > > bitmap_find_free_region(), which are aligned. This needs an update in
> > > > > advk_msi_irq_compose_msi_msg() and advk_pcie_handle_msi() functions to do
> > > > > proper mapping between Linux virtual irq numbers and hwirq MSI inner domain
> > > > > numbers.
> > > > >
> > > > > Also the whole 16-bit MSI number is stored in the PCIE_MSI_PAYLOAD_REG
> > > > > register, not only lower 8 bits. Fix reading content of this register.
> > > > >
> > > > > This change fixes receiving MSI interrupts on Armada 3720 boards and allows
> > > > > using NVMe disks which use Multi-MSI feature with 3 interrupts.
> > > > >
> > > > > Without this change, NVMe disks just freeze booting Linux on Armada 3720
> > > > > boards as linux nvme-core.c driver is waiting 60s for an interrupt.
> > > > >
> > > > > Signed-off-by: Pali Rohár <[email protected]>
> > > > > Reviewed-by: Marek Behún <[email protected]>
> > > > > Cc: [email protected] # f21a8b1b6837 ("PCI: aardvark: Move to MSI handling using generic MSI support")
> > > > > ---
> > > > > drivers/pci/controller/pci-aardvark.c | 32 ++++++++++++++++-----------
> > > > > 1 file changed, 19 insertions(+), 13 deletions(-)
> > > > >
> > > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > > > > index 366d7480bc1b..498810c00b6d 100644
> > > > > --- a/drivers/pci/controller/pci-aardvark.c
> > > > > +++ b/drivers/pci/controller/pci-aardvark.c
> > > > > @@ -118,6 +118,7 @@
> > > > > #define PCIE_MSI_STATUS_REG (CONTROL_BASE_ADDR + 0x58)
> > > > > #define PCIE_MSI_MASK_REG (CONTROL_BASE_ADDR + 0x5C)
> > > > > #define PCIE_MSI_PAYLOAD_REG (CONTROL_BASE_ADDR + 0x9C)
> > > > > +#define PCIE_MSI_DATA_MASK GENMASK(15, 0)
> > > >
> > > > See my comment below about this addition.
> > > >
> > > > > /* LMI registers base address and register offsets */
> > > > > #define LMI_BASE_ADDR 0x6000
> > > > > @@ -861,7 +862,7 @@ static void advk_msi_irq_compose_msi_msg(struct irq_data *data,
> > > > >
> > > > > msg->address_lo = lower_32_bits(msi_msg);
> > > > > msg->address_hi = upper_32_bits(msi_msg);
> > > > > - msg->data = data->irq;
> > > > > + msg->data = data->hwirq;
> > > > > }
> > > > >
> > > > > static int advk_msi_set_affinity(struct irq_data *irq_data,
> > > > > @@ -878,15 +879,11 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > > > > int hwirq, i;
> > > > >
> > > > > mutex_lock(&pcie->msi_used_lock);
> > > > > - hwirq = bitmap_find_next_zero_area(pcie->msi_used, MSI_IRQ_NUM,
> > > > > - 0, nr_irqs, 0);
> > > > > - if (hwirq >= MSI_IRQ_NUM) {
> > > > > - mutex_unlock(&pcie->msi_used_lock);
> > > > > - return -ENOSPC;
> > > > > - }
> > > > > -
> > > > > - bitmap_set(pcie->msi_used, hwirq, nr_irqs);
> > > > > + hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
> > > > > + order_base_2(nr_irqs));
> > > > > mutex_unlock(&pcie->msi_used_lock);
> > > > > + if (hwirq < 0)
> > > > > + return -ENOSPC;
> > > > >
> > > > > for (i = 0; i < nr_irqs; i++)
> > > > > irq_domain_set_info(domain, virq + i, hwirq + i,
> > > > > @@ -894,7 +891,7 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> > > > > domain->host_data, handle_simple_irq,
> > > > > NULL, NULL);
> > > > >
> > > > > - return hwirq;
> > > > > + return 0;
> > > > > }
> > > > >
> > > > > static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > > > > @@ -904,7 +901,7 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
> > > > > struct advk_pcie *pcie = domain->host_data;
> > > > >
> > > > > mutex_lock(&pcie->msi_used_lock);
> > > > > - bitmap_clear(pcie->msi_used, d->hwirq, nr_irqs);
> > > > > + bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
> > > > > mutex_unlock(&pcie->msi_used_lock);
> > > > > }
> > > > >
> > > > > @@ -1048,6 +1045,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > > > > {
> > > > > u32 msi_val, msi_mask, msi_status, msi_idx;
> > > > > u16 msi_data;
> > > > > + int virq;
> > > > >
> > > > > msi_mask = advk_readl(pcie, PCIE_MSI_MASK_REG);
> > > > > msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> > > > > @@ -1057,9 +1055,17 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> > > > > if (!(BIT(msi_idx) & msi_status))
> > > > > continue;
> > > > >
> > > > > + /*
> > > > > + * msi_idx contains bits [4:0] of the msi_data and msi_data
> > > > > + * contains 16bit MSI interrupt number from MSI inner domain
> > > > > + */
> > > > > advk_writel(pcie, BIT(msi_idx), PCIE_MSI_STATUS_REG);
> > > > > - msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & 0xFF;
> > > > > - generic_handle_irq(msi_data);
> > > > > + msi_data = advk_readl(pcie, PCIE_MSI_PAYLOAD_REG) & PCIE_MSI_DATA_MASK;
> > > >
> > > > Can this be moved to a separate patch? It seems like this patch should
> > > > only focus on correctly dealing with the irq/hwirq issues.
> > >
> > > Well, hwirq is read from PCIE_MSI_PAYLOAD_REG register and it is 16-bit.
> > > That is why I included this change in this patch, to fix also reading
> > > IRQ number, not only setting IRQ number.
> >
> > But this irq number still is a 5 bit quantity at this stage, and the
>
> Yes, it should be 5 bit number. And in case wrongly programmed PCIe card
> sends interrupt with "incorrect number" then A3720 PCIe controller

How? This driver is in control of what gets programmed.

> "should not try" to map this 16-bit unknown MSI interrupt number to
> something in 5-bit domain (by setting upper bits to zero) and trying to
> deliver this invalid interrupt via some existing virq.
>
> Interrupt number of received MSI is stored in low 16 bits in
> PCIE_MSI_PAYLOAD_REG register and you should use / validate whole
> number, not just few bits from it.

Meh. i still maintain that this isn't a logical split for this patch,
but at this stage I don't care (my 3720 is in the recycling pile).

M.

--
Without deviation from the norm, progress is not possible.

2021-06-04 16:28:14

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 12/42] PCI: aardvark: Check for virq mapping when processing INTx IRQ

On Friday 07 May 2021 10:15:39 Marc Zyngier wrote:
> On Thu, 06 May 2021 16:31:23 +0100,
> Pali Rohár <[email protected]> wrote:
> >
> > It is possible that we receive spurious INTx interrupt. So add needed check
> > before calling generic_handle_irq() function.
> >
> > Signed-off-by: Pali Rohár <[email protected]>
> > Reviewed-by: Marek Behún <[email protected]>
> > Cc: [email protected]
> > ---
> > drivers/pci/controller/pci-aardvark.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index 362faddae935..e7089db11f79 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -1106,7 +1106,10 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
> > PCIE_ISR1_REG);
> >
> > virq = irq_find_mapping(pcie->irq_domain, i);
> > - generic_handle_irq(virq);
> > + if (virq)
> > + generic_handle_irq(virq);
> > + else
> > + dev_err(&pcie->pdev->dev, "unexpected INT%c IRQ\n", (char)i+'A');
>
> Please don't scream like this. This is the best way to get into a DoS
> situation if you interrupt rate is high enough. At least rate-limit
> it.

Ok, I will fix it!

Just to note that this code pattern is used also in other drivers.
So other drivers should fixed too...

2021-06-04 16:31:55

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 12/42] PCI: aardvark: Check for virq mapping when processing INTx IRQ

On Fri, 04 Jun 2021 17:24:51 +0100,
Pali Rohár <[email protected]> wrote:
>
> On Friday 07 May 2021 10:15:39 Marc Zyngier wrote:
> > On Thu, 06 May 2021 16:31:23 +0100,
> > Pali Rohár <[email protected]> wrote:
> > >
> > > It is possible that we receive spurious INTx interrupt. So add needed check
> > > before calling generic_handle_irq() function.
> > >
> > > Signed-off-by: Pali Rohár <[email protected]>
> > > Reviewed-by: Marek Behún <[email protected]>
> > > Cc: [email protected]
> > > ---
> > > drivers/pci/controller/pci-aardvark.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > > index 362faddae935..e7089db11f79 100644
> > > --- a/drivers/pci/controller/pci-aardvark.c
> > > +++ b/drivers/pci/controller/pci-aardvark.c
> > > @@ -1106,7 +1106,10 @@ static void advk_pcie_handle_int(struct advk_pcie *pcie)
> > > PCIE_ISR1_REG);
> > >
> > > virq = irq_find_mapping(pcie->irq_domain, i);
> > > - generic_handle_irq(virq);
> > > + if (virq)
> > > + generic_handle_irq(virq);
> > > + else
> > > + dev_err(&pcie->pdev->dev, "unexpected INT%c IRQ\n", (char)i+'A');
> >
> > Please don't scream like this. This is the best way to get into a DoS
> > situation if you interrupt rate is high enough. At least rate-limit
> > it.
>
> Ok, I will fix it!
>
> Just to note that this code pattern is used also in other drivers.
> So other drivers should fixed too...

"We should fix the kernel" is a common theme. Please go ahead.

M.

--
Without deviation from the norm, progress is not possible.

2021-07-02 21:38:30

by Pali Rohár

[permalink] [raw]
Subject: Re: [PATCH 20/42] PCI: aardvark: Add support for more than 32 MSI interrupts

FYI this one patch does not work. Please drop it.

On Thursday 06 May 2021 17:31:31 Pali Rohár wrote:
> Aardvark HW can handle MSI interrupt with any 16-bit number. Received MSI
> interrupt number is visible in PCIE_MSI_PAYLOAD_REG register after clearing
> corresponding bit in PCIE_MSI_STATUS_REG register.

After doing heavy load testing I figured out that PCIE_MSI_PAYLOAD_REG
register is either buggy or unusable or there is missing some other
configuration as it contains in most cases just content of the last
received memory write operation to MSI doorbell register. And also even
after clearing corresponding bit in PCIE_MSI_STATUS_REG register.

Therefore we should avoid usage of PCIE_MSI_PAYLOAD_REG register
completely. Which implies that it is needed to use only corresponding
bit from PCIE_MSI_STATUS_REG register and so only 32 MSI interrupts are
supported.

I will address removal of PCIE_MSI_STATUS_REG register in other followup
patch.

> The first 32 interrupt numbers are currently stored in linear map in MSI
> inner domain. Store the rest in dynamic radix tree for space efficiency.
>
> Free interrupt numbers (available for MSI inner domain allocation) for the
> first 32 interrupts are currently stored in a bitmap. For the rest,
> introduce a linked list of allocated regions.
>
> In the most common scenario there is only one PCIe card connected on boards
> with Armada 3720 SoC. Since in Multi-MSI mode the PCIe device can use at
> most 32 interrupts, all these interrupts are allocated in the linear map of
> MSI inner domain and marked as used in the bitmap.
>
> For less common scenarios with PCIe devices with multiple functions or with
> a PCIe Bridge with packet switches with more connected PCIe devices more
> than 32 interrupts are requested. In this case, store each interrupt range
> from each interrupt request into the linked list as one node. In the worst
> case every PCIe function will occupy one node in this linked list.
>
> This change allows to use all 32 Multi-MSI interrupts on every connected
> PCIe card on the Turris Mox router with Mox G module.
>
> Signed-off-by: Pali Rohár <[email protected]>
> Reviewed-by: Marek Behún <[email protected]>
> ---
> drivers/pci/controller/pci-aardvark.c | 71 ++++++++++++++++++++++++---
> 1 file changed, 64 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 199015215779..d74e84b0e689 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -178,11 +178,18 @@
> #define RETRAIN_WAIT_MAX_RETRIES 10
> #define RETRAIN_WAIT_USLEEP_US 2000
>
> -#define MSI_IRQ_NUM 32
> +#define MSI_IRQ_LINEAR_COUNT 32
> +#define MSI_IRQ_TOTAL_COUNT 65536
>
> #define CFG_RD_UR_VAL 0xffffffff
> #define CFG_RD_CRS_VAL 0xffff0001
>
> +struct advk_msi_range {
> + struct list_head list;
> + u16 first;
> + u16 count;
> +};
> +
> struct advk_pcie {
> struct platform_device *pdev;
> void __iomem *base;
> @@ -193,7 +200,8 @@ struct advk_pcie {
> struct irq_chip msi_bottom_irq_chip;
> struct irq_chip msi_irq_chip;
> struct msi_domain_info msi_domain_info;
> - DECLARE_BITMAP(msi_used, MSI_IRQ_NUM);
> + DECLARE_BITMAP(msi_used_linear, MSI_IRQ_LINEAR_COUNT);
> + struct list_head msi_used_radix;
> struct mutex msi_used_lock;
> int link_gen;
> struct pci_bridge_emul bridge;
> @@ -885,12 +893,44 @@ static int advk_msi_irq_domain_alloc(struct irq_domain *domain,
> unsigned int nr_irqs, void *args)
> {
> struct advk_pcie *pcie = domain->host_data;
> + struct advk_msi_range *msi_range, *msi_range_prev, *msi_range_next;
> + unsigned int first, count, last;
> int hwirq, i;
>
> mutex_lock(&pcie->msi_used_lock);
> - hwirq = bitmap_find_free_region(pcie->msi_used, MSI_IRQ_NUM,
> +
> + /* First few used interrupt numbers are marked in bitmap (the most common) */
> + hwirq = bitmap_find_free_region(pcie->msi_used_linear, MSI_IRQ_LINEAR_COUNT,
> order_base_2(nr_irqs));
> +
> + /* And rest used interrupt numbers are stored in linked list as ranges */
> + if (hwirq < 0) {
> + count = 1 << order_base_2(nr_irqs);
> + msi_range_prev = list_entry(&pcie->msi_used_radix, typeof(*msi_range), list);
> + do {
> + msi_range_next = list_next_entry(msi_range_prev, list);
> + last = list_entry_is_head(msi_range_next, &pcie->msi_used_radix, list)
> + ? MSI_IRQ_TOTAL_COUNT : msi_range_next->first;
> + first = list_entry_is_head(msi_range_prev, &pcie->msi_used_radix, list)
> + ? MSI_IRQ_LINEAR_COUNT : round_up(msi_range_prev->first +
> + msi_range_prev->count, count);
> + if (first + count > last) {
> + msi_range_prev = msi_range_next;
> + continue;
> + }
> + msi_range = kzalloc(sizeof(*msi_range), GFP_KERNEL);
> + if (msi_range) {
> + hwirq = first;
> + msi_range->first = first;
> + msi_range->count = count;
> + list_add(&msi_range->list, &msi_range_prev->list);
> + }
> + break;
> + } while (!list_entry_is_head(msi_range_next, &pcie->msi_used_radix, list));
> + }
> +
> mutex_unlock(&pcie->msi_used_lock);
> +
> if (hwirq < 0)
> return -ENOSPC;
>
> @@ -908,9 +948,20 @@ static void advk_msi_irq_domain_free(struct irq_domain *domain,
> {
> struct irq_data *d = irq_domain_get_irq_data(domain, virq);
> struct advk_pcie *pcie = domain->host_data;
> + struct advk_msi_range *msi_range;
>
> mutex_lock(&pcie->msi_used_lock);
> - bitmap_release_region(pcie->msi_used, d->hwirq, order_base_2(nr_irqs));
> + if (d->hwirq < MSI_IRQ_LINEAR_COUNT) {
> + bitmap_release_region(pcie->msi_used_linear, d->hwirq, order_base_2(nr_irqs));
> + } else {
> + list_for_each_entry(msi_range, &pcie->msi_used_radix, list) {
> + if (msi_range->first != d->hwirq)
> + continue;
> + list_del(&msi_range->list);
> + kfree(msi_range);
> + break;
> + }
> + }
> mutex_unlock(&pcie->msi_used_lock);
> }
>
> @@ -967,6 +1018,7 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
> struct msi_domain_info *msi_di;
>
> mutex_init(&pcie->msi_used_lock);
> + INIT_LIST_HEAD(&pcie->msi_used_radix);
>
> bottom_ic = &pcie->msi_bottom_irq_chip;
>
> @@ -982,9 +1034,14 @@ static int advk_pcie_init_msi_irq_domain(struct advk_pcie *pcie)
> MSI_FLAG_MULTI_PCI_MSI;
> msi_di->chip = msi_ic;
>
> + /*
> + * Aardvark HW can handle MSI interrupt with any 16bit number.
> + * For optimization first few interrupts are allocated in linear map
> + * (which is common scenario) and rest are allocated in radix tree.
> + */
> pcie->msi_inner_domain =
> - irq_domain_add_linear(NULL, MSI_IRQ_NUM,
> - &advk_msi_domain_ops, pcie);
> + __irq_domain_add(NULL, MSI_IRQ_LINEAR_COUNT, MSI_IRQ_TOTAL_COUNT, 0,
> + &advk_msi_domain_ops, pcie);
> if (!pcie->msi_inner_domain)
> return -ENOMEM;
>
> @@ -1052,7 +1109,7 @@ static void advk_pcie_handle_msi(struct advk_pcie *pcie)
> msi_val = advk_readl(pcie, PCIE_MSI_STATUS_REG);
> msi_status = msi_val & ((~msi_mask) & PCIE_MSI_ALL_MASK);
>
> - for (msi_idx = 0; msi_idx < MSI_IRQ_NUM; msi_idx++) {
> + for (msi_idx = 0; msi_idx < BITS_PER_TYPE(msi_status); msi_idx++) {
> if (!(BIT(msi_idx) & msi_status))
> continue;
>
> --
> 2.20.1
>