2023-02-14 14:10:46

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

This is a series of patches that fixes the PCIe endpoint controller driver
for the Rockchip RK3399 SoC. The driver was introduced in
cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
The original driver had issues and would not allow for the RK3399 to
operate in PCIe endpoint mode correctly. This patch series fixes that so
that the PCIe core controller of the RK3399 SoC can now act as a PCIe
endpoint. This is v2 of the patch series and addresses the concerns that
were raised during the review of the first version.

Thank you in advance for reviewing these changes and hopefully
getting this merged. Having a functional PCIe endpoint controller
driver for the RK3399 would allow to develop further PCIe endpoint
functions through the Linux PCIe endpoint framework using this SoC.

Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in
cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
did not work.

Summary of problems with the driver :

* Missing dtsi entry
* Could not update Device ID (DID)
* The endpoint could not be configured by a host computer because the
endpoint kept sending Configuration Request Retry Status (CRS) messages
* The kernel would sometimes hang on probe due to access to registers in
a clock domain of which the PLLs were not locked
* The memory window mapping and address translation mechanism had
conflicting mappings and did not follow the technical reference manual
as to how the address translation should be done
* Legacy IRQs were not generated by the endpoint
* Message Signaled interrupts (MSI) were not generated by the endpoint

The problems have been addressed and validated through tests (see below).

Summary of changes :

This patch series is composed of 9 patches that do the following :
* Remove writes to unused registers in the PCIe core register space.
The registers that were written to is marked "unused" and read
only in the technical reference manual of the RK3399 SoC.
* Write PCI Device ID (DID) to correct register, the DID was written to
a read only register and therefore would not update the DID.
* Assert PCI Configuration Enable bit after probe so that it would stop
sending Configuration Request Retry Status (CRS) messages to the
host once configured, without this the host would retry until
timeout and cancel the PCI configuration.
* Add poll and timeout to wait for PHY PLLs to be locked, this
is the only patch that also applies to the root complex function
of the PCIe core controller, without this the kernel would
sometimes access registers in the PHY PLL clock domain when the PLLs
were not yet locked and the system would hang. This was hackily solved
in other non mainline patches (e.g., in armbian) with a "msleep()"
that was added after PHY PLL configuration but without realizing
why it was needed. A poll with timeout seems like a sane approach.
* Add dtsi entry for RK3399 PCIe endpoint core. The new entry is
in "disabled" status by default, so unless it is explicitly enabled
it will not conflict with the PCIe root complex controller entry.
Developers that will enable it would know that the root complex function
then must be disabled, this can be done in the board level DTS.
* Fix window mapping and address translation for endpoint. The window
mapping and address translation did not follow the technical reference
manual and a single memory region was used which resulted in conflicting
address translations for memory allocated in that region. The current
patch allows to allocate up to 32 memory windows with 1MB pages.
* Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs
were not sent by the device because their generation did not follow the
instructions in the technical reference manual. They now work.
* Use u32 variable to access 32-bit registers, u16 variables were used to
access and manipulate data of 32-bit registers, this would lead to
overflows e.g., when left shifting more than 16 bits.
* Add parameter check for RK3399 PCIe endpoint core set_msi(), return
-EINVAL when incompatible parameters are passed.

Validation on real hardware:

This patch series has been tested with kernel 6.0.19 (and 5.19)
on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer
board connected to a host computer through PCIe x1 and x4. The PCIe
endpoint test function driver was loaded on the SoC and the PCIe endpoint
test driver was loaded on the host computer. The following tests were
executed through this setup :

* enumeration of the PCIe endpoint device (lspci)
lspci -vvv
* validation of PCI header and capabilities
setpci and lspci -xxxx
* device was recognized by host computer dans PCIe endpoint test driver
was loaded
lspci -v states "Kernel modules: pci_endpoint_test"
* tested the BARs 0-5
sudo /usr/bin/pcitest -b 0
...
sudo /usr/bin/pcitest -b 5
* tested legacy interrupt through the test driver
sudo /usr/bin/pcitest -i 0
sudo /usr/bin/pcitest -l
* tested MSI interrupt through the test driver
sudo /usr/bin/pcitest -i 1
sudo /usr/bin/pcitest -m 1
* tested read/write to and from host through the test driver with checksum
sudo /usr/bin/pcitest -r -s 1024
sudo /usr/bin/pcitest -w -s 1024
* tested read/write with DMA enabled (all read/write tests also did IRQ)
sudo /usr/bin/pcitest -r -d -s 8192
sudo /usr/bin/pcitest -w -d -s 8192

Commands used on the SoC to launch the endpoint function (configfs) :

modprobe -i pci-epf-test
mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0
echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid
echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid
echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts
ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \
/sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/
echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start

Note: to enable the endpoint controller on the board the file :
arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts
Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep
to "okay". This is not submitted as a patch because most users
will use the PCIe core controller in host (root complex) mode
rather than endpoint mode.

I have tested and confirmed all basic functionality required for the
endpoint with the test driver and tools. With the previous state of
the driver the device would not even be enumerated by the host
computer (mainly because of CRS messages being sent back to the root
complex) and tests would not pass (driver would not even be loaded
because DID was not set correctly) and then only the BAR test would
pass. Now all tests pass as stated above.

Best regards
Rick

Rick Wertenbroek (9):
PCI: rockchip: Remove writes to unused registers
PCI: rockchip: Write PCI Device ID to correct register
PCI: rockchip: Assert PCI Configuration Enable bit after probe
PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core
PCI: rockchip: Fix window mapping and address translation for endpoint
PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
PCI: rockchip: Use u32 variable to access 32-bit registers
PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core
set_msi()

arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 ++++
drivers/pci/controller/pcie-rockchip-ep.c | 143 ++++++++++------------
drivers/pci/controller/pcie-rockchip.c | 16 +++
drivers/pci/controller/pcie-rockchip.h | 36 ++++--
4 files changed, 128 insertions(+), 90 deletions(-)

--
2.25.1



2023-02-14 14:10:48

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

Remove write accesses to registers that are marked "unused" (and
therefore read-only) in the technical reference manual (TRM)
(see RK3399 TRM 17.6.8.1)

Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 10 ----------
1 file changed, 10 deletions(-)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index d1a200b93..d5c477020 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -61,10 +61,6 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region));
- rockchip_pcie_write(rockchip, 0,
- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(region));
- rockchip_pcie_write(rockchip, 0,
- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(region));
}

static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
@@ -114,12 +110,6 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
addr1 = upper_32_bits(cpu_addr);
}
-
- /* CPU bus address region */
- rockchip_pcie_write(rockchip, addr0,
- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r));
- rockchip_pcie_write(rockchip, addr1,
- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r));
}

static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
--
2.25.1


2023-02-14 14:10:50

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 2/9] PCI: rockchip: Write PCI Device ID to correct register

Write PCI Device ID (DID) to the correct register. The Device ID was not
updated through the correct register. Device ID was written to a read-only
register and therefore did not work. The Device ID is now set through the
correct register. This is documented in the RK3399 TRM section 17.6.6.1.1

Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Cc: [email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 6 ++++--
drivers/pci/controller/pcie-rockchip.h | 2 ++
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index d5c477020..9b835377b 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -115,6 +115,7 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
struct pci_epf_header *hdr)
{
+ u32 reg;
struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
struct rockchip_pcie *rockchip = &ep->rockchip;

@@ -127,8 +128,9 @@ static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
PCIE_CORE_CONFIG_VENDOR);
}

- rockchip_pcie_write(rockchip, hdr->deviceid << 16,
- ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + PCI_VENDOR_ID);
+ reg = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_DID_VID);
+ reg = (reg & 0xFFFF) | (hdr->deviceid << 16);
+ rockchip_pcie_write(rockchip, reg, PCIE_EP_CONFIG_DID_VID);

rockchip_pcie_write(rockchip,
hdr->revid |
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
index 32c3a859c..51a123e5c 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -133,6 +133,8 @@
#define PCIE_RC_RP_ATS_BASE 0x400000
#define PCIE_RC_CONFIG_NORMAL_BASE 0x800000
#define PCIE_RC_CONFIG_BASE 0xa00000
+#define PCIE_EP_CONFIG_BASE 0xa00000
+#define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00)
#define PCIE_RC_CONFIG_RID_CCR (PCIE_RC_CONFIG_BASE + 0x08)
#define PCIE_RC_CONFIG_DCR (PCIE_RC_CONFIG_BASE + 0xc4)
#define PCIE_RC_CONFIG_DCR_CSPL_SHIFT 18
--
2.25.1


2023-02-14 14:11:00

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 3/9] PCI: rockchip: Assert PCI Configuration Enable bit after probe

Assert PCI Configuration Enable bit after probe. When this bit is left to
0 in the endpoint mode, the RK3399 PCIe endpoint core will generate
configuration request retry status (CRS) messages back to the root complex.
Assert this bit after probe to allow the RK3399 PCIe endpoint core to reply
to configuration requests from the root complex.
This is documented in section 17.5.8.1.2 of the RK3399 TRM.

Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Cc: [email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index 9b835377b..4c84e403e 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -623,6 +623,8 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)

ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;

+ rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE, PCIE_CLIENT_CONFIG);
+
return 0;
err_epc_mem_exit:
pci_epc_mem_exit(epc);
--
2.25.1


2023-02-14 14:11:02

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 5/9] arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core

Add dtsi entry for RK3399 PCIe endpoint core in the device tree.
The status is "disabled" by default, so it will not be loaded unless
explicitly chosen to. The RK3399 PCIe endpoit core should be enabled
with the RK3399 PCIe root complex disabled because the RK3399 PCIe
controller can only work one mode at the time, either in "root complex"
mode or in "endpoint" mode.

Signed-off-by: Rick Wertenbroek <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 9d5b0e8c9..8cc5a1ee2 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -265,6 +265,29 @@ pcie0_intc: interrupt-controller {
};
};

+ pcie0_ep: pcie-ep@f8000000 {
+ compatible = "rockchip,rk3399-pcie-ep";
+ rockchip,max-outbound-regions = <32>;
+ clocks = <&cru ACLK_PCIE>, <&cru ACLK_PERF_PCIE>,
+ <&cru PCLK_PCIE>, <&cru SCLK_PCIE_PM>;
+ clock-names = "aclk", "aclk-perf",
+ "hclk", "pm";
+ max-functions = /bits/ 8 <8>;
+ num-lanes = <4>;
+ reg = <0x0 0xfd000000 0x0 0x1000000>, <0x0 0xfa000000 0x0 0x2000000>;
+ reg-names = "apb-base", "mem-base";
+ resets = <&cru SRST_PCIE_CORE>, <&cru SRST_PCIE_MGMT>,
+ <&cru SRST_PCIE_MGMT_STICKY>, <&cru SRST_PCIE_PIPE> ,
+ <&cru SRST_PCIE_PM>, <&cru SRST_P_PCIE>, <&cru SRST_A_PCIE>;
+ reset-names = "core", "mgmt", "mgmt-sticky", "pipe",
+ "pm", "pclk", "aclk";
+ phys = <&pcie_phy 0>, <&pcie_phy 1>, <&pcie_phy 2>, <&pcie_phy 3>;
+ phy-names = "pcie-phy-0", "pcie-phy-1", "pcie-phy-2", "pcie-phy-3";
+ pinctrl-names = "default";
+ pinctrl-0 = <&pcie_clkreqnb_cpm>;
+ status = "disabled";
+ };
+
gmac: ethernet@fe300000 {
compatible = "rockchip,rk3399-gmac";
reg = <0x0 0xfe300000 0x0 0x10000>;
--
2.25.1


2023-02-14 14:11:04

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 4/9] PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked

The RK3399 PCIe controller should wait until the PHY PLLs are locked.
Add poll and timeout to wait for PHY PLLs to be locked. If they cannot
be locked generate error message and jump to error handler. Accessing
registers in the PHY clock domain when PLLs are not locked causes hang
The PHY PLLs status is checked through a side channel register.
This is documented in the TRM section 17.5.8.1 "PCIe Initialization
Sequence".

Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Cc: [email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip.c | 16 ++++++++++++++++
drivers/pci/controller/pcie-rockchip.h | 2 ++
2 files changed, 18 insertions(+)

diff --git a/drivers/pci/controller/pcie-rockchip.c b/drivers/pci/controller/pcie-rockchip.c
index 990a00e08..5f2e2dd5d 100644
--- a/drivers/pci/controller/pcie-rockchip.c
+++ b/drivers/pci/controller/pcie-rockchip.c
@@ -14,6 +14,7 @@
#include <linux/clk.h>
#include <linux/delay.h>
#include <linux/gpio/consumer.h>
+#include <linux/iopoll.h>
#include <linux/of_pci.h>
#include <linux/phy/phy.h>
#include <linux/platform_device.h>
@@ -153,6 +154,12 @@ int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
}
EXPORT_SYMBOL_GPL(rockchip_pcie_parse_dt);

+#define rockchip_pcie_read_addr(addr) rockchip_pcie_read(rockchip, addr)
+/* 100 ms max wait time for PHY PLLs to lock */
+#define RK_PHY_PLL_LOCK_TIMEOUT_US 100000
+/* Sleep should be less than 20ms */
+#define RK_PHY_PLL_LOCK_SLEEP_US 1000
+
int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
{
struct device *dev = rockchip->dev;
@@ -254,6 +261,15 @@ int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
}
}

+ err = readx_poll_timeout(rockchip_pcie_read_addr, PCIE_CLIENT_SIDE_BAND_STATUS,
+ regs, !(regs & PCIE_CLIENT_PHY_ST), RK_PHY_PLL_LOCK_SLEEP_US,
+ RK_PHY_PLL_LOCK_TIMEOUT_US);
+
+ if (err) {
+ dev_err(dev, "PHY PLLs could not lock, %d\n", err);
+ goto err_power_off_phy;
+ }
+
/*
* Please don't reorder the deassert sequence of the following
* four reset pins.
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
index 51a123e5c..f3a5ff1cf 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -38,6 +38,8 @@
#define PCIE_CLIENT_MODE_EP HIWORD_UPDATE(0x0040, 0)
#define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0)
#define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080)
+#define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20)
+#define PCIE_CLIENT_PHY_ST BIT(12)
#define PCIE_CLIENT_DEBUG_OUT_0 (PCIE_CLIENT_BASE + 0x3c)
#define PCIE_CLIENT_DEBUG_LTSSM_MASK GENMASK(5, 0)
#define PCIE_CLIENT_DEBUG_LTSSM_L1 0x18
--
2.25.1


2023-02-14 14:11:11

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 6/9] PCI: rockchip: Fix window mapping and address translation for endpoint

The RK3399 PCI endpoint core has 33 windows for PCIe space, now in the
driver up to 32 fixed size (1M) windows are used and pages are allocated
and mapped accordingly. The driver first used a single window and allocated
space inside which caused translation issues (between CPU space and PCI
space) because a window can only have a single translation at a given
time, which if multiple pages are allocated inside will cause conflicts.
Now each window is a single region of 1M which will always guarantee that
the translation is not in conflict.

Set the translation register addresses for physical function. As documented
in the technical reference manual (TRM) section 17.5.5 "PCIe Address
Translation" and section 17.6.8 "Address Translation Registers Description"

Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Cc: [email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 67 ++++++++++++-----------
drivers/pci/controller/pcie-rockchip.h | 25 +++++----
2 files changed, 49 insertions(+), 43 deletions(-)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index 4c84e403e..cbc281a6a 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -76,11 +76,17 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
if (num_pass_bits < 8)
num_pass_bits = 8;

- cpu_addr -= rockchip->mem_res->start;
- addr0 = ((is_nor_msg ? 0x10 : (num_pass_bits - 1)) &
- PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
- (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
- addr1 = upper_32_bits(is_nor_msg ? cpu_addr : pci_addr);
+ if (is_nor_msg) {
+ dev_warn(rockchip->dev, "NOR MSG\n");
+ cpu_addr -= rockchip->mem_res->start;
+ addr0 = (0x10 & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
+ (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
+ addr1 = upper_32_bits(cpu_addr);
+ } else {
+ addr0 = (num_pass_bits & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
+ (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
+ addr1 = upper_32_bits(pci_addr);
+ }
desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type;
desc1 = 0;

@@ -103,12 +109,6 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
rockchip_pcie_write(rockchip, desc1,
ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
-
- addr0 =
- ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
- (lower_32_bits(cpu_addr) &
- PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
- addr1 = upper_32_bits(cpu_addr);
}
}

@@ -256,15 +256,7 @@ static int rockchip_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn,
struct rockchip_pcie *pcie = &ep->rockchip;
u32 r;

- r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG);
- /*
- * Region 0 is reserved for configuration space and shouldn't
- * be used elsewhere per TRM, so leave it out.
- */
- if (r >= ep->max_regions - 1) {
- dev_err(&epc->dev, "no free outbound region\n");
- return -EINVAL;
- }
+ r = (addr >> ilog2(SZ_1M)) & 0x1f;

rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, AXI_WRAPPER_MEM_WRITE, addr,
pci_addr, size);
@@ -282,15 +274,11 @@ static void rockchip_pcie_ep_unmap_addr(struct pci_epc *epc, u8 fn, u8 vfn,
struct rockchip_pcie *rockchip = &ep->rockchip;
u32 r;

- for (r = 0; r < ep->max_regions - 1; r++)
+ for (r = 0; r < ep->max_regions; r++)
if (ep->ob_addr[r] == addr)
break;

- /*
- * Region 0 is reserved for configuration space and shouldn't
- * be used elsewhere per TRM, so leave it out.
- */
- if (r == ep->max_regions - 1)
+ if (r == ep->max_regions)
return;

rockchip_pcie_clear_ep_ob_atu(rockchip, r);
@@ -411,6 +399,7 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
u16 flags, mme, data, data_mask;
u8 msi_count;
u64 pci_addr, pci_addr_mask = 0xff;
+ u32 r;

/* Check MSI enable bit */
flags = rockchip_pcie_read(&ep->rockchip,
@@ -444,12 +433,12 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
ROCKCHIP_PCIE_EP_MSI_CTRL_REG +
PCI_MSI_ADDRESS_LO);
- pci_addr &= GENMASK_ULL(63, 2);

/* Set the outbound region if needed. */
if (unlikely(ep->irq_pci_addr != (pci_addr & ~pci_addr_mask) ||
ep->irq_pci_fn != fn)) {
- rockchip_pcie_prog_ep_ob_atu(rockchip, fn, ep->max_regions - 1,
+ r = (ep->irq_phys_addr >> ilog2(SZ_1M)) & 0x1f;
+ rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
AXI_WRAPPER_MEM_WRITE,
ep->irq_phys_addr,
pci_addr & ~pci_addr_mask,
@@ -539,6 +528,8 @@ static int rockchip_pcie_parse_ep_dt(struct rockchip_pcie *rockchip,
if (err < 0 || ep->max_regions > MAX_REGION_LIMIT)
ep->max_regions = MAX_REGION_LIMIT;

+ ep->ob_region_map = 0;
+
err = of_property_read_u8(dev->of_node, "max-functions",
&ep->epc->max_functions);
if (err < 0)
@@ -559,7 +550,10 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
struct rockchip_pcie *rockchip;
struct pci_epc *epc;
size_t max_regions;
+ struct pci_epc_mem_window *windows = NULL;
int err;
+ u32 cfg;
+ int i;

ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL);
if (!ep)
@@ -606,15 +600,26 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
/* Only enable function 0 by default */
rockchip_pcie_write(rockchip, BIT(0), PCIE_CORE_PHY_FUNC_CFG);

- err = pci_epc_mem_init(epc, rockchip->mem_res->start,
- resource_size(rockchip->mem_res), PAGE_SIZE);
+ windows = devm_kcalloc(dev, ep->max_regions, sizeof(struct pci_epc_mem_window), GFP_KERNEL);
+ if (!windows) {
+ err = -ENOMEM;
+ goto err_uninit_port;
+ }
+ for (i = 0; i < ep->max_regions; i++) {
+ windows[i].phys_base = rockchip->mem_res->start + (SZ_1M * i);
+ windows[i].size = SZ_1M;
+ windows[i].page_size = SZ_1M;
+ }
+ err = pci_epc_multi_mem_init(epc, windows, ep->max_regions);
+ devm_kfree(dev, windows);
+
if (err < 0) {
dev_err(dev, "failed to initialize the memory space\n");
goto err_uninit_port;
}

ep->irq_cpu_addr = pci_epc_mem_alloc_addr(epc, &ep->irq_phys_addr,
- SZ_128K);
+ SZ_1M);
if (!ep->irq_cpu_addr) {
dev_err(dev, "failed to reserve memory space for MSI\n");
err = -ENOMEM;
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
index f3a5ff1cf..72e427a0f 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -134,6 +134,7 @@

#define PCIE_RC_RP_ATS_BASE 0x400000
#define PCIE_RC_CONFIG_NORMAL_BASE 0x800000
+#define PCIE_EP_PF_CONFIG_REGS_BASE 0x800000
#define PCIE_RC_CONFIG_BASE 0xa00000
#define PCIE_EP_CONFIG_BASE 0xa00000
#define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00)
@@ -228,13 +229,14 @@
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
#define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3
-#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12))
+#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \
+ (PCIE_EP_PF_CONFIG_REGS_BASE + (((fn) << 12) & GENMASK(19, 12)))
+#define ROCKCHIP_PCIE_EP_VIRT_FUNC_BASE(fn) \
+ (PCIE_EP_PF_CONFIG_REGS_BASE + 0x10000 + (((fn) << 12) & GENMASK(19, 12)))
#define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \
- (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008)
+ (PCIE_CORE_AXI_CONF_BASE + 0x0828 + (fn) * 0x0040 + (bar) * 0x0008)
#define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar) \
- (PCIE_RC_RP_ATS_BASE + 0x0844 + (fn) * 0x0040 + (bar) * 0x0008)
-#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \
- (PCIE_RC_RP_ATS_BASE + 0x0000 + ((r) & 0x1f) * 0x0020)
+ (PCIE_CORE_AXI_CONF_BASE + 0x082c + (fn) * 0x0040 + (bar) * 0x0008)
#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN_MASK GENMASK(19, 12)
#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN(devfn) \
(((devfn) << 12) & \
@@ -242,20 +244,19 @@
#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK GENMASK(27, 20)
#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS(bus) \
(((bus) << 20) & ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK)
+#define PCIE_RC_EP_ATR_OB_REGIONS_1_32 (PCIE_CORE_AXI_CONF_BASE + 0x0020)
+#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \
+ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0000 + ((r) & 0x1f) * 0x0020)
#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r) \
- (PCIE_RC_RP_ATS_BASE + 0x0004 + ((r) & 0x1f) * 0x0020)
+ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0004 + ((r) & 0x1f) * 0x0020)
#define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_HARDCODED_RID BIT(23)
#define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK GENMASK(31, 24)
#define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(devfn) \
(((devfn) << 24) & ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK)
#define ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r) \
- (PCIE_RC_RP_ATS_BASE + 0x0008 + ((r) & 0x1f) * 0x0020)
+ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0008 + ((r) & 0x1f) * 0x0020)
#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \
- (PCIE_RC_RP_ATS_BASE + 0x000c + ((r) & 0x1f) * 0x0020)
-#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r) \
- (PCIE_RC_RP_ATS_BASE + 0x0018 + ((r) & 0x1f) * 0x0020)
-#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r) \
- (PCIE_RC_RP_ATS_BASE + 0x001c + ((r) & 0x1f) * 0x0020)
+ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x000c + ((r) & 0x1f) * 0x0020)

#define ROCKCHIP_PCIE_CORE_EP_FUNC_BAR_CFG0(fn) \
(PCIE_CORE_CTRL_MGMT_BASE + 0x0240 + (fn) * 0x0008)
--
2.25.1


2023-02-14 14:11:18

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 7/9] PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core

Fix legacy IRQ generation for RK3399 PCIe endpoint core according to
the technical reference manual (TRM). Assert and deassert legacy
interrupt (INTx) through the legacy interrupt control register
("PCIE_CLIENT_LEGACY_INT_CTRL") instead of manually generating a PCIe
message. The generation of the legacy interrupt was tested and validated
with the PCIe endpoint test driver.

Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Cc: [email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 38 +++++------------------
drivers/pci/controller/pcie-rockchip.h | 6 ++++
2 files changed, 14 insertions(+), 30 deletions(-)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index cbc281a6a..ca5b363ba 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -328,45 +328,23 @@ static void rockchip_pcie_ep_assert_intx(struct rockchip_pcie_ep *ep, u8 fn,
u8 intx, bool is_asserted)
{
struct rockchip_pcie *rockchip = &ep->rockchip;
- u32 r = ep->max_regions - 1;
- u32 offset;
- u32 status;
- u8 msg_code;
-
- if (unlikely(ep->irq_pci_addr != ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR ||
- ep->irq_pci_fn != fn)) {
- rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
- AXI_WRAPPER_NOR_MSG,
- ep->irq_phys_addr, 0, 0);
- ep->irq_pci_addr = ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR;
- ep->irq_pci_fn = fn;
- }

intx &= 3;
if (is_asserted) {
ep->irq_pending |= BIT(intx);
- msg_code = ROCKCHIP_PCIE_MSG_CODE_ASSERT_INTA + intx;
} else {
ep->irq_pending &= ~BIT(intx);
- msg_code = ROCKCHIP_PCIE_MSG_CODE_DEASSERT_INTA + intx;
}

- status = rockchip_pcie_read(rockchip,
- ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
- ROCKCHIP_PCIE_EP_CMD_STATUS);
- status &= ROCKCHIP_PCIE_EP_CMD_STATUS_IS;
-
- if ((status != 0) ^ (ep->irq_pending != 0)) {
- status ^= ROCKCHIP_PCIE_EP_CMD_STATUS_IS;
- rockchip_pcie_write(rockchip, status,
- ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
- ROCKCHIP_PCIE_EP_CMD_STATUS);
+ if (is_asserted) {
+ rockchip_pcie_write(rockchip,
+ PCIE_CLIENT_INT_IN_ASSERT | PCIE_CLIENT_INT_PEND_ST_PEND,
+ PCIE_CLIENT_LEGACY_INT_CTRL);
+ } else {
+ rockchip_pcie_write(rockchip,
+ PCIE_CLIENT_INT_IN_DEASSERT | PCIE_CLIENT_INT_PEND_ST_NORMAL,
+ PCIE_CLIENT_LEGACY_INT_CTRL);
}
-
- offset =
- ROCKCHIP_PCIE_MSG_ROUTING(ROCKCHIP_PCIE_MSG_ROUTING_LOCAL_INTX) |
- ROCKCHIP_PCIE_MSG_CODE(msg_code) | ROCKCHIP_PCIE_MSG_NO_DATA;
- writel(0, ep->irq_cpu_addr + offset);
}

static int rockchip_pcie_ep_send_legacy_irq(struct rockchip_pcie_ep *ep, u8 fn,
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
index 72e427a0f..e90c2a2b8 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -39,6 +39,12 @@
#define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0)
#define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080)
#define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20)
+#define PCIE_CLIENT_LEGACY_INT_CTRL (PCIE_CLIENT_BASE + 0x0c)
+#define PCIE_CLIENT_INT_IN_ASSERT HIWORD_UPDATE_BIT(0x0002)
+#define PCIE_CLIENT_INT_IN_DEASSERT HIWORD_UPDATE(0x0002, 0)
+#define PCIE_CLIENT_INT_PEND_ST_PEND HIWORD_UPDATE_BIT(0x0001)
+#define PCIE_CLIENT_INT_PEND_ST_NORMAL HIWORD_UPDATE(0x0001, 0)
+#define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20)
#define PCIE_CLIENT_PHY_ST BIT(12)
#define PCIE_CLIENT_DEBUG_OUT_0 (PCIE_CLIENT_BASE + 0x3c)
#define PCIE_CLIENT_DEBUG_LTSSM_MASK GENMASK(5, 0)
--
2.25.1


2023-02-14 14:11:22

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 8/9] PCI: rockchip: Use u32 variable to access 32-bit registers

Previously u16 variables were used to access 32-bit registers, this
resulted in not all of the data being read from the registers. Also
the left shift of more than 16-bits would result in moving data out
of the variable. Use u32 variables to access 32-bit registers

Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Cc: [email protected]
Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 10 +++++-----
drivers/pci/controller/pcie-rockchip.h | 1 +
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index ca5b363ba..b7865a94e 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -292,15 +292,15 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
{
struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
struct rockchip_pcie *rockchip = &ep->rockchip;
- u16 flags;
+ u32 flags;

flags = rockchip_pcie_read(rockchip,
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK;
flags |=
- ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
- PCI_MSI_FLAGS_64BIT;
+ (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
+ (PCI_MSI_FLAGS_64BIT << ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET);
flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP;
rockchip_pcie_write(rockchip, flags,
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
@@ -312,7 +312,7 @@ static int rockchip_pcie_ep_get_msi(struct pci_epc *epc, u8 fn, u8 vfn)
{
struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
struct rockchip_pcie *rockchip = &ep->rockchip;
- u16 flags;
+ u32 flags;

flags = rockchip_pcie_read(rockchip,
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
@@ -374,7 +374,7 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
u8 interrupt_num)
{
struct rockchip_pcie *rockchip = &ep->rockchip;
- u16 flags, mme, data, data_mask;
+ u32 flags, mme, data, data_mask;
u8 msi_count;
u64 pci_addr, pci_addr_mask = 0xff;
u32 r;
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
index e90c2a2b8..11dbf53cd 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -227,6 +227,7 @@
#define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
#define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
#define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
+#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
--
2.25.1


2023-02-14 14:11:29

by Rick Wertenbroek

[permalink] [raw]
Subject: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

The RK3399 PCIe endpoint core supports only a single PCIe physcial
function (function number 0), therefore return -EINVAL if set_msi() is
called with a function number greater than 0.
The PCIe standard only allows the multi message capability (MMC) value
to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
called with a MMC value of over 0x5.

Signed-off-by: Rick Wertenbroek <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index b7865a94e..80634b690 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
struct rockchip_pcie *rockchip = &ep->rockchip;
u32 flags;

+ if (fn) {
+ dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
+ return -EINVAL;
+ }
+
+ if (mmc > 0x5) {
+ dev_err(&epc->dev, "Number of MSI IRQs cannot be more than 32\n");
+ return -EINVAL;
+ }
+
flags = rockchip_pcie_read(rockchip,
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
--
2.25.1


2023-02-14 23:56:14

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Remove write accesses to registers that are marked "unused" (and
> therefore read-only) in the technical reference manual (TRM)
> (see RK3399 TRM 17.6.8.1)
>
> Signed-off-by: Rick Wertenbroek <[email protected]>

I checked the TRM and indeed these registers are listed as unused.
However, with this patch, nothing work for me using a Pine rockpro64
board. Keeping this patch, your series (modulo some other fixes, more
emails coming) is making things work !

So I think the bug is with the TRM, not the code. THinking logically about
htis, it makes sense: this is programming the address translation unit to
translate mmio & dma between host PCI address and local CPU space address.
If we never set the PU address, how can that unit possibly ever translate
anything ?

> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index d1a200b93..d5c477020 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -61,10 +61,6 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region));
> - rockchip_pcie_write(rockchip, 0,
> - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(region));
> - rockchip_pcie_write(rockchip, 0,
> - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(region));
> }
>
> static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
> @@ -114,12 +110,6 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
> PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
> addr1 = upper_32_bits(cpu_addr);
> }
> -
> - /* CPU bus address region */
> - rockchip_pcie_write(rockchip, addr0,
> - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r));
> - rockchip_pcie_write(rockchip, addr1,
> - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r));
> }
>
> static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,

--
Damien Le Moal
Western Digital Research


2023-02-14 23:58:01

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 2/9] PCI: rockchip: Write PCI Device ID to correct register

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Write PCI Device ID (DID) to the correct register. The Device ID was not
> updated through the correct register. Device ID was written to a read-only
> register and therefore did not work. The Device ID is now set through the
> correct register. This is documented in the RK3399 TRM section 17.6.6.1.1
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>

Reviewed-by: Damien Le Moal <[email protected]>
Tested-by: Damien Le Moal <[email protected]>

--
Damien Le Moal
Western Digital Research


2023-02-14 23:59:37

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] PCI: rockchip: Assert PCI Configuration Enable bit after probe

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Assert PCI Configuration Enable bit after probe. When this bit is left to
> 0 in the endpoint mode, the RK3399 PCIe endpoint core will generate
> configuration request retry status (CRS) messages back to the root complex.
> Assert this bit after probe to allow the RK3399 PCIe endpoint core to reply
> to configuration requests from the root complex.
> This is documented in section 17.5.8.1.2 of the RK3399 TRM.
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>

Reviewed-by: Damien Le Moal <[email protected]>
Tested-by: Damien Le Moal <[email protected]>

--
Damien Le Moal
Western Digital Research


2023-02-15 01:01:19

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 4/9] PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked

On 2/14/23 23:08, Rick Wertenbroek wrote:
> The RK3399 PCIe controller should wait until the PHY PLLs are locked.
> Add poll and timeout to wait for PHY PLLs to be locked. If they cannot
> be locked generate error message and jump to error handler. Accessing
> registers in the PHY clock domain when PLLs are not locked causes hang
> The PHY PLLs status is checked through a side channel register.
> This is documented in the TRM section 17.5.8.1 "PCIe Initialization
> Sequence".
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>

A couple of nits below. Otherwise:

Reviewed-by: Damien Le Moal <[email protected]>
Tested-by: Damien Le Moal <[email protected]>

> ---
> drivers/pci/controller/pcie-rockchip.c | 16 ++++++++++++++++
> drivers/pci/controller/pcie-rockchip.h | 2 ++
> 2 files changed, 18 insertions(+)
>
> diff --git a/drivers/pci/controller/pcie-rockchip.c b/drivers/pci/controller/pcie-rockchip.c
> index 990a00e08..5f2e2dd5d 100644
> --- a/drivers/pci/controller/pcie-rockchip.c
> +++ b/drivers/pci/controller/pcie-rockchip.c

[...]

> + err = readx_poll_timeout(rockchip_pcie_read_addr, PCIE_CLIENT_SIDE_BAND_STATUS,

Nit: long line. Split it after the first comma.

> + regs, !(regs & PCIE_CLIENT_PHY_ST), RK_PHY_PLL_LOCK_SLEEP_US,
> + RK_PHY_PLL_LOCK_TIMEOUT_US);
> +

Nit: no need for this blank line.

> + if (err) {
> + dev_err(dev, "PHY PLLs could not lock, %d\n", err);
> + goto err_power_off_phy;
> + }
> +

--
Damien Le Moal
Western Digital Research


2023-02-15 01:05:18

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Add dtsi entry for RK3399 PCIe endpoint core in the device tree.
> The status is "disabled" by default, so it will not be loaded unless
> explicitly chosen to. The RK3399 PCIe endpoit core should be enabled
> with the RK3399 PCIe root complex disabled because the RK3399 PCIe
> controller can only work one mode at the time, either in "root complex"
> mode or in "endpoint" mode.
>
> Signed-off-by: Rick Wertenbroek <[email protected]>

You should also update the file:

Documentation/devicetree/bindings/pci/rockchip-pcie-ep.txt

The example there is broken...

Otherwise, this works great for me.

--
Damien Le Moal
Western Digital Research


2023-02-15 01:21:06

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 6/9] PCI: rockchip: Fix window mapping and address translation for endpoint

On 2/14/23 23:08, Rick Wertenbroek wrote:
> The RK3399 PCI endpoint core has 33 windows for PCIe space, now in the
> driver up to 32 fixed size (1M) windows are used and pages are allocated
> and mapped accordingly. The driver first used a single window and allocated
> space inside which caused translation issues (between CPU space and PCI
> space) because a window can only have a single translation at a given
> time, which if multiple pages are allocated inside will cause conflicts.
> Now each window is a single region of 1M which will always guarantee that
> the translation is not in conflict.
>
> Set the translation register addresses for physical function. As documented
> in the technical reference manual (TRM) section 17.5.5 "PCIe Address
> Translation" and section 17.6.8 "Address Translation Registers Description"
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>
> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 67 ++++++++++++-----------
> drivers/pci/controller/pcie-rockchip.h | 25 +++++----
> 2 files changed, 49 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index 4c84e403e..cbc281a6a 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -76,11 +76,17 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
> if (num_pass_bits < 8)
> num_pass_bits = 8;
>
> - cpu_addr -= rockchip->mem_res->start;
> - addr0 = ((is_nor_msg ? 0x10 : (num_pass_bits - 1)) &
> - PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
> - (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
> - addr1 = upper_32_bits(is_nor_msg ? cpu_addr : pci_addr);
> + if (is_nor_msg) {
> + dev_warn(rockchip->dev, "NOR MSG\n");

I do not think this warning is needed.
In fact, if you move your patch 7 before this one, we could probably drop
the is_nor_msg == true case entirely since with your patch 7, only the
host driver uses AXI_WRAPPER_NOR_MSG. So warning and returning for that
case should be enough.

> + cpu_addr -= rockchip->mem_res->start;

This needs to be done for the !is_nor_msg case too.

> + addr0 = (0x10 & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
> + (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
> + addr1 = upper_32_bits(cpu_addr);
> + } else {
> + addr0 = (num_pass_bits & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
> + (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
> + addr1 = upper_32_bits(pci_addr);
> + }
> desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type;
> desc1 = 0;
>
> @@ -103,12 +109,6 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
> rockchip_pcie_write(rockchip, desc1,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
> -
> - addr0 =
> - ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
> - (lower_32_bits(cpu_addr) &
> - PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
> - addr1 = upper_32_bits(cpu_addr);

This hunk should have been removed in patch 1. But as commented, this is
needed, at least for me. Without setting the cpu addr to OB region cpu
addr register, mmio/dma does not work for me, despite the TRM saying that
these registers are unused.

> }
> }
>
> @@ -256,15 +256,7 @@ static int rockchip_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn,
> struct rockchip_pcie *pcie = &ep->rockchip;
> u32 r;
>
> - r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG);
> - /*
> - * Region 0 is reserved for configuration space and shouldn't
> - * be used elsewhere per TRM, so leave it out.
> - */
> - if (r >= ep->max_regions - 1) {
> - dev_err(&epc->dev, "no free outbound region\n");
> - return -EINVAL;
> - }
> + r = (addr >> ilog2(SZ_1M)) & 0x1f;

Locally, I added a smal helper:

static inline int rockchip_ob_region(u64 addr)
{
return (addr >> ilog2(SZ_1M)) & 0x1f;
}

That makes the code nicer and avoids having this open coded repeatedly in
different places.

>
> rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, AXI_WRAPPER_MEM_WRITE, addr,
> pci_addr, size);
> @@ -282,15 +274,11 @@ static void rockchip_pcie_ep_unmap_addr(struct pci_epc *epc, u8 fn, u8 vfn,
> struct rockchip_pcie *rockchip = &ep->rockchip;
> u32 r;
>
> - for (r = 0; r < ep->max_regions - 1; r++)
> + for (r = 0; r < ep->max_regions; r++)
> if (ep->ob_addr[r] == addr)
> break;
>
> - /*
> - * Region 0 is reserved for configuration space and shouldn't
> - * be used elsewhere per TRM, so leave it out.
> - */
> - if (r == ep->max_regions - 1)
> + if (r == ep->max_regions)
> return;
>
> rockchip_pcie_clear_ep_ob_atu(rockchip, r);
> @@ -411,6 +399,7 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
> u16 flags, mme, data, data_mask;
> u8 msi_count;
> u64 pci_addr, pci_addr_mask = 0xff;
> + u32 r;
>
> /* Check MSI enable bit */
> flags = rockchip_pcie_read(&ep->rockchip,
> @@ -444,12 +433,12 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> ROCKCHIP_PCIE_EP_MSI_CTRL_REG +
> PCI_MSI_ADDRESS_LO);
> - pci_addr &= GENMASK_ULL(63, 2);
>
> /* Set the outbound region if needed. */
> if (unlikely(ep->irq_pci_addr != (pci_addr & ~pci_addr_mask) ||
> ep->irq_pci_fn != fn)) {
> - rockchip_pcie_prog_ep_ob_atu(rockchip, fn, ep->max_regions - 1,
> + r = (ep->irq_phys_addr >> ilog2(SZ_1M)) & 0x1f;
> + rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
> AXI_WRAPPER_MEM_WRITE,
> ep->irq_phys_addr,
> pci_addr & ~pci_addr_mask,
> @@ -539,6 +528,8 @@ static int rockchip_pcie_parse_ep_dt(struct rockchip_pcie *rockchip,
> if (err < 0 || ep->max_regions > MAX_REGION_LIMIT)
> ep->max_regions = MAX_REGION_LIMIT;
>
> + ep->ob_region_map = 0;
> +
> err = of_property_read_u8(dev->of_node, "max-functions",
> &ep->epc->max_functions);
> if (err < 0)
> @@ -559,7 +550,10 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
> struct rockchip_pcie *rockchip;
> struct pci_epc *epc;
> size_t max_regions;
> + struct pci_epc_mem_window *windows = NULL;
> int err;
> + u32 cfg;
> + int i;

Nit: instead of declaring this with another line, you could declare it
together with "err" above.

>
> ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL);
> if (!ep)
> @@ -606,15 +600,26 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
> /* Only enable function 0 by default */
> rockchip_pcie_write(rockchip, BIT(0), PCIE_CORE_PHY_FUNC_CFG);
>
> - err = pci_epc_mem_init(epc, rockchip->mem_res->start,
> - resource_size(rockchip->mem_res), PAGE_SIZE);
> + windows = devm_kcalloc(dev, ep->max_regions, sizeof(struct pci_epc_mem_window), GFP_KERNEL);
> + if (!windows) {
> + err = -ENOMEM;
> + goto err_uninit_port;
> + }
> + for (i = 0; i < ep->max_regions; i++) {
> + windows[i].phys_base = rockchip->mem_res->start + (SZ_1M * i);
> + windows[i].size = SZ_1M;
> + windows[i].page_size = SZ_1M;
> + }
> + err = pci_epc_multi_mem_init(epc, windows, ep->max_regions);
> + devm_kfree(dev, windows);
> +
> if (err < 0) {
> dev_err(dev, "failed to initialize the memory space\n");
> goto err_uninit_port;
> }
>
> ep->irq_cpu_addr = pci_epc_mem_alloc_addr(epc, &ep->irq_phys_addr,
> - SZ_128K);
> + SZ_1M);

The region for this needs to be marked as allocated in the ob_region_map. So:

set_bit(rockchip_ob_region(ep->irq_phys_addr),
&ep->ob_region_map);

Of note though is that this ob_region_bitmap is used to set and clear bits
*only*, it is actually never checked at all to see if there is a bug and a
mapped region is being remapped without an unmap first. Not sure it is
very useful in the end.

> if (!ep->irq_cpu_addr) {
> dev_err(dev, "failed to reserve memory space for MSI\n");
> err = -ENOMEM;
> diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
> index f3a5ff1cf..72e427a0f 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -134,6 +134,7 @@
>
> #define PCIE_RC_RP_ATS_BASE 0x400000
> #define PCIE_RC_CONFIG_NORMAL_BASE 0x800000
> +#define PCIE_EP_PF_CONFIG_REGS_BASE 0x800000
> #define PCIE_RC_CONFIG_BASE 0xa00000
> #define PCIE_EP_CONFIG_BASE 0xa00000
> #define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00)
> @@ -228,13 +229,14 @@
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
> #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
> #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3
> -#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12))
> +#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \
> + (PCIE_EP_PF_CONFIG_REGS_BASE + (((fn) << 12) & GENMASK(19, 12)))
> +#define ROCKCHIP_PCIE_EP_VIRT_FUNC_BASE(fn) \
> + (PCIE_EP_PF_CONFIG_REGS_BASE + 0x10000 + (((fn) << 12) & GENMASK(19, 12)))
> #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \
> - (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008)
> + (PCIE_CORE_AXI_CONF_BASE + 0x0828 + (fn) * 0x0040 + (bar) * 0x0008)
> #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar) \
> - (PCIE_RC_RP_ATS_BASE + 0x0844 + (fn) * 0x0040 + (bar) * 0x0008)
> -#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \
> - (PCIE_RC_RP_ATS_BASE + 0x0000 + ((r) & 0x1f) * 0x0020)
> + (PCIE_CORE_AXI_CONF_BASE + 0x082c + (fn) * 0x0040 + (bar) * 0x0008)
> #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN_MASK GENMASK(19, 12)
> #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN(devfn) \
> (((devfn) << 12) & \
> @@ -242,20 +244,19 @@
> #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK GENMASK(27, 20)
> #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS(bus) \
> (((bus) << 20) & ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK)
> +#define PCIE_RC_EP_ATR_OB_REGIONS_1_32 (PCIE_CORE_AXI_CONF_BASE + 0x0020)
> +#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \
> + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0000 + ((r) & 0x1f) * 0x0020)
> #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r) \
> - (PCIE_RC_RP_ATS_BASE + 0x0004 + ((r) & 0x1f) * 0x0020)
> + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0004 + ((r) & 0x1f) * 0x0020)
> #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_HARDCODED_RID BIT(23)
> #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK GENMASK(31, 24)
> #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(devfn) \
> (((devfn) << 24) & ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK)
> #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r) \
> - (PCIE_RC_RP_ATS_BASE + 0x0008 + ((r) & 0x1f) * 0x0020)
> + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0008 + ((r) & 0x1f) * 0x0020)
> #define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \
> - (PCIE_RC_RP_ATS_BASE + 0x000c + ((r) & 0x1f) * 0x0020)
> -#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r) \
> - (PCIE_RC_RP_ATS_BASE + 0x0018 + ((r) & 0x1f) * 0x0020)
> -#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r) \
> - (PCIE_RC_RP_ATS_BASE + 0x001c + ((r) & 0x1f) * 0x0020)
> + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x000c + ((r) & 0x1f) * 0x0020)
>
> #define ROCKCHIP_PCIE_CORE_EP_FUNC_BAR_CFG0(fn) \
> (PCIE_CORE_CTRL_MGMT_BASE + 0x0240 + (fn) * 0x0008)

--
Damien Le Moal
Western Digital Research


2023-02-15 01:27:20

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 7/9] PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Fix legacy IRQ generation for RK3399 PCIe endpoint core according to
> the technical reference manual (TRM). Assert and deassert legacy
> interrupt (INTx) through the legacy interrupt control register
> ("PCIE_CLIENT_LEGACY_INT_CTRL") instead of manually generating a PCIe
> message. The generation of the legacy interrupt was tested and validated
> with the PCIe endpoint test driver.
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>

Some nits below. But otherwise works fine for me.

> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 38 +++++------------------
> drivers/pci/controller/pcie-rockchip.h | 6 ++++
> 2 files changed, 14 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index cbc281a6a..ca5b363ba 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -328,45 +328,23 @@ static void rockchip_pcie_ep_assert_intx(struct rockchip_pcie_ep *ep, u8 fn,
> u8 intx, bool is_asserted)
> {
> struct rockchip_pcie *rockchip = &ep->rockchip;
> - u32 r = ep->max_regions - 1;
> - u32 offset;
> - u32 status;
> - u8 msg_code;
> -
> - if (unlikely(ep->irq_pci_addr != ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR ||
> - ep->irq_pci_fn != fn)) {
> - rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
> - AXI_WRAPPER_NOR_MSG,
> - ep->irq_phys_addr, 0, 0);
> - ep->irq_pci_addr = ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR;
> - ep->irq_pci_fn = fn;
> - }
>
> intx &= 3;
> if (is_asserted) {
> ep->irq_pending |= BIT(intx);
> - msg_code = ROCKCHIP_PCIE_MSG_CODE_ASSERT_INTA + intx;
> } else {
> ep->irq_pending &= ~BIT(intx);
> - msg_code = ROCKCHIP_PCIE_MSG_CODE_DEASSERT_INTA + intx;
> }
>
> - status = rockchip_pcie_read(rockchip,
> - ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> - ROCKCHIP_PCIE_EP_CMD_STATUS);
> - status &= ROCKCHIP_PCIE_EP_CMD_STATUS_IS;
> -
> - if ((status != 0) ^ (ep->irq_pending != 0)) {
> - status ^= ROCKCHIP_PCIE_EP_CMD_STATUS_IS;
> - rockchip_pcie_write(rockchip, status,
> - ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> - ROCKCHIP_PCIE_EP_CMD_STATUS);
> + if (is_asserted) {
> + rockchip_pcie_write(rockchip,
> + PCIE_CLIENT_INT_IN_ASSERT | PCIE_CLIENT_INT_PEND_ST_PEND,
> + PCIE_CLIENT_LEGACY_INT_CTRL);
> + } else {
> + rockchip_pcie_write(rockchip,
> + PCIE_CLIENT_INT_IN_DEASSERT | PCIE_CLIENT_INT_PEND_ST_NORMAL,
> + PCIE_CLIENT_LEGACY_INT_CTRL);
> }

With this change, you have now twice "if (is_asserted) {", which is not
necessary. You can simplify the code a bit:

static void rockchip_pcie_ep_assert_intx(struct rockchip_pcie_ep *ep,
u8 fn, u8 intx, bool do_assert)
{

u8 msg_code;



intx &= 3;

if (do_assert) {

ep->irq_pending |= BIT(intx);

msg_code = ROCKCHIP_PCIE_MSG_CODE_ASSERT_INTA + intx;

rockchip_pcie_write(&ep->rockchip,

PCIE_CLIENT_INT_IN_ASSERT |

PCIE_CLIENT_INT_PEND_ST_PEND,

PCIE_CLIENT_LEGACY_INT_CTRL);

return;

}



ep->irq_pending &= ~BIT(intx);

msg_code = ROCKCHIP_PCIE_MSG_CODE_DEASSERT_INTA + intx;

rockchip_pcie_write(&ep->rockchip,

PCIE_CLIENT_INT_IN_DEASSERT |

PCIE_CLIENT_INT_PEND_ST_NORMAL,

PCIE_CLIENT_LEGACY_INT_CTRL);

}

Note also the renaming of the argument "is_asserted" to "do_assert". The
name is_asserted is badly misleading considering the english meaning given
that it is true when we *must* do the assert and false when we must
deassert. So do_assert as a name better match the use of that argument I
think.

> -
> - offset =
> - ROCKCHIP_PCIE_MSG_ROUTING(ROCKCHIP_PCIE_MSG_ROUTING_LOCAL_INTX) |
> - ROCKCHIP_PCIE_MSG_CODE(msg_code) | ROCKCHIP_PCIE_MSG_NO_DATA;
> - writel(0, ep->irq_cpu_addr + offset);
> }
>
> static int rockchip_pcie_ep_send_legacy_irq(struct rockchip_pcie_ep *ep, u8 fn,
> diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
> index 72e427a0f..e90c2a2b8 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -39,6 +39,12 @@
> #define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0)
> #define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080)
> #define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20)
> +#define PCIE_CLIENT_LEGACY_INT_CTRL (PCIE_CLIENT_BASE + 0x0c)
> +#define PCIE_CLIENT_INT_IN_ASSERT HIWORD_UPDATE_BIT(0x0002)
> +#define PCIE_CLIENT_INT_IN_DEASSERT HIWORD_UPDATE(0x0002, 0)
> +#define PCIE_CLIENT_INT_PEND_ST_PEND HIWORD_UPDATE_BIT(0x0001)
> +#define PCIE_CLIENT_INT_PEND_ST_NORMAL HIWORD_UPDATE(0x0001, 0)
> +#define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20)
> #define PCIE_CLIENT_PHY_ST BIT(12)
> #define PCIE_CLIENT_DEBUG_OUT_0 (PCIE_CLIENT_BASE + 0x3c)
> #define PCIE_CLIENT_DEBUG_LTSSM_MASK GENMASK(5, 0)

--
Damien Le Moal
Western Digital Research


2023-02-15 01:34:42

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] PCI: rockchip: Use u32 variable to access 32-bit registers

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Previously u16 variables were used to access 32-bit registers, this
> resulted in not all of the data being read from the registers. Also
> the left shift of more than 16-bits would result in moving data out
> of the variable. Use u32 variables to access 32-bit registers
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>
> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 10 +++++-----
> drivers/pci/controller/pcie-rockchip.h | 1 +
> 2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index ca5b363ba..b7865a94e 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -292,15 +292,15 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> {
> struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
> struct rockchip_pcie *rockchip = &ep->rockchip;
> - u16 flags;
> + u32 flags;
>
> flags = rockchip_pcie_read(rockchip,
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK;
> flags |=
> - ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
> - PCI_MSI_FLAGS_64BIT;
> + (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |

ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET is 17 and multi_msg_cap is a u8...
Not nice.

Locally, I added the local variable:

u32 mmc = multi_msg_cap;

And use mmc instead of multi_msg_cap to avoid issues. Also,

> + (PCI_MSI_FLAGS_64BIT << ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET);
> flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP;
> rockchip_pcie_write(rockchip, flags,
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> @@ -312,7 +312,7 @@ static int rockchip_pcie_ep_get_msi(struct pci_epc *epc, u8 fn, u8 vfn)
> {
> struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
> struct rockchip_pcie *rockchip = &ep->rockchip;
> - u16 flags;
> + u32 flags;
>
> flags = rockchip_pcie_read(rockchip,
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> @@ -374,7 +374,7 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
> u8 interrupt_num)
> {
> struct rockchip_pcie *rockchip = &ep->rockchip;
> - u16 flags, mme, data, data_mask;
> + u32 flags, mme, data, data_mask;
> u8 msi_count;
> u64 pci_addr, pci_addr_mask = 0xff;
> u32 r;
> diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
> index e90c2a2b8..11dbf53cd 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -227,6 +227,7 @@
> #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
> #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
> +#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16

You are not using this macro anywhere. The name is also not very
descriptive. Better have it as:

#define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)

to match the TRM name and be clear that the bit indicates if MSI is
enabled or not.

> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20

--
Damien Le Moal
Western Digital Research


2023-02-15 01:39:57

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On 2/14/23 23:08, Rick Wertenbroek wrote:
> The RK3399 PCIe endpoint core supports only a single PCIe physcial
> function (function number 0), therefore return -EINVAL if set_msi() is
> called with a function number greater than 0.
> The PCIe standard only allows the multi message capability (MMC) value
> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> called with a MMC value of over 0x5.
>
> Signed-off-by: Rick Wertenbroek <[email protected]>
> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index b7865a94e..80634b690 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> struct rockchip_pcie *rockchip = &ep->rockchip;
> u32 flags;
>
> + if (fn) {
> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> + return -EINVAL;
> + }

Checking this here is late... Given that at most only one physical
function is supported, the check should be in rockchip_pcie_parse_ep_dt().
Something like:

err = of_property_read_u8(dev->of_node, "max-functions",
&ep->epc->max_functions);

if (err < 0 || ep->epc->max_functions > 1)

ep->epc->max_functions = 1;

And all the macros with the (fn) argument could also be simplified
(argument fn removed) since fn will always be 0.

> +
> + if (mmc > 0x5) {
> + dev_err(&epc->dev, "Number of MSI IRQs cannot be more than 32\n");

Long line. Please split it after the comma.

> + return -EINVAL;
> + }
> +
> flags = rockchip_pcie_read(rockchip,
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> ROCKCHIP_PCIE_EP_MSI_CTRL_REG);

Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
here all the time.

--
Damien Le Moal
Western Digital Research


2023-02-15 01:51:27

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 2/14/23 23:08, Rick Wertenbroek wrote:
> This is a series of patches that fixes the PCIe endpoint controller driver
> for the Rockchip RK3399 SoC. The driver was introduced in
> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> The original driver had issues and would not allow for the RK3399 to
> operate in PCIe endpoint mode correctly. This patch series fixes that so
> that the PCIe core controller of the RK3399 SoC can now act as a PCIe
> endpoint. This is v2 of the patch series and addresses the concerns that
> were raised during the review of the first version.
>
> Thank you in advance for reviewing these changes and hopefully
> getting this merged. Having a functional PCIe endpoint controller
> driver for the RK3399 would allow to develop further PCIe endpoint
> functions through the Linux PCIe endpoint framework using this SoC.
>
> Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in
> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> did not work.
>
> Summary of problems with the driver :
>
> * Missing dtsi entry
> * Could not update Device ID (DID)
> * The endpoint could not be configured by a host computer because the
> endpoint kept sending Configuration Request Retry Status (CRS) messages
> * The kernel would sometimes hang on probe due to access to registers in
> a clock domain of which the PLLs were not locked
> * The memory window mapping and address translation mechanism had
> conflicting mappings and did not follow the technical reference manual
> as to how the address translation should be done
> * Legacy IRQs were not generated by the endpoint
> * Message Signaled interrupts (MSI) were not generated by the endpoint
>
> The problems have been addressed and validated through tests (see below).
>
> Summary of changes :
>
> This patch series is composed of 9 patches that do the following :
> * Remove writes to unused registers in the PCIe core register space.
> The registers that were written to is marked "unused" and read
> only in the technical reference manual of the RK3399 SoC.
> * Write PCI Device ID (DID) to correct register, the DID was written to
> a read only register and therefore would not update the DID.
> * Assert PCI Configuration Enable bit after probe so that it would stop
> sending Configuration Request Retry Status (CRS) messages to the
> host once configured, without this the host would retry until
> timeout and cancel the PCI configuration.
> * Add poll and timeout to wait for PHY PLLs to be locked, this
> is the only patch that also applies to the root complex function
> of the PCIe core controller, without this the kernel would
> sometimes access registers in the PHY PLL clock domain when the PLLs
> were not yet locked and the system would hang. This was hackily solved
> in other non mainline patches (e.g., in armbian) with a "msleep()"
> that was added after PHY PLL configuration but without realizing
> why it was needed. A poll with timeout seems like a sane approach.
> * Add dtsi entry for RK3399 PCIe endpoint core. The new entry is
> in "disabled" status by default, so unless it is explicitly enabled
> it will not conflict with the PCIe root complex controller entry.
> Developers that will enable it would know that the root complex function
> then must be disabled, this can be done in the board level DTS.
> * Fix window mapping and address translation for endpoint. The window
> mapping and address translation did not follow the technical reference
> manual and a single memory region was used which resulted in conflicting
> address translations for memory allocated in that region. The current
> patch allows to allocate up to 32 memory windows with 1MB pages.
> * Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs
> were not sent by the device because their generation did not follow the
> instructions in the technical reference manual. They now work.
> * Use u32 variable to access 32-bit registers, u16 variables were used to
> access and manipulate data of 32-bit registers, this would lead to
> overflows e.g., when left shifting more than 16 bits.
> * Add parameter check for RK3399 PCIe endpoint core set_msi(), return
> -EINVAL when incompatible parameters are passed.
>
> Validation on real hardware:
>
> This patch series has been tested with kernel 6.0.19 (and 5.19)
> on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer
> board connected to a host computer through PCIe x1 and x4. The PCIe
> endpoint test function driver was loaded on the SoC and the PCIe endpoint
> test driver was loaded on the host computer. The following tests were
> executed through this setup :
>
> * enumeration of the PCIe endpoint device (lspci)
> lspci -vvv
> * validation of PCI header and capabilities
> setpci and lspci -xxxx
> * device was recognized by host computer dans PCIe endpoint test driver
> was loaded
> lspci -v states "Kernel modules: pci_endpoint_test"
> * tested the BARs 0-5
> sudo /usr/bin/pcitest -b 0
> ...
> sudo /usr/bin/pcitest -b 5
> * tested legacy interrupt through the test driver
> sudo /usr/bin/pcitest -i 0
> sudo /usr/bin/pcitest -l
> * tested MSI interrupt through the test driver
> sudo /usr/bin/pcitest -i 1
> sudo /usr/bin/pcitest -m 1
> * tested read/write to and from host through the test driver with checksum
> sudo /usr/bin/pcitest -r -s 1024
> sudo /usr/bin/pcitest -w -s 1024
> * tested read/write with DMA enabled (all read/write tests also did IRQ)
> sudo /usr/bin/pcitest -r -d -s 8192
> sudo /usr/bin/pcitest -w -d -s 8192
>
> Commands used on the SoC to launch the endpoint function (configfs) :
>
> modprobe -i pci-epf-test
> mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0
> echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid
> echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid
> echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts
> ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \
> /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/
> echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start
>
> Note: to enable the endpoint controller on the board the file :
> arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts
> Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep
> to "okay". This is not submitted as a patch because most users
> will use the PCIe core controller in host (root complex) mode
> rather than endpoint mode.
>
> I have tested and confirmed all basic functionality required for the
> endpoint with the test driver and tools. With the previous state of
> the driver the device would not even be enumerated by the host
> computer (mainly because of CRS messages being sent back to the root
> complex) and tests would not pass (driver would not even be loaded
> because DID was not set correctly) and then only the BAR test would
> pass. Now all tests pass as stated above.

Note about that: with your series applied, nothing was working for me on
my pine Rockpro64 board (AMD Ryzen host). I got weird/unstable behavior
and the host IOMMU screaming about IO page faults due to the endpoint
doing weird pci accesses. Running the host with IOMMU on really helps in
debugging this stuff :)

With the few fixes to your series I commented about, things started to
work better, but still very unstable. More debugging and I found out that
the pci-epf-test drivers, both host and endpoint sides, have nasty
problems that lead to reporting failures when things are actually working,
or outright dummy things being done that trigger errors (e.g. bad DMA
synchronization triggers IOMMU page faults reports). I have a dozen fix
patches for these drivers. Will clean them up and post ASAP.

With the test drivers fixed + the fixes to your series, I have the
pci_test.sh tests passing 100% of the time, repeatedly (in a loop). All solid.

However, I am still seeing issues with my ongoing work with a NVMe
endpoint driver function: I see everything working when the host BIOS
pokes at the NVMe "drive" it sees (all good, that is normal), but once
Linux nvme driver probe kicks in, IRQs are essentially dead: the nvme
driver does not see anything strange and allocates IRQs (1 first, which
ends up being INTX, then multiple MSI one for each completion queue), but
on the endpoint side, attempting to raise MSI or INTX IRQs result in error
as the rockchip-ep driver sees both INTX and MSI as disabled. No clue what
is going on. I suspect that a pci reset may have happened and corrupted
the core configuration. However, the EPC/EPF infrastructure does not
catch/process PCI resets as far as I can tell. That may be the issue.
I do not see this issue with the epf test driver, because I suspect the
host BIOS not knowing anything about that device, it does not touch it.
This all may depend on the host & BIOS. Not sure. Need to try with
different hosts. Just FYI :)


--
Damien Le Moal
Western Digital Research


2023-02-15 02:39:02

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 7/9] PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core

On 2/14/23 23:08, Rick Wertenbroek wrote:
> Fix legacy IRQ generation for RK3399 PCIe endpoint core according to
> the technical reference manual (TRM). Assert and deassert legacy
> interrupt (INTx) through the legacy interrupt control register
> ("PCIE_CLIENT_LEGACY_INT_CTRL") instead of manually generating a PCIe
> message. The generation of the legacy interrupt was tested and validated
> with the PCIe endpoint test driver.
>
> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> Cc: [email protected]
> Signed-off-by: Rick Wertenbroek <[email protected]>
> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 38 +++++------------------
> drivers/pci/controller/pcie-rockchip.h | 6 ++++
> 2 files changed, 14 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index cbc281a6a..ca5b363ba 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -328,45 +328,23 @@ static void rockchip_pcie_ep_assert_intx(struct rockchip_pcie_ep *ep, u8 fn,
> u8 intx, bool is_asserted)
> {
> struct rockchip_pcie *rockchip = &ep->rockchip;
> - u32 r = ep->max_regions - 1;
> - u32 offset;
> - u32 status;
> - u8 msg_code;
> -
> - if (unlikely(ep->irq_pci_addr != ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR ||
> - ep->irq_pci_fn != fn)) {
> - rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
> - AXI_WRAPPER_NOR_MSG,
> - ep->irq_phys_addr, 0, 0);
> - ep->irq_pci_addr = ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR;

By the way, ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR is now unused. Remove it
too please.

--
Damien Le Moal
Western Digital Research


2023-02-15 09:04:51

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On Wed, Feb 15, 2023 at 12:56 AM Damien Le Moal
<[email protected]> wrote:
>
> I checked the TRM and indeed these registers are listed as unused.
> However, with this patch, nothing work for me using a Pine rockpro64
> board. Keeping this patch, your series (modulo some other fixes, more
> emails coming) is making things work !

Hello, Thank you for testing the driver and commenting, I'll incorporate your
suggestions in the next version of this series.

This patch alone does not make the driver work. Without the fixes to the
address windows and translation found in [PATCH v2 6/9] ("PCI: rockchip:
Fix window mapping and address translation for endpoint") transfers will not
work. However, as you said, with the patch series, the driver works.
Good to see that you have the driver working on the rockpro64 which is a
very similar but different board than the one I used (FriendlyElec NanoPC-T4).

> So I think the bug is with the TRM, not the code. THinking logically about
> htis, it makes sense: this is programming the address translation unit to
> translate mmio & dma between host PCI address and local CPU space address.
> If we never set the PU address, how can that unit possibly ever translate
> anything ?

No, the bug is not in the TRM:
The RK3399 PCIe endpoint core has the physical address space of 64MB
@ 0xF800'0000 to access the PCIe address space (TRM 17.5.4).
This space is split into 33 windows, one of 32MBytes and 32 of 1MByte.
Read-write accesses by the CPU to that region will be translated. Each
window has a mapping that is configured through the ATR Configuration
Register Address Map (TRM 17.6.8) and the registers addr0 and addr1
will dictate the translation between the window (a physical CPU addr)
into a PCI space address (with this the unit can translate). The other
registers are for the PCIe header descriptor.
The translation process is documented in TRM 17.5.5.1.1
The core will translate all read-write accesses to the windows that fall
in the 64MB space @ 0xF800'0000 and generate the PCIe addresses
and headers according to the values in the registers in the ATR
Configuration Register Address Map (@ 0xFDC0'0000).

Translation does indeed take place and works
but requires the changes in [PATCH v2 6/9] ("PCI: rockchip:
Fix window mapping and address translation for endpoint")
because it was broken from the start...

The two writes that were removed are to unused (read-only) registers.
The writes don't do anything, manually writing and reading back these
addresses will always lead to 0 (they are read-only). So they are removed.

2023-02-15 09:17:40

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On 2/15/23 18:04, Rick Wertenbroek wrote:
> On Wed, Feb 15, 2023 at 12:56 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> I checked the TRM and indeed these registers are listed as unused.
>> However, with this patch, nothing work for me using a Pine rockpro64
>> board. Keeping this patch, your series (modulo some other fixes, more
>> emails coming) is making things work !
>
> Hello, Thank you for testing the driver and commenting, I'll incorporate your
> suggestions in the next version of this series.
>
> This patch alone does not make the driver work. Without the fixes to the
> address windows and translation found in [PATCH v2 6/9] ("PCI: rockchip:
> Fix window mapping and address translation for endpoint") transfers will not
> work. However, as you said, with the patch series, the driver works.
> Good to see that you have the driver working on the rockpro64 which is a
> very similar but different board than the one I used (FriendlyElec NanoPC-T4).
>
>> So I think the bug is with the TRM, not the code. THinking logically about
>> htis, it makes sense: this is programming the address translation unit to
>> translate mmio & dma between host PCI address and local CPU space address.
>> If we never set the PU address, how can that unit possibly ever translate
>> anything ?
>
> No, the bug is not in the TRM:
> The RK3399 PCIe endpoint core has the physical address space of 64MB
> @ 0xF800'0000 to access the PCIe address space (TRM 17.5.4).
> This space is split into 33 windows, one of 32MBytes and 32 of 1MByte.
> Read-write accesses by the CPU to that region will be translated. Each
> window has a mapping that is configured through the ATR Configuration
> Register Address Map (TRM 17.6.8) and the registers addr0 and addr1
> will dictate the translation between the window (a physical CPU addr)
> into a PCI space address (with this the unit can translate). The other
> registers are for the PCIe header descriptor.
> The translation process is documented in TRM 17.5.5.1.1
> The core will translate all read-write accesses to the windows that fall
> in the 64MB space @ 0xF800'0000 and generate the PCIe addresses
> and headers according to the values in the registers in the ATR
> Configuration Register Address Map (@ 0xFDC0'0000).
>
> Translation does indeed take place and works
> but requires the changes in [PATCH v2 6/9] ("PCI: rockchip:
> Fix window mapping and address translation for endpoint")
> because it was broken from the start...
>
> The two writes that were removed are to unused (read-only) registers.
> The writes don't do anything, manually writing and reading back these
> addresses will always lead to 0 (they are read-only). So they are removed.

OK. I tried so many things to get something working that I probably got
confused with this one :) Let me retry with this patch 1 to see if I get
the same results, which is: pci-epf-test solidly working (with the patches
I sent earlier today), and my on-going nvme epf driver working-ish (BIOS
OK, but IRQs to Linux miserably failing to be sent from EP).

--
Damien Le Moal
Western Digital Research


2023-02-15 10:00:10

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On 2/15/23 18:04, Rick Wertenbroek wrote:
> On Wed, Feb 15, 2023 at 12:56 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> I checked the TRM and indeed these registers are listed as unused.
>> However, with this patch, nothing work for me using a Pine rockpro64
>> board. Keeping this patch, your series (modulo some other fixes, more
>> emails coming) is making things work !
>
> Hello, Thank you for testing the driver and commenting, I'll incorporate your
> suggestions in the next version of this series.
>
> This patch alone does not make the driver work. Without the fixes to the
> address windows and translation found in [PATCH v2 6/9] ("PCI: rockchip:
> Fix window mapping and address translation for endpoint") transfers will not
> work. However, as you said, with the patch series, the driver works.
> Good to see that you have the driver working on the rockpro64 which is a
> very similar but different board than the one I used (FriendlyElec NanoPC-T4).
>
>> So I think the bug is with the TRM, not the code. THinking logically about
>> htis, it makes sense: this is programming the address translation unit to
>> translate mmio & dma between host PCI address and local CPU space address.
>> If we never set the PU address, how can that unit possibly ever translate
>> anything ?
>
> No, the bug is not in the TRM:
> The RK3399 PCIe endpoint core has the physical address space of 64MB
> @ 0xF800'0000 to access the PCIe address space (TRM 17.5.4).
> This space is split into 33 windows, one of 32MBytes and 32 of 1MByte.
> Read-write accesses by the CPU to that region will be translated. Each
> window has a mapping that is configured through the ATR Configuration
> Register Address Map (TRM 17.6.8) and the registers addr0 and addr1
> will dictate the translation between the window (a physical CPU addr)
> into a PCI space address (with this the unit can translate). The other
> registers are for the PCIe header descriptor.
> The translation process is documented in TRM 17.5.5.1.1
> The core will translate all read-write accesses to the windows that fall
> in the 64MB space @ 0xF800'0000 and generate the PCIe addresses
> and headers according to the values in the registers in the ATR
> Configuration Register Address Map (@ 0xFDC0'0000).
>
> Translation does indeed take place and works
> but requires the changes in [PATCH v2 6/9] ("PCI: rockchip:
> Fix window mapping and address translation for endpoint")
> because it was broken from the start...
>
> The two writes that were removed are to unused (read-only) registers.
> The writes don't do anything, manually writing and reading back these
> addresses will always lead to 0 (they are read-only). So they are removed.

OK. Tested again and it is working-ish...

./pcitest.sh
## Loading pci_endpoint_test
## BAR tests
BAR0: OKAY
BAR1: OKAY
BAR2: OKAY
BAR3: OKAY
BAR4: OKAY
BAR5: OKAY

## Legacy interrupt tests
SET IRQ TYPE TO LEGACY: OKAY
LEGACY IRQ: OKAY

## MSI interrupt tests
SET IRQ TYPE TO MSI: OKAY
MSI1: OKAY
MSI2: OKAY
MSI3: OKAY
MSI4: OKAY
MSI5: OKAY
MSI6: OKAY
MSI7: OKAY
MSI8: OKAY
MSI9: OKAY
MSI10: OKAY
MSI11: OKAY
MSI12: OKAY
MSI13: OKAY
MSI14: OKAY
MSI15: OKAY
MSI16: OKAY

## Read Tests (DMA)
READ ( 1 bytes): OKAY
READ ( 1024 bytes): OKAY
READ ( 1025 bytes): OKAY
READ ( 4096 bytes): OKAY
READ ( 131072 bytes): OKAY
READ (1024000 bytes): OKAY
READ (1024001 bytes): OKAY
READ (1048576 bytes): OKAY

## Write Tests (DMA)
WRITE ( 1 bytes): OKAY
WRITE ( 1024 bytes): OKAY
WRITE ( 1025 bytes): OKAY
WRITE ( 4096 bytes): OKAY
WRITE ( 131072 bytes): OKAY
WRITE (1024000 bytes): OKAY

Then stops here doing the 1024001 B case. The host waits for a completion that
does not come. On the EP side, I see:

[ 94.307215] pci_epf_test pci_epf_test.0: READ src addr 0xffd00000, 1024001 B
[ 94.307960] pci_epc fd000000.pcie-ep: Map region 1 phys addr 0xfa100000 to
pci addr 0xffd00000, 1024001 B
[ 94.308924] rockchip-pcie-ep fd000000.pcie-ep: Set atu: region 1, cpu addr
0xfa100000, pci addr 0xffd00000, 1024001 B
[ 132.309645] dma-pl330 ff6e0000.dma-controller: Reset Channel-2
CS-20000e FTC-40000

^^^^^^^^^^^^^^^
The DMA engine does not like something at all. Back to where I was when I tried
your series initially, which is why I tried removing patch 1 to see what happens...

[ 132.370479] pci_epf_test pci_epf_test.0: READ => Size: 1024001 B, DMA: YES,
Time: 38.059623935 s, Rate: 26 KB/s
[ 132.372152] pci_epc fd000000.pcie-ep: Unmap region 1
[ 132.372780] pci_epf_test pci_epf_test.0: RAISE MSI IRQ 1
[ 132.373312] rockchip-pcie-ep fd000000.pcie-ep: Send MSI IRQ 1
[ 132.373844] rockchip-pcie-ep fd000000.pcie-ep: MSI disabled
[ 132.374388] pci_epf_test pci_epf_test.0: Raise IRQ failed -22

And it looks like the PCI core crashed or something because MSI does not work
anymore as well (note that this is wheat I see with my nvme epf driver too, but
I do not have that DMA channel reset message...)

If I run the tests without DMA (mmio only), everything seems fine:

## Read Tests (No DMA)
READ ( 1 bytes): OKAY
READ ( 1024 bytes): OKAY
READ ( 1025 bytes): OKAY
READ (1024000 bytes): OKAY
READ (1024001 bytes): OKAY

## Write Tests (No DMA)
WRITE ( 1 bytes): OKAY
WRITE ( 1024 bytes): OKAY
WRITE ( 1025 bytes): OKAY
WRITE (1024000 bytes): OKAY
WRITE (1024001 bytes): OKAY

## Copy Tests (No DMA)
COPY ( 1 bytes): OKAY
COPY ( 1024 bytes): OKAY
COPY ( 1025 bytes): OKAY
COPY (1024000 bytes): OKAY
COPY (1024001 bytes): OKAY

So it looks like translation is working with your patch, but that the driver is
still missing something for DMA to work correctly...

Will keep digging.

Note that this is all tested with the patch series fixing pci-epf-test and
pci_endpoint_test drivers that I posted earlier today. As mentioned, my host is
an AMD Ryzen PC.

--
Damien Le Moal
Western Digital Research


2023-02-15 10:28:45

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Wed, Feb 15, 2023 at 2:51 AM Damien Le Moal
<[email protected]> wrote:
>
> Note about that: with your series applied, nothing was working for me on
> my pine Rockpro64 board (AMD Ryzen host). I got weird/unstable behavior
> and the host IOMMU screaming about IO page faults due to the endpoint
> doing weird pci accesses. Running the host with IOMMU on really helps in
> debugging this stuff :)

Thank you for testing, I have also tested with a Ryzen host, I have IOMMU
enabled as well.

>
> With the few fixes to your series I commented about, things started to
> work better, but still very unstable. More debugging and I found out that
> the pci-epf-test drivers, both host and endpoint sides, have nasty
> problems that lead to reporting failures when things are actually working,
> or outright dummy things being done that trigger errors (e.g. bad DMA
> synchronization triggers IOMMU page faults reports). I have a dozen fix
> patches for these drivers. Will clean them up and post ASAP.
>
> With the test drivers fixed + the fixes to your series, I have the
> pci_test.sh tests passing 100% of the time, repeatedly (in a loop). All solid.
>

Good to hear that it now works, I'll try them as well.

> However, I am still seeing issues with my ongoing work with a NVMe
> endpoint driver function: I see everything working when the host BIOS
> pokes at the NVMe "drive" it sees (all good, that is normal), but once
> Linux nvme driver probe kicks in, IRQs are essentially dead: the nvme
> driver does not see anything strange and allocates IRQs (1 first, which
> ends up being INTX, then multiple MSI one for each completion queue), but
> on the endpoint side, attempting to raise MSI or INTX IRQs result in error
> as the rockchip-ep driver sees both INTX and MSI as disabled. No clue what
> is going on. I suspect that a pci reset may have happened and corrupted
> the core configuration. However, the EPC/EPF infrastructure does not
> catch/process PCI resets as far as I can tell. That may be the issue.
> I do not see this issue with the epf test driver, because I suspect the
> host BIOS not knowing anything about that device, it does not touch it.
> This all may depend on the host & BIOS. Not sure. Need to try with
> different hosts. Just FYI :)
>

Interesting that you are working on this, I started to patch the RK3399 PCIe
endpoint controller driver for a similar project, I want to run our NVMe
firmware in a Linux PCIe endpoint function.

For the IRQs there are two things that come to mind:
1) The host driver could actually disable them and work in polling mode,
I have seen that with different versions of the Linux kernel NVMe driver
sometimes it would choose to use polling instead of IRQs for the queues.
So maybe it's just the
2) The RK3399 PCIe endpoint controller is said to be able only to generate
one type of interrupt at a given time. "It is capable of generating MSI or
Legacy interrupt if the PCIe is configured as EP. Notes that one PCIe
component can't generate both types of interrupts. It is either one or the
other." (see TRM 17.5.9 Interrupt Support).
I don't know exactly what the TRM means the the controller cannot
use both interrupts at the same time, but this might be a path to explore

2023-02-15 10:42:13

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 2/15/23 19:28, Rick Wertenbroek wrote:
> On Wed, Feb 15, 2023 at 2:51 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> Note about that: with your series applied, nothing was working for me on
>> my pine Rockpro64 board (AMD Ryzen host). I got weird/unstable behavior
>> and the host IOMMU screaming about IO page faults due to the endpoint
>> doing weird pci accesses. Running the host with IOMMU on really helps in
>> debugging this stuff :)
>
> Thank you for testing, I have also tested with a Ryzen host, I have IOMMU
> enabled as well.
>
>>
>> With the few fixes to your series I commented about, things started to
>> work better, but still very unstable. More debugging and I found out that
>> the pci-epf-test drivers, both host and endpoint sides, have nasty
>> problems that lead to reporting failures when things are actually working,
>> or outright dummy things being done that trigger errors (e.g. bad DMA
>> synchronization triggers IOMMU page faults reports). I have a dozen fix
>> patches for these drivers. Will clean them up and post ASAP.
>>
>> With the test drivers fixed + the fixes to your series, I have the
>> pci_test.sh tests passing 100% of the time, repeatedly (in a loop). All solid.
>>
>
> Good to hear that it now works, I'll try them as well.
>
>> However, I am still seeing issues with my ongoing work with a NVMe
>> endpoint driver function: I see everything working when the host BIOS
>> pokes at the NVMe "drive" it sees (all good, that is normal), but once
>> Linux nvme driver probe kicks in, IRQs are essentially dead: the nvme
>> driver does not see anything strange and allocates IRQs (1 first, which
>> ends up being INTX, then multiple MSI one for each completion queue), but
>> on the endpoint side, attempting to raise MSI or INTX IRQs result in error
>> as the rockchip-ep driver sees both INTX and MSI as disabled. No clue what
>> is going on. I suspect that a pci reset may have happened and corrupted
>> the core configuration. However, the EPC/EPF infrastructure does not
>> catch/process PCI resets as far as I can tell. That may be the issue.
>> I do not see this issue with the epf test driver, because I suspect the
>> host BIOS not knowing anything about that device, it does not touch it.
>> This all may depend on the host & BIOS. Not sure. Need to try with
>> different hosts. Just FYI :)
>>
>
> Interesting that you are working on this, I started to patch the RK3399 PCIe
> endpoint controller driver for a similar project, I want to run our NVMe
> firmware in a Linux PCIe endpoint function.
>
> For the IRQs there are two things that come to mind:
> 1) The host driver could actually disable them and work in polling mode,
> I have seen that with different versions of the Linux kernel NVMe driver
> sometimes it would choose to use polling instead of IRQs for the queues.
> So maybe it's just the
> 2) The RK3399 PCIe endpoint controller is said to be able only to generate
> one type of interrupt at a given time. "It is capable of generating MSI or
> Legacy interrupt if the PCIe is configured as EP. Notes that one PCIe
> component can't generate both types of interrupts. It is either one or the
> other." (see TRM 17.5.9 Interrupt Support).
> I don't know exactly what the TRM means the the controller cannot
> use both interrupts at the same time, but this might be a path to explore

The host says that both INTX is enabled and MSI disabled when the nvme driver
starts probing. That driver starts probe with a single vector to enable the
device first and use the admin SQ/CQ for indentify etc. Then, that IRQ is freed
and multiple MSI vectors allocated, one for each admin + IO queue pair.
The problem is that on the endpoint, the driver says that both INTX and MSI are
disabled but the host at least sees INTX enabled, and the first IRQ allocated
for the probe enables MSI and gets one vector. But that MSI enable is not seen
by the EP, and the EP also says that INTX is disabled, contrary to what the host
says.

When the BIOS probe the drive, both INTX and MSI are OK. Only one IRQ is used by
the BIOS and I tried both by setting & disabling MSI. What I think happens is
that there may be a PCI reset/FLR or something similar, and that screws up the
core config... I do not have a PCI bus analyzer, so hard to debug :)

I did hack both the host nvme driver and EP driver to print PCI link status etc,
but I do not see anything strange there. Furthermore, the BAR accesses and admin
SQ/CQ commands and cqe exchange is working as I get the identify commands from
the host and the host sees the cqe, but after a timeout as it never receives any
IRQ... I would like to try testing without the BIOS touching the EP nvme
controller. But not sure how to do that. Probably should ignore the first CC.EN
enable event I see, which is from the BIOS.

--
Damien Le Moal
Western Digital Research


2023-02-15 10:46:25

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2 8/9] PCI: rockchip: Use u32 variable to access 32-bit registers

From: Damien Le Moal
> Sent: 15 February 2023 01:34
>
> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > Previously u16 variables were used to access 32-bit registers, this
> > resulted in not all of the data being read from the registers. Also
> > the left shift of more than 16-bits would result in moving data out
> > of the variable. Use u32 variables to access 32-bit registers
> >
> > Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> > Cc: [email protected]
> > Signed-off-by: Rick Wertenbroek <[email protected]>
> > ---
> > drivers/pci/controller/pcie-rockchip-ep.c | 10 +++++-----
> > drivers/pci/controller/pcie-rockchip.h | 1 +
> > 2 files changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> > index ca5b363ba..b7865a94e 100644
> > --- a/drivers/pci/controller/pcie-rockchip-ep.c
> > +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> > @@ -292,15 +292,15 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> > {
> > struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
> > struct rockchip_pcie *rockchip = &ep->rockchip;
> > - u16 flags;
> > + u32 flags;
> >
> > flags = rockchip_pcie_read(rockchip,
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> > ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> > flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK;
> > flags |=
> > - ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
> > - PCI_MSI_FLAGS_64BIT;
> > + (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
>
> ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET is 17 and multi_msg_cap is a u8...
> Not nice.

It really doesn't matter.
As soon as you do any arithmetic char and short are promoted to int.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-02-15 11:20:44

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] PCI: rockchip: Use u32 variable to access 32-bit registers

On 2/15/23 19:46, David Laight wrote:
> From: Damien Le Moal
>> Sent: 15 February 2023 01:34
>>
>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>> Previously u16 variables were used to access 32-bit registers, this
>>> resulted in not all of the data being read from the registers. Also
>>> the left shift of more than 16-bits would result in moving data out
>>> of the variable. Use u32 variables to access 32-bit registers
>>>
>>> Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
>>> Cc: [email protected]
>>> Signed-off-by: Rick Wertenbroek <[email protected]>
>>> ---
>>> drivers/pci/controller/pcie-rockchip-ep.c | 10 +++++-----
>>> drivers/pci/controller/pcie-rockchip.h | 1 +
>>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>> index ca5b363ba..b7865a94e 100644
>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>> @@ -292,15 +292,15 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>> {
>>> struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
>>> struct rockchip_pcie *rockchip = &ep->rockchip;
>>> - u16 flags;
>>> + u32 flags;
>>>
>>> flags = rockchip_pcie_read(rockchip,
>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
>>> ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
>>> flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK;
>>> flags |=
>>> - ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
>>> - PCI_MSI_FLAGS_64BIT;
>>> + (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
>>
>> ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET is 17 and multi_msg_cap is a u8...
>> Not nice.
>
> It really doesn't matter.
> As soon as you do any arithmetic char and short are promoted to int.

OK. Thanks.


--
Damien Le Moal
Western Digital Research


2023-02-16 07:28:53

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On 2/15/23 18:58, Damien Le Moal wrote:
[...]
> WRITE ( 131072 bytes): OKAY
> WRITE (1024000 bytes): OKAY
>
> Then stops here doing the 1024001 B case. The host waits for a completion that
> does not come. On the EP side, I see:
>
> [ 94.307215] pci_epf_test pci_epf_test.0: READ src addr 0xffd00000, 1024001 B
> [ 94.307960] pci_epc fd000000.pcie-ep: Map region 1 phys addr 0xfa100000 to
> pci addr 0xffd00000, 1024001 B
> [ 94.308924] rockchip-pcie-ep fd000000.pcie-ep: Set atu: region 1, cpu addr
> 0xfa100000, pci addr 0xffd00000, 1024001 B
> [ 132.309645] dma-pl330 ff6e0000.dma-controller: Reset Channel-2
> CS-20000e FTC-40000
>
> ^^^^^^^^^^^^^^^
> The DMA engine does not like something at all. Back to where I was when I tried
> your series initially, which is why I tried removing patch 1 to see what happens...
>
> [ 132.370479] pci_epf_test pci_epf_test.0: READ => Size: 1024001 B, DMA: YES,
> Time: 38.059623935 s, Rate: 26 KB/s
> [ 132.372152] pci_epc fd000000.pcie-ep: Unmap region 1
> [ 132.372780] pci_epf_test pci_epf_test.0: RAISE MSI IRQ 1
> [ 132.373312] rockchip-pcie-ep fd000000.pcie-ep: Send MSI IRQ 1
> [ 132.373844] rockchip-pcie-ep fd000000.pcie-ep: MSI disabled
> [ 132.374388] pci_epf_test pci_epf_test.0: Raise IRQ failed -22
>
> And it looks like the PCI core crashed or something because MSI does not work
> anymore as well (note that this is wheat I see with my nvme epf driver too, but
> I do not have that DMA channel reset message...)
>
> If I run the tests without DMA (mmio only), everything seems fine:
>
> ## Read Tests (No DMA)
> READ ( 1 bytes): OKAY
> READ ( 1024 bytes): OKAY
> READ ( 1025 bytes): OKAY
> READ (1024000 bytes): OKAY
> READ (1024001 bytes): OKAY
>
> ## Write Tests (No DMA)
> WRITE ( 1 bytes): OKAY
> WRITE ( 1024 bytes): OKAY
> WRITE ( 1025 bytes): OKAY
> WRITE (1024000 bytes): OKAY
> WRITE (1024001 bytes): OKAY
>
> ## Copy Tests (No DMA)
> COPY ( 1 bytes): OKAY
> COPY ( 1024 bytes): OKAY
> COPY ( 1025 bytes): OKAY
> COPY (1024000 bytes): OKAY
> COPY (1024001 bytes): OKAY
>
> So it looks like translation is working with your patch, but that the driver is
> still missing something for DMA to work correctly...

I kept testing this and realized that I was not getting a consistent behavior.
Sometimes all tests passed, but would not repeat (running again would fail
everything), sometimes NMIs from bad accesses, and other times "hang" (test not
completing but no real machine hang/crash). So it started to hint at something
randomly initialized...

Re-reading the TRM, particularly section 17.5.5.1.1, I realized that the lower
16 bits of the desc2 register are used for the translation, but we never set
them with the current code. Only desc0 and desc1... So I added a write(0) to
desc2 and now it is finally working well. Running the tests in a loop, they all
pass and no bad behavior is observed.

My cleaned-up rockchip_pcie_prog_ep_ob_atu() function now looks like this:

static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
u32 r, u32 type, u64 phys_addr,
u64 pci_addr, size_t size)
{
u64 sz = 1ULL << fls64(size - 1);
int num_pass_bits = ilog2(sz);
u32 addr0, addr1, desc0;

/* Sanity checks */
if (WARN_ON_ONCE(type == AXI_WRAPPER_NOR_MSG))
return;
if (WARN_ON_ONCE(ALIGN_DOWN(phys_addr, SZ_1M) != phys_addr))
return;
if (WARN_ON_ONCE(rockchip_ob_region(phys_addr + size - 1) != r))
return;

/* We must pass at least 8 bits of PCI bus address */
if (num_pass_bits < 8)
num_pass_bits = 8;

/* PCI bus address region */
addr0 = ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
(lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
addr1 = upper_32_bits(pci_addr);
desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type;

rockchip_pcie_write(rockchip, addr0,
ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r));
rockchip_pcie_write(rockchip, addr1,
ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r));
rockchip_pcie_write(rockchip, desc0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC2(r));
}

And the function rockchip_pcie_clear_ep_ob_atu() also clears desc2:

static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
u32 region)
{
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(region));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(region));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region));
rockchip_pcie_write(rockchip, 0,
ROCKCHIP_PCIE_AT_OB_REGION_DESC2(region));
}

Thoughts ?

--
Damien Le Moal
Western Digital Research


2023-02-16 08:43:55

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On Thu, Feb 16, 2023 at 8:28 AM Damien Le Moal
<[email protected]> wrote:
>
> On 2/15/23 18:58, Damien Le Moal wrote:
> [...]
> > WRITE ( 131072 bytes): OKAY
> > WRITE (1024000 bytes): OKAY
> >
> > Then stops here doing the 1024001 B case. The host waits for a completion that
> > does not come. On the EP side, I see:
> >
> > [ 94.307215] pci_epf_test pci_epf_test.0: READ src addr 0xffd00000, 1024001 B
> > [ 94.307960] pci_epc fd000000.pcie-ep: Map region 1 phys addr 0xfa100000 to
> > pci addr 0xffd00000, 1024001 B
> > [ 94.308924] rockchip-pcie-ep fd000000.pcie-ep: Set atu: region 1, cpu addr
> > 0xfa100000, pci addr 0xffd00000, 1024001 B
> > [ 132.309645] dma-pl330 ff6e0000.dma-controller: Reset Channel-2
> > CS-20000e FTC-40000
> >
> > ^^^^^^^^^^^^^^^
> > The DMA engine does not like something at all. Back to where I was when I tried
> > your series initially, which is why I tried removing patch 1 to see what happens...
> >
> > [ 132.370479] pci_epf_test pci_epf_test.0: READ => Size: 1024001 B, DMA: YES,
> > Time: 38.059623935 s, Rate: 26 KB/s
> > [ 132.372152] pci_epc fd000000.pcie-ep: Unmap region 1
> > [ 132.372780] pci_epf_test pci_epf_test.0: RAISE MSI IRQ 1
> > [ 132.373312] rockchip-pcie-ep fd000000.pcie-ep: Send MSI IRQ 1
> > [ 132.373844] rockchip-pcie-ep fd000000.pcie-ep: MSI disabled
> > [ 132.374388] pci_epf_test pci_epf_test.0: Raise IRQ failed -22
> >
> > And it looks like the PCI core crashed or something because MSI does not work
> > anymore as well (note that this is wheat I see with my nvme epf driver too, but
> > I do not have that DMA channel reset message...)
> >
> > If I run the tests without DMA (mmio only), everything seems fine:
> >
> > ## Read Tests (No DMA)
> > READ ( 1 bytes): OKAY
> > READ ( 1024 bytes): OKAY
> > READ ( 1025 bytes): OKAY
> > READ (1024000 bytes): OKAY
> > READ (1024001 bytes): OKAY
> >
> > ## Write Tests (No DMA)
> > WRITE ( 1 bytes): OKAY
> > WRITE ( 1024 bytes): OKAY
> > WRITE ( 1025 bytes): OKAY
> > WRITE (1024000 bytes): OKAY
> > WRITE (1024001 bytes): OKAY
> >
> > ## Copy Tests (No DMA)
> > COPY ( 1 bytes): OKAY
> > COPY ( 1024 bytes): OKAY
> > COPY ( 1025 bytes): OKAY
> > COPY (1024000 bytes): OKAY
> > COPY (1024001 bytes): OKAY
> >
> > So it looks like translation is working with your patch, but that the driver is
> > still missing something for DMA to work correctly...
>
> I kept testing this and realized that I was not getting a consistent behavior.
> Sometimes all tests passed, but would not repeat (running again would fail
> everything), sometimes NMIs from bad accesses, and other times "hang" (test not
> completing but no real machine hang/crash). So it started to hint at something
> randomly initialized...
>
> Re-reading the TRM, particularly section 17.5.5.1.1, I realized that the lower
> 16 bits of the desc2 register are used for the translation, but we never set
> them with the current code. Only desc0 and desc1... So I added a write(0) to
> desc2 and now it is finally working well. Running the tests in a loop, they all
> pass and no bad behavior is observed.
>
> My cleaned-up rockchip_pcie_prog_ep_ob_atu() function now looks like this:
>
> static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
> u32 r, u32 type, u64 phys_addr,
> u64 pci_addr, size_t size)
> {
> u64 sz = 1ULL << fls64(size - 1);
> int num_pass_bits = ilog2(sz);
> u32 addr0, addr1, desc0;
>
> /* Sanity checks */
> if (WARN_ON_ONCE(type == AXI_WRAPPER_NOR_MSG))
> return;
> if (WARN_ON_ONCE(ALIGN_DOWN(phys_addr, SZ_1M) != phys_addr))
> return;
> if (WARN_ON_ONCE(rockchip_ob_region(phys_addr + size - 1) != r))
> return;
>
> /* We must pass at least 8 bits of PCI bus address */
> if (num_pass_bits < 8)
> num_pass_bits = 8;
>
> /* PCI bus address region */
> addr0 = ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
> (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
> addr1 = upper_32_bits(pci_addr);
> desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type;
>
> rockchip_pcie_write(rockchip, addr0,
> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r));
> rockchip_pcie_write(rockchip, addr1,
> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r));
> rockchip_pcie_write(rockchip, desc0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC2(r));
> }
>
> And the function rockchip_pcie_clear_ep_ob_atu() also clears desc2:
>
> static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
> u32 region)
> {
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(region));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(region));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region));
> rockchip_pcie_write(rockchip, 0,
> ROCKCHIP_PCIE_AT_OB_REGION_DESC2(region));
> }
>
> Thoughts ?
>
> --
> Damien Le Moal
> Western Digital Research
>

desc2 dictates the bits [79-64] of the PCIe header descriptor.
These bits are for the PF TLP Processing hints.
I did not set them because I thought the default value (0) would be fine.
The TRM section 17.6.8.2.5 says that this register values are reset
to 0, therefore, the write here (0) to desc2 should not change anything unless
that register is written somewhere (I don't think it is).
Anyways, it's not a bad idea to set desc2 to 0 in those two functions.

Rick

2023-02-16 09:54:41

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 1/9] PCI: rockchip: Remove writes to unused registers

On 2/16/23 17:43, Rick Wertenbroek wrote:
> On Thu, Feb 16, 2023 at 8:28 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> On 2/15/23 18:58, Damien Le Moal wrote:
>> [...]
>>> WRITE ( 131072 bytes): OKAY
>>> WRITE (1024000 bytes): OKAY
>>>
>>> Then stops here doing the 1024001 B case. The host waits for a completion that
>>> does not come. On the EP side, I see:
>>>
>>> [ 94.307215] pci_epf_test pci_epf_test.0: READ src addr 0xffd00000, 1024001 B
>>> [ 94.307960] pci_epc fd000000.pcie-ep: Map region 1 phys addr 0xfa100000 to
>>> pci addr 0xffd00000, 1024001 B
>>> [ 94.308924] rockchip-pcie-ep fd000000.pcie-ep: Set atu: region 1, cpu addr
>>> 0xfa100000, pci addr 0xffd00000, 1024001 B
>>> [ 132.309645] dma-pl330 ff6e0000.dma-controller: Reset Channel-2
>>> CS-20000e FTC-40000
>>>
>>> ^^^^^^^^^^^^^^^
>>> The DMA engine does not like something at all. Back to where I was when I tried
>>> your series initially, which is why I tried removing patch 1 to see what happens...
>>>
>>> [ 132.370479] pci_epf_test pci_epf_test.0: READ => Size: 1024001 B, DMA: YES,
>>> Time: 38.059623935 s, Rate: 26 KB/s
>>> [ 132.372152] pci_epc fd000000.pcie-ep: Unmap region 1
>>> [ 132.372780] pci_epf_test pci_epf_test.0: RAISE MSI IRQ 1
>>> [ 132.373312] rockchip-pcie-ep fd000000.pcie-ep: Send MSI IRQ 1
>>> [ 132.373844] rockchip-pcie-ep fd000000.pcie-ep: MSI disabled
>>> [ 132.374388] pci_epf_test pci_epf_test.0: Raise IRQ failed -22
>>>
>>> And it looks like the PCI core crashed or something because MSI does not work
>>> anymore as well (note that this is wheat I see with my nvme epf driver too, but
>>> I do not have that DMA channel reset message...)
>>>
>>> If I run the tests without DMA (mmio only), everything seems fine:
>>>
>>> ## Read Tests (No DMA)
>>> READ ( 1 bytes): OKAY
>>> READ ( 1024 bytes): OKAY
>>> READ ( 1025 bytes): OKAY
>>> READ (1024000 bytes): OKAY
>>> READ (1024001 bytes): OKAY
>>>
>>> ## Write Tests (No DMA)
>>> WRITE ( 1 bytes): OKAY
>>> WRITE ( 1024 bytes): OKAY
>>> WRITE ( 1025 bytes): OKAY
>>> WRITE (1024000 bytes): OKAY
>>> WRITE (1024001 bytes): OKAY
>>>
>>> ## Copy Tests (No DMA)
>>> COPY ( 1 bytes): OKAY
>>> COPY ( 1024 bytes): OKAY
>>> COPY ( 1025 bytes): OKAY
>>> COPY (1024000 bytes): OKAY
>>> COPY (1024001 bytes): OKAY
>>>
>>> So it looks like translation is working with your patch, but that the driver is
>>> still missing something for DMA to work correctly...
>>
>> I kept testing this and realized that I was not getting a consistent behavior.
>> Sometimes all tests passed, but would not repeat (running again would fail
>> everything), sometimes NMIs from bad accesses, and other times "hang" (test not
>> completing but no real machine hang/crash). So it started to hint at something
>> randomly initialized...
>>
>> Re-reading the TRM, particularly section 17.5.5.1.1, I realized that the lower
>> 16 bits of the desc2 register are used for the translation, but we never set
>> them with the current code. Only desc0 and desc1... So I added a write(0) to
>> desc2 and now it is finally working well. Running the tests in a loop, they all
>> pass and no bad behavior is observed.
>>
>> My cleaned-up rockchip_pcie_prog_ep_ob_atu() function now looks like this:
>>
>> static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
>> u32 r, u32 type, u64 phys_addr,
>> u64 pci_addr, size_t size)
>> {
>> u64 sz = 1ULL << fls64(size - 1);
>> int num_pass_bits = ilog2(sz);
>> u32 addr0, addr1, desc0;
>>
>> /* Sanity checks */
>> if (WARN_ON_ONCE(type == AXI_WRAPPER_NOR_MSG))
>> return;
>> if (WARN_ON_ONCE(ALIGN_DOWN(phys_addr, SZ_1M) != phys_addr))
>> return;
>> if (WARN_ON_ONCE(rockchip_ob_region(phys_addr + size - 1) != r))
>> return;
>>
>> /* We must pass at least 8 bits of PCI bus address */
>> if (num_pass_bits < 8)
>> num_pass_bits = 8;
>>
>> /* PCI bus address region */
>> addr0 = ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
>> (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
>> addr1 = upper_32_bits(pci_addr);
>> desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type;
>>
>> rockchip_pcie_write(rockchip, addr0,
>> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r));
>> rockchip_pcie_write(rockchip, addr1,
>> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r));
>> rockchip_pcie_write(rockchip, desc0,
>> ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_DESC2(r));
>> }
>>
>> And the function rockchip_pcie_clear_ep_ob_atu() also clears desc2:
>>
>> static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
>> u32 region)
>> {
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(region));
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(region));
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region));
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region));
>> rockchip_pcie_write(rockchip, 0,
>> ROCKCHIP_PCIE_AT_OB_REGION_DESC2(region));
>> }
>>
>> Thoughts ?
>>
>> --
>> Damien Le Moal
>> Western Digital Research
>>
>
> desc2 dictates the bits [79-64] of the PCIe header descriptor.
> These bits are for the PF TLP Processing hints.
> I did not set them because I thought the default value (0) would be fine.
> The TRM section 17.6.8.2.5 says that this register values are reset
> to 0, therefore, the write here (0) to desc2 should not change anything unless
> that register is written somewhere (I don't think it is).
> Anyways, it's not a bad idea to set desc2 to 0 in those two functions.

I wonder if that register changes when TLPs are processed... So when the region
is remapped, previous values still there cause the problems I am seeing ?

As mentioned, with this "fix", the pcitest.sh is now solid. But I am still
seeing the same issues with my nvme endpoint driver when Linux takes over from
BIOS. Bios works, but not Linux, still no IRQs at all. So there is likely still
another issue that I cannot see at the moment. No hints whatsoever.

--
Damien Le Moal
Western Digital Research


2023-02-21 10:48:35

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
<[email protected]> wrote:
>
> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > The RK3399 PCIe endpoint core supports only a single PCIe physcial
> > function (function number 0), therefore return -EINVAL if set_msi() is
> > called with a function number greater than 0.
> > The PCIe standard only allows the multi message capability (MMC) value
> > to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> > called with a MMC value of over 0x5.
> >
> > Signed-off-by: Rick Wertenbroek <[email protected]>
> > ---
> > drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> > index b7865a94e..80634b690 100644
> > --- a/drivers/pci/controller/pcie-rockchip-ep.c
> > +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> > @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> > struct rockchip_pcie *rockchip = &ep->rockchip;
> > u32 flags;
> >
> > + if (fn) {
> > + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> > + return -EINVAL;
> > + }
>
> Checking this here is late... Given that at most only one physical
> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
> Something like:
>
> err = of_property_read_u8(dev->of_node, "max-functions",
> &ep->epc->max_functions);
>
> if (err < 0 || ep->epc->max_functions > 1)
>
> ep->epc->max_functions = 1;
>

Yes, this could be moved to the probe, thanks.

> And all the macros with the (fn) argument could also be simplified
> (argument fn removed) since fn will always be 0.

These functions cannot be simplified because they have to follow the signature
given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
function number as a parameter. If we change the function signature we won't
be able to assign these functions to the pc_epc_ops structure

static const struct pci_epc_ops rockchip_pcie_epc_ops = {
.write_header = rockchip_pcie_ep_write_header,
.set_bar = rockchip_pcie_ep_set_bar,
.clear_bar = rockchip_pcie_ep_clear_bar,
.map_addr = rockchip_pcie_ep_map_addr,
.unmap_addr = rockchip_pcie_ep_unmap_addr,
.set_msi = rockchip_pcie_ep_set_msi,
.get_msi = rockchip_pcie_ep_get_msi,
.raise_irq = rockchip_pcie_ep_raise_irq,
.start = rockchip_pcie_ep_start,
.get_features = rockchip_pcie_ep_get_features,
};

>
> > +
> > + if (mmc > 0x5) {
> > + dev_err(&epc->dev, "Number of MSI IRQs cannot be more than 32\n");
>
> Long line. Please split it after the comma.
>
> > + return -EINVAL;
> > + }
> > +
> > flags = rockchip_pcie_read(rockchip,
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> > ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
>
> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
> here all the time.

Yes, this could be an improvement but this is the way it is written
everywhere in this
driver, I chose to keep it so as to remain coherent with the rest of the driver.
Cleaning this is not so important since this code will not be
rewritten / changed so
often. But I agree that it might be nicer. But, on the other side if
at some point
support for virtual functions would be added, the offsets would need
to be computed
based on the virtual function number and the code would be written
like it is now,
so I suggest keeping this the way it is for now.

>
> --
> Damien Le Moal
> Western Digital Research
>

Thank you for your comments.

Regards
Rick

2023-02-21 10:55:41

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On 2/21/23 19:47, Rick Wertenbroek wrote:
> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>> called with a function number greater than 0.
>>> The PCIe standard only allows the multi message capability (MMC) value
>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>> called with a MMC value of over 0x5.
>>>
>>> Signed-off-by: Rick Wertenbroek <[email protected]>
>>> ---
>>> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>> index b7865a94e..80634b690 100644
>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>> struct rockchip_pcie *rockchip = &ep->rockchip;
>>> u32 flags;
>>>
>>> + if (fn) {
>>> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>> + return -EINVAL;
>>> + }
>>
>> Checking this here is late... Given that at most only one physical
>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>> Something like:
>>
>> err = of_property_read_u8(dev->of_node, "max-functions",
>> &ep->epc->max_functions);
>>
>> if (err < 0 || ep->epc->max_functions > 1)
>>
>> ep->epc->max_functions = 1;
>>
>
> Yes, this could be moved to the probe, thanks.
>
>> And all the macros with the (fn) argument could also be simplified
>> (argument fn removed) since fn will always be 0.
>
> These functions cannot be simplified because they have to follow the signature
> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
> function number as a parameter. If we change the function signature we won't
> be able to assign these functions to the pc_epc_ops structure

I was not suggesting to change the functions signature. I was suggesting
dropping the fn argument for the *macros*, e.g.

ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE

since fn is always 0.

That said, I am not entirely sure if the limit really is 1 function at most. The
TRM seems to be suggesting that up to 4 functions can be supported...

[...]

>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>> here all the time.
>
> Yes, this could be an improvement but this is the way it is written
> everywhere in this
> driver, I chose to keep it so as to remain coherent with the rest of the driver.
> Cleaning this is not so important since this code will not be
> rewritten / changed so
> often. But I agree that it might be nicer. But, on the other side if
> at some point
> support for virtual functions would be added, the offsets would need
> to be computed
> based on the virtual function number and the code would be written
> like it is now,
> so I suggest keeping this the way it is for now.

Yes, sure, this can be cleaned later.

A more pressing problem is the lack of support for MSIX despite the fact that
the controller supports that *and* advertize it as a capability. That is what
was causing my problem with the Linux nvme driver and my prototype nvme epf
function driver: the host driver was seeing MSIX support (1 vector supported by
default), and so was allocating one MSIX for the device probe. But on the EP
end, it is MSI or INTX only... Working on adding that to solve this issue.

--
Damien Le Moal
Western Digital Research


2023-02-21 13:19:43

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
<[email protected]> wrote:
>
> On 2/21/23 19:47, Rick Wertenbroek wrote:
> > On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
> > <[email protected]> wrote:
> >>
> >> On 2/14/23 23:08, Rick Wertenbroek wrote:
> >>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
> >>> function (function number 0), therefore return -EINVAL if set_msi() is
> >>> called with a function number greater than 0.
> >>> The PCIe standard only allows the multi message capability (MMC) value
> >>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> >>> called with a MMC value of over 0x5.
> >>>
> >>> Signed-off-by: Rick Wertenbroek <[email protected]>
> >>> ---
> >>> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> >>> 1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> >>> index b7865a94e..80634b690 100644
> >>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> >>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> >>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> >>> struct rockchip_pcie *rockchip = &ep->rockchip;
> >>> u32 flags;
> >>>
> >>> + if (fn) {
> >>> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> >>> + return -EINVAL;
> >>> + }
> >>
> >> Checking this here is late... Given that at most only one physical
> >> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
> >> Something like:
> >>
> >> err = of_property_read_u8(dev->of_node, "max-functions",
> >> &ep->epc->max_functions);
> >>
> >> if (err < 0 || ep->epc->max_functions > 1)
> >>
> >> ep->epc->max_functions = 1;
> >>
> >
> > Yes, this could be moved to the probe, thanks.
> >
> >> And all the macros with the (fn) argument could also be simplified
> >> (argument fn removed) since fn will always be 0.
> >
> > These functions cannot be simplified because they have to follow the signature
> > given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
> > function number as a parameter. If we change the function signature we won't
> > be able to assign these functions to the pc_epc_ops structure
>
> I was not suggesting to change the functions signature. I was suggesting
> dropping the fn argument for the *macros*, e.g.
>
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>
> since fn is always 0.
>
> That said, I am not entirely sure if the limit really is 1 function at most. The
> TRM seems to be suggesting that up to 4 functions can be supported...
>
> [...]
>
> >> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
> >> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
> >> here all the time.
> >
> > Yes, this could be an improvement but this is the way it is written
> > everywhere in this
> > driver, I chose to keep it so as to remain coherent with the rest of the driver.
> > Cleaning this is not so important since this code will not be
> > rewritten / changed so
> > often. But I agree that it might be nicer. But, on the other side if
> > at some point
> > support for virtual functions would be added, the offsets would need
> > to be computed
> > based on the virtual function number and the code would be written
> > like it is now,
> > so I suggest keeping this the way it is for now.
>
> Yes, sure, this can be cleaned later.
>
> A more pressing problem is the lack of support for MSIX despite the fact that
> the controller supports that *and* advertize it as a capability. That is what
> was causing my problem with the Linux nvme driver and my prototype nvme epf
> function driver: the host driver was seeing MSIX support (1 vector supported by
> default), and so was allocating one MSIX for the device probe. But on the EP
> end, it is MSI or INTX only... Working on adding that to solve this issue.
>

I have seen this too, the controller advertises the capability. However, the TRM
(section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
So the solution should be to modify the probe function of the endpoint
controller
so that the MSI-X capability would not be advertised, this should fix
your problem.

I wonder if one could still implement MSI-X because from the endpoint we would
just need to implement it as a message (TLP) over PCIe (because the space for
the vectors is allocated and written, so that part should be ok). I am
not an expert
on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
requires some fields in the PCIe header descriptor to be filled with values that
cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.

Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
send such interrupts. Once this is fixed you should be able to have your NVMe
function running.

Regards.
Rick


> --
> Damien Le Moal
> Western Digital Research
>

2023-02-21 16:38:49

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On Tue, Feb 21, 2023 at 2:19 PM Rick Wertenbroek
<[email protected]> wrote:
>
> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
> <[email protected]> wrote:
> >
> > On 2/21/23 19:47, Rick Wertenbroek wrote:
> > > On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
> > > <[email protected]> wrote:
> > >>
> > >> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > >>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
> > >>> function (function number 0), therefore return -EINVAL if set_msi() is
> > >>> called with a function number greater than 0.
> > >>> The PCIe standard only allows the multi message capability (MMC) value
> > >>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> > >>> called with a MMC value of over 0x5.
> > >>>
> > >>> Signed-off-by: Rick Wertenbroek <[email protected]>
> > >>> ---
> > >>> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> > >>> 1 file changed, 10 insertions(+)
> > >>>
> > >>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> > >>> index b7865a94e..80634b690 100644
> > >>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> > >>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> > >>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> > >>> struct rockchip_pcie *rockchip = &ep->rockchip;
> > >>> u32 flags;
> > >>>
> > >>> + if (fn) {
> > >>> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> > >>> + return -EINVAL;
> > >>> + }
> > >>
> > >> Checking this here is late... Given that at most only one physical
> > >> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
> > >> Something like:
> > >>
> > >> err = of_property_read_u8(dev->of_node, "max-functions",
> > >> &ep->epc->max_functions);
> > >>
> > >> if (err < 0 || ep->epc->max_functions > 1)
> > >>
> > >> ep->epc->max_functions = 1;
> > >>
> > >
> > > Yes, this could be moved to the probe, thanks.
> > >
> > >> And all the macros with the (fn) argument could also be simplified
> > >> (argument fn removed) since fn will always be 0.
> > >
> > > These functions cannot be simplified because they have to follow the signature
> > > given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
> > > function number as a parameter. If we change the function signature we won't
> > > be able to assign these functions to the pc_epc_ops structure
> >
> > I was not suggesting to change the functions signature. I was suggesting
> > dropping the fn argument for the *macros*, e.g.
> >
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
> >
> > since fn is always 0.
> >
> > That said, I am not entirely sure if the limit really is 1 function at most. The
> > TRM seems to be suggesting that up to 4 functions can be supported...
> >
> > [...]
> >
> > >> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
> > >> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
> > >> here all the time.
> > >
> > > Yes, this could be an improvement but this is the way it is written
> > > everywhere in this
> > > driver, I chose to keep it so as to remain coherent with the rest of the driver.
> > > Cleaning this is not so important since this code will not be
> > > rewritten / changed so
> > > often. But I agree that it might be nicer. But, on the other side if
> > > at some point
> > > support for virtual functions would be added, the offsets would need
> > > to be computed
> > > based on the virtual function number and the code would be written
> > > like it is now,
> > > so I suggest keeping this the way it is for now.
> >
> > Yes, sure, this can be cleaned later.
> >
> > A more pressing problem is the lack of support for MSIX despite the fact that
> > the controller supports that *and* advertize it as a capability. That is what
> > was causing my problem with the Linux nvme driver and my prototype nvme epf
> > function driver: the host driver was seeing MSIX support (1 vector supported by
> > default), and so was allocating one MSIX for the device probe. But on the EP
> > end, it is MSI or INTX only... Working on adding that to solve this issue.
> >
>
> I have seen this too, the controller advertises the capability. However, the TRM
> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
> So the solution should be to modify the probe function of the endpoint
> controller
> so that the MSI-X capability would not be advertised, this should fix
> your problem.
>
> I wonder if one could still implement MSI-X because from the endpoint we would
> just need to implement it as a message (TLP) over PCIe (because the space for
> the vectors is allocated and written, so that part should be ok). I am
> not an expert
> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
> requires some fields in the PCIe header descriptor to be filled with values that
> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
>
> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
> send such interrupts. Once this is fixed you should be able to have your NVMe
> function running.
>
> Regards.
> Rick
>

It is possible to disable MSI-X by skipping the MSI-X capability
structure in the PCIe
capabilities structures linked-list.
The current linked list is MSI cap (0x90) -> MSI-X cap (0xb0) -> PCIe
Device cap (0xc0)
So we want to set MSI (0x90) -> PCIe Device cap (0xc0)

This can be done by writing 0xc0 to bits 15-8 of 0xFDA0'0090 (MSI cap).
I tested this quickly through devmem2 before loading the endpoint
function driver
and it fixes the issue of MSI-X capabilities being advertised to the host.

In the driver the changes would look like this;
In the probe function you can disable MSI-X as follows:

@@ -631,6 +618,28 @@ static int rockchip_pcie_ep_probe(struct
platform_device *pdev)

ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;

+ /*
+ * Disable MSI-X because the controller is not capable of MSI-X
+ * This requires to skip the MSI-X capabilities entry in the
+ * chain of PCIe capabilities, we get the next pointer from the
+ * MSI-X entry and set that in the MSI capability entry, this way
+ * the MSI-X entry is skipped (left out of the linked-list)
+ */
+ cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
+ ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
+
+ cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK;
+
+ cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
+ ROCKCHIP_PCIE_EP_MSIX_CAP_REG) &
ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK;
+
+ cfg_msi |= cfg_msix_cp;
+
+ rockchip_pcie_write(rockchip, cfg_msi,
+ PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
+
rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
PCIE_CLIENT_CONFIG);

return 0;
err_epc_mem_exit:
pci_epc_mem_exit(epc);

In the pcie-rockchip.h add the following #defines:

@@ -216,21 +227,28 @@
#define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
#define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
#define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
+#define ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET 8
+#define ROCKCHIP_PCIE_EP_MSI_CP1_MASK GENMASK(15, 8)
+#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK GENMASK(22, 20)
#define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
#define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
+#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG 0xb0
+#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET 8
+#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK GENMASK(15, 8)
#define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3

I will add this to the next version of the patch set.
Thank you Damien for pointing this out ! This should solve the issues
you have with
your NVMe endpoint function regarding MSI-X interrupts.

Regards
Rick

>
> > --
> > Damien Le Moal
> > Western Digital Research
> >

2023-02-21 21:58:13

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On 2/21/23 22:19, Rick Wertenbroek wrote:
> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> On 2/21/23 19:47, Rick Wertenbroek wrote:
>>> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
>>> <[email protected]> wrote:
>>>>
>>>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>>>> called with a function number greater than 0.
>>>>> The PCIe standard only allows the multi message capability (MMC) value
>>>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>>>> called with a MMC value of over 0x5.
>>>>>
>>>>> Signed-off-by: Rick Wertenbroek <[email protected]>
>>>>> ---
>>>>> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>>>> 1 file changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>> index b7865a94e..80634b690 100644
>>>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>>>> struct rockchip_pcie *rockchip = &ep->rockchip;
>>>>> u32 flags;
>>>>>
>>>>> + if (fn) {
>>>>> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>>>> + return -EINVAL;
>>>>> + }
>>>>
>>>> Checking this here is late... Given that at most only one physical
>>>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>>>> Something like:
>>>>
>>>> err = of_property_read_u8(dev->of_node, "max-functions",
>>>> &ep->epc->max_functions);
>>>>
>>>> if (err < 0 || ep->epc->max_functions > 1)
>>>>
>>>> ep->epc->max_functions = 1;
>>>>
>>>
>>> Yes, this could be moved to the probe, thanks.
>>>
>>>> And all the macros with the (fn) argument could also be simplified
>>>> (argument fn removed) since fn will always be 0.
>>>
>>> These functions cannot be simplified because they have to follow the signature
>>> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
>>> function number as a parameter. If we change the function signature we won't
>>> be able to assign these functions to the pc_epc_ops structure
>>
>> I was not suggesting to change the functions signature. I was suggesting
>> dropping the fn argument for the *macros*, e.g.
>>
>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>>
>> since fn is always 0.
>>
>> That said, I am not entirely sure if the limit really is 1 function at most. The
>> TRM seems to be suggesting that up to 4 functions can be supported...
>>
>> [...]
>>
>>>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>>>> here all the time.
>>>
>>> Yes, this could be an improvement but this is the way it is written
>>> everywhere in this
>>> driver, I chose to keep it so as to remain coherent with the rest of the driver.
>>> Cleaning this is not so important since this code will not be
>>> rewritten / changed so
>>> often. But I agree that it might be nicer. But, on the other side if
>>> at some point
>>> support for virtual functions would be added, the offsets would need
>>> to be computed
>>> based on the virtual function number and the code would be written
>>> like it is now,
>>> so I suggest keeping this the way it is for now.
>>
>> Yes, sure, this can be cleaned later.
>>
>> A more pressing problem is the lack of support for MSIX despite the fact that
>> the controller supports that *and* advertize it as a capability. That is what
>> was causing my problem with the Linux nvme driver and my prototype nvme epf
>> function driver: the host driver was seeing MSIX support (1 vector supported by
>> default), and so was allocating one MSIX for the device probe. But on the EP
>> end, it is MSI or INTX only... Working on adding that to solve this issue.
>>
>
> I have seen this too, the controller advertises the capability. However, the TRM
> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
> So the solution should be to modify the probe function of the endpoint
> controller
> so that the MSI-X capability would not be advertised, this should fix
> your problem.

Yep, that is what I did for now: write 0 in the capability ID of the MSIX
capability list entry. A cleaner solution would be to change the next entry
pointer of the entry preceding the MSIX entry. Will send a patch for that.

>
> I wonder if one could still implement MSI-X because from the endpoint we would
> just need to implement it as a message (TLP) over PCIe (because the space for
> the vectors is allocated and written, so that part should be ok). I am
> not an expert
> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
> requires some fields in the PCIe header descriptor to be filled with values that
> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
>
> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
> send such interrupts. Once this is fixed you should be able to have your NVMe
> function running.
>
> Regards.
> Rick
>
>
>> --
>> Damien Le Moal
>> Western Digital Research
>>

--
Damien Le Moal
Western Digital Research


2023-02-21 22:01:54

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

On 2/22/23 01:37, Rick Wertenbroek wrote:
> On Tue, Feb 21, 2023 at 2:19 PM Rick Wertenbroek
> <[email protected]> wrote:
>>
>> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
>> <[email protected]> wrote:
>>>
>>> On 2/21/23 19:47, Rick Wertenbroek wrote:
>>>> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
>>>> <[email protected]> wrote:
>>>>>
>>>>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>>>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>>>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>>>>> called with a function number greater than 0.
>>>>>> The PCIe standard only allows the multi message capability (MMC) value
>>>>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>>>>> called with a MMC value of over 0x5.
>>>>>>
>>>>>> Signed-off-by: Rick Wertenbroek <[email protected]>
>>>>>> ---
>>>>>> drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>>>>> 1 file changed, 10 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> index b7865a94e..80634b690 100644
>>>>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>>>>> struct rockchip_pcie *rockchip = &ep->rockchip;
>>>>>> u32 flags;
>>>>>>
>>>>>> + if (fn) {
>>>>>> + dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>
>>>>> Checking this here is late... Given that at most only one physical
>>>>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>>>>> Something like:
>>>>>
>>>>> err = of_property_read_u8(dev->of_node, "max-functions",
>>>>> &ep->epc->max_functions);
>>>>>
>>>>> if (err < 0 || ep->epc->max_functions > 1)
>>>>>
>>>>> ep->epc->max_functions = 1;
>>>>>
>>>>
>>>> Yes, this could be moved to the probe, thanks.
>>>>
>>>>> And all the macros with the (fn) argument could also be simplified
>>>>> (argument fn removed) since fn will always be 0.
>>>>
>>>> These functions cannot be simplified because they have to follow the signature
>>>> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
>>>> function number as a parameter. If we change the function signature we won't
>>>> be able to assign these functions to the pc_epc_ops structure
>>>
>>> I was not suggesting to change the functions signature. I was suggesting
>>> dropping the fn argument for the *macros*, e.g.
>>>
>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>>>
>>> since fn is always 0.
>>>
>>> That said, I am not entirely sure if the limit really is 1 function at most. The
>>> TRM seems to be suggesting that up to 4 functions can be supported...
>>>
>>> [...]
>>>
>>>>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>>>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>>>>> here all the time.
>>>>
>>>> Yes, this could be an improvement but this is the way it is written
>>>> everywhere in this
>>>> driver, I chose to keep it so as to remain coherent with the rest of the driver.
>>>> Cleaning this is not so important since this code will not be
>>>> rewritten / changed so
>>>> often. But I agree that it might be nicer. But, on the other side if
>>>> at some point
>>>> support for virtual functions would be added, the offsets would need
>>>> to be computed
>>>> based on the virtual function number and the code would be written
>>>> like it is now,
>>>> so I suggest keeping this the way it is for now.
>>>
>>> Yes, sure, this can be cleaned later.
>>>
>>> A more pressing problem is the lack of support for MSIX despite the fact that
>>> the controller supports that *and* advertize it as a capability. That is what
>>> was causing my problem with the Linux nvme driver and my prototype nvme epf
>>> function driver: the host driver was seeing MSIX support (1 vector supported by
>>> default), and so was allocating one MSIX for the device probe. But on the EP
>>> end, it is MSI or INTX only... Working on adding that to solve this issue.
>>>
>>
>> I have seen this too, the controller advertises the capability. However, the TRM
>> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
>> So the solution should be to modify the probe function of the endpoint
>> controller
>> so that the MSI-X capability would not be advertised, this should fix
>> your problem.
>>
>> I wonder if one could still implement MSI-X because from the endpoint we would
>> just need to implement it as a message (TLP) over PCIe (because the space for
>> the vectors is allocated and written, so that part should be ok). I am
>> not an expert
>> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
>> requires some fields in the PCIe header descriptor to be filled with values that
>> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
>>
>> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
>> send such interrupts. Once this is fixed you should be able to have your NVMe
>> function running.
>>
>> Regards.
>> Rick
>>
>
> It is possible to disable MSI-X by skipping the MSI-X capability
> structure in the PCIe
> capabilities structures linked-list.
> The current linked list is MSI cap (0x90) -> MSI-X cap (0xb0) -> PCIe
> Device cap (0xc0)
> So we want to set MSI (0x90) -> PCIe Device cap (0xc0)
>
> This can be done by writing 0xc0 to bits 15-8 of 0xFDA0'0090 (MSI cap).
> I tested this quickly through devmem2 before loading the endpoint
> function driver
> and it fixes the issue of MSI-X capabilities being advertised to the host.
>
> In the driver the changes would look like this;
> In the probe function you can disable MSI-X as follows:
>
> @@ -631,6 +618,28 @@ static int rockchip_pcie_ep_probe(struct
> platform_device *pdev)
>
> ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
>
> + /*
> + * Disable MSI-X because the controller is not capable of MSI-X
> + * This requires to skip the MSI-X capabilities entry in the

s/capabilities/capability

> + * chain of PCIe capabilities, we get the next pointer from the
> + * MSI-X entry and set that in the MSI capability entry, this way
> + * the MSI-X entry is skipped (left out of the linked-list)
> + */
> + cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
> + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> +
> + cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK;
> +
> + cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
> + ROCKCHIP_PCIE_EP_MSIX_CAP_REG) &
> ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK;
> +
> + cfg_msi |= cfg_msix_cp;
> +
> + rockchip_pcie_write(rockchip, cfg_msi,
> + PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> +
> rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
> PCIE_CLIENT_CONFIG);
>
> return 0;
> err_epc_mem_exit:
> pci_epc_mem_exit(epc);
>
> In the pcie-rockchip.h add the following #defines:
>
> @@ -216,21 +227,28 @@
> #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
> #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
> +#define ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET 8
> +#define ROCKCHIP_PCIE_EP_MSI_CP1_MASK GENMASK(15, 8)
> +#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK GENMASK(22, 20)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG 0xb0
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET 8
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK GENMASK(15, 8)
> #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
> #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3
>
> I will add this to the next version of the patch set.
> Thank you Damien for pointing this out ! This should solve the issues
> you have with
> your NVMe endpoint function regarding MSI-X interrupts.

OK. I replied to your previous mail with the same idea. Looks good :)
Will test that instead of my dirty hack that puts 0 in the MSIX capability ID.

>
> Regards
> Rick
>
>>
>>> --
>>> Damien Le Moal
>>> Western Digital Research
>>>

--
Damien Le Moal
Western Digital Research


2023-03-14 00:03:13

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 2/14/23 23:08, Rick Wertenbroek wrote:
> This is a series of patches that fixes the PCIe endpoint controller driver
> for the Rockchip RK3399 SoC. The driver was introduced in
> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> The original driver had issues and would not allow for the RK3399 to
> operate in PCIe endpoint mode correctly. This patch series fixes that so
> that the PCIe core controller of the RK3399 SoC can now act as a PCIe
> endpoint. This is v2 of the patch series and addresses the concerns that
> were raised during the review of the first version.

Rick,

Are you going to send a rebased V3 soon ? I have a couple of additional
patches to add on top of your series...


>
> Thank you in advance for reviewing these changes and hopefully
> getting this merged. Having a functional PCIe endpoint controller
> driver for the RK3399 would allow to develop further PCIe endpoint
> functions through the Linux PCIe endpoint framework using this SoC.
>
> Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in
> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> did not work.
>
> Summary of problems with the driver :
>
> * Missing dtsi entry
> * Could not update Device ID (DID)
> * The endpoint could not be configured by a host computer because the
> endpoint kept sending Configuration Request Retry Status (CRS) messages
> * The kernel would sometimes hang on probe due to access to registers in
> a clock domain of which the PLLs were not locked
> * The memory window mapping and address translation mechanism had
> conflicting mappings and did not follow the technical reference manual
> as to how the address translation should be done
> * Legacy IRQs were not generated by the endpoint
> * Message Signaled interrupts (MSI) were not generated by the endpoint
>
> The problems have been addressed and validated through tests (see below).
>
> Summary of changes :
>
> This patch series is composed of 9 patches that do the following :
> * Remove writes to unused registers in the PCIe core register space.
> The registers that were written to is marked "unused" and read
> only in the technical reference manual of the RK3399 SoC.
> * Write PCI Device ID (DID) to correct register, the DID was written to
> a read only register and therefore would not update the DID.
> * Assert PCI Configuration Enable bit after probe so that it would stop
> sending Configuration Request Retry Status (CRS) messages to the
> host once configured, without this the host would retry until
> timeout and cancel the PCI configuration.
> * Add poll and timeout to wait for PHY PLLs to be locked, this
> is the only patch that also applies to the root complex function
> of the PCIe core controller, without this the kernel would
> sometimes access registers in the PHY PLL clock domain when the PLLs
> were not yet locked and the system would hang. This was hackily solved
> in other non mainline patches (e.g., in armbian) with a "msleep()"
> that was added after PHY PLL configuration but without realizing
> why it was needed. A poll with timeout seems like a sane approach.
> * Add dtsi entry for RK3399 PCIe endpoint core. The new entry is
> in "disabled" status by default, so unless it is explicitly enabled
> it will not conflict with the PCIe root complex controller entry.
> Developers that will enable it would know that the root complex function
> then must be disabled, this can be done in the board level DTS.
> * Fix window mapping and address translation for endpoint. The window
> mapping and address translation did not follow the technical reference
> manual and a single memory region was used which resulted in conflicting
> address translations for memory allocated in that region. The current
> patch allows to allocate up to 32 memory windows with 1MB pages.
> * Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs
> were not sent by the device because their generation did not follow the
> instructions in the technical reference manual. They now work.
> * Use u32 variable to access 32-bit registers, u16 variables were used to
> access and manipulate data of 32-bit registers, this would lead to
> overflows e.g., when left shifting more than 16 bits.
> * Add parameter check for RK3399 PCIe endpoint core set_msi(), return
> -EINVAL when incompatible parameters are passed.
>
> Validation on real hardware:
>
> This patch series has been tested with kernel 6.0.19 (and 5.19)
> on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer
> board connected to a host computer through PCIe x1 and x4. The PCIe
> endpoint test function driver was loaded on the SoC and the PCIe endpoint
> test driver was loaded on the host computer. The following tests were
> executed through this setup :
>
> * enumeration of the PCIe endpoint device (lspci)
> lspci -vvv
> * validation of PCI header and capabilities
> setpci and lspci -xxxx
> * device was recognized by host computer dans PCIe endpoint test driver
> was loaded
> lspci -v states "Kernel modules: pci_endpoint_test"
> * tested the BARs 0-5
> sudo /usr/bin/pcitest -b 0
> ...
> sudo /usr/bin/pcitest -b 5
> * tested legacy interrupt through the test driver
> sudo /usr/bin/pcitest -i 0
> sudo /usr/bin/pcitest -l
> * tested MSI interrupt through the test driver
> sudo /usr/bin/pcitest -i 1
> sudo /usr/bin/pcitest -m 1
> * tested read/write to and from host through the test driver with checksum
> sudo /usr/bin/pcitest -r -s 1024
> sudo /usr/bin/pcitest -w -s 1024
> * tested read/write with DMA enabled (all read/write tests also did IRQ)
> sudo /usr/bin/pcitest -r -d -s 8192
> sudo /usr/bin/pcitest -w -d -s 8192
>
> Commands used on the SoC to launch the endpoint function (configfs) :
>
> modprobe -i pci-epf-test
> mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0
> echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid
> echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid
> echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts
> ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \
> /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/
> echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start
>
> Note: to enable the endpoint controller on the board the file :
> arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts
> Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep
> to "okay". This is not submitted as a patch because most users
> will use the PCIe core controller in host (root complex) mode
> rather than endpoint mode.
>
> I have tested and confirmed all basic functionality required for the
> endpoint with the test driver and tools. With the previous state of
> the driver the device would not even be enumerated by the host
> computer (mainly because of CRS messages being sent back to the root
> complex) and tests would not pass (driver would not even be loaded
> because DID was not set correctly) and then only the BAR test would
> pass. Now all tests pass as stated above.
>
> Best regards
> Rick
>
> Rick Wertenbroek (9):
> PCI: rockchip: Remove writes to unused registers
> PCI: rockchip: Write PCI Device ID to correct register
> PCI: rockchip: Assert PCI Configuration Enable bit after probe
> PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
> arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core
> PCI: rockchip: Fix window mapping and address translation for endpoint
> PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
> PCI: rockchip: Use u32 variable to access 32-bit registers
> PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core
> set_msi()
>
> arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 ++++
> drivers/pci/controller/pcie-rockchip-ep.c | 143 ++++++++++------------
> drivers/pci/controller/pcie-rockchip.c | 16 +++
> drivers/pci/controller/pcie-rockchip.h | 36 ++++--
> 4 files changed, 128 insertions(+), 90 deletions(-)
>

--
Damien Le Moal
Western Digital Research


2023-03-14 07:57:53

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Tue, Mar 14, 2023 at 1:02 AM Damien Le Moal
<[email protected]> wrote:
>
> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > This is a series of patches that fixes the PCIe endpoint controller driver
> > for the Rockchip RK3399 SoC. The driver was introduced in
> > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> > The original driver had issues and would not allow for the RK3399 to
> > operate in PCIe endpoint mode correctly. This patch series fixes that so
> > that the PCIe core controller of the RK3399 SoC can now act as a PCIe
> > endpoint. This is v2 of the patch series and addresses the concerns that
> > were raised during the review of the first version.
>
> Rick,
>
> Are you going to send a rebased V3 soon ? I have a couple of additional
> patches to add on top of your series...
>

I'll try to send a V3 this week. The changes to V2 will be the issues
raised and discussed on the V2 here in the mailing list with the additional
code for removing the unsupported MSI-X capability list (was discussed
in the mailing list as well).

>
> >
> > Thank you in advance for reviewing these changes and hopefully
> > getting this merged. Having a functional PCIe endpoint controller
> > driver for the RK3399 would allow to develop further PCIe endpoint
> > functions through the Linux PCIe endpoint framework using this SoC.
> >
> > Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in
> > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> > did not work.
> >
> > Summary of problems with the driver :
> >
> > * Missing dtsi entry
> > * Could not update Device ID (DID)
> > * The endpoint could not be configured by a host computer because the
> > endpoint kept sending Configuration Request Retry Status (CRS) messages
> > * The kernel would sometimes hang on probe due to access to registers in
> > a clock domain of which the PLLs were not locked
> > * The memory window mapping and address translation mechanism had
> > conflicting mappings and did not follow the technical reference manual
> > as to how the address translation should be done
> > * Legacy IRQs were not generated by the endpoint
> > * Message Signaled interrupts (MSI) were not generated by the endpoint
> >
> > The problems have been addressed and validated through tests (see below).
> >
> > Summary of changes :
> >
> > This patch series is composed of 9 patches that do the following :
> > * Remove writes to unused registers in the PCIe core register space.
> > The registers that were written to is marked "unused" and read
> > only in the technical reference manual of the RK3399 SoC.
> > * Write PCI Device ID (DID) to correct register, the DID was written to
> > a read only register and therefore would not update the DID.
> > * Assert PCI Configuration Enable bit after probe so that it would stop
> > sending Configuration Request Retry Status (CRS) messages to the
> > host once configured, without this the host would retry until
> > timeout and cancel the PCI configuration.
> > * Add poll and timeout to wait for PHY PLLs to be locked, this
> > is the only patch that also applies to the root complex function
> > of the PCIe core controller, without this the kernel would
> > sometimes access registers in the PHY PLL clock domain when the PLLs
> > were not yet locked and the system would hang. This was hackily solved
> > in other non mainline patches (e.g., in armbian) with a "msleep()"
> > that was added after PHY PLL configuration but without realizing
> > why it was needed. A poll with timeout seems like a sane approach.
> > * Add dtsi entry for RK3399 PCIe endpoint core. The new entry is
> > in "disabled" status by default, so unless it is explicitly enabled
> > it will not conflict with the PCIe root complex controller entry.
> > Developers that will enable it would know that the root complex function
> > then must be disabled, this can be done in the board level DTS.
> > * Fix window mapping and address translation for endpoint. The window
> > mapping and address translation did not follow the technical reference
> > manual and a single memory region was used which resulted in conflicting
> > address translations for memory allocated in that region. The current
> > patch allows to allocate up to 32 memory windows with 1MB pages.
> > * Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs
> > were not sent by the device because their generation did not follow the
> > instructions in the technical reference manual. They now work.
> > * Use u32 variable to access 32-bit registers, u16 variables were used to
> > access and manipulate data of 32-bit registers, this would lead to
> > overflows e.g., when left shifting more than 16 bits.
> > * Add parameter check for RK3399 PCIe endpoint core set_msi(), return
> > -EINVAL when incompatible parameters are passed.
> >
> > Validation on real hardware:
> >
> > This patch series has been tested with kernel 6.0.19 (and 5.19)
> > on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer
> > board connected to a host computer through PCIe x1 and x4. The PCIe
> > endpoint test function driver was loaded on the SoC and the PCIe endpoint
> > test driver was loaded on the host computer. The following tests were
> > executed through this setup :
> >
> > * enumeration of the PCIe endpoint device (lspci)
> > lspci -vvv
> > * validation of PCI header and capabilities
> > setpci and lspci -xxxx
> > * device was recognized by host computer dans PCIe endpoint test driver
> > was loaded
> > lspci -v states "Kernel modules: pci_endpoint_test"
> > * tested the BARs 0-5
> > sudo /usr/bin/pcitest -b 0
> > ...
> > sudo /usr/bin/pcitest -b 5
> > * tested legacy interrupt through the test driver
> > sudo /usr/bin/pcitest -i 0
> > sudo /usr/bin/pcitest -l
> > * tested MSI interrupt through the test driver
> > sudo /usr/bin/pcitest -i 1
> > sudo /usr/bin/pcitest -m 1
> > * tested read/write to and from host through the test driver with checksum
> > sudo /usr/bin/pcitest -r -s 1024
> > sudo /usr/bin/pcitest -w -s 1024
> > * tested read/write with DMA enabled (all read/write tests also did IRQ)
> > sudo /usr/bin/pcitest -r -d -s 8192
> > sudo /usr/bin/pcitest -w -d -s 8192
> >
> > Commands used on the SoC to launch the endpoint function (configfs) :
> >
> > modprobe -i pci-epf-test
> > mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0
> > echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid
> > echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid
> > echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts
> > ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \
> > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/
> > echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start
> >
> > Note: to enable the endpoint controller on the board the file :
> > arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts
> > Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep
> > to "okay". This is not submitted as a patch because most users
> > will use the PCIe core controller in host (root complex) mode
> > rather than endpoint mode.
> >
> > I have tested and confirmed all basic functionality required for the
> > endpoint with the test driver and tools. With the previous state of
> > the driver the device would not even be enumerated by the host
> > computer (mainly because of CRS messages being sent back to the root
> > complex) and tests would not pass (driver would not even be loaded
> > because DID was not set correctly) and then only the BAR test would
> > pass. Now all tests pass as stated above.
> >
> > Best regards
> > Rick
> >
> > Rick Wertenbroek (9):
> > PCI: rockchip: Remove writes to unused registers
> > PCI: rockchip: Write PCI Device ID to correct register
> > PCI: rockchip: Assert PCI Configuration Enable bit after probe
> > PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
> > arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core
> > PCI: rockchip: Fix window mapping and address translation for endpoint
> > PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
> > PCI: rockchip: Use u32 variable to access 32-bit registers
> > PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core
> > set_msi()
> >
> > arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 ++++
> > drivers/pci/controller/pcie-rockchip-ep.c | 143 ++++++++++------------
> > drivers/pci/controller/pcie-rockchip.c | 16 +++
> > drivers/pci/controller/pcie-rockchip.h | 36 ++++--
> > 4 files changed, 128 insertions(+), 90 deletions(-)
> >
>
> --
> Damien Le Moal
> Western Digital Research
>

2023-03-14 08:13:40

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 3/14/23 16:57, Rick Wertenbroek wrote:
> On Tue, Mar 14, 2023 at 1:02 AM Damien Le Moal
> <[email protected]> wrote:
>>
>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>> This is a series of patches that fixes the PCIe endpoint controller driver
>>> for the Rockchip RK3399 SoC. The driver was introduced in
>>> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
>>> The original driver had issues and would not allow for the RK3399 to
>>> operate in PCIe endpoint mode correctly. This patch series fixes that so
>>> that the PCIe core controller of the RK3399 SoC can now act as a PCIe
>>> endpoint. This is v2 of the patch series and addresses the concerns that
>>> were raised during the review of the first version.
>>
>> Rick,
>>
>> Are you going to send a rebased V3 soon ? I have a couple of additional
>> patches to add on top of your series...
>>
>
> I'll try to send a V3 this week. The changes to V2 will be the issues
> raised and discussed on the V2 here in the mailing list with the additional
> code for removing the unsupported MSI-X capability list (was discussed
> in the mailing list as well).

Thanks.

Additional patch needed to avoid problems with this controller is that
we need to set ".align = 256" in the features. Otherwise, things do not
work well. This is because the ATU drops the low 8-bits of the PCI
addresses. It is a one liner patch, so feel free to add it to your series.

I also noticed random issues wich seem to be due to link-up timing... We
probably will need to implement a poll thread to detect and notify with
the linkup callback when we actually have the link established with the
host (see the dw-ep controller which does something similar).


From: Damien Le Moal <[email protected]>
Date: Thu, 9 Mar 2023 16:37:24 +0900
Subject: [PATCH] PCI: rockchip: Set address alignment for endpoint mode

The address translation unit of the rockchip EP controller does not use
the lower 8 bits of a PCIe-space address to map local memory. Thus we
must set the align feature field to 256 to let the user know about this
constraint.

Signed-off-by: Damien Le Moal <[email protected]>
---
drivers/pci/controller/pcie-rockchip-ep.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c
b/drivers/pci/controller/pcie-rockchip-ep.c
index 12db9a9d92af..c6a23db84967 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -471,6 +471,7 @@ static const struct pci_epc_features
rockchip_pcie_epc_features = {
.linkup_notifier = false,
.msi_capable = true,
.msix_capable = false,
+ .align = 256,
};

static const struct pci_epc_features*
--
2.39.2


--
Damien Le Moal
Western Digital Research


2023-03-14 14:53:53

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Tue, Mar 14, 2023 at 9:10 AM Damien Le Moal
<[email protected]> wrote:
>
> On 3/14/23 16:57, Rick Wertenbroek wrote:
> > On Tue, Mar 14, 2023 at 1:02 AM Damien Le Moal
> > <[email protected]> wrote:
> >>
> >> On 2/14/23 23:08, Rick Wertenbroek wrote:
> >>> This is a series of patches that fixes the PCIe endpoint controller driver
> >>> for the Rockchip RK3399 SoC. The driver was introduced in
> >>> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> >>> The original driver had issues and would not allow for the RK3399 to
> >>> operate in PCIe endpoint mode correctly. This patch series fixes that so
> >>> that the PCIe core controller of the RK3399 SoC can now act as a PCIe
> >>> endpoint. This is v2 of the patch series and addresses the concerns that
> >>> were raised during the review of the first version.
> >>
> >> Rick,
> >>
> >> Are you going to send a rebased V3 soon ? I have a couple of additional
> >> patches to add on top of your series...
> >>
> >
> > I'll try to send a V3 this week. The changes to V2 will be the issues
> > raised and discussed on the V2 here in the mailing list with the additional
> > code for removing the unsupported MSI-X capability list (was discussed
> > in the mailing list as well).
>
> Thanks.
>
> Additional patch needed to avoid problems with this controller is that
> we need to set ".align = 256" in the features. Otherwise, things do not
> work well. This is because the ATU drops the low 8-bits of the PCI
> addresses. It is a one liner patch, so feel free to add it to your series.
>
> I also noticed random issues wich seem to be due to link-up timing... We
> probably will need to implement a poll thread to detect and notify with
> the linkup callback when we actually have the link established with the
> host (see the dw-ep controller which does something similar).
>

Hello Damien,
I also noticed random issues I suspect to be related to link status or power
state, in my case it sometimes happens that the BARs (0-6) in the config
space get reset to 0. This is not due to the driver because the driver never
ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM
17.6.4.1.5-17.6.4.1.10).
I don't think the host rewrites them because lspci shows the BARs as
"[virtual]" which means they have been assigned by host but have 0
value in the endpoint device (when lspci rereads the PCI config header).
See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422

So I suspect the controller detects something related to link status or
power state and internally (in hardware) resets those registers. It's not
the kernel code, it never accesses these regs. The problem occurs
very randomly, sometimes in a few seconds, sometimes I cannot see
it for a whole day.

Is this similar to what you are experiencing ?
Do you have any idea as to what could make these registers to be reset
(I could not find anything in the TRM, also nothing in the driver seems to
cause it).

>
> From: Damien Le Moal <[email protected]>
> Date: Thu, 9 Mar 2023 16:37:24 +0900
> Subject: [PATCH] PCI: rockchip: Set address alignment for endpoint mode
>
> The address translation unit of the rockchip EP controller does not use
> the lower 8 bits of a PCIe-space address to map local memory. Thus we
> must set the align feature field to 256 to let the user know about this
> constraint.
>
> Signed-off-by: Damien Le Moal <[email protected]>
> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c
> b/drivers/pci/controller/pcie-rockchip-ep.c
> index 12db9a9d92af..c6a23db84967 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -471,6 +471,7 @@ static const struct pci_epc_features
> rockchip_pcie_epc_features = {
> .linkup_notifier = false,
> .msi_capable = true,
> .msix_capable = false,
> + .align = 256,
> };
>
> static const struct pci_epc_features*
> --
> 2.39.2
>
>
> --
> Damien Le Moal
> Western Digital Research
>

Do you want me to include this patch in the V3 series or will you
submit another patch series for the changes you applied on the RK3399 PCIe
endpoint controller ? I don't know if you prefer to build the V3
together or if you
prefer to submit another patch series on top of mine. Let me know.

Best regards.
Rick

2023-03-14 15:47:56

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 8/9] PCI: rockchip: Use u32 variable to access 32-bit registers

On Wed, Feb 15, 2023 at 2:34 AM Damien Le Moal
<[email protected]> wrote:
>
> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > Previously u16 variables were used to access 32-bit registers, this
> > resulted in not all of the data being read from the registers. Also
> > the left shift of more than 16-bits would result in moving data out
> > of the variable. Use u32 variables to access 32-bit registers
> >
> > Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
> > Cc: [email protected]
> > Signed-off-by: Rick Wertenbroek <[email protected]>
> > ---
> > drivers/pci/controller/pcie-rockchip-ep.c | 10 +++++-----
> > drivers/pci/controller/pcie-rockchip.h | 1 +
> > 2 files changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> > index ca5b363ba..b7865a94e 100644
> > --- a/drivers/pci/controller/pcie-rockchip-ep.c
> > +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> > @@ -292,15 +292,15 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> > {
> > struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
> > struct rockchip_pcie *rockchip = &ep->rockchip;
> > - u16 flags;
> > + u32 flags;
> >
> > flags = rockchip_pcie_read(rockchip,
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> > ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> > flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK;
> > flags |=
> > - ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
> > - PCI_MSI_FLAGS_64BIT;
> > + (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
>
> ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET is 17 and multi_msg_cap is a u8...
> Not nice.
>
> Locally, I added the local variable:
>
> u32 mmc = multi_msg_cap;
>
> And use mmc instead of multi_msg_cap to avoid issues. Also,
>
> > + (PCI_MSI_FLAGS_64BIT << ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET);
> > flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP;
> > rockchip_pcie_write(rockchip, flags,
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> > @@ -312,7 +312,7 @@ static int rockchip_pcie_ep_get_msi(struct pci_epc *epc, u8 fn, u8 vfn)
> > {
> > struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
> > struct rockchip_pcie *rockchip = &ep->rockchip;
> > - u16 flags;
> > + u32 flags;
> >
> > flags = rockchip_pcie_read(rockchip,
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> > @@ -374,7 +374,7 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
> > u8 interrupt_num)
> > {
> > struct rockchip_pcie *rockchip = &ep->rockchip;
> > - u16 flags, mme, data, data_mask;
> > + u32 flags, mme, data, data_mask;
> > u8 msi_count;
> > u64 pci_addr, pci_addr_mask = 0xff;
> > u32 r;
> > diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
> > index e90c2a2b8..11dbf53cd 100644
> > --- a/drivers/pci/controller/pcie-rockchip.h
> > +++ b/drivers/pci/controller/pcie-rockchip.h
> > @@ -227,6 +227,7 @@
> > #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
> > #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
> > #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
> > +#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
>
> You are not using this macro anywhere. The name is also not very
> descriptive. Better have it as:
>
> #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
>
> to match the TRM name and be clear that the bit indicates if MSI is
> enabled or not.
>

This is already the case, the #define is already there.
#define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
The other offset is for all the MSI flags together, this offset is used
when the flags are retrieved altogether, see rockchip_pcie_ep_set_msi()
The ME bit is also used when we need to check the bit specifically, see
rockchip_pcie_ep_get_msi().

> > #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
> > #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
> > #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
>
> --
> Damien Le Moal
> Western Digital Research
>

2023-03-14 22:54:28

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 3/14/23 23:53, Rick Wertenbroek wrote:
> Hello Damien,
> I also noticed random issues I suspect to be related to link status or power
> state, in my case it sometimes happens that the BARs (0-6) in the config
> space get reset to 0. This is not due to the driver because the driver never
> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM
> 17.6.4.1.5-17.6.4.1.10).
> I don't think the host rewrites them because lspci shows the BARs as
> "[virtual]" which means they have been assigned by host but have 0
> value in the endpoint device (when lspci rereads the PCI config header).
> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422
>
> So I suspect the controller detects something related to link status or
> power state and internally (in hardware) resets those registers. It's not
> the kernel code, it never accesses these regs. The problem occurs
> very randomly, sometimes in a few seconds, sometimes I cannot see
> it for a whole day.
>
> Is this similar to what you are experiencing ?

Yes. I sometimes get NMIs after starting the function driver, when my function
driver starts probing the bar registers after seeing the host changing one
register. And the link also comes up with 4 lanes or 2 lanes, random.

> Do you have any idea as to what could make these registers to be reset
> (I could not find anything in the TRM, also nothing in the driver seems to
> cause it).

My thinking is that since we do not have a linkup notifier, the function driver
starts setting things up without the link established (e.g. when the host is
still powered down). Once the host start booting and pic link is established,
things may be reset in the hardware... That is the only thing I can think of.

And yes, there are definitely something going on with the power states too I
think: if I let things idle for a few minutes, everything stops working: no
activity seen on the endpoint over the BARs. I tried enabling the sys and client
interrupts to see if I can see power state changes, or if clearing the
interrupts helps (they are masked by default), but no change. And booting the
host with pci_aspm=off does not help either. Also tried to change all the
capabilities related to link & power states to "off" (not supported), and no
change either. So currently, I am out of ideas regarding that one.

I am trying to make progress on my endpoint driver (nvme function) to be sure it
is not a bug there that breaks things. I may still have something bad because
when I enable the BIOS native NVMe driver on the host, either the host does not
boot, or grub crashes with memory corruptions. Overall, not yet very stable and
still trying to sort out the root cause of that.


> Do you want me to include this patch in the V3 series or will you
> submit another patch series for the changes you applied on the RK3399 PCIe
> endpoint controller ? I don't know if you prefer to build the V3
> together or if you
> prefer to submit another patch series on top of mine. Let me know.

If it is no trouble, please include it with your series. Will be easier to
retest everything together :)

--
Damien Le Moal
Western Digital Research


2023-03-15 00:01:02

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 3/15/23 07:54, Damien Le Moal wrote:
> On 3/14/23 23:53, Rick Wertenbroek wrote:
>> Hello Damien,
>> I also noticed random issues I suspect to be related to link status or power
>> state, in my case it sometimes happens that the BARs (0-6) in the config
>> space get reset to 0. This is not due to the driver because the driver never
>> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM
>> 17.6.4.1.5-17.6.4.1.10).
>> I don't think the host rewrites them because lspci shows the BARs as
>> "[virtual]" which means they have been assigned by host but have 0
>> value in the endpoint device (when lspci rereads the PCI config header).
>> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422
>>
>> So I suspect the controller detects something related to link status or
>> power state and internally (in hardware) resets those registers. It's not
>> the kernel code, it never accesses these regs. The problem occurs
>> very randomly, sometimes in a few seconds, sometimes I cannot see
>> it for a whole day.
>>
>> Is this similar to what you are experiencing ?
>
> Yes. I sometimes get NMIs after starting the function driver, when my function
> driver starts probing the bar registers after seeing the host changing one
> register. And the link also comes up with 4 lanes or 2 lanes, random.
>
>> Do you have any idea as to what could make these registers to be reset
>> (I could not find anything in the TRM, also nothing in the driver seems to
>> cause it).
>
> My thinking is that since we do not have a linkup notifier, the function driver
> starts setting things up without the link established (e.g. when the host is
> still powered down). Once the host start booting and pic link is established,
> things may be reset in the hardware... That is the only thing I can think of.
>
> And yes, there are definitely something going on with the power states too I
> think: if I let things idle for a few minutes, everything stops working: no
> activity seen on the endpoint over the BARs. I tried enabling the sys and client
> interrupts to see if I can see power state changes, or if clearing the
> interrupts helps (they are masked by default), but no change. And booting the
> host with pci_aspm=off does not help either. Also tried to change all the
> capabilities related to link & power states to "off" (not supported), and no
> change either. So currently, I am out of ideas regarding that one.
>
> I am trying to make progress on my endpoint driver (nvme function) to be sure it
> is not a bug there that breaks things. I may still have something bad because
> when I enable the BIOS native NVMe driver on the host, either the host does not
> boot, or grub crashes with memory corruptions. Overall, not yet very stable and
> still trying to sort out the root cause of that.

By the way, enabling the interrupts to see the error notifications, I do see a
lot of retry timeout and other recoverable errors. So the issues I am seeing
could be due to my PCI cable setup that is not ideal (bad signal, ground loops,
... ?). Not sure. I do not have a PCI analyzer handy :)

I attached the patches I used to enable the EP interrupts. Enabling debug prints
will tell you what is going on. That may give you some hints on your setup ?

--
Damien Le Moal
Western Digital Research


Attachments:
0001-arm-rk3399-rockpro64-Add-interrupts-to-PCI-endpoint-.patch (1.12 kB)
0001-PCI-rockchip-Add-client-and-subsys-interrupts-for-en.patch (6.15 kB)
Download all attachments

2023-03-16 12:53:46

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Wed, Mar 15, 2023 at 1:00 AM Damien Le Moal
<[email protected]> wrote:
>
> On 3/15/23 07:54, Damien Le Moal wrote:
> > On 3/14/23 23:53, Rick Wertenbroek wrote:
> >> Hello Damien,
> >> I also noticed random issues I suspect to be related to link status or power
> >> state, in my case it sometimes happens that the BARs (0-6) in the config
> >> space get reset to 0. This is not due to the driver because the driver never
> >> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM
> >> 17.6.4.1.5-17.6.4.1.10).
> >> I don't think the host rewrites them because lspci shows the BARs as
> >> "[virtual]" which means they have been assigned by host but have 0
> >> value in the endpoint device (when lspci rereads the PCI config header).
> >> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422
> >>
> >> So I suspect the controller detects something related to link status or
> >> power state and internally (in hardware) resets those registers. It's not
> >> the kernel code, it never accesses these regs. The problem occurs
> >> very randomly, sometimes in a few seconds, sometimes I cannot see
> >> it for a whole day.
> >>
> >> Is this similar to what you are experiencing ?
> >
> > Yes. I sometimes get NMIs after starting the function driver, when my function
> > driver starts probing the bar registers after seeing the host changing one
> > register. And the link also comes up with 4 lanes or 2 lanes, random.

Hello, I have never had it come up with only 2 lanes, I get 4 consistently.
I have it connected through a M.2 to female PCIe 16x (4x electrically
connected),
then through a male-to-male PCIe 4x cable with TX/RX swap, then through a
16x extender. All three cables are approx 25cm. It seems stable.

> >
> >> Do you have any idea as to what could make these registers to be reset
> >> (I could not find anything in the TRM, also nothing in the driver seems to
> >> cause it).
> >
> > My thinking is that since we do not have a linkup notifier, the function driver
> > starts setting things up without the link established (e.g. when the host is
> > still powered down). Once the host start booting and pic link is established,
> > things may be reset in the hardware... That is the only thing I can think of.

This might be worth investigating, I'll look into it, but it seems
many of the EP
drivers don't have a Linkup notifier,
drivers/pci/controller/dwc/pci-dra7xx.c has
one, but most of the other EP drivers don't have them, so it might not be
absolutely required.

> >
> > And yes, there are definitely something going on with the power states too I
> > think: if I let things idle for a few minutes, everything stops working: no
> > activity seen on the endpoint over the BARs. I tried enabling the sys and client
> > interrupts to see if I can see power state changes, or if clearing the
> > interrupts helps (they are masked by default), but no change. And booting the
> > host with pci_aspm=off does not help either. Also tried to change all the
> > capabilities related to link & power states to "off" (not supported), and no
> > change either. So currently, I am out of ideas regarding that one.
> >
> > I am trying to make progress on my endpoint driver (nvme function) to be sure it
> > is not a bug there that breaks things. I may still have something bad because
> > when I enable the BIOS native NVMe driver on the host, either the host does not
> > boot, or grub crashes with memory corruptions. Overall, not yet very stable and
> > still trying to sort out the root cause of that.

I am also working on an NVMe driver but I have our NVMe firmware running in
userspace so our endpoint function driver only exposes the BARs as UIO
mapped memory and has a simple interface to generate IRQs to host / initiate
DMA transfers.

So that driver does very little in itself and I still have problems
with the BARs
getting unmapped (reset to 0) randomly. I hope your patches for monitoring
the IRQs will shed some light on this. I also observed the BARs getting reset
with the pcie ep test function driver, so I don't think it necessarily
is the function
that is to blame, rather the controller itself (also because none of
the kernel code
should / does access the BARs registers @0xfd80'0010).

>
> By the way, enabling the interrupts to see the error notifications, I do see a
> lot of retry timeout and other recoverable errors. So the issues I am seeing
> could be due to my PCI cable setup that is not ideal (bad signal, ground loops,
> ... ?). Not sure. I do not have a PCI analyzer handy :)
>
> I attached the patches I used to enable the EP interrupts. Enabling debug prints
> will tell you what is going on. That may give you some hints on your setup ?
>
> --
> Damien Le Moal
> Western Digital Research

Thank you for these patches. I will try them and see if they give me more info.

Also, I will delay the release of the v3 of my patch series because of
these issues.
The v3 only incorporates the changes discussed here in the mailing list so your
version should be up to date. If you want me to send you the series in
its current
state let me know.

But I will need some more debugging, I'll release the v3 when the driver is more
stable. I don't when, I don't have that much time on this project. Thanks for
your understanding.

Rick

2023-03-16 16:37:14

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Thu, Mar 16, 2023 at 1:52 PM Rick Wertenbroek
<[email protected]> wrote:
>
> On Wed, Mar 15, 2023 at 1:00 AM Damien Le Moal
> <[email protected]> wrote:
> >
> > On 3/15/23 07:54, Damien Le Moal wrote:
> > > On 3/14/23 23:53, Rick Wertenbroek wrote:
> > >> Hello Damien,
> > >> I also noticed random issues I suspect to be related to link status or power
> > >> state, in my case it sometimes happens that the BARs (0-6) in the config
> > >> space get reset to 0. This is not due to the driver because the driver never
> > >> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM
> > >> 17.6.4.1.5-17.6.4.1.10).
> > >> I don't think the host rewrites them because lspci shows the BARs as
> > >> "[virtual]" which means they have been assigned by host but have 0
> > >> value in the endpoint device (when lspci rereads the PCI config header).
> > >> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422
> > >>
> > >> So I suspect the controller detects something related to link status or
> > >> power state and internally (in hardware) resets those registers. It's not
> > >> the kernel code, it never accesses these regs. The problem occurs
> > >> very randomly, sometimes in a few seconds, sometimes I cannot see
> > >> it for a whole day.
> > >>
> > >> Is this similar to what you are experiencing ?
> > >
> > > Yes. I sometimes get NMIs after starting the function driver, when my function
> > > driver starts probing the bar registers after seeing the host changing one
> > > register. And the link also comes up with 4 lanes or 2 lanes, random.
>
> Hello, I have never had it come up with only 2 lanes, I get 4 consistently.
> I have it connected through a M.2 to female PCIe 16x (4x electrically
> connected),
> then through a male-to-male PCIe 4x cable with TX/RX swap, then through a
> 16x extender. All three cables are approx 25cm. It seems stable.
>
> > >
> > >> Do you have any idea as to what could make these registers to be reset
> > >> (I could not find anything in the TRM, also nothing in the driver seems to
> > >> cause it).
> > >
> > > My thinking is that since we do not have a linkup notifier, the function driver
> > > starts setting things up without the link established (e.g. when the host is
> > > still powered down). Once the host start booting and pic link is established,
> > > things may be reset in the hardware... That is the only thing I can think of.
>
> This might be worth investigating, I'll look into it, but it seems
> many of the EP
> drivers don't have a Linkup notifier,
> drivers/pci/controller/dwc/pci-dra7xx.c has
> one, but most of the other EP drivers don't have them, so it might not be
> absolutely required.
>
> > >
> > > And yes, there are definitely something going on with the power states too I
> > > think: if I let things idle for a few minutes, everything stops working: no
> > > activity seen on the endpoint over the BARs. I tried enabling the sys and client
> > > interrupts to see if I can see power state changes, or if clearing the
> > > interrupts helps (they are masked by default), but no change. And booting the
> > > host with pci_aspm=off does not help either. Also tried to change all the
> > > capabilities related to link & power states to "off" (not supported), and no
> > > change either. So currently, I am out of ideas regarding that one.
> > >
> > > I am trying to make progress on my endpoint driver (nvme function) to be sure it
> > > is not a bug there that breaks things. I may still have something bad because
> > > when I enable the BIOS native NVMe driver on the host, either the host does not
> > > boot, or grub crashes with memory corruptions. Overall, not yet very stable and
> > > still trying to sort out the root cause of that.
>
> I am also working on an NVMe driver but I have our NVMe firmware running in
> userspace so our endpoint function driver only exposes the BARs as UIO
> mapped memory and has a simple interface to generate IRQs to host / initiate
> DMA transfers.
>
> So that driver does very little in itself and I still have problems
> with the BARs
> getting unmapped (reset to 0) randomly. I hope your patches for monitoring
> the IRQs will shed some light on this. I also observed the BARs getting reset
> with the pcie ep test function driver, so I don't think it necessarily
> is the function
> that is to blame, rather the controller itself (also because none of
> the kernel code
> should / does access the BARs registers @0xfd80'0010).
>
> >
> > By the way, enabling the interrupts to see the error notifications, I do see a
> > lot of retry timeout and other recoverable errors. So the issues I am seeing
> > could be due to my PCI cable setup that is not ideal (bad signal, ground loops,
> > ... ?). Not sure. I do not have a PCI analyzer handy :)

I have enabled the IRQs and messages thanks to your patches but I don't get
messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable.
The main issue I face is still that after a random amount of time, the BARs are
reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs
(e.g., host writing the BAR values to the config header), but I don't think that
the problem comes from a TLP issued from the host. (it might be).

I don't think it's a buffer overflow / out-of-bounds access by kernel
code for two reasons
1) The values in the config space around the BARs is coherent and unchanged
2) The bars are reset to 0 and not a random value

I suspect a hardware reset of those registers issued internally in the
PCIe controller,
I don't know why (it might be a link related event or power state
related event).

I have also experienced very slow behavior with the PCI endpoint test driver,
e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to
come from LCRC errors, when I check the "LCRC Error count register"
@0xFD90'0214 I can see it drastically increase between two calls of pcitest
(when I mean drastically it means by 6607 (0x19CF) for example).

The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though.

I have tried to shorten the cabling by removing one of the PCIe extenders, that
didn't change the issues much.

Any ideas as to why I see a large number of TLPs with LCRC errors in them ?
Do you experience the same ? What are your values in 0xFD90'0214 when
running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing
0xFFFF to it in case it reaches the maximum value of 0xFFFF).




> >
> > I attached the patches I used to enable the EP interrupts. Enabling debug prints
> > will tell you what is going on. That may give you some hints on your setup ?
> >
> > --
> > Damien Le Moal
> > Western Digital Research
>
> Thank you for these patches. I will try them and see if they give me more info.
>
> Also, I will delay the release of the v3 of my patch series because of
> these issues.
> The v3 only incorporates the changes discussed here in the mailing list so your
> version should be up to date. If you want me to send you the series in
> its current
> state let me know.
>
> But I will need some more debugging, I'll release the v3 when the driver is more
> stable. I don't when, I don't have that much time on this project. Thanks for
> your understanding.
>
> Rick

2023-03-16 22:11:29

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On 3/17/23 01:34, Rick Wertenbroek wrote:
>>> By the way, enabling the interrupts to see the error notifications, I do see a
>>> lot of retry timeout and other recoverable errors. So the issues I am seeing
>>> could be due to my PCI cable setup that is not ideal (bad signal, ground loops,
>>> ... ?). Not sure. I do not have a PCI analyzer handy :)
>
> I have enabled the IRQs and messages thanks to your patches but I don't get
> messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable.
> The main issue I face is still that after a random amount of time, the BARs are
> reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs
> (e.g., host writing the BAR values to the config header), but I don't think that
> the problem comes from a TLP issued from the host. (it might be).

Hmmm... I am getting lots of IRQs, especially the ones signaling "replay timer
timed out" and "replay timer rolled over after 4 transmissions of the same TLP"
but also some "phy error detected on receive side"... Need to try to rework my
cable setup I guess.

As for the BARs being reset to 0, I have not checked, but it may be why I see
things not working after some inactivity. Will check that. We may be seeing the
same regarding that.

> I don't think it's a buffer overflow / out-of-bounds access by kernel
> code for two reasons
> 1) The values in the config space around the BARs is coherent and unchanged
> 2) The bars are reset to 0 and not a random value
>
> I suspect a hardware reset of those registers issued internally in the
> PCIe controller,
> I don't know why (it might be a link related event or power state
> related event).
>
> I have also experienced very slow behavior with the PCI endpoint test driver,
> e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to
> come from LCRC errors, when I check the "LCRC Error count register"
> @0xFD90'0214 I can see it drastically increase between two calls of pcitest
> (when I mean drastically it means by 6607 (0x19CF) for example).
>
> The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though.
>
> I have tried to shorten the cabling by removing one of the PCIe extenders, that
> didn't change the issues much.
>
> Any ideas as to why I see a large number of TLPs with LCRC errors in them ?
> Do you experience the same ? What are your values in 0xFD90'0214 when
> running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing
> 0xFFFF to it in case it reaches the maximum value of 0xFFFF).

I have not checked. But I will look at these counters to see what I have there.


--
Damien Le Moal
Western Digital Research


2023-04-13 13:56:12

by Lorenzo Pieralisi

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Fri, Mar 17, 2023 at 07:09:04AM +0900, Damien Le Moal wrote:
> On 3/17/23 01:34, Rick Wertenbroek wrote:
> >>> By the way, enabling the interrupts to see the error notifications, I do see a
> >>> lot of retry timeout and other recoverable errors. So the issues I am seeing
> >>> could be due to my PCI cable setup that is not ideal (bad signal, ground loops,
> >>> ... ?). Not sure. I do not have a PCI analyzer handy :)
> >
> > I have enabled the IRQs and messages thanks to your patches but I don't get
> > messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable.
> > The main issue I face is still that after a random amount of time, the BARs are
> > reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs
> > (e.g., host writing the BAR values to the config header), but I don't think that
> > the problem comes from a TLP issued from the host. (it might be).
>
> Hmmm... I am getting lots of IRQs, especially the ones signaling "replay timer
> timed out" and "replay timer rolled over after 4 transmissions of the same TLP"
> but also some "phy error detected on receive side"... Need to try to rework my
> cable setup I guess.
>
> As for the BARs being reset to 0, I have not checked, but it may be why I see
> things not working after some inactivity. Will check that. We may be seeing the
> same regarding that.
>
> > I don't think it's a buffer overflow / out-of-bounds access by kernel
> > code for two reasons
> > 1) The values in the config space around the BARs is coherent and unchanged
> > 2) The bars are reset to 0 and not a random value
> >
> > I suspect a hardware reset of those registers issued internally in the
> > PCIe controller,
> > I don't know why (it might be a link related event or power state
> > related event).
> >
> > I have also experienced very slow behavior with the PCI endpoint test driver,
> > e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to
> > come from LCRC errors, when I check the "LCRC Error count register"
> > @0xFD90'0214 I can see it drastically increase between two calls of pcitest
> > (when I mean drastically it means by 6607 (0x19CF) for example).
> >
> > The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though.
> >
> > I have tried to shorten the cabling by removing one of the PCIe extenders, that
> > didn't change the issues much.
> >
> > Any ideas as to why I see a large number of TLPs with LCRC errors in them ?
> > Do you experience the same ? What are your values in 0xFD90'0214 when
> > running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing
> > 0xFFFF to it in case it reaches the maximum value of 0xFFFF).
>
> I have not checked. But I will look at these counters to see what I have there.

Hi,

checking where are we with this thread and whether there is something to
consider for v6.4, if testing succeeds.

Thanks,
Lorenzo

2023-04-13 14:36:59

by Rick Wertenbroek

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] PCI: rockchip: Fix RK3399 PCIe endpoint controller driver

On Thu, Apr 13, 2023 at 3:49 PM Lorenzo Pieralisi <[email protected]> wrote:
>
> On Fri, Mar 17, 2023 at 07:09:04AM +0900, Damien Le Moal wrote:
> > On 3/17/23 01:34, Rick Wertenbroek wrote:
> > >>> By the way, enabling the interrupts to see the error notifications, I do see a
> > >>> lot of retry timeout and other recoverable errors. So the issues I am seeing
> > >>> could be due to my PCI cable setup that is not ideal (bad signal, ground loops,
> > >>> ... ?). Not sure. I do not have a PCI analyzer handy :)
> > >
> > > I have enabled the IRQs and messages thanks to your patches but I don't get
> > > messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable.
> > > The main issue I face is still that after a random amount of time, the BARs are
> > > reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs
> > > (e.g., host writing the BAR values to the config header), but I don't think that
> > > the problem comes from a TLP issued from the host. (it might be).
> >
> > Hmmm... I am getting lots of IRQs, especially the ones signaling "replay timer
> > timed out" and "replay timer rolled over after 4 transmissions of the same TLP"
> > but also some "phy error detected on receive side"... Need to try to rework my
> > cable setup I guess.
> >
> > As for the BARs being reset to 0, I have not checked, but it may be why I see
> > things not working after some inactivity. Will check that. We may be seeing the
> > same regarding that.
> >
> > > I don't think it's a buffer overflow / out-of-bounds access by kernel
> > > code for two reasons
> > > 1) The values in the config space around the BARs is coherent and unchanged
> > > 2) The bars are reset to 0 and not a random value
> > >
> > > I suspect a hardware reset of those registers issued internally in the
> > > PCIe controller,
> > > I don't know why (it might be a link related event or power state
> > > related event).
> > >
> > > I have also experienced very slow behavior with the PCI endpoint test driver,
> > > e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to
> > > come from LCRC errors, when I check the "LCRC Error count register"
> > > @0xFD90'0214 I can see it drastically increase between two calls of pcitest
> > > (when I mean drastically it means by 6607 (0x19CF) for example).
> > >
> > > The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though.
> > >
> > > I have tried to shorten the cabling by removing one of the PCIe extenders, that
> > > didn't change the issues much.
> > >
> > > Any ideas as to why I see a large number of TLPs with LCRC errors in them ?
> > > Do you experience the same ? What are your values in 0xFD90'0214 when
> > > running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing
> > > 0xFFFF to it in case it reaches the maximum value of 0xFFFF).
> >
> > I have not checked. But I will look at these counters to see what I have there.
>
> Hi,
>
> checking where are we with this thread and whether there is something to
> consider for v6.4, if testing succeeds.
>
> Thanks,
> Lorenzo

Hello,
Thank you for considering this.

There is a V3 of this patch series [1|, that fixes the issues
encountered with the V2.
The debugging following this thread was discussed off-list with Damien Le Moal.
The V3 has been tested successfully by Damien Le Moal [2]

I will submit a V4 next week, since there are minor changes that were
suggested in
the threads for the V3 (mostly minor changes in code style, macros, comments).

I hope it can be considered for v6.4, thank you.

[1] https://lore.kernel.org/linux-pci/[email protected]/T/#mc8f2589ff04862175cb0c906b38cb37a90db0e42
[2] https://lore.kernel.org/linux-pci/[email protected]/


Notes on what was discovered off-list :

The issues regarding BAR reset were due to power supply issues (PCI cable
jumping host 3V3 supply to SoC 3V3 supply, and are fixed with proper cabling).
a few LCRC errors are normal with PCIe, the number will depend on
signal integrity
at the physical layer (cabling).


Best regards,
Rick