2024-02-12 16:54:57

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 00/10] arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

This series addresses a few problems with the sc8280xp PCIe
implementation.

The DWC PCIe controller can either use its internal MSI controller or an
external one such as the GICv3 ITS. Enabling the latter allows for
assigning affinity to individual interrupts, but results in a large
amount of Correctable Errors being logged on both the Lenovo ThinkPad
X13s and the sc8280xp-crd reference design.

It turns out that these errors are always generated, but for some yet to
be determined reason, the AER interrupts are never received when using
the internal MSI controller, which makes the link errors harder to
notice.

On the X13s, there is a large number of errors generated when bringing
up the link on boot. This is related to the fact that UEFI firmware has
already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the
link at Gen3 generates a massive amount of errors until the Wi-Fi
firmware is restarted.

A recent commit enabling ASPM on certain Qualcomm platforms introduced
further errors when using the Wi-Fi on the X13s as well as when
accessing the NVMe on the CRD. The exact reason for this has not yet
been identified, but disabling ASPM L0s makes the errors go away. This
could suggest that either the current ASPM implementation is incomplete
or that L0s is not supported with these devices.

Note that the X13s and CRD use the same Wi-Fi controller, but the errors
are only generated on the X13s. The NVMe controller on my X13s does not
support L0s so there are no issues there, unlike on the CRD which uses a
different controller. The modem on the CRD does not generate any errors,
but both the NVMe and modem keeps bouncing in and out of L0s/L1 also
when not used, which could indicate that there are bigger problems with
the ASPM implementation. I don't have a modem on my X13s so I have not
been able to test whether L0s causes an trouble there.

Enabling AER error reporting on sc8280xp could similarly also reveal
existing problems with the related sa8295p and sa8540p platforms as they
share the base dtsi.

The last four patches, marked as RFC, adds support for disabling ASPM
L0s in the devicetree and disables it selectively for the X13s Wi-Fi
and CRD NVMe. If it turns out that the Qualcomm PCIe implementation is
incomplete, we may need to disable ASPM (L0s) completely in the driver
instead.

Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a
significant impact on the power consumption

The DT bindings and PCI patch are expected to go through the PCI tree,
while Bjorn A takes the devicetree updates through the Qualcomm tree.

Johan


Johan Hovold (10):
dt-bindings: PCI: qcom: Allow 'required-opps'
dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe
dt-bindings: PCI: qcom: Allow 'aspm-no-l0s'
PCI: qcom: Add support for disabling ASPM L0s in devicetree
arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe
arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi

.../devicetree/bindings/pci/qcom,pcie.yaml | 6 +++++-
arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 4 ++++
.../qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 3 +++
arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 17 +++++++++++++++-
drivers/pci/controller/dwc/pcie-qcom.c | 20 +++++++++++++++++++
5 files changed, 48 insertions(+), 2 deletions(-)

--
2.43.0



2024-02-12 16:55:44

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 04/10] arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed

Limit the WiFi PCIe link speed to Gen2 speed (500 GB/s), which is the
speed that Windows uses.

Signed-off-by: Johan Hovold <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
index f34c572253f5..8c1fccf8847a 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
+++ b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
@@ -563,6 +563,8 @@ &pcie3a_phy {
};

&pcie4 {
+ max-link-speed = <2>;
+
perst-gpios = <&tlmm 141 GPIO_ACTIVE_LOW>;
wake-gpios = <&tlmm 139 GPIO_ACTIVE_LOW>;

--
2.43.0


2024-02-12 16:56:08

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 10/10] arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi

Enabling ASPM L0s on the Lenovo Thinkpad X13s results in Correctable
Errors (BadTLP, Timeout) when accessing the Wi-Fi controller so disable
it for now.

Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops")
Signed-off-by: Johan Hovold <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
index ff4b896b1bbf..aed857feface 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
+++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
@@ -864,6 +864,7 @@ &pcie3a_phy {

&pcie4 {
max-link-speed = <2>;
+ aspm-no-l0s;

perst-gpios = <&tlmm 141 GPIO_ACTIVE_LOW>;
wake-gpios = <&tlmm 139 GPIO_ACTIVE_LOW>;
--
2.43.0


2024-02-12 16:56:09

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 06/10] arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

The DWC PCIe controller can be used with its internal MSI controller or
with an external one such as the GICv3 Interrupt Translation Service
(ITS).

Add the msi-map properties needed to use the GIC ITS. This will also
make Linux switch to the ITS implementation, which allows for assigning
affinity to individual MSIs.

Note that using the GIC ITS on SC8280XP will cause Advanced Error
Reporting (AER) interrupts to be received on errors unlike when using
the internal MSI controller. This will specifically lead to
notifications about Correctable Errors being logged for the Wi-Fi
controller the Lenovo ThinkPad X13s when the AER driver is enabled.

Signed-off-by: Johan Hovold <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
index 36382b1bd965..ee6026f4f12c 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
@@ -1737,6 +1737,8 @@ pcie4: pcie@1c00000 {
linux,pci-domain = <6>;
num-lanes = <1>;

+ msi-map = <0x0 &its 0xe0000 0x10000>;
+
interrupts = <GIC_SPI 141 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 142 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 143 IRQ_TYPE_LEVEL_HIGH>,
@@ -1838,6 +1840,8 @@ pcie3b: pcie@1c08000 {
linux,pci-domain = <5>;
num-lanes = <2>;

+ msi-map = <0x0 &its 0xd0000 0x10000>;
+
interrupts = <GIC_SPI 145 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 146 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 147 IRQ_TYPE_LEVEL_HIGH>,
@@ -1937,6 +1941,8 @@ pcie3a: pcie@1c10000 {
linux,pci-domain = <4>;
num-lanes = <4>;

+ msi-map = <0x0 &its 0xc0000 0x10000>;
+
interrupts = <GIC_SPI 312 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 313 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 314 IRQ_TYPE_LEVEL_HIGH>,
@@ -2039,6 +2045,8 @@ pcie2b: pcie@1c18000 {
linux,pci-domain = <3>;
num-lanes = <2>;

+ msi-map = <0x0 &its 0xb0000 0x10000>;
+
interrupts = <GIC_SPI 306 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 307 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 308 IRQ_TYPE_LEVEL_HIGH>,
@@ -2138,6 +2146,8 @@ pcie2a: pcie@1c20000 {
linux,pci-domain = <2>;
num-lanes = <4>;

+ msi-map = <0x0 &its 0xa0000 0x10000>;
+
interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 523 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 524 IRQ_TYPE_LEVEL_HIGH>,
@@ -4426,7 +4436,7 @@ intc: interrupt-controller@17a00000 {
#size-cells = <2>;
ranges;

- msi-controller@17a40000 {
+ its: msi-controller@17a40000 {
compatible = "arm,gic-v3-its";
reg = <0 0x17a40000 0 0x20000>;
msi-controller;
--
2.43.0


2024-02-12 16:56:09

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

Whether the 'msi-map-mask' property is needed or not depends on how the
MSI interrupts are mapped and it should therefore not be described as
required.

Note that the current schema fails to detect omissions of the mask
property if the internal MSI controller properties are also present.

Signed-off-by: Johan Hovold <[email protected]>
---
Documentation/devicetree/bindings/pci/qcom,pcie.yaml | 1 -
1 file changed, 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie.yaml b/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
index 5eda4e72f681..b28517047db2 100644
--- a/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
+++ b/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
@@ -146,7 +146,6 @@ anyOf:
- "#interrupt-cells"
- required:
- msi-map
- - msi-map-mask

allOf:
- $ref: /schemas/pci/pci-bus.yaml#
--
2.43.0


2024-02-12 16:56:09

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 05/10] arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed

Limit the WiFi PCIe link speed to Gen2 speed (500 GB/s), which is the
speed that the boot firmware has brought up the link at (and that
Windows uses).

This is specifically needed to avoid a large amount of link errors when
restarting the link during boot (but which are currently not reported).

This may potentially also help with intermittent failures to download
the ath11k firmware during boot which can be seen when there is a
longer delay between restarting the link and loading the WiFi driver
(e.g. when using full disk encryption).

Fixes: 123b30a75623 ("arm64: dts: qcom: sc8280xp-x13s: enable WiFi controller")
Cc: [email protected] # 6.2
Signed-off-by: Johan Hovold <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
index 511d53d9c5a1..ff4b896b1bbf 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
+++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
@@ -863,6 +863,8 @@ &pcie3a_phy {
};

&pcie4 {
+ max-link-speed = <2>;
+
perst-gpios = <&tlmm 141 GPIO_ACTIVE_LOW>;
wake-gpios = <&tlmm 139 GPIO_ACTIVE_LOW>;

--
2.43.0


2024-02-12 16:56:09

by Johan Hovold

[permalink] [raw]
Subject: [RFC 09/10] arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe

Enabling ASPM L0s on the CRD results in a large amount of Correctable
Errors (Timeout) when accessing the NVMe controller so disable it for
now.

Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops")
Signed-off-by: Johan Hovold <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
index 8c1fccf8847a..a428ba624ce1 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
+++ b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
@@ -525,6 +525,8 @@ keyboard@68 {
};

&pcie2a {
+ aspm-no-l0s;
+
perst-gpios = <&tlmm 143 GPIO_ACTIVE_LOW>;
wake-gpios = <&tlmm 145 GPIO_ACTIVE_LOW>;

--
2.43.0


2024-02-12 16:56:10

by Johan Hovold

[permalink] [raw]
Subject: [PATCH 01/10] dt-bindings: PCI: qcom: Allow 'required-opps'

Some Qualcomm SoCs require a minimum performance level for the power
domain so add 'required-opps' to the binding.

Signed-off-by: Johan Hovold <[email protected]>
---
Documentation/devicetree/bindings/pci/qcom,pcie.yaml | 3 +++
1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie.yaml b/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
index a93ab3b54066..5eda4e72f681 100644
--- a/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
+++ b/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
@@ -123,6 +123,9 @@ properties:
description: GPIO controlled connection to PERST# signal
maxItems: 1

+ required-opps:
+ maxItems: 1
+
wake-gpios:
description: GPIO controlled connection to WAKE# signal
maxItems: 1
--
2.43.0


2024-02-12 17:05:20

by Johan Hovold

[permalink] [raw]
Subject: [RFC 08/10] PCI: qcom: Add support for disabling ASPM L0s in devicetree

A recent commit started enabling ASPM unconditionally when the hardware
claims to support it. This triggers Correctable Errors for some PCIe
devices on machines like the Lenovo ThinkPad X13s, which could indicate
an incomplete driver ASPM implementation or that the hardware does in
fact not support L0s.

Add support for disabling ASPM L0s in the devicetree when it is not
supported on a particular machine and controller.

Note that only the 1.9.0 ops enable ASPM currently.

Fixes: a9a023c05697 ("PCI: qcom: Add support for disabling ASPM L0s in devicetree")
Signed-off-by: Johan Hovold <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 2455decc574a..071741b81644 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -273,6 +273,25 @@ static int qcom_pcie_start_link(struct dw_pcie *pci)
return 0;
}

+static void qcom_pcie_clear_aspm_l0s(struct dw_pcie *pci)
+{
+ u16 offset;
+ u32 val;
+
+ if (!of_property_read_bool(pci->dev->of_node, "aspm-no-l0s"))
+ return;
+
+ offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+
+ dw_pcie_dbi_ro_wr_en(pci);
+
+ val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP);
+ val &= ~PCI_EXP_LNKCAP_ASPM_L0S;
+ writel(val, pci->dbi_base + offset + PCI_EXP_LNKCAP);
+
+ dw_pcie_dbi_ro_wr_dis(pci);
+}
+
static void qcom_pcie_clear_hpc(struct dw_pcie *pci)
{
u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
@@ -962,6 +981,7 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)

static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
{
+ qcom_pcie_clear_aspm_l0s(pcie->pci);
qcom_pcie_clear_hpc(pcie->pci);

return 0;
--
2.43.0


2024-02-12 19:35:05

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [RFC 08/10] PCI: qcom: Add support for disabling ASPM L0s in devicetree

On Mon, Feb 12, 2024 at 05:50:41PM +0100, Johan Hovold wrote:
> A recent commit started enabling ASPM unconditionally when the hardware
> claims to support it. This triggers Correctable Errors for some PCIe
> devices on machines like the Lenovo ThinkPad X13s, which could indicate
> an incomplete driver ASPM implementation or that the hardware does in
> fact not support L0s.

I think it would be useful for debugging purposes to identify the
specific commit. Maybe it's 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for
platforms supporting 1.9.0 ops") ?

> Add support for disabling ASPM L0s in the devicetree when it is not
> supported on a particular machine and controller.
>
> Note that only the 1.9.0 ops enable ASPM currently.
>
> Fixes: a9a023c05697 ("PCI: qcom: Add support for disabling ASPM L0s in devicetree")

I don't see this SHA1 in the PCI tree; is it a stable SHA1 from
somewhere else?

> Signed-off-by: Johan Hovold <[email protected]>
> ---
> drivers/pci/controller/dwc/pcie-qcom.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index 2455decc574a..071741b81644 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -273,6 +273,25 @@ static int qcom_pcie_start_link(struct dw_pcie *pci)
> return 0;
> }
>
> +static void qcom_pcie_clear_aspm_l0s(struct dw_pcie *pci)
> +{
> + u16 offset;
> + u32 val;
> +
> + if (!of_property_read_bool(pci->dev->of_node, "aspm-no-l0s"))
> + return;
> +
> + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> +
> + dw_pcie_dbi_ro_wr_en(pci);
> +
> + val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP);
> + val &= ~PCI_EXP_LNKCAP_ASPM_L0S;
> + writel(val, pci->dbi_base + offset + PCI_EXP_LNKCAP);
> +
> + dw_pcie_dbi_ro_wr_dis(pci);
> +}
> +
> static void qcom_pcie_clear_hpc(struct dw_pcie *pci)
> {
> u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> @@ -962,6 +981,7 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
>
> static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
> {
> + qcom_pcie_clear_aspm_l0s(pcie->pci);
> qcom_pcie_clear_hpc(pcie->pci);
>
> return 0;
> --
> 2.43.0
>

2024-02-12 20:21:00

by Johan Hovold

[permalink] [raw]
Subject: Re: [RFC 08/10] PCI: qcom: Add support for disabling ASPM L0s in devicetree

On Mon, Feb 12, 2024 at 01:34:49PM -0600, Bjorn Helgaas wrote:
> On Mon, Feb 12, 2024 at 05:50:41PM +0100, Johan Hovold wrote:
> > A recent commit started enabling ASPM unconditionally when the hardware
> > claims to support it. This triggers Correctable Errors for some PCIe
> > devices on machines like the Lenovo ThinkPad X13s, which could indicate
> > an incomplete driver ASPM implementation or that the hardware does in
> > fact not support L0s.
>
> I think it would be useful for debugging purposes to identify the
> specific commit. Maybe it's 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for
> platforms supporting 1.9.0 ops") ?
>
> > Add support for disabling ASPM L0s in the devicetree when it is not
> > supported on a particular machine and controller.
> >
> > Note that only the 1.9.0 ops enable ASPM currently.
> >
> > Fixes: a9a023c05697 ("PCI: qcom: Add support for disabling ASPM L0s in devicetree")
>
> I don't see this SHA1 in the PCI tree; is it a stable SHA1 from
> somewhere else?

Yes, sorry, this was indeed supposed to say

Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops")

as it would be a dependency for the follow-on fixes.

Johan

2024-02-14 06:36:16

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH 00/10] arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

On Mon, Feb 12, 2024 at 05:50:33PM +0100, Johan Hovold wrote:
> This series addresses a few problems with the sc8280xp PCIe
> implementation.
>
> The DWC PCIe controller can either use its internal MSI controller or an
> external one such as the GICv3 ITS. Enabling the latter allows for
> assigning affinity to individual interrupts, but results in a large
> amount of Correctable Errors being logged on both the Lenovo ThinkPad
> X13s and the sc8280xp-crd reference design.
>
> It turns out that these errors are always generated,

How did you confirm this?

> but for some yet to
> be determined reason, the AER interrupts are never received when using
> the internal MSI controller, which makes the link errors harder to
> notice.
>

If you manually inject the errors using "aer-inject", are you not seeing the AER
errors with internal MSI controller as well?

> On the X13s, there is a large number of errors generated when bringing
> up the link on boot. This is related to the fact that UEFI firmware has
> already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the
> link at Gen3 generates a massive amount of errors until the Wi-Fi
> firmware is restarted.
>
> A recent commit enabling ASPM on certain Qualcomm platforms introduced
> further errors when using the Wi-Fi on the X13s as well as when
> accessing the NVMe on the CRD. The exact reason for this has not yet
> been identified, but disabling ASPM L0s makes the errors go away. This
> could suggest that either the current ASPM implementation is incomplete
> or that L0s is not supported with these devices.
>

What are those "further errors" you are seeing with ASPM enabled? Are those
errors appear with GIC ITS or with internal MSI controller as well?

> Note that the X13s and CRD use the same Wi-Fi controller, but the errors
> are only generated on the X13s. The NVMe controller on my X13s does not
> support L0s so there are no issues there, unlike on the CRD which uses a
> different controller. The modem on the CRD does not generate any errors,
> but both the NVMe and modem keeps bouncing in and out of L0s/L1 also
> when not used, which could indicate that there are bigger problems with
> the ASPM implementation. I don't have a modem on my X13s so I have not
> been able to test whether L0s causes an trouble there.
>
> Enabling AER error reporting on sc8280xp could similarly also reveal
> existing problems with the related sa8295p and sa8540p platforms as they
> share the base dtsi.
>
> The last four patches, marked as RFC, adds support for disabling ASPM
> L0s in the devicetree and disables it selectively for the X13s Wi-Fi
> and CRD NVMe. If it turns out that the Qualcomm PCIe implementation is
> incomplete, we may need to disable ASPM (L0s) completely in the driver
> instead.
>

If the device is not supporting L0s, then it as to be disabled in the device,
not in the PCIe controller, no?

> Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a
> significant impact on the power consumption
>
> The DT bindings and PCI patch are expected to go through the PCI tree,
> while Bjorn A takes the devicetree updates through the Qualcomm tree.
>

Since I took a stab at enabling the GIC ITS previously, I noticed that the NVMe
performance got a slight dip. And that was one of the reasons (apart from AER
errors) that I never submitted the patch.

Could you share the NVMe benchmark (fio) with this series?

> Johan
>
>
> Johan Hovold (10):
> dt-bindings: PCI: qcom: Allow 'required-opps'
> dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
> arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
> arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
> arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
> arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

Is this patch based on the version I shared with you long back? If so, I'd
expect to have some credit. If you came up with your own version, then ignore
this comment.

- Mani

> dt-bindings: PCI: qcom: Allow 'aspm-no-l0s'
> PCI: qcom: Add support for disabling ASPM L0s in devicetree
> arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe
> arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi
>
> .../devicetree/bindings/pci/qcom,pcie.yaml | 6 +++++-
> arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 4 ++++
> .../qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 3 +++
> arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 17 +++++++++++++++-
> drivers/pci/controller/dwc/pcie-qcom.c | 20 +++++++++++++++++++
> 5 files changed, 48 insertions(+), 2 deletions(-)
>
> --
> 2.43.0
>

--
மணிவண்ணன் சதாசிவம்

2024-02-14 11:16:11

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 00/10] arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

On Wed, Feb 14, 2024 at 12:05:54PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Feb 12, 2024 at 05:50:33PM +0100, Johan Hovold wrote:
> > This series addresses a few problems with the sc8280xp PCIe
> > implementation.
> >
> > The DWC PCIe controller can either use its internal MSI controller or an
> > external one such as the GICv3 ITS. Enabling the latter allows for
> > assigning affinity to individual interrupts, but results in a large
> > amount of Correctable Errors being logged on both the Lenovo ThinkPad
> > X13s and the sc8280xp-crd reference design.
> >
> > It turns out that these errors are always generated,
>
> How did you confirm this?

You can see that error flags being set in the controller and endpoint,
for example, using lspci -vv:

CESta: RxErr- BadTLP+ BadDLLP- Rollover- Timeout- AdvNonFatalErr-

> > but for some yet to
> > be determined reason, the AER interrupts are never received when using
> > the internal MSI controller, which makes the link errors harder to
> > notice.
>
> If you manually inject the errors using "aer-inject", are you not seeing the AER
> errors with internal MSI controller as well?

I haven't tried that, I'm just reporting that that piece of
functionality is currently broken and that that partly explains why the
ASPM problems went unnoticed.

> > On the X13s, there is a large number of errors generated when bringing
> > up the link on boot. This is related to the fact that UEFI firmware has
> > already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the
> > link at Gen3 generates a massive amount of errors until the Wi-Fi
> > firmware is restarted.
> >
> > A recent commit enabling ASPM on certain Qualcomm platforms introduced
> > further errors when using the Wi-Fi on the X13s as well as when
> > accessing the NVMe on the CRD. The exact reason for this has not yet
> > been identified, but disabling ASPM L0s makes the errors go away. This
> > could suggest that either the current ASPM implementation is incomplete
> > or that L0s is not supported with these devices.
>
> What are those "further errors" you are seeing with ASPM enabled? Are those
> errors appear with GIC ITS or with internal MSI controller as well?

Further errors as in further correctable errors that are not related to
the errors seen when resetting the X13s Wi-Fi link at boot.

These show up, for example, when accessing the NVMe on the CRD or when
using the Wi-Fi on the X13s. These errors go away when L0s is disabled.

And yes, you see them with both the external and internal MSI controller
(in the latter case, by looking at the error flags mentioned above).

> > Note that the X13s and CRD use the same Wi-Fi controller, but the errors
> > are only generated on the X13s. The NVMe controller on my X13s does not
> > support L0s so there are no issues there, unlike on the CRD which uses a
> > different controller. The modem on the CRD does not generate any errors,
> > but both the NVMe and modem keeps bouncing in and out of L0s/L1 also
> > when not used, which could indicate that there are bigger problems with
> > the ASPM implementation. I don't have a modem on my X13s so I have not
> > been able to test whether L0s causes an trouble there.
> >
> > Enabling AER error reporting on sc8280xp could similarly also reveal
> > existing problems with the related sa8295p and sa8540p platforms as they
> > share the base dtsi.
> >
> > The last four patches, marked as RFC, adds support for disabling ASPM
> > L0s in the devicetree and disables it selectively for the X13s Wi-Fi
> > and CRD NVMe. If it turns out that the Qualcomm PCIe implementation is
> > incomplete, we may need to disable ASPM (L0s) completely in the driver
> > instead.
>
> If the device is not supporting L0s, then it as to be disabled in the device,
> not in the PCIe controller, no?

Well, we don't know yet where the problem lies, just that enabling L0s
results in a large number of correctable errors.

Until yesterday I had not seen any such errors for the Wi-Fi on the CRD,
which uses essentially the same ath11k controller, so there was no clear
indication that this was necessarily a problem with the devices either.

> > Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a
> > significant impact on the power consumption
> >
> > The DT bindings and PCI patch are expected to go through the PCI tree,
> > while Bjorn A takes the devicetree updates through the Qualcomm tree.
>
> Since I took a stab at enabling the GIC ITS previously, I noticed that the NVMe
> performance got a slight dip. And that was one of the reasons (apart from AER
> errors) that I never submitted the patch.
>
> Could you share the NVMe benchmark (fio) with this series?

Did you have any particular benchmark in mind?

I have run multiple fio benchmarks and while the results vary with the
parameters, the impact of switching to ITS (so that not all PCIe
interrupts are processed on CPU0) is generally favourable.

A raw sequential read shows no change in throughput on either the X13s
or the CRD even if for some reason this test performs really badly on
the X13s (i.e. regardless of which MSI controller is used):

crd-rseq-read: IOPS=11.1k, BW=2764MiB/s (2898MB/s)(81.0GiB/30003msec)
X13s-rseq-read: IOPS=508, BW=127MiB/s (134MB/s)(3841MiB/30169msec)

Another benchmark I've used against a mounted ext4 partition shows a 2x
improvement in throughput with ITS for sequential and random reads and
writes on the X13s:

seq-read: IOPS=88.4k, BW=345MiB/s (362MB/s)(10.0GiB/29657msec)
rand-read: IOPS=21.2k, BW=82.8MiB/s (86.8MB/s)(4967MiB/60001msec)
seq-write: IOPS=162k, BW=632MiB/s (662MB/s)(10.0GiB/16213msec)
rand-write: IOPS=142k, BW=555MiB/s (582MB/s)(10.0GiB/18439msec)

while the results are essentially unchanged with a larger block size and
queue depth (32/2m instead of 4/4k):

seq-read: IOPS=1095, BW=2191MiB/s (2298MB/s)(10.0GiB/4673msec)
rand-read: IOPS=1020, BW=2041MiB/s (2140MB/s)(10.0GiB/5017msec)
seq-write: IOPS=918, BW=1837MiB/s (1926MB/s)(10.0GiB/5574msec)
rand-write: IOPS=826, BW=1653MiB/s (1734MB/s)(10.0GiB/6194msec)

> > Johan Hovold (10):
> > dt-bindings: PCI: qcom: Allow 'required-opps'
> > dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
> > arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
> > arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
> > arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
> > arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe
>
> Is this patch based on the version I shared with you long back? If so, I'd
> expect to have some credit. If you came up with your own version, then ignore
> this comment.

No, this patch has beeen created and evaluated from scratch based on the
downstream direwolf dts, which has these five 'msi-map' properties.

I debated whether I should base it on your version instead, but in the
end it would have a new commit message and only these properties from
the downstream dtsi would remain (you also removed existing properties
IIRC). So while it's certainly inspired by your work, this has been done
from scratch, including the testing.

If you prefer I can make this clear in the commit message, but adding a
Co-developed-by didn't seem quite right either as I did this work
without your involvement. But perhaps that would be better?

Johan

2024-02-14 11:57:29

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 01/10] dt-bindings: PCI: qcom: Allow 'required-opps'

On 12/02/2024 17:50, Johan Hovold wrote:
> Some Qualcomm SoCs require a minimum performance level for the power
> domain so add 'required-opps' to the binding.
>
> Signed-off-by: Johan Hovold <[email protected]>
> ---
> Documentation/devicetree/bindings/pci/qcom,pcie.yaml | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie.yaml b/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
> index a93ab3b54066..5eda4e72f681 100644
> --- a/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
> +++ b/Documentation/devicetree/bindings/pci/qcom,pcie.yaml
> @@ -123,6 +123,9 @@ properties:
> description: GPIO controlled connection to PERST# signal
> maxItems: 1
>
> + required-opps:
> + maxItems: 1
> +

Just letting know that this might conflict with:
https://lore.kernel.org/all/[email protected]/

(I would be happy if my series got applied, so people can base their
worn easily on it)

Best regards,
Krzysztof


2024-02-14 12:01:40

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On 12/02/2024 17:50, Johan Hovold wrote:
> Whether the 'msi-map-mask' property is needed or not depends on how the
> MSI interrupts are mapped and it should therefore not be described as
> required.

I could imagine that on all devices the interrupts are mapped in a way
you need to provide msi-map-mask. IOW, can there be a Qualcomm platform
without msi-map-mask?

Best regards,
Krzysztof


2024-02-14 12:12:07

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 01/10] dt-bindings: PCI: qcom: Allow 'required-opps'

On 12/02/2024 17:50, Johan Hovold wrote:
> Some Qualcomm SoCs require a minimum performance level for the power
> domain so add 'required-opps' to the binding.
>
> Signed-off-by: Johan Hovold <[email protected]>


Reviewed-by: Krzysztof Kozlowski <[email protected]>

Best regards,
Krzysztof


2024-02-14 12:55:26

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Wed, Feb 14, 2024 at 01:01:20PM +0100, Krzysztof Kozlowski wrote:
> On 12/02/2024 17:50, Johan Hovold wrote:
> > Whether the 'msi-map-mask' property is needed or not depends on how the
> > MSI interrupts are mapped and it should therefore not be described as
> > required.
>
> I could imagine that on all devices the interrupts are mapped in a way
> you need to provide msi-map-mask. IOW, can there be a Qualcomm platform
> without msi-map-mask?

I don't have access to the documentation so I'll leave that for you guys
to determine. I do note that the downstream DT does not use it and that
we have a new devicetree in linux-next which also does not have it:

https://lore.kernel.org/r/20240125-topic-sm8650-upstream-pcie-its-v1-1-cb506deeb43e@linaro.org

But at least the latter looks like an omission that should be fixed.

Johan

2024-02-14 13:39:18

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On 14/02/2024 13:54, Johan Hovold wrote:
> On Wed, Feb 14, 2024 at 01:01:20PM +0100, Krzysztof Kozlowski wrote:
>> On 12/02/2024 17:50, Johan Hovold wrote:
>>> Whether the 'msi-map-mask' property is needed or not depends on how the
>>> MSI interrupts are mapped and it should therefore not be described as
>>> required.
>>
>> I could imagine that on all devices the interrupts are mapped in a way
>> you need to provide msi-map-mask. IOW, can there be a Qualcomm platform
>> without msi-map-mask?
>
> I don't have access to the documentation so I'll leave that for you guys
> to determine. I do note that the downstream DT does not use it and that
> we have a new devicetree in linux-next which also does not have it:
>
> https://lore.kernel.org/r/20240125-topic-sm8650-upstream-pcie-its-v1-1-cb506deeb43e@linaro.org
>
> But at least the latter looks like an omission that should be fixed.

Hm, either that or the mask for sm8450 was not needed as well. Anyway,
thanks for explanation, appreciated!


Acked-by: Krzysztof Kozlowski <[email protected]>

Best regards,
Krzysztof


2024-02-15 20:50:53

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH 06/10] arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

On 12.02.2024 17:50, Johan Hovold wrote:
> The DWC PCIe controller can be used with its internal MSI controller or
> with an external one such as the GICv3 Interrupt Translation Service
> (ITS).
>
> Add the msi-map properties needed to use the GIC ITS. This will also
> make Linux switch to the ITS implementation, which allows for assigning
> affinity to individual MSIs.
>
> Note that using the GIC ITS on SC8280XP will cause Advanced Error
> Reporting (AER) interrupts to be received on errors unlike when using
> the internal MSI controller. This will specifically lead to
> notifications about Correctable Errors being logged for the Wi-Fi
> controller the Lenovo ThinkPad X13s when the AER driver is enabled.
>
> Signed-off-by: Johan Hovold <[email protected]>
> ---

The numbers match

Reviewed-by: Konrad Dybcio <[email protected]>

Konrad


2024-02-15 21:01:27

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [PATCH 04/10] arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed

On 12.02.2024 17:50, Johan Hovold wrote:
> Limit the WiFi PCIe link speed to Gen2 speed (500 GB/s), which is the

MB/s

> speed that Windows uses.
>
> Signed-off-by: Johan Hovold <[email protected]>
> ---

Hm.. I'dve assumed it ships with a WLAN card that supports moving
more bandwidth.. Is it always at gen2?

Konrad

2024-02-16 07:12:34

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 04/10] arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed

On Thu, Feb 15, 2024 at 09:47:01PM +0100, Konrad Dybcio wrote:
> On 12.02.2024 17:50, Johan Hovold wrote:
> > Limit the WiFi PCIe link speed to Gen2 speed (500 GB/s), which is the
>
> MB/s

Indeed, thanks for spotting that.

> > speed that Windows uses.
> >
> > Signed-off-by: Johan Hovold <[email protected]>
> > ---
>
> Hm.. I'dve assumed it ships with a WLAN card that supports moving
> more bandwidth.. Is it always at gen2?

I don't know how the Windows driver works, but the UEFI firmware has
brought the link up at Gen2 and that's also what Windows reported when I
checked. But I was not actually using the wifi when I did so.

But yes, it seems we may be limiting the theoretical maximum data rate
for the wifi this way.

As this appears to fix wifi startup issue reported by one user, and
allows us to enable ITS and AER reporting, perhaps that's acceptable
until the Linux driver can manage to scale the link speed (or we figure
out a more elaborate way of restarting the link at boot).

The PCIe link errors could also indicate that the wifi can not be run
any faster than this on these machines even if my guess is something is
wrong with ASPM implementation. Hopefully Qualcomm will be able to shed
some light on that.

Johan

2024-02-16 12:04:11

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 04/10] arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed

On Fri, Feb 16, 2024 at 08:12:46AM +0100, Johan Hovold wrote:
> On Thu, Feb 15, 2024 at 09:47:01PM +0100, Konrad Dybcio wrote:
> > On 12.02.2024 17:50, Johan Hovold wrote:
> > > Limit the WiFi PCIe link speed to Gen2 speed (500 GB/s), which is the
> >
> > MB/s
>
> Indeed, thanks for spotting that.
>
> > > speed that Windows uses.

> > Hm.. I'dve assumed it ships with a WLAN card that supports moving
> > more bandwidth.. Is it always at gen2?

> But yes, it seems we may be limiting the theoretical maximum data rate
> for the wifi this way.

It looks like the peak wifi speed for these chips is 3.6 Gbps, and it
may be lower for the X13s (and in practice). So 500 MB/s should be more
than enough.

https://www.qualcomm.com/products/technology/wi-fi/fastconnect/fastconnect-6900

Johan

2024-02-16 14:54:38

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH 00/10] arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe

On Wed, Feb 14, 2024 at 12:09:15PM +0100, Johan Hovold wrote:
> On Wed, Feb 14, 2024 at 12:05:54PM +0530, Manivannan Sadhasivam wrote:
> > On Mon, Feb 12, 2024 at 05:50:33PM +0100, Johan Hovold wrote:
> > > This series addresses a few problems with the sc8280xp PCIe
> > > implementation.
> > >
> > > The DWC PCIe controller can either use its internal MSI controller or an
> > > external one such as the GICv3 ITS. Enabling the latter allows for
> > > assigning affinity to individual interrupts, but results in a large
> > > amount of Correctable Errors being logged on both the Lenovo ThinkPad
> > > X13s and the sc8280xp-crd reference design.
> > >
> > > It turns out that these errors are always generated,
> >
> > How did you confirm this?
>
> You can see that error flags being set in the controller and endpoint,
> for example, using lspci -vv:
>
> CESta: RxErr- BadTLP+ BadDLLP- Rollover- Timeout- AdvNonFatalErr-
>

Okay.

> > > but for some yet to
> > > be determined reason, the AER interrupts are never received when using
> > > the internal MSI controller, which makes the link errors harder to
> > > notice.
> >
> > If you manually inject the errors using "aer-inject", are you not seeing the AER
> > errors with internal MSI controller as well?
>
> I haven't tried that, I'm just reporting that that piece of
> functionality is currently broken and that that partly explains why the
> ASPM problems went unnoticed.
>

I just gave it a shot and I could see the AER interrupts raised for correctable
errors with internal MSI controller.

Now I'm puzzled why this is not getting triggered by default. I'll check with
the hardware team if they have any clue.

> > > On the X13s, there is a large number of errors generated when bringing
> > > up the link on boot. This is related to the fact that UEFI firmware has
> > > already enabled the Wi-Fi PCIe link at Gen2 speed and restarting the
> > > link at Gen3 generates a massive amount of errors until the Wi-Fi
> > > firmware is restarted.
> > >
> > > A recent commit enabling ASPM on certain Qualcomm platforms introduced
> > > further errors when using the Wi-Fi on the X13s as well as when
> > > accessing the NVMe on the CRD. The exact reason for this has not yet
> > > been identified, but disabling ASPM L0s makes the errors go away. This
> > > could suggest that either the current ASPM implementation is incomplete
> > > or that L0s is not supported with these devices.
> >
> > What are those "further errors" you are seeing with ASPM enabled? Are those
> > errors appear with GIC ITS or with internal MSI controller as well?
>
> Further errors as in further correctable errors that are not related to
> the errors seen when resetting the X13s Wi-Fi link at boot.
>
> These show up, for example, when accessing the NVMe on the CRD or when
> using the Wi-Fi on the X13s. These errors go away when L0s is disabled.
>
> And yes, you see them with both the external and internal MSI controller
> (in the latter case, by looking at the error flags mentioned above).
>

Hmm.

> > > Note that the X13s and CRD use the same Wi-Fi controller, but the errors
> > > are only generated on the X13s. The NVMe controller on my X13s does not
> > > support L0s so there are no issues there, unlike on the CRD which uses a
> > > different controller. The modem on the CRD does not generate any errors,
> > > but both the NVMe and modem keeps bouncing in and out of L0s/L1 also
> > > when not used, which could indicate that there are bigger problems with
> > > the ASPM implementation. I don't have a modem on my X13s so I have not
> > > been able to test whether L0s causes an trouble there.
> > >
> > > Enabling AER error reporting on sc8280xp could similarly also reveal
> > > existing problems with the related sa8295p and sa8540p platforms as they
> > > share the base dtsi.
> > >
> > > The last four patches, marked as RFC, adds support for disabling ASPM
> > > L0s in the devicetree and disables it selectively for the X13s Wi-Fi
> > > and CRD NVMe. If it turns out that the Qualcomm PCIe implementation is
> > > incomplete, we may need to disable ASPM (L0s) completely in the driver
> > > instead.
> >
> > If the device is not supporting L0s, then it as to be disabled in the device,
> > not in the PCIe controller, no?
>
> Well, we don't know yet where the problem lies, just that enabling L0s
> results in a large number of correctable errors.
>
> Until yesterday I had not seen any such errors for the Wi-Fi on the CRD,
> which uses essentially the same ath11k controller, so there was no clear
> indication that this was necessarily a problem with the devices either.
>

I'll confirm the L0s compatibility with the hardware team.

> > > Note that disabling ASPM L0s for the X13s Wi-Fi does not seem to have a
> > > significant impact on the power consumption
> > >
> > > The DT bindings and PCI patch are expected to go through the PCI tree,
> > > while Bjorn A takes the devicetree updates through the Qualcomm tree.
> >
> > Since I took a stab at enabling the GIC ITS previously, I noticed that the NVMe
> > performance got a slight dip. And that was one of the reasons (apart from AER
> > errors) that I never submitted the patch.
> >
> > Could you share the NVMe benchmark (fio) with this series?
>
> Did you have any particular benchmark in mind?
>
> I have run multiple fio benchmarks and while the results vary with the
> parameters, the impact of switching to ITS (so that not all PCIe
> interrupts are processed on CPU0) is generally favourable.
>
> A raw sequential read shows no change in throughput on either the X13s
> or the CRD even if for some reason this test performs really badly on
> the X13s (i.e. regardless of which MSI controller is used):
>
> crd-rseq-read: IOPS=11.1k, BW=2764MiB/s (2898MB/s)(81.0GiB/30003msec)
> X13s-rseq-read: IOPS=508, BW=127MiB/s (134MB/s)(3841MiB/30169msec)
>
> Another benchmark I've used against a mounted ext4 partition shows a 2x
> improvement in throughput with ITS for sequential and random reads and
> writes on the X13s:
>
> seq-read: IOPS=88.4k, BW=345MiB/s (362MB/s)(10.0GiB/29657msec)
> rand-read: IOPS=21.2k, BW=82.8MiB/s (86.8MB/s)(4967MiB/60001msec)
> seq-write: IOPS=162k, BW=632MiB/s (662MB/s)(10.0GiB/16213msec)
> rand-write: IOPS=142k, BW=555MiB/s (582MB/s)(10.0GiB/18439msec)
>
> while the results are essentially unchanged with a larger block size and
> queue depth (32/2m instead of 4/4k):
>
> seq-read: IOPS=1095, BW=2191MiB/s (2298MB/s)(10.0GiB/4673msec)
> rand-read: IOPS=1020, BW=2041MiB/s (2140MB/s)(10.0GiB/5017msec)
> seq-write: IOPS=918, BW=1837MiB/s (1926MB/s)(10.0GiB/5574msec)
> rand-write: IOPS=826, BW=1653MiB/s (1734MB/s)(10.0GiB/6194msec)
>

Ok, this looks promising. Long back when I tried the benchmark (seq & rand r/w),
performance dropped slightly with GIC ITS. But looks like things have changed.

> > > Johan Hovold (10):
> > > dt-bindings: PCI: qcom: Allow 'required-opps'
> > > dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
> > > arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP
> > > arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
> > > arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
> > > arm64: dts: qcom: sc8280xp: enable GICv3 ITS for PCIe
> >
> > Is this patch based on the version I shared with you long back? If so, I'd
> > expect to have some credit. If you came up with your own version, then ignore
> > this comment.
>
> No, this patch has beeen created and evaluated from scratch based on the
> downstream direwolf dts, which has these five 'msi-map' properties.
>
> I debated whether I should base it on your version instead, but in the
> end it would have a new commit message and only these properties from
> the downstream dtsi would remain (you also removed existing properties
> IIRC). So while it's certainly inspired by your work, this has been done
> from scratch, including the testing.
>
> If you prefer I can make this clear in the commit message, but adding a
> Co-developed-by didn't seem quite right either as I did this work
> without your involvement. But perhaps that would be better?
>

Nah. As I said, if you have created the patch without basing on my version,
then no credit is required. I just wanted to know since I shared the patch
earlier.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-02-16 16:54:48

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Wed, Feb 14, 2024 at 02:38:57PM +0100, Krzysztof Kozlowski wrote:
> On 14/02/2024 13:54, Johan Hovold wrote:
> > On Wed, Feb 14, 2024 at 01:01:20PM +0100, Krzysztof Kozlowski wrote:
> >> On 12/02/2024 17:50, Johan Hovold wrote:
> >>> Whether the 'msi-map-mask' property is needed or not depends on how the
> >>> MSI interrupts are mapped and it should therefore not be described as
> >>> required.
> >>
> >> I could imagine that on all devices the interrupts are mapped in a way
> >> you need to provide msi-map-mask. IOW, can there be a Qualcomm platform
> >> without msi-map-mask?
> >
> > I don't have access to the documentation so I'll leave that for you guys
> > to determine. I do note that the downstream DT does not use it and that
> > we have a new devicetree in linux-next which also does not have it:
> >
> > https://lore.kernel.org/r/20240125-topic-sm8650-upstream-pcie-its-v1-1-cb506deeb43e@linaro.org
> >
> > But at least the latter looks like an omission that should be fixed.
>
> Hm, either that or the mask for sm8450 was not needed as well. Anyway,
> thanks for explanation, appreciated!
>

msi-map-mask is definitely needed as it would allow all the devices under the
same bus to reuse the MSI identifier. Currently, excluding this property will
not cause any issue since there is a single device under each bus. But we cannot
assume that is going to be the case on all boards.

I will submit a patch to fix SM8650.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-02-20 07:41:39

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Fri, Feb 16, 2024 at 10:24:06PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Feb 14, 2024 at 02:38:57PM +0100, Krzysztof Kozlowski wrote:
> > On 14/02/2024 13:54, Johan Hovold wrote:
> > > On Wed, Feb 14, 2024 at 01:01:20PM +0100, Krzysztof Kozlowski wrote:
> > >> On 12/02/2024 17:50, Johan Hovold wrote:
> > >>> Whether the 'msi-map-mask' property is needed or not depends on how the
> > >>> MSI interrupts are mapped and it should therefore not be described as
> > >>> required.
> > >>
> > >> I could imagine that on all devices the interrupts are mapped in a way
> > >> you need to provide msi-map-mask. IOW, can there be a Qualcomm platform
> > >> without msi-map-mask?
> > >
> > > I don't have access to the documentation so I'll leave that for you guys
> > > to determine. I do note that the downstream DT does not use it and that
> > > we have a new devicetree in linux-next which also does not have it:
> > >
> > > https://lore.kernel.org/r/20240125-topic-sm8650-upstream-pcie-its-v1-1-cb506deeb43e@linaro.org
> > >
> > > But at least the latter looks like an omission that should be fixed.
> >
> > Hm, either that or the mask for sm8450 was not needed as well. Anyway,
> > thanks for explanation, appreciated!
>
> msi-map-mask is definitely needed as it would allow all the devices under the
> same bus to reuse the MSI identifier. Currently, excluding this property will
> not cause any issue since there is a single device under each bus. But we cannot
> assume that is going to be the case on all boards.

Are you saying that there is never a use case for an identity mapping?
Just on Qualcomm hardware or in general?

It looks like we have a fairly large number of mainline devicetrees that
do use an identity mapping here (i.e. do not specify 'msi-map-mask') and
the binding document also has an explicit example of this.

Documentation/devicetree/bindings/pci/pci-msi.txt

> I will submit a patch to fix SM8650.

Johan

2024-02-20 08:43:05

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Tue, Feb 20, 2024 at 08:41:25AM +0100, Johan Hovold wrote:
> On Fri, Feb 16, 2024 at 10:24:06PM +0530, Manivannan Sadhasivam wrote:

> > msi-map-mask is definitely needed as it would allow all the devices under the
> > same bus to reuse the MSI identifier. Currently, excluding this property will
> > not cause any issue since there is a single device under each bus. But we cannot
> > assume that is going to be the case on all boards.
>
> Are you saying that there is never a use case for an identity mapping?
> Just on Qualcomm hardware or in general?
>
> It looks like we have a fairly large number of mainline devicetrees that
> do use an identity mapping here (i.e. do not specify 'msi-map-mask') and
> the binding document also has an explicit example of this.
>
> Documentation/devicetree/bindings/pci/pci-msi.txt

The above should have said "linear mapping" as the msi-base is not
always identical to the rid-base, but you get the point.

Johan

2024-02-21 05:26:29

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Tue, Feb 20, 2024 at 08:41:25AM +0100, Johan Hovold wrote:
> On Fri, Feb 16, 2024 at 10:24:06PM +0530, Manivannan Sadhasivam wrote:
> > On Wed, Feb 14, 2024 at 02:38:57PM +0100, Krzysztof Kozlowski wrote:
> > > On 14/02/2024 13:54, Johan Hovold wrote:
> > > > On Wed, Feb 14, 2024 at 01:01:20PM +0100, Krzysztof Kozlowski wrote:
> > > >> On 12/02/2024 17:50, Johan Hovold wrote:
> > > >>> Whether the 'msi-map-mask' property is needed or not depends on how the
> > > >>> MSI interrupts are mapped and it should therefore not be described as
> > > >>> required.
> > > >>
> > > >> I could imagine that on all devices the interrupts are mapped in a way
> > > >> you need to provide msi-map-mask. IOW, can there be a Qualcomm platform
> > > >> without msi-map-mask?
> > > >
> > > > I don't have access to the documentation so I'll leave that for you guys
> > > > to determine. I do note that the downstream DT does not use it and that
> > > > we have a new devicetree in linux-next which also does not have it:
> > > >
> > > > https://lore.kernel.org/r/20240125-topic-sm8650-upstream-pcie-its-v1-1-cb506deeb43e@linaro.org
> > > >
> > > > But at least the latter looks like an omission that should be fixed.
> > >
> > > Hm, either that or the mask for sm8450 was not needed as well. Anyway,
> > > thanks for explanation, appreciated!
> >
> > msi-map-mask is definitely needed as it would allow all the devices under the
> > same bus to reuse the MSI identifier. Currently, excluding this property will
> > not cause any issue since there is a single device under each bus. But we cannot
> > assume that is going to be the case on all boards.
>
> Are you saying that there is never a use case for an identity mapping?
> Just on Qualcomm hardware or in general?
>
> It looks like we have a fairly large number of mainline devicetrees that
> do use an identity mapping here (i.e. do not specify 'msi-map-mask') and
> the binding document also has an explicit example of this.
>
> Documentation/devicetree/bindings/pci/pci-msi.txt

I don't know how other platforms supposed to work without this property for more
than one devices. Maybe they were not tested enough?

But for sure, Qcom SoCs require either per device MSI identifier or
msi-map-mask.

- Mani

--
மணிவண்ணன் சதாசிவம்

2024-02-21 10:32:45

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Wed, Feb 21, 2024 at 10:56:07AM +0530, Manivannan Sadhasivam wrote:
> On Tue, Feb 20, 2024 at 08:41:25AM +0100, Johan Hovold wrote:
> > On Fri, Feb 16, 2024 at 10:24:06PM +0530, Manivannan Sadhasivam wrote:

> > > msi-map-mask is definitely needed as it would allow all the devices under the
> > > same bus to reuse the MSI identifier. Currently, excluding this property will
> > > not cause any issue since there is a single device under each bus. But we cannot
> > > assume that is going to be the case on all boards.
> >
> > Are you saying that there is never a use case for an identity mapping?
> > Just on Qualcomm hardware or in general?
> >
> > It looks like we have a fairly large number of mainline devicetrees that
> > do use an identity mapping here (i.e. do not specify 'msi-map-mask') and
> > the binding document also has an explicit example of this.
> >
> > Documentation/devicetree/bindings/pci/pci-msi.txt
>
> I don't know how other platforms supposed to work without this property for more
> than one devices. Maybe they were not tested enough?

Seems a bit far fetched since it's also an example in the binding.

In fact, only the two Qualcomm platforms that you added 'msi-map-mask'
for use it.

> But for sure, Qcom SoCs require either per device MSI identifier or
> msi-map-mask.

But isn't the mapping set up by the boot firmware and can differ between
platforms?

The mapping on sc8280xp looks quite different from sm8450/sm8650:

msi-map = <0x0 &gic_its 0x5981 0x1>,
<0x100 &gic_its 0x5980 0x1>;
msi-map-mask = <0xff00>;

Here it's obvious that the mask is needed, whereas for sc8280xp:

msi-map = <0x0 &its 0xa0000 0x10000>;

it's not obvious what the mask should be. In fact, it looks like
Qualcomm intended a linear mapping here as the length is 0x10000 and
they left out the mask.

And after digging through the X13s ACPI tables, this is indeed how the
hardware is configured, which means that we should not use a
'msi-map-mask' property for sc8280xp and that this patch is correct.

Johan

2024-02-22 03:53:34

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH 02/10] dt-bindings: PCI: qcom: Do not require 'msi-map-mask'

On Wed, Feb 21, 2024 at 11:30:58AM +0100, Johan Hovold wrote:
> On Wed, Feb 21, 2024 at 10:56:07AM +0530, Manivannan Sadhasivam wrote:
> > On Tue, Feb 20, 2024 at 08:41:25AM +0100, Johan Hovold wrote:
> > > On Fri, Feb 16, 2024 at 10:24:06PM +0530, Manivannan Sadhasivam wrote:
>
> > > > msi-map-mask is definitely needed as it would allow all the devices under the
> > > > same bus to reuse the MSI identifier. Currently, excluding this property will
> > > > not cause any issue since there is a single device under each bus. But we cannot
> > > > assume that is going to be the case on all boards.
> > >
> > > Are you saying that there is never a use case for an identity mapping?
> > > Just on Qualcomm hardware or in general?
> > >
> > > It looks like we have a fairly large number of mainline devicetrees that
> > > do use an identity mapping here (i.e. do not specify 'msi-map-mask') and
> > > the binding document also has an explicit example of this.
> > >
> > > Documentation/devicetree/bindings/pci/pci-msi.txt
> >
> > I don't know how other platforms supposed to work without this property for more
> > than one devices. Maybe they were not tested enough?
>
> Seems a bit far fetched since it's also an example in the binding.
>
> In fact, only the two Qualcomm platforms that you added 'msi-map-mask'
> for use it.
>
> > But for sure, Qcom SoCs require either per device MSI identifier or
> > msi-map-mask.
>
> But isn't the mapping set up by the boot firmware and can differ between
> platforms?
>
> The mapping on sc8280xp looks quite different from sm8450/sm8650:
>
> msi-map = <0x0 &gic_its 0x5981 0x1>,
> <0x100 &gic_its 0x5980 0x1>;
> msi-map-mask = <0xff00>;
>
> Here it's obvious that the mask is needed, whereas for sc8280xp:
>
> msi-map = <0x0 &its 0xa0000 0x10000>;
>
> it's not obvious what the mask should be. In fact, it looks like
> Qualcomm intended a linear mapping here as the length is 0x10000 and
> they left out the mask.
>
> And after digging through the X13s ACPI tables, this is indeed how the
> hardware is configured, which means that we should not use a
> 'msi-map-mask' property for sc8280xp and that this patch is correct.
>

Right. Confirmed the same with the hw team. On Qcom SoCs ITS mapping is
relatively similar to SMMU stream IDs. So on SM8450 and other mobile targets
making use of SMMUv2, only 128 SIDs are available, hence only 128 MSI
identifiers. But on SC8280XP and other similar ones, SMMUv3 is used, so there
are 65536 SIDs available and also the MSI identifiers. So yes, this SoC indeed
supports linear mapping of MSI identifiers and so the mask is not required.

Thanks!

- Mani

--
மணிவண்ணன் சதாசிவம்