2021-06-07 11:31:02

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

Hi,

This is the third iteration to improve handling of the 64-bit
attribute on non-prefetchable host bridge ranges. Previous version can
be found at [0][1].

This version is a small update over the previous version - changelog
below. If there is no futher feedback on the patches, please consider
merging them.

Thanks,
Punit

Changes:
v3:
* Improved commit log for clarity (Patch 1)
* Added Tested-by tags

v2:
* Check ranges PCI / bus addresses rather than CPU addresses
* (new) Restrict 32-bit size warnings on ranges that don't have the 64-bit attribute set
* Refactor the 32-bit size warning to the range parsing loop. This
change also prints the warnings right after the window mappings are
logged.


[0] https://lore.kernel.org/linux-arm-kernel/[email protected]/
[1] https://lore.kernel.org/linux-pci/[email protected]/

Punit Agrawal (4):
PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
PCI: of: Relax the condition for warning about non-prefetchable memory
aperture size
PCI: of: Refactor the check for non-prefetchable 32-bit window
arm64: dts: rockchip: Update PCI host bridge window to 32-bit address
memory

arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
drivers/pci/of.c | 17 ++++++++++++-----
2 files changed, 13 insertions(+), 6 deletions(-)

--
2.30.2


2021-06-07 11:32:21

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH v3 3/4] PCI: of: Refactor the check for non-prefetchable 32-bit window

Recently, an override was added for non-prefetchable host bridge
windows below 4GB that have the 64-bit address attribute set. As many
of the conditions for the check overlap with the check for
non-prefetchable window size, refactor the code to unify the ranges
validation into devm_of_pci_get_host_bridge_resources().

As an added benefit, the warning message is now printed right after
the range mapping giving the user a better indication of where the
issue is.

Signed-off-by: Punit Agrawal <[email protected]>
Tested-by: Alexandru Elisei <[email protected]>
Cc: Vidya Sagar <[email protected]>
---
drivers/pci/of.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 38fe2589beb0..0580c654127e 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -355,11 +355,15 @@ static int devm_of_pci_get_host_bridge_resources(struct device *dev,
*io_base = range.cpu_addr;
} else if (resource_type(res) == IORESOURCE_MEM) {
if (!(res->flags & IORESOURCE_PREFETCH)) {
- if (res->flags & IORESOURCE_MEM_64)
+ if (res->flags & IORESOURCE_MEM_64) {
if (!upper_32_bits(range.pci_addr + range.size - 1)) {
dev_warn(dev, "Clearing 64-bit flag for non-prefetchable memory below 4GB\n");
res->flags &= ~IORESOURCE_MEM_64;
}
+ } else {
+ if (upper_32_bits(resource_size(res)))
+ dev_warn(dev, "Memory resource size exceeds max for 32 bits\n");
+ }
}
}

@@ -579,12 +583,6 @@ static int pci_parse_request_of_pci_ranges(struct device *dev,
break;
case IORESOURCE_MEM:
res_valid |= !(res->flags & IORESOURCE_PREFETCH);
-
- if (!(res->flags & IORESOURCE_PREFETCH))
- if (!(res->flags & IORESOURCE_MEM_64) &&
- upper_32_bits(resource_size(res)))
- dev_warn(dev, "Memory resource size exceeds max for 32 bits\n");
-
break;
}
}
--
2.30.2

2021-06-07 11:32:40

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH v3 1/4] PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB

Some host bridges advertise non-prefetchable memory windows that are
entirely located below 4GB but are marked as 64-bit address memory.

Since commit 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
flags for 64-bit memory addresses"), the OF PCI range parser takes a
stricter view and treats 64-bit address ranges as advertised while
before such ranges were treated as 32-bit.

A PCI root port modelled as a PCI-to-PCI bridge cannot forward 64-bit
non-prefetchable memory ranges. As a result, the change in behaviour
due to the commit causes failure to allocate 32-bit BAR from a 64-bit
non-prefetchable window.

In order to not break platforms where non-prefetchable memory ranges
lie entirely below 4GB, clear the 64-bit flag.

Suggested-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Punit Agrawal <[email protected]>
Tested-by: Alexandru Elisei <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Rob Herring <[email protected]>
---
drivers/pci/of.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 85dcb7097da4..1e45186a5715 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -353,6 +353,14 @@ static int devm_of_pci_get_host_bridge_resources(struct device *dev,
dev_warn(dev, "More than one I/O resource converted for %pOF. CPU base address for old range lost!\n",
dev_node);
*io_base = range.cpu_addr;
+ } else if (resource_type(res) == IORESOURCE_MEM) {
+ if (!(res->flags & IORESOURCE_PREFETCH)) {
+ if (res->flags & IORESOURCE_MEM_64)
+ if (!upper_32_bits(range.pci_addr + range.size - 1)) {
+ dev_warn(dev, "Clearing 64-bit flag for non-prefetchable memory below 4GB\n");
+ res->flags &= ~IORESOURCE_MEM_64;
+ }
+ }
}

pci_add_resource_offset(resources, res, res->start - range.pci_addr);
--
2.30.2

2021-06-07 11:33:14

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH v3 2/4] PCI: of: Relax the condition for warning about non-prefetchable memory aperture size

Commit fede8526cc48 ("PCI: of: Warn if non-prefetchable memory
aperture size is > 32-bit") introduced a warning for non-prefetchable
resources that need more than 32bits to resolve. It turns out that the
check is too restrictive and should be applicable to only resources
that are limited to host bridge windows that don't have the ability to
map 64-bit address space.

Relax the condition to only warn when the resource size requires >
32-bits and doesn't allow mapping to 64-bit addresses.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Punit Agrawal <[email protected]>
Tested-by: Alexandru Elisei <[email protected]>
Cc: Vidya Sagar <[email protected]>
---
drivers/pci/of.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 1e45186a5715..38fe2589beb0 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -581,7 +581,8 @@ static int pci_parse_request_of_pci_ranges(struct device *dev,
res_valid |= !(res->flags & IORESOURCE_PREFETCH);

if (!(res->flags & IORESOURCE_PREFETCH))
- if (upper_32_bits(resource_size(res)))
+ if (!(res->flags & IORESOURCE_MEM_64) &&
+ upper_32_bits(resource_size(res)))
dev_warn(dev, "Memory resource size exceeds max for 32 bits\n");

break;
--
2.30.2

2021-06-07 11:33:54

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH v3 4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory

The PCIe host bridge on RK3399 advertises a single 64-bit memory
address range even though it lies entirely below 4GB.

Previously the OF PCI range parser treated 64-bit ranges more
leniently (i.e., as 32-bit), but since commit 9d57e61bf723 ("of/pci:
Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
the code takes a stricter view and treats the ranges as advertised in
the device tree (i.e, as 64-bit).

The change in behaviour causes failure when allocating bus addresses
to devices connected behind a PCI-to-PCI bridge that require
non-prefetchable memory ranges. The allocation failure was observed
for certain Samsung NVMe drives connected to RockPro64 boards.

Update the host bridge window attributes to treat it as 32-bit address
memory. This fixes the allocation failure observed since commit
9d57e61bf723.

Reported-by: Alexandru Elisei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Robin Murphy <[email protected]>
Signed-off-by: Punit Agrawal <[email protected]>
Tested-by: Alexandru Elisei <[email protected]>
Cc: Heiko Stuebner <[email protected]>
Cc: Rob Herring <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 634a91af8e83..4b854eb21f72 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -227,7 +227,7 @@ pcie0: pcie@f8000000 {
<&pcie_phy 2>, <&pcie_phy 3>;
phy-names = "pcie-phy-0", "pcie-phy-1",
"pcie-phy-2", "pcie-phy-3";
- ranges = <0x83000000 0x0 0xfa000000 0x0 0xfa000000 0x0 0x1e00000>,
+ ranges = <0x82000000 0x0 0xfa000000 0x0 0xfa000000 0x0 0x1e00000>,
<0x81000000 0x0 0xfbe00000 0x0 0xfbe00000 0x0 0x100000>;
resets = <&cru SRST_PCIE_CORE>, <&cru SRST_PCIE_MGMT>,
<&cru SRST_PCIE_MGMT_STICKY>, <&cru SRST_PCIE_PIPE>,
--
2.30.2

2021-06-08 19:24:15

by Vidya Sagar

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] PCI: of: Relax the condition for warning about non-prefetchable memory aperture size



On 6/7/2021 4:58 PM, Punit Agrawal wrote:
> External email: Use caution opening links or attachments
>
>
> Commit fede8526cc48 ("PCI: of: Warn if non-prefetchable memory
> aperture size is > 32-bit") introduced a warning for non-prefetchable
> resources that need more than 32bits to resolve. It turns out that the
> check is too restrictive and should be applicable to only resources
> that are limited to host bridge windows that don't have the ability to
> map 64-bit address space.
I think the host bridge windows having the ability to map 64-bit address
space is different from restricting the non-prefetchable memory aperture
size to 32-bit.
Whether the host bridge uses internal translations or not to map the
non-prefetchable resources to 64-bit space, the size needs to be
programmed in the host bridge's 'Memory Limit Register (Offset 22h)'
which can represent sizes only fit into 32-bits.
Host bridges having the ability to map 64-bit address spaces gives
flexibility to utilize the vast 64-bit space for the (restrictive)
non-prefetchable memory (i.e. mapping non-prefetchable BARs of endpoints
to the 64-bit space in CPU's view) and get it translated internally and
put a 32-bit address on the PCIe bus finally.

- Vidya Sagar
>
> Relax the condition to only warn when the resource size requires >
> 32-bits and doesn't allow mapping to 64-bit addresses.
>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Punit Agrawal <[email protected]>
> Tested-by: Alexandru Elisei <[email protected]>
> Cc: Vidya Sagar <[email protected]>
> ---
> drivers/pci/of.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 1e45186a5715..38fe2589beb0 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -581,7 +581,8 @@ static int pci_parse_request_of_pci_ranges(struct device *dev,
> res_valid |= !(res->flags & IORESOURCE_PREFETCH);
>
> if (!(res->flags & IORESOURCE_PREFETCH))
> - if (upper_32_bits(resource_size(res)))
> + if (!(res->flags & IORESOURCE_MEM_64) &&
> + upper_32_bits(resource_size(res)))
> dev_warn(dev, "Memory resource size exceeds max for 32 bits\n");
>
> break;
> --
> 2.30.2
>

2021-06-09 16:36:33

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

Hi Punit,

On Mon, 07 Jun 2021 12:28:52 +0100,
Punit Agrawal <[email protected]> wrote:
>
> Hi,
>
> This is the third iteration to improve handling of the 64-bit
> attribute on non-prefetchable host bridge ranges. Previous version can
> be found at [0][1].
>
> This version is a small update over the previous version - changelog
> below. If there is no futher feedback on the patches, please consider
> merging them.

Thanks for this. This brings my test machine back to life:

Acked-by: Marc Zyngier <[email protected]>
Tested-by: Marc Zyngier <[email protected]>

Any chance this could hit upstream shortly? RK3399 is a fairly popular
SoC, and a number of us are running test boxes based on it.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-06-10 00:25:20

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB

[+cc Leonardo]

On Mon, Jun 07, 2021 at 08:28:53PM +0900, Punit Agrawal wrote:
> Some host bridges advertise non-prefetchable memory windows that are
> entirely located below 4GB but are marked as 64-bit address memory.
>
> Since commit 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
> flags for 64-bit memory addresses"), the OF PCI range parser takes a
> stricter view and treats 64-bit address ranges as advertised while
> before such ranges were treated as 32-bit.
>
> A PCI root port modelled as a PCI-to-PCI bridge cannot forward 64-bit
> non-prefetchable memory ranges. As a result, the change in behaviour
> due to the commit causes failure to allocate 32-bit BAR from a 64-bit
> non-prefetchable window.
>
> In order to not break platforms where non-prefetchable memory ranges
> lie entirely below 4GB, clear the 64-bit flag.

I don't think we should care about the address width DT supplies for a
host bridge window. Prior to 9d57e61bf723, I don't think we *did*
care because of_bus_pci_get_flags() threw away that information.

My proposal for a commit log, including information about the problem
report and a "Fixes:" tag:

Alexandru and Qu reported this resource allocation failure on
ROCKPro64 v2 and ROCK Pi 4B, both based on the RK3399:

pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff 64bit]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0: BAR 14: no space for [mem size 0x00100000]
pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]

"BAR 14" is the PCI bridge's 32-bit non-prefetchable window, and our
PCI allocation code isn't smart enough to allocate it in a host
bridge window marked as 64-bit, even though this should work fine.

A DT host bridge description includes the windows from the CPU
address space to the PCI bus space. On a few architectures
(microblaze, powerpc, sparc), the DT may also describe PCI devices
themselves, including their BARs.

Before 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
flags for 64-bit memory addresses"), of_bus_pci_get_flags() ignored
the fact that some DT addresses described 64-bit windows and BARs.
That was a problem because the virtio virtual NIC has a 32-bit BAR
and a 64-bit BAR, and the driver couldn't distinguish them.

9d57e61bf723 set IORESOURCE_MEM_64 for those 64-bit DT ranges, which
fixed the virtio driver. But it also set IORESOURCE_MEM_64 for host
bridge windows, which exposed the fact that the PCI allocator isn't
smart enough to put 32-bit resources in those 64-bit windows.

Clear IORESOURCE_MEM_64 from host bridge windows since we don't need
that information.

Fixes: 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
Reported-at: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Alexandru Elisei <[email protected]>
Reported-by: Qu Wenruo <[email protected]>

> Suggested-by: Ard Biesheuvel <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Punit Agrawal <[email protected]>
> Tested-by: Alexandru Elisei <[email protected]>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Rob Herring <[email protected]>
> ---
> drivers/pci/of.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 85dcb7097da4..1e45186a5715 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -353,6 +353,14 @@ static int devm_of_pci_get_host_bridge_resources(struct device *dev,
> dev_warn(dev, "More than one I/O resource converted for %pOF. CPU base address for old range lost!\n",
> dev_node);
> *io_base = range.cpu_addr;
> + } else if (resource_type(res) == IORESOURCE_MEM) {
> + if (!(res->flags & IORESOURCE_PREFETCH)) {
> + if (res->flags & IORESOURCE_MEM_64)
> + if (!upper_32_bits(range.pci_addr + range.size - 1)) {
> + dev_warn(dev, "Clearing 64-bit flag for non-prefetchable memory below 4GB\n");
> + res->flags &= ~IORESOURCE_MEM_64;
> + }
> + }

Why do we need to check IORESOURCE_PREFETCH, IORESOURCE_MEM_64, and
upper_32_bits()? If I understand this correctly, prior to
9d57e61bf723, IORESOURCE_MEM_64 was *never* set here. Isn't something
like this sufficient?

} else if (resource_type(res) == IORESOURCE_MEM) {
res->flags &= ~IORESOURCE_MEM_64;
}

I'm not sure we need a warning either. We didn't warn before
9d57e61bf723, and there's nothing the user needs to do anyway.

> }
>
> pci_add_resource_offset(resources, res, res->start - range.pci_addr);
> --
> 2.30.2
>

2021-06-10 04:08:03

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] PCI: of: Relax the condition for warning about non-prefetchable memory aperture size

On Wed, Jun 09, 2021 at 12:36:08AM +0530, Vidya Sagar wrote:
> On 6/7/2021 4:58 PM, Punit Agrawal wrote:
> >
> > Commit fede8526cc48 ("PCI: of: Warn if non-prefetchable memory
> > aperture size is > 32-bit") introduced a warning for non-prefetchable
> > resources that need more than 32bits to resolve. It turns out that the
> > check is too restrictive and should be applicable to only resources
> > that are limited to host bridge windows that don't have the ability to
> > map 64-bit address space.
>
> I think the host bridge windows having the ability to map 64-bit address
> space is different from restricting the non-prefetchable memory aperture
> size to 32-bit.

> Whether the host bridge uses internal translations or not to map the
> non-prefetchable resources to 64-bit space, the size needs to be programmed
> in the host bridge's 'Memory Limit Register (Offset 22h)' which can
> represent sizes only fit into 32-bits.

> Host bridges having the ability to map 64-bit address spaces gives
> flexibility to utilize the vast 64-bit space for the (restrictive)
> non-prefetchable memory (i.e. mapping non-prefetchable BARs of endpoints to
> the 64-bit space in CPU's view) and get it translated internally and put a
> 32-bit address on the PCIe bus finally.

The vastness of the 64-bit space in the CPU view only helps with
non-prefetchable memory if you have multiple host bridges with
different CPU-to-PCI translations. Each root bus can only carve up
4GB of PCI memory space for use by its non-prefetchable memory
windows.

Of course, if we're willing to give up the performance, there's
nothing to prevent us from using non-prefetchable space for
*prefetchable* resources, as in my example below.

I think the fede8526cc48 commit log is incorrect, or at least
incomplete:

As per PCIe spec r5.0, sec 7.5.1.3.8 only 32-bit BAR registers are defined
for non-prefetchable memory and hence a warning should be reported when
the size of them go beyond 32-bits.

7.5.1.3.8 is talking about non-prefetchable PCI-to-PCI bridge windows,
not BARs. AFAIK, 64-bit BARs may be non-prefetchable. The warning is
in pci_parse_request_of_pci_ranges(), which isn't looking at
PCI-to-PCI bridge windows; it's looking at PCI host bridge windows.
It's legal for a host bridge to have only non-prefetchable windows,
and prefetchable PCI BARs can be placed in them.

For example, we could have the following:

pci_bus 0000:00: root bus resource [mem 0x80000000-0x1_ffffffff] (6GB)
pci 0000:00:00.0: PCI bridge to [bus 01-7f]
pci 0000:00:00.0: bridge window [mem 0x80000000-0xbfffffff] (1GB)
pci 0000:00:00.0: bridge window [mem 0x1_00000000-0x1_7fffffff 64bit pref] (2GB)
pci 0000:00:00.1: PCI bridge to [bus 80-ff]
pci 0000:00:00.1: bridge window [mem 0xc0000000-0xffffffff] (1GB)
pci 0000:00:00.1: bridge window [mem 0x1_80000000-0x1_ffffffff 64bit pref] (2GB)

Here the host bridge window is 6GB and is not prefetchable. The
PCI-to-PCI bridge non-prefetchable windows are 1GB each and the bases
and limits fit in 32 bits. The prefetchable windows are 2GB each, and
we're allowed but not required to put these in prefetchable host
bridge windows.

So I'm not convinced this warning is valid to begin with. It may be
that this host bridge configuration isn't optimal, and we might want
an informational message, but I think it's *legal*.

> > Relax the condition to only warn when the resource size requires >
> > 32-bits and doesn't allow mapping to 64-bit addresses.
> >
> > Link: https://lore.kernel.org/r/[email protected]
> > Signed-off-by: Punit Agrawal <[email protected]>
> > Tested-by: Alexandru Elisei <[email protected]>
> > Cc: Vidya Sagar <[email protected]>
> > ---
> > drivers/pci/of.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> > index 1e45186a5715..38fe2589beb0 100644
> > --- a/drivers/pci/of.c
> > +++ b/drivers/pci/of.c
> > @@ -581,7 +581,8 @@ static int pci_parse_request_of_pci_ranges(struct device *dev,
> > res_valid |= !(res->flags & IORESOURCE_PREFETCH);
> >
> > if (!(res->flags & IORESOURCE_PREFETCH))
> > - if (upper_32_bits(resource_size(res)))
> > + if (!(res->flags & IORESOURCE_MEM_64) &&
> > + upper_32_bits(resource_size(res)))
> > dev_warn(dev, "Memory resource size exceeds max for 32 bits\n");
> >
> > break;
> > --
> > 2.30.2
> >

2021-06-10 09:09:49

by Anand Moon

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

Hi Punit,

On Mon, 7 Jun 2021 at 17:02, Punit Agrawal <[email protected]> wrote:
>
> Hi,
>
> This is the third iteration to improve handling of the 64-bit
> attribute on non-prefetchable host bridge ranges. Previous version can
> be found at [0][1].
>
> This version is a small update over the previous version - changelog
> below. If there is no futher feedback on the patches, please consider
> merging them.
>
> Thanks,
> Punit
>

Thanks for your work, I have tested this on my RK3399 (Odroid N1) SBC.
It partially works on this board. So I looked into
RK3399TRM_V1.3_Part2 for more details.

17.6.7.1.45 Root Complex BAR Configuration Register
It looks like these config changes are related to the issue you are
trying to solve.

On this basis here are the code changes I made in the driver for testing.

[alarm@alarm linux-rockchip-5.y-devel]$ git diff
drivers/pci/controller/pcie-rockchip.h
drivers/pci/controller/pcie-rockchip.c
diff --git a/drivers/pci/controller/pcie-rockchip.c
b/drivers/pci/controller/pcie-rockchip.c
index 990a00e08bc5..18e315de9f04 100644
--- a/drivers/pci/controller/pcie-rockchip.c
+++ b/drivers/pci/controller/pcie-rockchip.c
@@ -405,9 +405,20 @@ void rockchip_pcie_cfg_configuration_accesses(
struct rockchip_pcie *rockchip, u32 type)
{
u32 ob_desc_0;
+ u32 status;

/* Configuration Accesses for region 0 */
- rockchip_pcie_write(rockchip, 0x0, PCIE_RC_BAR_CONF);
+ status = rockchip_pcie_read(rockchip, PCIE_RC_BAR_CONF);
+ status |= FIELD_PREP(PCIE_RC_BAR_RCBAR0A, 0x14) |
+ FIELD_PREP(PCIE_RC_BAR_RCBAR0C, 0x6) |
+ FIELD_PREP(PCIE_RC_BAR_RCBAR1A, 0x14) |
+ FIELD_PREP(PCIE_RC_BAR_RCBAR1C, 0x4) |
+ PCIE_RC_BAR_RCBARPME |
+ PCIE_RC_BAR_RCBARPMS |
+ PCIE_RC_BAR_RCBARPIE |
+ PCIE_RC_BAR_RCBARPIS |
+ PCIE_RC_BAR_RCBCE;
+ rockchip_pcie_write(rockchip, status, PCIE_RC_BAR_CONF);

rockchip_pcie_write(rockchip,
(RC_REGION_0_ADDR_TRANS_L + RC_REGION_0_PASS_BITS),
diff --git a/drivers/pci/controller/pcie-rockchip.h
b/drivers/pci/controller/pcie-rockchip.h
index 1650a5087450..a68ca18de4ec 100644
--- a/drivers/pci/controller/pcie-rockchip.h
+++ b/drivers/pci/controller/pcie-rockchip.h
@@ -114,6 +114,16 @@
#define PCIE_CORE_INT_MASK (PCIE_CORE_CTRL_MGMT_BASE + 0x210)
#define PCIE_CORE_PHY_FUNC_CFG (PCIE_CORE_CTRL_MGMT_BASE + 0x2c0)
#define PCIE_RC_BAR_CONF (PCIE_CORE_CTRL_MGMT_BASE + 0x300)
+#define PCIE_RC_BAR_RCBAR0A GENMASK(5, 0)
+#define PCIE_RC_BAR_RCBAR0C GENMASK(8, 6)
+#define PCIE_RC_BAR_RCBAR1A GENMASK(13, 9)
+#define PCIE_RC_BAR_RCBAR1C GENMASK(16, 14)
+#define PCIE_RC_BAR_RCBARPME BIT(17)
+#define PCIE_RC_BAR_RCBARPMS BIT(18)
+#define PCIE_RC_BAR_RCBARPIE BIT(19)
+#define PCIE_RC_BAR_RCBARPIS BIT(20)
+#define PCIE_RC_BAR_RCBCE BIT(31)
+
#define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_DISABLED 0x0
#define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_IO_32BITS 0x1
#define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_MEM_32BITS 0x4

But I am getting the following error at my end.
--------------------
[ OK ] Found device /dev/ttyS2.
[ 7.089794] rockchip-pcie f8000000.pcie: host bridge /pcie@f8000000 ranges:
[ 7.092945] rk808-rtc rk808-rtc: registered as rtc0
[ 7.103462] rockchip-pcie f8000000.pcie: MEM
0x00fa000000..0x00fbdfffff -> 0x00fa000000
[ 7.113384] rockchip-pcie f8000000.pcie: IO
0x00fbe00000..0x00fbefffff -> 0x00fbe00000
[ 7.123294] rockchip-pcie f8000000.pcie: no vpcie12v regulator found
[ 7.123524] rk808-rtc rk808-rtc: setting system clock to
2021-06-10T07:36:35 UTC (1623310595)
[ 7.130589] rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
[ 7.187409] mc: Linux media interface: v0.10
[ 7.210178] rockchip-pcie f8000000.pcie: PCI host bridge to bus 0000:00
[ 7.217643] pci_bus 0000:00: root bus resource [bus 00-1f]
[ 7.224126] pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff]
[ 7.234975] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
(bus address [0xfbe00000-0xfbefffff])
[ 7.246527] pci 0000:00:00.0: [1d87:0100] type 01 class 0x060400
[ 7.264087] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x003fffff 64bit]
[ 7.271770] pci 0000:00:00.0: supports D1
[ 7.276265] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[ 7.284724] pci 0000:00:00.0: bridge configuration invalid ([bus
00-00]), reconfiguring
[ 7.293874] pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
[ 7.300656] pci 0000:01:00.0: reg 0x10: initial BAR value 0x00000000 invalid
[ 7.308542] pci 0000:01:00.0: reg 0x10: [io size 0x0008]
[ 7.309498] videodev: Linux video capture interface: v2.00
[ 7.314606] pci 0000:01:00.0: reg 0x14: initial BAR value 0x00000000 invalid
[ 7.328593] pci 0000:01:00.0: reg 0x14: [io size 0x0004]
[ 7.334661] pci 0000:01:00.0: reg 0x18: initial BAR value 0x00000000 invalid
[ 7.342545] pci 0000:01:00.0: reg 0x18: [io size 0x0008]
[ 7.342578] pci 0000:01:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
[ 7.342581] pci 0000:01:00.0: reg 0x1c: [io size 0x0004]
[ 7.342609] pci 0000:01:00.0: reg 0x20: initial BAR value 0x00000000 invalid
[ 7.342612] pci 0000:01:00.0: reg 0x20: [io size 0x0010]
[ 7.342640] pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
[ 7.383491] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[ 7.383533] pci 0000:01:00.0: Max Payload Size set to 256 (was 128, max 512)
[ 7.385978] panfrost ff9a0000.gpu: clock rate = 500000000
[ 7.387283] rk_gmac-dwmac fe300000.ethernet: IRQ eth_wake_irq not found
[ 7.387295] rk_gmac-dwmac fe300000.ethernet: IRQ eth_lpi not found
[ 7.387377] rk_gmac-dwmac fe300000.ethernet: PTP uses main clock
[ 7.387614] rk_gmac-dwmac fe300000.ethernet: clock input or output? (input).
[ 7.387623] rk_gmac-dwmac fe300000.ethernet: TX delay(0x28).
[ 7.387631] rk_gmac-dwmac fe300000.ethernet: RX delay(0x11).
[ 7.387643] rk_gmac-dwmac fe300000.ethernet: integrated PHY? (no).
[ 7.387682] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
[ 7.387688] rk_gmac-dwmac fe300000.ethernet: clock input from PHY
[ 7.391688] panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init
[panfrost]] Failed to register cooling device
[ 7.392704] rk_gmac-dwmac fe300000.ethernet: init for RGMII
[ 7.392908] rk_gmac-dwmac fe300000.ethernet: User ID: 0x10, Synopsys ID: 0x35
[ 7.392921] rk_gmac-dwmac fe300000.ethernet: DWMAC1000
[ 7.392926] rk_gmac-dwmac fe300000.ethernet: DMA HW capability
register supported
[ 7.392931] rk_gmac-dwmac fe300000.ethernet: RX Checksum Offload
Engine supported
[ 7.392935] rk_gmac-dwmac fe300000.ethernet: COE Type 2
[ 7.392939] rk_gmac-dwmac fe300000.ethernet: TX Checksum insertion supported
[ 7.392943] rk_gmac-dwmac fe300000.ethernet: Wake-Up On Lan supported
[ 7.393016] rk_gmac-dwmac fe300000.ethernet: Normal descriptors
[ 7.393022] rk_gmac-dwmac fe300000.ethernet: Ring mode enabled
[ 7.393026] rk_gmac-dwmac fe300000.ethernet: Enable RX Mitigation
via HW Watchdog Timer
[ 7.393070] rk_gmac-dwmac fe300000.ethernet: Unbalanced pm_runtime_enable!
[ 7.393421] libphy: stmmac: probed
[ 7.399404] rockchip-pcie f8000000.pcie: local interrupt received
[ 7.405075] panfrost ff9a0000.gpu: mali-t860 id 0x860 major 0x2
minor 0x0 status 0x0
[ 7.412338] rockchip-pcie f8000000.pcie: an error was observed in
the flow control advertisements from the other side
[ 7.412377] rockchip-pcie f8000000.pcie: phy link changes
[ 7.417620] pci_bus 0000:01: fixups for bus
[ 7.417624] pci_bus 0000:01: bus scan returning with max=01
[ 7.417628] pci_bus 0000:01: busn_res: [bus 01-1f] end is updated to 01
[ 7.417637] pci_bus 0000:00: bus scan returning with max=01
[ 7.417650] pci 0000:00:00.0: BAR 0: assigned [mem
0xfa000000-0xfa3fffff 64bit]
[ 7.417662] pci 0000:00:00.0: BAR 14: assigned [mem 0xfa400000-0xfa4fffff]
[ 7.417666] pci 0000:00:00.0: BAR 13: assigned [io 0x1000-0x1fff]
[ 7.417671] pci 0000:01:00.0: BAR 6: assigned [mem
0xfa400000-0xfa40ffff pref]
[ 7.417675] pci 0000:01:00.0: BAR 5: assigned [mem 0xfa410000-0xfa4101ff]
[ 7.417689] pci 0000:01:00.0: BAR 4: assigned [io 0x1000-0x100f]
[ 7.417702] pci 0000:01:00.0: BAR 0: assigned [io 0x1010-0x1017]
[ 7.417715] pci 0000:01:00.0: BAR 2: assigned [io 0x1018-0x101f]
[ 7.417729] pci 0000:01:00.0: BAR 1: assigned [io 0x1020-0x1023]
[ 7.417742] pci 0000:01:00.0: BAR 3: assigned [io 0x1024-0x1027]
[ 7.417756] pci 0000:00:00.0: PCI bridge to [bus 01]
[ 7.417760] pci 0000:00:00.0: bridge window [io 0x1000-0x1fff]
[ 7.417766] pci 0000:00:00.0: bridge window [mem 0xfa400000-0xfa4fffff]
[ 7.417872] pcieport 0000:00:00.0: assign IRQ: got 93
[ 7.417884] pcieport 0000:00:00.0: enabling device (0000 -> 0003)
[ 7.417895] pcieport 0000:00:00.0: enabling bus mastering
[ 7.418081] SError Interrupt on CPU5, code 0xbf000002 -- SError
[ 7.418084] CPU: 5 PID: 90 Comm: kworker/u12:2 Not tainted
5.13.0-rc5-xn1ml-00004-g128f93a6219c-dirty #6
[ 7.418085] Hardware name: Hardkernel ODROID-N1 (DT)
[ 7.418086] Workqueue: events_unbound deferred_probe_work_func
[ 7.418089] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[ 7.418091] pc : __pci_enable_msix_range+0x498/0x6b0
[ 7.418092] lr : __pci_enable_msix_range+0x44c/0x6b0
[ 7.418093] sp : ffff80001249b710
[ 7.418094] x29: ffff80001249b710 x28: 0000000000000000 x27: ffff0000059ba000
[ 7.418098] x26: ffff0000059ba0c0 x25: 0000000000000001 x24: 0000000000000000
[ 7.418102] x23: 000000000000000c x22: 0000000000000000 x21: ffff0000059ba2e8
[ 7.418106] x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000002
[ 7.418110] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 7.418114] x14: 0000000000000001 x13: 00000000000e98d4 x12: 0000000000000040
[ 7.418117] x11: ffff00000081afe8 x10: ffff00000081afea x9 : ffff800011d129e0
[ 7.418121] x8 : 0000000000000000 x7 : ffff800011fca91c x6 : 0000000000000000
[ 7.418125] x5 : 000000000000c011 x4 : ffff8000178000b0 x3 : 0000000000000002
[ 7.418128] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000004efe680
[ 7.418132] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 7.418134] CPU: 5 PID: 90 Comm: kworker/u12:2 Not tainted
5.13.0-rc5-xn1ml-00004-g128f93a6219c-dirty #6
[ 7.418136] Hardware name: Hardkernel ODROID-N1 (DT)
[ 7.418137] Workqueue: events_unbound deferred_probe_work_func
[ 7.418139] Call trace:
[ 7.418140] dump_backtrace+0x0/0x1a0
[ 7.418141] show_stack+0x1c/0x70
[ 7.418142] dump_stack+0xd0/0x12c
[ 7.418143] panic+0x16c/0x334
[ 7.418145] add_taint+0x0/0xb0
[ 7.418146] arm64_serror_panic+0x7c/0x88
[ 7.418147] do_serror+0x68/0x70
[ 7.418148] el1_error+0x80/0xf8
[ 7.418149] __pci_enable_msix_range+0x498/0x6b0
[ 7.418150] pci_alloc_irq_vectors_affinity+0xbc/0x13c
[ 7.418152] pcie_port_device_register+0x11c/0x40c
[ 7.418153] pcie_portdrv_probe+0x48/0x100
[ 7.418154] local_pci_probe+0x44/0xb0
[ 7.418155] pci_device_probe+0x114/0x1b0
[ 7.418156] really_probe+0xe4/0x504
[ 7.418157] driver_probe_device+0x64/0xcc
[ 7.418158] __device_attach_driver+0xb8/0x114
[ 7.418159] bus_for_each_drv+0x78/0xd0
[ 7.418160] __device_attach+0xd8/0x180
[ 7.418161] device_attach+0x18/0x24
[ 7.418162] pci_bus_add_device+0x54/0xc0
[ 7.418163] pci_bus_add_devices+0x40/0x90
[ 7.418165] pci_host_probe+0x44/0xc4
[ 7.418166] rockchip_pcie_probe+0x2fc/0x4d4 [pcie_rockchip_host]
[ 7.418167] platform_probe+0x6c/0xdc
[ 7.418168] really_probe+0xe4/0x504
[ 7.418169] driver_probe_device+0x64/0xcc
[ 7.418170] __device_attach_driver+0xb8/0x114
[ 7.418171] bus_for_each_drv+0x78/0xd0
[ 7.418172] __device_attach+0xd8/0x180
[ 7.418173] device_initial_probe+0x18/0x2c
[ 7.418174] bus_probe_device+0xa0/0xac
[ 7.418175] deferred_probe_work_func+0x88/0xc0
[ 7.418176] process_one_work+0x1cc/0x350
[ 7.418177] worker_thread+0x13c/0x470
[ 7.418178] kthread+0x158/0x160
[ 7.418179] ret_from_fork+0x10/0x30
[ 8.063493] SMP: stopping secondary CPUs
[ 8.063495] Kernel Offset: disabled
[ 8.063496] CPU features: 0x10001031,20000846
[ 8.063498] Memory Limit: none

Thanks
-Anand

> _______________________________________________
> Linux-rockchip mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-rockchip

2021-06-10 13:38:52

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB

Hi Bjorn,

Bjorn Helgaas <[email protected]> writes:

> [+cc Leonardo]
>
> On Mon, Jun 07, 2021 at 08:28:53PM +0900, Punit Agrawal wrote:
>> Some host bridges advertise non-prefetchable memory windows that are
>> entirely located below 4GB but are marked as 64-bit address memory.
>>
>> Since commit 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
>> flags for 64-bit memory addresses"), the OF PCI range parser takes a
>> stricter view and treats 64-bit address ranges as advertised while
>> before such ranges were treated as 32-bit.
>>
>> A PCI root port modelled as a PCI-to-PCI bridge cannot forward 64-bit
>> non-prefetchable memory ranges. As a result, the change in behaviour
>> due to the commit causes failure to allocate 32-bit BAR from a 64-bit
>> non-prefetchable window.
>>
>> In order to not break platforms where non-prefetchable memory ranges
>> lie entirely below 4GB, clear the 64-bit flag.
>
> I don't think we should care about the address width DT supplies for a
> host bridge window. Prior to 9d57e61bf723, I don't think we *did*
> care because of_bus_pci_get_flags() threw away that information.
>
> My proposal for a commit log, including information about the problem
> report and a "Fixes:" tag:
>
> Alexandru and Qu reported this resource allocation failure on
> ROCKPro64 v2 and ROCK Pi 4B, both based on the RK3399:
>
> pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff 64bit]
> pci 0000:00:00.0: PCI bridge to [bus 01]
> pci 0000:00:00.0: BAR 14: no space for [mem size 0x00100000]
> pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
>
> "BAR 14" is the PCI bridge's 32-bit non-prefetchable window, and our
> PCI allocation code isn't smart enough to allocate it in a host
> bridge window marked as 64-bit, even though this should work fine.
>
> A DT host bridge description includes the windows from the CPU
> address space to the PCI bus space. On a few architectures
> (microblaze, powerpc, sparc), the DT may also describe PCI devices
> themselves, including their BARs.
>
> Before 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
> flags for 64-bit memory addresses"), of_bus_pci_get_flags() ignored
> the fact that some DT addresses described 64-bit windows and BARs.
> That was a problem because the virtio virtual NIC has a 32-bit BAR
> and a 64-bit BAR, and the driver couldn't distinguish them.

Many thanks for demystifying the motivation for 9d57e61bf723. Not being
familiar with the usage of DT to describe PCI devices I was missing this
context.

> 9d57e61bf723 set IORESOURCE_MEM_64 for those 64-bit DT ranges, which
> fixed the virtio driver. But it also set IORESOURCE_MEM_64 for host
> bridge windows, which exposed the fact that the PCI allocator isn't
> smart enough to put 32-bit resources in those 64-bit windows.
>
> Clear IORESOURCE_MEM_64 from host bridge windows since we don't need
> that information.
>
> Fixes: 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
> Reported-at: https://lore.kernel.org/lkml/[email protected]/
> Reported-by: Alexandru Elisei <[email protected]>
> Reported-by: Qu Wenruo <[email protected]>

Thank you for commit log - without all the pieces I was struggling to
clearly describe the details. And I missed the appropriate tags as
well. I've updated the commit log based on your suggestion.

>> Suggested-by: Ard Biesheuvel <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Tested-by: Alexandru Elisei <[email protected]>
>> Cc: Bjorn Helgaas <[email protected]>
>> Cc: Rob Herring <[email protected]>
>> ---
>> drivers/pci/of.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
>> index 85dcb7097da4..1e45186a5715 100644
>> --- a/drivers/pci/of.c
>> +++ b/drivers/pci/of.c
>> @@ -353,6 +353,14 @@ static int devm_of_pci_get_host_bridge_resources(struct device *dev,
>> dev_warn(dev, "More than one I/O resource converted for %pOF. CPU base address for old range lost!\n",
>> dev_node);
>> *io_base = range.cpu_addr;
>> + } else if (resource_type(res) == IORESOURCE_MEM) {
>> + if (!(res->flags & IORESOURCE_PREFETCH)) {
>> + if (res->flags & IORESOURCE_MEM_64)
>> + if (!upper_32_bits(range.pci_addr + range.size - 1)) {
>> + dev_warn(dev, "Clearing 64-bit flag for non-prefetchable memory below 4GB\n");
>> + res->flags &= ~IORESOURCE_MEM_64;
>> + }
>> + }
>
> Why do we need to check IORESOURCE_PREFETCH, IORESOURCE_MEM_64, and
> upper_32_bits()? If I understand this correctly, prior to
> 9d57e61bf723, IORESOURCE_MEM_64 was *never* set here. Isn't something
> like this sufficient?
>
> } else if (resource_type(res) == IORESOURCE_MEM) {
> res->flags &= ~IORESOURCE_MEM_64;
> }

Based on the discussion in the original thread[0], I was working with
the assumption that we don't want to lose the IORESOURCE_MEM_64 flag
other than in the problem scenario, i.e., non-prefetchable memory below
4GB.

You suggestion is simpler and also solves the issue by effectively
reverting the impact of 9d57e61bf723 on BAR allocation. If there are no
objections I will take this approach for the next update.

To aid future readers I will also add the following comment -

/*
* PCI allocation cannot correctly allocate 32-bit non-prefetchable BAR
* in host bridge windows marked as 64-bit.
*/

> I'm not sure we need a warning either. We didn't warn before
> 9d57e61bf723, and there's nothing the user needs to do anyway.

The warning was a nudge (probably too subtle) to get the user to upgrade
their DT to drop the 64-bit marker on the host bridge window. With your
suggestion, the DT change is not needed anymore - though it may still be
worth dropping the 64-bit marker.

Thanks,
Punit

[0] https://lore.kernel.org/linux-pci/CAMj1kXGF_JmuZ+rRA55-NrTQ6f20fhcHc=62AGJ71eHNU8AoBQ@mail.gmail.com/

2021-06-10 14:14:12

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] PCI: of: Relax the condition for warning about non-prefetchable memory aperture size

Hi Bjorn,

Bjorn Helgaas <[email protected]> writes:

> On Wed, Jun 09, 2021 at 12:36:08AM +0530, Vidya Sagar wrote:
>> On 6/7/2021 4:58 PM, Punit Agrawal wrote:
>> >
>> > Commit fede8526cc48 ("PCI: of: Warn if non-prefetchable memory
>> > aperture size is > 32-bit") introduced a warning for non-prefetchable
>> > resources that need more than 32bits to resolve. It turns out that the
>> > check is too restrictive and should be applicable to only resources
>> > that are limited to host bridge windows that don't have the ability to
>> > map 64-bit address space.
>>
>> I think the host bridge windows having the ability to map 64-bit address
>> space is different from restricting the non-prefetchable memory aperture
>> size to 32-bit.
>
>> Whether the host bridge uses internal translations or not to map the
>> non-prefetchable resources to 64-bit space, the size needs to be programmed
>> in the host bridge's 'Memory Limit Register (Offset 22h)' which can
>> represent sizes only fit into 32-bits.
>
>> Host bridges having the ability to map 64-bit address spaces gives
>> flexibility to utilize the vast 64-bit space for the (restrictive)
>> non-prefetchable memory (i.e. mapping non-prefetchable BARs of endpoints to
>> the 64-bit space in CPU's view) and get it translated internally and put a
>> 32-bit address on the PCIe bus finally.
>
> The vastness of the 64-bit space in the CPU view only helps with
> non-prefetchable memory if you have multiple host bridges with
> different CPU-to-PCI translations. Each root bus can only carve up
> 4GB of PCI memory space for use by its non-prefetchable memory
> windows.
>
> Of course, if we're willing to give up the performance, there's
> nothing to prevent us from using non-prefetchable space for
> *prefetchable* resources, as in my example below.
>
> I think the fede8526cc48 commit log is incorrect, or at least
> incomplete:
>
> As per PCIe spec r5.0, sec 7.5.1.3.8 only 32-bit BAR registers are defined
> for non-prefetchable memory and hence a warning should be reported when
> the size of them go beyond 32-bits.
>
> 7.5.1.3.8 is talking about non-prefetchable PCI-to-PCI bridge windows,
> not BARs. AFAIK, 64-bit BARs may be non-prefetchable. The warning is
> in pci_parse_request_of_pci_ranges(), which isn't looking at
> PCI-to-PCI bridge windows; it's looking at PCI host bridge windows.
> It's legal for a host bridge to have only non-prefetchable windows,
> and prefetchable PCI BARs can be placed in them.
>
> For example, we could have the following:
>
> pci_bus 0000:00: root bus resource [mem 0x80000000-0x1_ffffffff] (6GB)
> pci 0000:00:00.0: PCI bridge to [bus 01-7f]
> pci 0000:00:00.0: bridge window [mem 0x80000000-0xbfffffff] (1GB)
> pci 0000:00:00.0: bridge window [mem 0x1_00000000-0x1_7fffffff 64bit pref] (2GB)
> pci 0000:00:00.1: PCI bridge to [bus 80-ff]
> pci 0000:00:00.1: bridge window [mem 0xc0000000-0xffffffff] (1GB)
> pci 0000:00:00.1: bridge window [mem 0x1_80000000-0x1_ffffffff 64bit pref] (2GB)
>
> Here the host bridge window is 6GB and is not prefetchable. The
> PCI-to-PCI bridge non-prefetchable windows are 1GB each and the bases
> and limits fit in 32 bits. The prefetchable windows are 2GB each, and
> we're allowed but not required to put these in prefetchable host
> bridge windows.
>
> So I'm not convinced this warning is valid to begin with. It may be
> that this host bridge configuration isn't optimal, and we might want
> an informational message, but I think it's *legal*.

By "optimal" - are you referring to the use of non-prefetchable space
for prefetchable window?

Also, if the warning doesn't apply to PCI host bridge windows, should I
drop it in the next update? Or leave out this and the next patch to be
dealt with separately.

Thanks,
Punit

[...]

2021-06-10 14:21:09

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

Hi Marc,

Marc Zyngier <[email protected]> writes:

> Hi Punit,
>
> On Mon, 07 Jun 2021 12:28:52 +0100,
> Punit Agrawal <[email protected]> wrote:
>>
>> Hi,
>>
>> This is the third iteration to improve handling of the 64-bit
>> attribute on non-prefetchable host bridge ranges. Previous version can
>> be found at [0][1].
>>
>> This version is a small update over the previous version - changelog
>> below. If there is no futher feedback on the patches, please consider
>> merging them.
>
> Thanks for this. This brings my test machine back to life:
>
> Acked-by: Marc Zyngier <[email protected]>
> Tested-by: Marc Zyngier <[email protected]>

Thanks for taking the patches for a spin. Based on the comments on Patch
1, there'll at least be another update. I'll copy you when I send that
out.

Thanks,
Punit

[...]

2021-06-10 14:30:20

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

Hi Anand,

Anand Moon <[email protected]> writes:

> Hi Punit,
>
> On Mon, 7 Jun 2021 at 17:02, Punit Agrawal <[email protected]> wrote:
>>
>> Hi,
>>
>> This is the third iteration to improve handling of the 64-bit
>> attribute on non-prefetchable host bridge ranges. Previous version can
>> be found at [0][1].
>>
>> This version is a small update over the previous version - changelog
>> below. If there is no futher feedback on the patches, please consider
>> merging them.
>>
>> Thanks,
>> Punit
>>
>
> Thanks for your work, I have tested this on my RK3399 (Odroid N1) SBC.
> It partially works on this board. So I looked into
> RK3399TRM_V1.3_Part2 for more details.
>
> 17.6.7.1.45 Root Complex BAR Configuration Register
> It looks like these config changes are related to the issue you are
> trying to solve.
>
> On this basis here are the code changes I made in the driver for testing.
>
> [alarm@alarm linux-rockchip-5.y-devel]$ git diff
> drivers/pci/controller/pcie-rockchip.h
> drivers/pci/controller/pcie-rockchip.c
> diff --git a/drivers/pci/controller/pcie-rockchip.c
> b/drivers/pci/controller/pcie-rockchip.c
> index 990a00e08bc5..18e315de9f04 100644
> --- a/drivers/pci/controller/pcie-rockchip.c
> +++ b/drivers/pci/controller/pcie-rockchip.c
> @@ -405,9 +405,20 @@ void rockchip_pcie_cfg_configuration_accesses(
> struct rockchip_pcie *rockchip, u32 type)
> {
> u32 ob_desc_0;
> + u32 status;
>
> /* Configuration Accesses for region 0 */
> - rockchip_pcie_write(rockchip, 0x0, PCIE_RC_BAR_CONF);
> + status = rockchip_pcie_read(rockchip, PCIE_RC_BAR_CONF);
> + status |= FIELD_PREP(PCIE_RC_BAR_RCBAR0A, 0x14) |
> + FIELD_PREP(PCIE_RC_BAR_RCBAR0C, 0x6) |
> + FIELD_PREP(PCIE_RC_BAR_RCBAR1A, 0x14) |
> + FIELD_PREP(PCIE_RC_BAR_RCBAR1C, 0x4) |
> + PCIE_RC_BAR_RCBARPME |
> + PCIE_RC_BAR_RCBARPMS |
> + PCIE_RC_BAR_RCBARPIE |
> + PCIE_RC_BAR_RCBARPIS |
> + PCIE_RC_BAR_RCBCE;
> + rockchip_pcie_write(rockchip, status, PCIE_RC_BAR_CONF);
>
> rockchip_pcie_write(rockchip,
> (RC_REGION_0_ADDR_TRANS_L + RC_REGION_0_PASS_BITS),
> diff --git a/drivers/pci/controller/pcie-rockchip.h
> b/drivers/pci/controller/pcie-rockchip.h
> index 1650a5087450..a68ca18de4ec 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -114,6 +114,16 @@
> #define PCIE_CORE_INT_MASK (PCIE_CORE_CTRL_MGMT_BASE + 0x210)
> #define PCIE_CORE_PHY_FUNC_CFG (PCIE_CORE_CTRL_MGMT_BASE + 0x2c0)
> #define PCIE_RC_BAR_CONF (PCIE_CORE_CTRL_MGMT_BASE + 0x300)
> +#define PCIE_RC_BAR_RCBAR0A GENMASK(5, 0)
> +#define PCIE_RC_BAR_RCBAR0C GENMASK(8, 6)
> +#define PCIE_RC_BAR_RCBAR1A GENMASK(13, 9)
> +#define PCIE_RC_BAR_RCBAR1C GENMASK(16, 14)
> +#define PCIE_RC_BAR_RCBARPME BIT(17)
> +#define PCIE_RC_BAR_RCBARPMS BIT(18)
> +#define PCIE_RC_BAR_RCBARPIE BIT(19)
> +#define PCIE_RC_BAR_RCBARPIS BIT(20)
> +#define PCIE_RC_BAR_RCBCE BIT(31)
> +
> #define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_DISABLED 0x0
> #define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_IO_32BITS 0x1
> #define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_MEM_32BITS 0x4
>
> But I am getting the following error at my end.
> --------------------
> [ OK ] Found device /dev/ttyS2.
> [ 7.089794] rockchip-pcie f8000000.pcie: host bridge /pcie@f8000000 ranges:
> [ 7.092945] rk808-rtc rk808-rtc: registered as rtc0
> [ 7.103462] rockchip-pcie f8000000.pcie: MEM
> 0x00fa000000..0x00fbdfffff -> 0x00fa000000
> [ 7.113384] rockchip-pcie f8000000.pcie: IO
> 0x00fbe00000..0x00fbefffff -> 0x00fbe00000
> [ 7.123294] rockchip-pcie f8000000.pcie: no vpcie12v regulator found
> [ 7.123524] rk808-rtc rk808-rtc: setting system clock to
> 2021-06-10T07:36:35 UTC (1623310595)
> [ 7.130589] rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
> [ 7.187409] mc: Linux media interface: v0.10
> [ 7.210178] rockchip-pcie f8000000.pcie: PCI host bridge to bus 0000:00
> [ 7.217643] pci_bus 0000:00: root bus resource [bus 00-1f]
> [ 7.224126] pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff]
> [ 7.234975] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
> (bus address [0xfbe00000-0xfbefffff])
> [ 7.246527] pci 0000:00:00.0: [1d87:0100] type 01 class 0x060400
> [ 7.264087] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x003fffff 64bit]
> [ 7.271770] pci 0000:00:00.0: supports D1
> [ 7.276265] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> [ 7.284724] pci 0000:00:00.0: bridge configuration invalid ([bus
> 00-00]), reconfiguring
> [ 7.293874] pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
> [ 7.300656] pci 0000:01:00.0: reg 0x10: initial BAR value 0x00000000 invalid
> [ 7.308542] pci 0000:01:00.0: reg 0x10: [io size 0x0008]
> [ 7.309498] videodev: Linux video capture interface: v2.00
> [ 7.314606] pci 0000:01:00.0: reg 0x14: initial BAR value 0x00000000 invalid
> [ 7.328593] pci 0000:01:00.0: reg 0x14: [io size 0x0004]
> [ 7.334661] pci 0000:01:00.0: reg 0x18: initial BAR value 0x00000000 invalid
> [ 7.342545] pci 0000:01:00.0: reg 0x18: [io size 0x0008]
> [ 7.342578] pci 0000:01:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
> [ 7.342581] pci 0000:01:00.0: reg 0x1c: [io size 0x0004]
> [ 7.342609] pci 0000:01:00.0: reg 0x20: initial BAR value 0x00000000 invalid
> [ 7.342612] pci 0000:01:00.0: reg 0x20: [io size 0x0010]
> [ 7.342640] pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
> [ 7.383491] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
> [ 7.383533] pci 0000:01:00.0: Max Payload Size set to 256 (was 128, max 512)
> [ 7.385978] panfrost ff9a0000.gpu: clock rate = 500000000
> [ 7.387283] rk_gmac-dwmac fe300000.ethernet: IRQ eth_wake_irq not found
> [ 7.387295] rk_gmac-dwmac fe300000.ethernet: IRQ eth_lpi not found
> [ 7.387377] rk_gmac-dwmac fe300000.ethernet: PTP uses main clock
> [ 7.387614] rk_gmac-dwmac fe300000.ethernet: clock input or output? (input).
> [ 7.387623] rk_gmac-dwmac fe300000.ethernet: TX delay(0x28).
> [ 7.387631] rk_gmac-dwmac fe300000.ethernet: RX delay(0x11).
> [ 7.387643] rk_gmac-dwmac fe300000.ethernet: integrated PHY? (no).
> [ 7.387682] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
> [ 7.387688] rk_gmac-dwmac fe300000.ethernet: clock input from PHY
> [ 7.391688] panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init
> [panfrost]] Failed to register cooling device
> [ 7.392704] rk_gmac-dwmac fe300000.ethernet: init for RGMII
> [ 7.392908] rk_gmac-dwmac fe300000.ethernet: User ID: 0x10, Synopsys ID: 0x35
> [ 7.392921] rk_gmac-dwmac fe300000.ethernet: DWMAC1000
> [ 7.392926] rk_gmac-dwmac fe300000.ethernet: DMA HW capability
> register supported
> [ 7.392931] rk_gmac-dwmac fe300000.ethernet: RX Checksum Offload
> Engine supported
> [ 7.392935] rk_gmac-dwmac fe300000.ethernet: COE Type 2
> [ 7.392939] rk_gmac-dwmac fe300000.ethernet: TX Checksum insertion supported
> [ 7.392943] rk_gmac-dwmac fe300000.ethernet: Wake-Up On Lan supported
> [ 7.393016] rk_gmac-dwmac fe300000.ethernet: Normal descriptors
> [ 7.393022] rk_gmac-dwmac fe300000.ethernet: Ring mode enabled
> [ 7.393026] rk_gmac-dwmac fe300000.ethernet: Enable RX Mitigation
> via HW Watchdog Timer
> [ 7.393070] rk_gmac-dwmac fe300000.ethernet: Unbalanced pm_runtime_enable!
> [ 7.393421] libphy: stmmac: probed
> [ 7.399404] rockchip-pcie f8000000.pcie: local interrupt received
> [ 7.405075] panfrost ff9a0000.gpu: mali-t860 id 0x860 major 0x2
> minor 0x0 status 0x0
> [ 7.412338] rockchip-pcie f8000000.pcie: an error was observed in
> the flow control advertisements from the other side
> [ 7.412377] rockchip-pcie f8000000.pcie: phy link changes
> [ 7.417620] pci_bus 0000:01: fixups for bus
> [ 7.417624] pci_bus 0000:01: bus scan returning with max=01
> [ 7.417628] pci_bus 0000:01: busn_res: [bus 01-1f] end is updated to 01
> [ 7.417637] pci_bus 0000:00: bus scan returning with max=01
> [ 7.417650] pci 0000:00:00.0: BAR 0: assigned [mem
> 0xfa000000-0xfa3fffff 64bit]
> [ 7.417662] pci 0000:00:00.0: BAR 14: assigned [mem 0xfa400000-0xfa4fffff]
> [ 7.417666] pci 0000:00:00.0: BAR 13: assigned [io 0x1000-0x1fff]
> [ 7.417671] pci 0000:01:00.0: BAR 6: assigned [mem
> 0xfa400000-0xfa40ffff pref]
> [ 7.417675] pci 0000:01:00.0: BAR 5: assigned [mem 0xfa410000-0xfa4101ff]
> [ 7.417689] pci 0000:01:00.0: BAR 4: assigned [io 0x1000-0x100f]
> [ 7.417702] pci 0000:01:00.0: BAR 0: assigned [io 0x1010-0x1017]
> [ 7.417715] pci 0000:01:00.0: BAR 2: assigned [io 0x1018-0x101f]
> [ 7.417729] pci 0000:01:00.0: BAR 1: assigned [io 0x1020-0x1023]
> [ 7.417742] pci 0000:01:00.0: BAR 3: assigned [io 0x1024-0x1027]
> [ 7.417756] pci 0000:00:00.0: PCI bridge to [bus 01]
> [ 7.417760] pci 0000:00:00.0: bridge window [io 0x1000-0x1fff]
> [ 7.417766] pci 0000:00:00.0: bridge window [mem 0xfa400000-0xfa4fffff]
> [ 7.417872] pcieport 0000:00:00.0: assign IRQ: got 93
> [ 7.417884] pcieport 0000:00:00.0: enabling device (0000 -> 0003)
> [ 7.417895] pcieport 0000:00:00.0: enabling bus mastering
> [ 7.418081] SError Interrupt on CPU5, code 0xbf000002 -- SError
> [ 7.418084] CPU: 5 PID: 90 Comm: kworker/u12:2 Not tainted
> 5.13.0-rc5-xn1ml-00004-g128f93a6219c-dirty #6
> [ 7.418085] Hardware name: Hardkernel ODROID-N1 (DT)
> [ 7.418086] Workqueue: events_unbound deferred_probe_work_func
> [ 7.418089] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> [ 7.418091] pc : __pci_enable_msix_range+0x498/0x6b0
> [ 7.418092] lr : __pci_enable_msix_range+0x44c/0x6b0
> [ 7.418093] sp : ffff80001249b710
> [ 7.418094] x29: ffff80001249b710 x28: 0000000000000000 x27: ffff0000059ba000
> [ 7.418098] x26: ffff0000059ba0c0 x25: 0000000000000001 x24: 0000000000000000
> [ 7.418102] x23: 000000000000000c x22: 0000000000000000 x21: ffff0000059ba2e8
> [ 7.418106] x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000002
> [ 7.418110] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [ 7.418114] x14: 0000000000000001 x13: 00000000000e98d4 x12: 0000000000000040
> [ 7.418117] x11: ffff00000081afe8 x10: ffff00000081afea x9 : ffff800011d129e0
> [ 7.418121] x8 : 0000000000000000 x7 : ffff800011fca91c x6 : 0000000000000000
> [ 7.418125] x5 : 000000000000c011 x4 : ffff8000178000b0 x3 : 0000000000000002
> [ 7.418128] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000004efe680
> [ 7.418132] Kernel panic - not syncing: Asynchronous SError Interrupt
> [ 7.418134] CPU: 5 PID: 90 Comm: kworker/u12:2 Not tainted
> 5.13.0-rc5-xn1ml-00004-g128f93a6219c-dirty #6
> [ 7.418136] Hardware name: Hardkernel ODROID-N1 (DT)
> [ 7.418137] Workqueue: events_unbound deferred_probe_work_func
> [ 7.418139] Call trace:
> [ 7.418140] dump_backtrace+0x0/0x1a0
> [ 7.418141] show_stack+0x1c/0x70
> [ 7.418142] dump_stack+0xd0/0x12c
> [ 7.418143] panic+0x16c/0x334
> [ 7.418145] add_taint+0x0/0xb0
> [ 7.418146] arm64_serror_panic+0x7c/0x88
> [ 7.418147] do_serror+0x68/0x70
> [ 7.418148] el1_error+0x80/0xf8
> [ 7.418149] __pci_enable_msix_range+0x498/0x6b0
> [ 7.418150] pci_alloc_irq_vectors_affinity+0xbc/0x13c
> [ 7.418152] pcie_port_device_register+0x11c/0x40c
> [ 7.418153] pcie_portdrv_probe+0x48/0x100
> [ 7.418154] local_pci_probe+0x44/0xb0
> [ 7.418155] pci_device_probe+0x114/0x1b0
> [ 7.418156] really_probe+0xe4/0x504
> [ 7.418157] driver_probe_device+0x64/0xcc
> [ 7.418158] __device_attach_driver+0xb8/0x114
> [ 7.418159] bus_for_each_drv+0x78/0xd0
> [ 7.418160] __device_attach+0xd8/0x180
> [ 7.418161] device_attach+0x18/0x24
> [ 7.418162] pci_bus_add_device+0x54/0xc0
> [ 7.418163] pci_bus_add_devices+0x40/0x90
> [ 7.418165] pci_host_probe+0x44/0xc4
> [ 7.418166] rockchip_pcie_probe+0x2fc/0x4d4 [pcie_rockchip_host]
> [ 7.418167] platform_probe+0x6c/0xdc
> [ 7.418168] really_probe+0xe4/0x504
> [ 7.418169] driver_probe_device+0x64/0xcc
> [ 7.418170] __device_attach_driver+0xb8/0x114
> [ 7.418171] bus_for_each_drv+0x78/0xd0
> [ 7.418172] __device_attach+0xd8/0x180
> [ 7.418173] device_initial_probe+0x18/0x2c
> [ 7.418174] bus_probe_device+0xa0/0xac
> [ 7.418175] deferred_probe_work_func+0x88/0xc0
> [ 7.418176] process_one_work+0x1cc/0x350
> [ 7.418177] worker_thread+0x13c/0x470
> [ 7.418178] kthread+0x158/0x160
> [ 7.418179] ret_from_fork+0x10/0x30
> [ 8.063493] SMP: stopping secondary CPUs
> [ 8.063495] Kernel Offset: disabled
> [ 8.063496] CPU features: 0x10001031,20000846
> [ 8.063498] Memory Limit: none

I am not sure I follow the changes you've made. Also the issue in the
bootlog seems to be different (unrelated?) to the one I am looking to
fix - failing to allocate non-prefetchable 32-bit BARs from 64-bit
resource windows.

Not sure how I can help.

Thanks,
Punit

[...]

2021-06-10 18:32:42

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB

On Thu, Jun 10, 2021 at 10:34:56PM +0900, Punit Agrawal wrote:
> Hi Bjorn,
>
> Bjorn Helgaas <[email protected]> writes:
>
> > [+cc Leonardo]
> >
> > On Mon, Jun 07, 2021 at 08:28:53PM +0900, Punit Agrawal wrote:
> >> Some host bridges advertise non-prefetchable memory windows that are
> >> entirely located below 4GB but are marked as 64-bit address memory.
> >>
> >> Since commit 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
> >> flags for 64-bit memory addresses"), the OF PCI range parser takes a
> >> stricter view and treats 64-bit address ranges as advertised while
> >> before such ranges were treated as 32-bit.
> >>
> >> A PCI root port modelled as a PCI-to-PCI bridge cannot forward 64-bit
> >> non-prefetchable memory ranges. As a result, the change in behaviour
> >> due to the commit causes failure to allocate 32-bit BAR from a 64-bit
> >> non-prefetchable window.
> >>
> >> In order to not break platforms where non-prefetchable memory ranges
> >> lie entirely below 4GB, clear the 64-bit flag.
> >
> > I don't think we should care about the address width DT supplies for a
> > host bridge window. Prior to 9d57e61bf723, I don't think we *did*
> > care because of_bus_pci_get_flags() threw away that information.
> >
> > My proposal for a commit log, including information about the problem
> > report and a "Fixes:" tag:
> >
> > Alexandru and Qu reported this resource allocation failure on
> > ROCKPro64 v2 and ROCK Pi 4B, both based on the RK3399:
> >
> > pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff 64bit]
> > pci 0000:00:00.0: PCI bridge to [bus 01]
> > pci 0000:00:00.0: BAR 14: no space for [mem size 0x00100000]
> > pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
> >
> > "BAR 14" is the PCI bridge's 32-bit non-prefetchable window, and our
> > PCI allocation code isn't smart enough to allocate it in a host
> > bridge window marked as 64-bit, even though this should work fine.
> >
> > A DT host bridge description includes the windows from the CPU
> > address space to the PCI bus space. On a few architectures
> > (microblaze, powerpc, sparc), the DT may also describe PCI devices
> > themselves, including their BARs.
> >
> > Before 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource
> > flags for 64-bit memory addresses"), of_bus_pci_get_flags() ignored
> > the fact that some DT addresses described 64-bit windows and BARs.
> > That was a problem because the virtio virtual NIC has a 32-bit BAR
> > and a 64-bit BAR, and the driver couldn't distinguish them.
>
> Many thanks for demystifying the motivation for 9d57e61bf723. Not being
> familiar with the usage of DT to describe PCI devices I was missing this
> context.

The use of DT to describe PCI devices is a mystery to me, too. I'm
guessing this is related to hypervisors that don't fully virtualize
PCI devices.

> > 9d57e61bf723 set IORESOURCE_MEM_64 for those 64-bit DT ranges, which
> > fixed the virtio driver. But it also set IORESOURCE_MEM_64 for host
> > bridge windows, which exposed the fact that the PCI allocator isn't
> > smart enough to put 32-bit resources in those 64-bit windows.
> >
> > Clear IORESOURCE_MEM_64 from host bridge windows since we don't need
> > that information.
> >
> > Fixes: 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
> > Reported-at: https://lore.kernel.org/lkml/[email protected]/
> > Reported-by: Alexandru Elisei <[email protected]>
> > Reported-by: Qu Wenruo <[email protected]>
>
> Thank you for commit log - without all the pieces I was struggling to
> clearly describe the details. And I missed the appropriate tags as
> well. I've updated the commit log based on your suggestion.
>
> >> Suggested-by: Ard Biesheuvel <[email protected]>
> >> Link: https://lore.kernel.org/r/[email protected]
> >> Signed-off-by: Punit Agrawal <[email protected]>
> >> Tested-by: Alexandru Elisei <[email protected]>
> >> Cc: Bjorn Helgaas <[email protected]>
> >> Cc: Rob Herring <[email protected]>
> >> ---
> >> drivers/pci/of.c | 8 ++++++++
> >> 1 file changed, 8 insertions(+)
> >>
> >> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> >> index 85dcb7097da4..1e45186a5715 100644
> >> --- a/drivers/pci/of.c
> >> +++ b/drivers/pci/of.c
> >> @@ -353,6 +353,14 @@ static int devm_of_pci_get_host_bridge_resources(struct device *dev,
> >> dev_warn(dev, "More than one I/O resource converted for %pOF. CPU base address for old range lost!\n",
> >> dev_node);
> >> *io_base = range.cpu_addr;
> >> + } else if (resource_type(res) == IORESOURCE_MEM) {
> >> + if (!(res->flags & IORESOURCE_PREFETCH)) {
> >> + if (res->flags & IORESOURCE_MEM_64)
> >> + if (!upper_32_bits(range.pci_addr + range.size - 1)) {
> >> + dev_warn(dev, "Clearing 64-bit flag for non-prefetchable memory below 4GB\n");
> >> + res->flags &= ~IORESOURCE_MEM_64;
> >> + }
> >> + }
> >
> > Why do we need to check IORESOURCE_PREFETCH, IORESOURCE_MEM_64, and
> > upper_32_bits()? If I understand this correctly, prior to
> > 9d57e61bf723, IORESOURCE_MEM_64 was *never* set here. Isn't something
> > like this sufficient?
> >
> > } else if (resource_type(res) == IORESOURCE_MEM) {
> > res->flags &= ~IORESOURCE_MEM_64;
> > }
>
> Based on the discussion in the original thread[0], I was working with
> the assumption that we don't want to lose the IORESOURCE_MEM_64 flag
> other than in the problem scenario, i.e., non-prefetchable memory below
> 4GB.
>
> You suggestion is simpler and also solves the issue by effectively
> reverting the impact of 9d57e61bf723 on BAR allocation. If there are no
> objections I will take this approach for the next update.
>
> To aid future readers I will also add the following comment -
>
> /*
> * PCI allocation cannot correctly allocate 32-bit non-prefetchable BAR
> * in host bridge windows marked as 64-bit.
> */
>
> > I'm not sure we need a warning either. We didn't warn before
> > 9d57e61bf723, and there's nothing the user needs to do anyway.
>
> The warning was a nudge (probably too subtle) to get the user to upgrade
> their DT to drop the 64-bit marker on the host bridge window. With your
> suggestion, the DT change is not needed anymore - though it may still be
> worth dropping the 64-bit marker.

I'm certainly not a DT expert, and Rob would know better.

The doc I'm looking at ([1]), says in sec 2.2.1.1 that for an address
in 32-bit-address Memory Space, the high-order address bits "hh...hh
must be zero" and only the 32 bits in "ll...ll" are usable.

That suggests to me that the DT probably *should* use 64-bit-address
Memory Space for things that don't fit in 32 bits. But when we use
such an address for PCI host bridge windows, I don't think the
distinction is useful, so I think we should just drop the 64-bit
indication silently.

> [0] https://lore.kernel.org/linux-pci/CAMj1kXGF_JmuZ+rRA55-NrTQ6f20fhcHc=62AGJ71eHNU8AoBQ@mail.gmail.com/

[1] PCI Bus Binding to: IEEE Std 1275-1994 Standard for Boot
(Initialization Configuration) Firmware, Revision 2.1 [this is
ancient, and I would welcome a pointer to something better]

2021-06-10 18:39:10

by Anand Moon

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

Hi Punit,

On Thu, 10 Jun 2021 at 19:55, Punit Agrawal <[email protected]> wrote:
>
> Hi Anand,
>
> Anand Moon <[email protected]> writes:
>
> > Hi Punit,
> >
> > On Mon, 7 Jun 2021 at 17:02, Punit Agrawal <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> This is the third iteration to improve handling of the 64-bit
> >> attribute on non-prefetchable host bridge ranges. Previous version can
> >> be found at [0][1].
> >>
> >> This version is a small update over the previous version - changelog
> >> below. If there is no futher feedback on the patches, please consider
> >> merging them.
> >>
> >> Thanks,
> >> Punit
> >>
> >
> > Thanks for your work, I have tested this on my RK3399 (Odroid N1) SBC.
> > It partially works on this board. So I looked into
> > RK3399TRM_V1.3_Part2 for more details.
> >
> > 17.6.7.1.45 Root Complex BAR Configuration Register
> > It looks like these config changes are related to the issue you are
> > trying to solve.
> >
> > On this basis here are the code changes I made in the driver for testing.
> >
> > [alarm@alarm linux-rockchip-5.y-devel]$ git diff
> > drivers/pci/controller/pcie-rockchip.h
> > drivers/pci/controller/pcie-rockchip.c
> > diff --git a/drivers/pci/controller/pcie-rockchip.c
> > b/drivers/pci/controller/pcie-rockchip.c
> > index 990a00e08bc5..18e315de9f04 100644
> > --- a/drivers/pci/controller/pcie-rockchip.c
> > +++ b/drivers/pci/controller/pcie-rockchip.c
> > @@ -405,9 +405,20 @@ void rockchip_pcie_cfg_configuration_accesses(
> > struct rockchip_pcie *rockchip, u32 type)
> > {
> > u32 ob_desc_0;
> > + u32 status;
> >
> > /* Configuration Accesses for region 0 */
> > - rockchip_pcie_write(rockchip, 0x0, PCIE_RC_BAR_CONF);
> > + status = rockchip_pcie_read(rockchip, PCIE_RC_BAR_CONF);
> > + status |= FIELD_PREP(PCIE_RC_BAR_RCBAR0A, 0x14) |
> > + FIELD_PREP(PCIE_RC_BAR_RCBAR0C, 0x6) |
> > + FIELD_PREP(PCIE_RC_BAR_RCBAR1A, 0x14) |
> > + FIELD_PREP(PCIE_RC_BAR_RCBAR1C, 0x4) |
> > + PCIE_RC_BAR_RCBARPME |
> > + PCIE_RC_BAR_RCBARPMS |
> > + PCIE_RC_BAR_RCBARPIE |
> > + PCIE_RC_BAR_RCBARPIS |
> > + PCIE_RC_BAR_RCBCE;
> > + rockchip_pcie_write(rockchip, status, PCIE_RC_BAR_CONF);
> >
> > rockchip_pcie_write(rockchip,
> > (RC_REGION_0_ADDR_TRANS_L + RC_REGION_0_PASS_BITS),
> > diff --git a/drivers/pci/controller/pcie-rockchip.h
> > b/drivers/pci/controller/pcie-rockchip.h
> > index 1650a5087450..a68ca18de4ec 100644
> > --- a/drivers/pci/controller/pcie-rockchip.h
> > +++ b/drivers/pci/controller/pcie-rockchip.h
> > @@ -114,6 +114,16 @@
> > #define PCIE_CORE_INT_MASK (PCIE_CORE_CTRL_MGMT_BASE + 0x210)
> > #define PCIE_CORE_PHY_FUNC_CFG (PCIE_CORE_CTRL_MGMT_BASE + 0x2c0)
> > #define PCIE_RC_BAR_CONF (PCIE_CORE_CTRL_MGMT_BASE + 0x300)
> > +#define PCIE_RC_BAR_RCBAR0A GENMASK(5, 0)
> > +#define PCIE_RC_BAR_RCBAR0C GENMASK(8, 6)
> > +#define PCIE_RC_BAR_RCBAR1A GENMASK(13, 9)
> > +#define PCIE_RC_BAR_RCBAR1C GENMASK(16, 14)
> > +#define PCIE_RC_BAR_RCBARPME BIT(17)
> > +#define PCIE_RC_BAR_RCBARPMS BIT(18)
> > +#define PCIE_RC_BAR_RCBARPIE BIT(19)
> > +#define PCIE_RC_BAR_RCBARPIS BIT(20)
> > +#define PCIE_RC_BAR_RCBCE BIT(31)
> > +
> > #define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_DISABLED 0x0
> > #define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_IO_32BITS 0x1
> > #define ROCKCHIP_PCIE_CORE_BAR_CFG_CTRL_MEM_32BITS 0x4
> >
> > But I am getting the following error at my end.
> > --------------------
> > [ OK ] Found device /dev/ttyS2.
> > [ 7.089794] rockchip-pcie f8000000.pcie: host bridge /pcie@f8000000 ranges:
> > [ 7.092945] rk808-rtc rk808-rtc: registered as rtc0
> > [ 7.103462] rockchip-pcie f8000000.pcie: MEM
> > 0x00fa000000..0x00fbdfffff -> 0x00fa000000
> > [ 7.113384] rockchip-pcie f8000000.pcie: IO
> > 0x00fbe00000..0x00fbefffff -> 0x00fbe00000
> > [ 7.123294] rockchip-pcie f8000000.pcie: no vpcie12v regulator found
> > [ 7.123524] rk808-rtc rk808-rtc: setting system clock to
> > 2021-06-10T07:36:35 UTC (1623310595)
> > [ 7.130589] rockchip-pcie f8000000.pcie: no vpcie3v3 regulator found
> > [ 7.187409] mc: Linux media interface: v0.10
> > [ 7.210178] rockchip-pcie f8000000.pcie: PCI host bridge to bus 0000:00
> > [ 7.217643] pci_bus 0000:00: root bus resource [bus 00-1f]
> > [ 7.224126] pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff]
> > [ 7.234975] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
> > (bus address [0xfbe00000-0xfbefffff])
> > [ 7.246527] pci 0000:00:00.0: [1d87:0100] type 01 class 0x060400
> > [ 7.264087] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x003fffff 64bit]
> > [ 7.271770] pci 0000:00:00.0: supports D1
> > [ 7.276265] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> > [ 7.284724] pci 0000:00:00.0: bridge configuration invalid ([bus
> > 00-00]), reconfiguring
> > [ 7.293874] pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
> > [ 7.300656] pci 0000:01:00.0: reg 0x10: initial BAR value 0x00000000 invalid
> > [ 7.308542] pci 0000:01:00.0: reg 0x10: [io size 0x0008]
> > [ 7.309498] videodev: Linux video capture interface: v2.00
> > [ 7.314606] pci 0000:01:00.0: reg 0x14: initial BAR value 0x00000000 invalid
> > [ 7.328593] pci 0000:01:00.0: reg 0x14: [io size 0x0004]
> > [ 7.334661] pci 0000:01:00.0: reg 0x18: initial BAR value 0x00000000 invalid
> > [ 7.342545] pci 0000:01:00.0: reg 0x18: [io size 0x0008]
> > [ 7.342578] pci 0000:01:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
> > [ 7.342581] pci 0000:01:00.0: reg 0x1c: [io size 0x0004]
> > [ 7.342609] pci 0000:01:00.0: reg 0x20: initial BAR value 0x00000000 invalid
> > [ 7.342612] pci 0000:01:00.0: reg 0x20: [io size 0x0010]
> > [ 7.342640] pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
> > [ 7.383491] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
> > [ 7.383533] pci 0000:01:00.0: Max Payload Size set to 256 (was 128, max 512)
> > [ 7.385978] panfrost ff9a0000.gpu: clock rate = 500000000
> > [ 7.387283] rk_gmac-dwmac fe300000.ethernet: IRQ eth_wake_irq not found
> > [ 7.387295] rk_gmac-dwmac fe300000.ethernet: IRQ eth_lpi not found
> > [ 7.387377] rk_gmac-dwmac fe300000.ethernet: PTP uses main clock
> > [ 7.387614] rk_gmac-dwmac fe300000.ethernet: clock input or output? (input).
> > [ 7.387623] rk_gmac-dwmac fe300000.ethernet: TX delay(0x28).
> > [ 7.387631] rk_gmac-dwmac fe300000.ethernet: RX delay(0x11).
> > [ 7.387643] rk_gmac-dwmac fe300000.ethernet: integrated PHY? (no).
> > [ 7.387682] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
> > [ 7.387688] rk_gmac-dwmac fe300000.ethernet: clock input from PHY
> > [ 7.391688] panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init
> > [panfrost]] Failed to register cooling device
> > [ 7.392704] rk_gmac-dwmac fe300000.ethernet: init for RGMII
> > [ 7.392908] rk_gmac-dwmac fe300000.ethernet: User ID: 0x10, Synopsys ID: 0x35
> > [ 7.392921] rk_gmac-dwmac fe300000.ethernet: DWMAC1000
> > [ 7.392926] rk_gmac-dwmac fe300000.ethernet: DMA HW capability
> > register supported
> > [ 7.392931] rk_gmac-dwmac fe300000.ethernet: RX Checksum Offload
> > Engine supported
> > [ 7.392935] rk_gmac-dwmac fe300000.ethernet: COE Type 2
> > [ 7.392939] rk_gmac-dwmac fe300000.ethernet: TX Checksum insertion supported
> > [ 7.392943] rk_gmac-dwmac fe300000.ethernet: Wake-Up On Lan supported
> > [ 7.393016] rk_gmac-dwmac fe300000.ethernet: Normal descriptors
> > [ 7.393022] rk_gmac-dwmac fe300000.ethernet: Ring mode enabled
> > [ 7.393026] rk_gmac-dwmac fe300000.ethernet: Enable RX Mitigation
> > via HW Watchdog Timer
> > [ 7.393070] rk_gmac-dwmac fe300000.ethernet: Unbalanced pm_runtime_enable!
> > [ 7.393421] libphy: stmmac: probed
> > [ 7.399404] rockchip-pcie f8000000.pcie: local interrupt received
> > [ 7.405075] panfrost ff9a0000.gpu: mali-t860 id 0x860 major 0x2
> > minor 0x0 status 0x0
> > [ 7.412338] rockchip-pcie f8000000.pcie: an error was observed in
> > the flow control advertisements from the other side
> > [ 7.412377] rockchip-pcie f8000000.pcie: phy link changes
> > [ 7.417620] pci_bus 0000:01: fixups for bus
> > [ 7.417624] pci_bus 0000:01: bus scan returning with max=01
> > [ 7.417628] pci_bus 0000:01: busn_res: [bus 01-1f] end is updated to 01
> > [ 7.417637] pci_bus 0000:00: bus scan returning with max=01
> > [ 7.417650] pci 0000:00:00.0: BAR 0: assigned [mem
> > 0xfa000000-0xfa3fffff 64bit]
> > [ 7.417662] pci 0000:00:00.0: BAR 14: assigned [mem 0xfa400000-0xfa4fffff]
> > [ 7.417666] pci 0000:00:00.0: BAR 13: assigned [io 0x1000-0x1fff]
> > [ 7.417671] pci 0000:01:00.0: BAR 6: assigned [mem
> > 0xfa400000-0xfa40ffff pref]
> > [ 7.417675] pci 0000:01:00.0: BAR 5: assigned [mem 0xfa410000-0xfa4101ff]
> > [ 7.417689] pci 0000:01:00.0: BAR 4: assigned [io 0x1000-0x100f]
> > [ 7.417702] pci 0000:01:00.0: BAR 0: assigned [io 0x1010-0x1017]
> > [ 7.417715] pci 0000:01:00.0: BAR 2: assigned [io 0x1018-0x101f]
> > [ 7.417729] pci 0000:01:00.0: BAR 1: assigned [io 0x1020-0x1023]
> > [ 7.417742] pci 0000:01:00.0: BAR 3: assigned [io 0x1024-0x1027]
> > [ 7.417756] pci 0000:00:00.0: PCI bridge to [bus 01]
> > [ 7.417760] pci 0000:00:00.0: bridge window [io 0x1000-0x1fff]
> > [ 7.417766] pci 0000:00:00.0: bridge window [mem 0xfa400000-0xfa4fffff]
> > [ 7.417872] pcieport 0000:00:00.0: assign IRQ: got 93
> > [ 7.417884] pcieport 0000:00:00.0: enabling device (0000 -> 0003)
> > [ 7.417895] pcieport 0000:00:00.0: enabling bus mastering
> > [ 7.418081] SError Interrupt on CPU5, code 0xbf000002 -- SError
> > [ 7.418084] CPU: 5 PID: 90 Comm: kworker/u12:2 Not tainted
> > 5.13.0-rc5-xn1ml-00004-g128f93a6219c-dirty #6
> > [ 7.418085] Hardware name: Hardkernel ODROID-N1 (DT)
> > [ 7.418086] Workqueue: events_unbound deferred_probe_work_func
> > [ 7.418089] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> > [ 7.418091] pc : __pci_enable_msix_range+0x498/0x6b0
> > [ 7.418092] lr : __pci_enable_msix_range+0x44c/0x6b0
> > [ 7.418093] sp : ffff80001249b710
> > [ 7.418094] x29: ffff80001249b710 x28: 0000000000000000 x27: ffff0000059ba000
> > [ 7.418098] x26: ffff0000059ba0c0 x25: 0000000000000001 x24: 0000000000000000
> > [ 7.418102] x23: 000000000000000c x22: 0000000000000000 x21: ffff0000059ba2e8
> > [ 7.418106] x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000002
> > [ 7.418110] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > [ 7.418114] x14: 0000000000000001 x13: 00000000000e98d4 x12: 0000000000000040
> > [ 7.418117] x11: ffff00000081afe8 x10: ffff00000081afea x9 : ffff800011d129e0
> > [ 7.418121] x8 : 0000000000000000 x7 : ffff800011fca91c x6 : 0000000000000000
> > [ 7.418125] x5 : 000000000000c011 x4 : ffff8000178000b0 x3 : 0000000000000002
> > [ 7.418128] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000004efe680
> > [ 7.418132] Kernel panic - not syncing: Asynchronous SError Interrupt
> > [ 7.418134] CPU: 5 PID: 90 Comm: kworker/u12:2 Not tainted
> > 5.13.0-rc5-xn1ml-00004-g128f93a6219c-dirty #6
> > [ 7.418136] Hardware name: Hardkernel ODROID-N1 (DT)
> > [ 7.418137] Workqueue: events_unbound deferred_probe_work_func
> > [ 7.418139] Call trace:
> > [ 7.418140] dump_backtrace+0x0/0x1a0
> > [ 7.418141] show_stack+0x1c/0x70
> > [ 7.418142] dump_stack+0xd0/0x12c
> > [ 7.418143] panic+0x16c/0x334
> > [ 7.418145] add_taint+0x0/0xb0
> > [ 7.418146] arm64_serror_panic+0x7c/0x88
> > [ 7.418147] do_serror+0x68/0x70
> > [ 7.418148] el1_error+0x80/0xf8
> > [ 7.418149] __pci_enable_msix_range+0x498/0x6b0
> > [ 7.418150] pci_alloc_irq_vectors_affinity+0xbc/0x13c
> > [ 7.418152] pcie_port_device_register+0x11c/0x40c
> > [ 7.418153] pcie_portdrv_probe+0x48/0x100
> > [ 7.418154] local_pci_probe+0x44/0xb0
> > [ 7.418155] pci_device_probe+0x114/0x1b0
> > [ 7.418156] really_probe+0xe4/0x504
> > [ 7.418157] driver_probe_device+0x64/0xcc
> > [ 7.418158] __device_attach_driver+0xb8/0x114
> > [ 7.418159] bus_for_each_drv+0x78/0xd0
> > [ 7.418160] __device_attach+0xd8/0x180
> > [ 7.418161] device_attach+0x18/0x24
> > [ 7.418162] pci_bus_add_device+0x54/0xc0
> > [ 7.418163] pci_bus_add_devices+0x40/0x90
> > [ 7.418165] pci_host_probe+0x44/0xc4
> > [ 7.418166] rockchip_pcie_probe+0x2fc/0x4d4 [pcie_rockchip_host]
> > [ 7.418167] platform_probe+0x6c/0xdc
> > [ 7.418168] really_probe+0xe4/0x504
> > [ 7.418169] driver_probe_device+0x64/0xcc
> > [ 7.418170] __device_attach_driver+0xb8/0x114
> > [ 7.418171] bus_for_each_drv+0x78/0xd0
> > [ 7.418172] __device_attach+0xd8/0x180
> > [ 7.418173] device_initial_probe+0x18/0x2c
> > [ 7.418174] bus_probe_device+0xa0/0xac
> > [ 7.418175] deferred_probe_work_func+0x88/0xc0
> > [ 7.418176] process_one_work+0x1cc/0x350
> > [ 7.418177] worker_thread+0x13c/0x470
> > [ 7.418178] kthread+0x158/0x160
> > [ 7.418179] ret_from_fork+0x10/0x30
> > [ 8.063493] SMP: stopping secondary CPUs
> > [ 8.063495] Kernel Offset: disabled
> > [ 8.063496] CPU features: 0x10001031,20000846
> > [ 8.063498] Memory Limit: none
>
> I am not sure I follow the changes you've made. Also the issue in the
> bootlog seems to be different (unrelated?) to the one I am looking to
> fix - failing to allocate non-prefetchable 32-bit BARs from 64-bit
> resource windows.
>
> Not sure how I can help.

Ok, no problem.

Thanks
-Anand

>
> Thanks,
> Punit
>
> [...]
>

2021-06-10 20:03:30

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] PCI: of: Relax the condition for warning about non-prefetchable memory aperture size

On Thu, Jun 10, 2021 at 11:11:10PM +0900, Punit Agrawal wrote:
> Hi Bjorn,
>
> Bjorn Helgaas <[email protected]> writes:
>
> > On Wed, Jun 09, 2021 at 12:36:08AM +0530, Vidya Sagar wrote:
> >> On 6/7/2021 4:58 PM, Punit Agrawal wrote:
> >> >
> >> > Commit fede8526cc48 ("PCI: of: Warn if non-prefetchable memory
> >> > aperture size is > 32-bit") introduced a warning for non-prefetchable
> >> > resources that need more than 32bits to resolve. It turns out that the
> >> > check is too restrictive and should be applicable to only resources
> >> > that are limited to host bridge windows that don't have the ability to
> >> > map 64-bit address space.
> >>
> >> I think the host bridge windows having the ability to map 64-bit address
> >> space is different from restricting the non-prefetchable memory aperture
> >> size to 32-bit.
> >
> >> Whether the host bridge uses internal translations or not to map the
> >> non-prefetchable resources to 64-bit space, the size needs to be programmed
> >> in the host bridge's 'Memory Limit Register (Offset 22h)' which can
> >> represent sizes only fit into 32-bits.
> >
> >> Host bridges having the ability to map 64-bit address spaces gives
> >> flexibility to utilize the vast 64-bit space for the (restrictive)
> >> non-prefetchable memory (i.e. mapping non-prefetchable BARs of endpoints to
> >> the 64-bit space in CPU's view) and get it translated internally and put a
> >> 32-bit address on the PCIe bus finally.
> >
> > The vastness of the 64-bit space in the CPU view only helps with
> > non-prefetchable memory if you have multiple host bridges with
> > different CPU-to-PCI translations. Each root bus can only carve up
> > 4GB of PCI memory space for use by its non-prefetchable memory
> > windows.
> >
> > Of course, if we're willing to give up the performance, there's
> > nothing to prevent us from using non-prefetchable space for
> > *prefetchable* resources, as in my example below.
> >
> > I think the fede8526cc48 commit log is incorrect, or at least
> > incomplete:
> >
> > As per PCIe spec r5.0, sec 7.5.1.3.8 only 32-bit BAR registers are defined
> > for non-prefetchable memory and hence a warning should be reported when
> > the size of them go beyond 32-bits.
> >
> > 7.5.1.3.8 is talking about non-prefetchable PCI-to-PCI bridge windows,
> > not BARs. AFAIK, 64-bit BARs may be non-prefetchable. The warning is
> > in pci_parse_request_of_pci_ranges(), which isn't looking at
> > PCI-to-PCI bridge windows; it's looking at PCI host bridge windows.
> > It's legal for a host bridge to have only non-prefetchable windows,
> > and prefetchable PCI BARs can be placed in them.
> >
> > For example, we could have the following:
> >
> > pci_bus 0000:00: root bus resource [mem 0x80000000-0x1_ffffffff] (6GB)
> > pci 0000:00:00.0: PCI bridge to [bus 01-7f]
> > pci 0000:00:00.0: bridge window [mem 0x80000000-0xbfffffff] (1GB)
> > pci 0000:00:00.0: bridge window [mem 0x1_00000000-0x1_7fffffff 64bit pref] (2GB)
> > pci 0000:00:00.1: PCI bridge to [bus 80-ff]
> > pci 0000:00:00.1: bridge window [mem 0xc0000000-0xffffffff] (1GB)
> > pci 0000:00:00.1: bridge window [mem 0x1_80000000-0x1_ffffffff 64bit pref] (2GB)
> >
> > Here the host bridge window is 6GB and is not prefetchable. The
> > PCI-to-PCI bridge non-prefetchable windows are 1GB each and the bases
> > and limits fit in 32 bits. The prefetchable windows are 2GB each, and
> > we're allowed but not required to put these in prefetchable host
> > bridge windows.
> >
> > So I'm not convinced this warning is valid to begin with. It may be
> > that this host bridge configuration isn't optimal, and we might want
> > an informational message, but I think it's *legal*.
>
> By "optimal" - are you referring to the use of non-prefetchable space
> for prefetchable window?

Yes. I just meant that we don't know the specific capabilities of the
host bridge, and firmware or the native driver may not have configured
it in the optimal way.

> Also, if the warning doesn't apply to PCI host bridge windows, should I
> drop it in the next update? Or leave out this and the next patch to be
> dealt with separately.

I'd like to hear Vidya's thoughts on this first in case I'm
misinterpreting something.

In the meantime, I think it's not terrible if you leave this as-is for
now. Worst-case we'll get some warnings that we might not need, but
IIUC, patches 2/4 and 3/4 don't fix a functional problem.

I don't know whether the IORESOURCE_PREFETCH bit on host bridge
windows is important or not. I *think* it's common for ACPI host
bridge descriptions to have no windows described as "prefetchable" (at
least, my garden-variety laptop has none):

pci_bus 0000:00: root bus resource [mem 0xfd000000-0xfe7fffff window]
pci 0000:00:01.0: PCI bridge to [bus 01]
pci 0000:00:01.0: bridge window [mem 0xc0000000-0xd1ffffff 64bit pref]
pci 0000:00:1c.1: PCI bridge to [bus 03]
pci 0000:00:1c.1: bridge window [mem 0xd2100000-0xd2afffff 64bit pref]
pci 0000:00:1d.6: PCI bridge to [bus 06-3e]
pci 0000:00:1d.6: bridge window [mem 0x90000000-0xb1ffffff 64bit pref]

I guess we must just rely on the fact that BIOS has already programmed
those prefetchable windows? I really don't know how this works, to be
honest.

Bjorn

2021-06-10 21:54:42

by Heiko Stuebner

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory

Hi,

Am Montag, 7. Juni 2021, 13:28:56 CEST schrieb Punit Agrawal:
> The PCIe host bridge on RK3399 advertises a single 64-bit memory
> address range even though it lies entirely below 4GB.
>
> Previously the OF PCI range parser treated 64-bit ranges more
> leniently (i.e., as 32-bit), but since commit 9d57e61bf723 ("of/pci:
> Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
> the code takes a stricter view and treats the ranges as advertised in
> the device tree (i.e, as 64-bit).
>
> The change in behaviour causes failure when allocating bus addresses
> to devices connected behind a PCI-to-PCI bridge that require
> non-prefetchable memory ranges. The allocation failure was observed
> for certain Samsung NVMe drives connected to RockPro64 boards.
>
> Update the host bridge window attributes to treat it as 32-bit address
> memory. This fixes the allocation failure observed since commit
> 9d57e61bf723.
>
> Reported-by: Alexandru Elisei <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Suggested-by: Robin Murphy <[email protected]>
> Signed-off-by: Punit Agrawal <[email protected]>
> Tested-by: Alexandru Elisei <[email protected]>
> Cc: Heiko Stuebner <[email protected]>
> Cc: Rob Herring <[email protected]>

just for clarity, should I just pick this patch separately for 5.13-rc to
make it easy for people using current kernel devicetrees, or should
this wait for the update mentioned in the cover-letter response
and should go all together through the PCI tree?

If so, I can provide an
Acked-by: Heiko Stuebner <[email protected]>



> ---
> arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> index 634a91af8e83..4b854eb21f72 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
> @@ -227,7 +227,7 @@ pcie0: pcie@f8000000 {
> <&pcie_phy 2>, <&pcie_phy 3>;
> phy-names = "pcie-phy-0", "pcie-phy-1",
> "pcie-phy-2", "pcie-phy-3";
> - ranges = <0x83000000 0x0 0xfa000000 0x0 0xfa000000 0x0 0x1e00000>,
> + ranges = <0x82000000 0x0 0xfa000000 0x0 0xfa000000 0x0 0x1e00000>,
> <0x81000000 0x0 0xfbe00000 0x0 0xfbe00000 0x0 0x100000>;
> resets = <&cru SRST_PCIE_CORE>, <&cru SRST_PCIE_MGMT>,
> <&cru SRST_PCIE_MGMT_STICKY>, <&cru SRST_PCIE_PIPE>,
>




2021-06-11 14:43:33

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory

Hi Heiko,

Heiko Stübner <[email protected]> writes:

> Hi,
>
> Am Montag, 7. Juni 2021, 13:28:56 CEST schrieb Punit Agrawal:
>> The PCIe host bridge on RK3399 advertises a single 64-bit memory
>> address range even though it lies entirely below 4GB.
>>
>> Previously the OF PCI range parser treated 64-bit ranges more
>> leniently (i.e., as 32-bit), but since commit 9d57e61bf723 ("of/pci:
>> Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
>> the code takes a stricter view and treats the ranges as advertised in
>> the device tree (i.e, as 64-bit).
>>
>> The change in behaviour causes failure when allocating bus addresses
>> to devices connected behind a PCI-to-PCI bridge that require
>> non-prefetchable memory ranges. The allocation failure was observed
>> for certain Samsung NVMe drives connected to RockPro64 boards.
>>
>> Update the host bridge window attributes to treat it as 32-bit address
>> memory. This fixes the allocation failure observed since commit
>> 9d57e61bf723.
>>
>> Reported-by: Alexandru Elisei <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> Suggested-by: Robin Murphy <[email protected]>
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Tested-by: Alexandru Elisei <[email protected]>
>> Cc: Heiko Stuebner <[email protected]>
>> Cc: Rob Herring <[email protected]>
>
> just for clarity, should I just pick this patch separately for 5.13-rc to
> make it easy for people using current kernel devicetrees, or should
> this wait for the update mentioned in the cover-letter response
> and should go all together through the PCI tree?

The device tree change is independent of the other patches in the
series. It would be great if you can pick this one - I am waiting on
feedback from Rob before sending an update on the remaining patches.

Thanks,
Punit

> If so, I can provide an
> Acked-by: Heiko Stuebner <[email protected]>
>
>
>
>> ---
>> arch/arm64/boot/dts/rockchip/rk3399.dtsi | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
>> index 634a91af8e83..4b854eb21f72 100644
>> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
>> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
>> @@ -227,7 +227,7 @@ pcie0: pcie@f8000000 {
>> <&pcie_phy 2>, <&pcie_phy 3>;
>> phy-names = "pcie-phy-0", "pcie-phy-1",
>> "pcie-phy-2", "pcie-phy-3";
>> - ranges = <0x83000000 0x0 0xfa000000 0x0 0xfa000000 0x0 0x1e00000>,
>> + ranges = <0x82000000 0x0 0xfa000000 0x0 0xfa000000 0x0 0x1e00000>,
>> <0x81000000 0x0 0xfbe00000 0x0 0xfbe00000 0x0 0x100000>;
>> resets = <&cru SRST_PCIE_CORE>, <&cru SRST_PCIE_MGMT>,
>> <&cru SRST_PCIE_MGMT_STICKY>, <&cru SRST_PCIE_PIPE>,
>>
>
>
>
>
>
> _______________________________________________
> Linux-rockchip mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-rockchip

2021-06-11 22:17:07

by Heiko Stuebner

[permalink] [raw]
Subject: Re: (subset) [PATCH v3 0/4] PCI: of: Improvements to handle 64-bit attribute for non-prefetchable ranges

On Mon, 7 Jun 2021 20:28:52 +0900, Punit Agrawal wrote:
> This is the third iteration to improve handling of the 64-bit
> attribute on non-prefetchable host bridge ranges. Previous version can
> be found at [0][1].
>
> This version is a small update over the previous version - changelog
> below. If there is no futher feedback on the patches, please consider
> merging them.
>
> [...]

Applied, thanks!

[4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory
commit: 8efe01b4386ab38a36b99cfdc1dc02c38a8898c3

Best regards,
--
Heiko Stuebner <[email protected]>

2021-06-15 21:30:34

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory

On Thu, Jun 10, 2021 at 3:50 PM Heiko Stübner <[email protected]> wrote:
>
> Hi,
>
> Am Montag, 7. Juni 2021, 13:28:56 CEST schrieb Punit Agrawal:
> > The PCIe host bridge on RK3399 advertises a single 64-bit memory
> > address range even though it lies entirely below 4GB.
> >
> > Previously the OF PCI range parser treated 64-bit ranges more
> > leniently (i.e., as 32-bit), but since commit 9d57e61bf723 ("of/pci:
> > Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
> > the code takes a stricter view and treats the ranges as advertised in
> > the device tree (i.e, as 64-bit).
> >
> > The change in behaviour causes failure when allocating bus addresses
> > to devices connected behind a PCI-to-PCI bridge that require
> > non-prefetchable memory ranges. The allocation failure was observed
> > for certain Samsung NVMe drives connected to RockPro64 boards.
> >
> > Update the host bridge window attributes to treat it as 32-bit address
> > memory. This fixes the allocation failure observed since commit
> > 9d57e61bf723.
> >
> > Reported-by: Alexandru Elisei <[email protected]>
> > Link: https://lore.kernel.org/r/[email protected]
> > Suggested-by: Robin Murphy <[email protected]>
> > Signed-off-by: Punit Agrawal <[email protected]>
> > Tested-by: Alexandru Elisei <[email protected]>
> > Cc: Heiko Stuebner <[email protected]>
> > Cc: Rob Herring <[email protected]>
>
> just for clarity, should I just pick this patch separately for 5.13-rc to
> make it easy for people using current kernel devicetrees, or should
> this wait for the update mentioned in the cover-letter response
> and should go all together through the PCI tree?

This was dropped from v4, but should still be applied IMO.

Acked-by: Rob Herring <[email protected]>

2021-06-15 21:52:17

by Heiko Stuebner

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory

Am Dienstag, 15. Juni 2021, 23:29:12 CEST schrieb Rob Herring:
> On Thu, Jun 10, 2021 at 3:50 PM Heiko St?bner <[email protected]> wrote:
> >
> > Hi,
> >
> > Am Montag, 7. Juni 2021, 13:28:56 CEST schrieb Punit Agrawal:
> > > The PCIe host bridge on RK3399 advertises a single 64-bit memory
> > > address range even though it lies entirely below 4GB.
> > >
> > > Previously the OF PCI range parser treated 64-bit ranges more
> > > leniently (i.e., as 32-bit), but since commit 9d57e61bf723 ("of/pci:
> > > Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
> > > the code takes a stricter view and treats the ranges as advertised in
> > > the device tree (i.e, as 64-bit).
> > >
> > > The change in behaviour causes failure when allocating bus addresses
> > > to devices connected behind a PCI-to-PCI bridge that require
> > > non-prefetchable memory ranges. The allocation failure was observed
> > > for certain Samsung NVMe drives connected to RockPro64 boards.
> > >
> > > Update the host bridge window attributes to treat it as 32-bit address
> > > memory. This fixes the allocation failure observed since commit
> > > 9d57e61bf723.
> > >
> > > Reported-by: Alexandru Elisei <[email protected]>
> > > Link: https://lore.kernel.org/r/[email protected]
> > > Suggested-by: Robin Murphy <[email protected]>
> > > Signed-off-by: Punit Agrawal <[email protected]>
> > > Tested-by: Alexandru Elisei <[email protected]>
> > > Cc: Heiko Stuebner <[email protected]>
> > > Cc: Rob Herring <[email protected]>
> >
> > just for clarity, should I just pick this patch separately for 5.13-rc to
> > make it easy for people using current kernel devicetrees, or should
> > this wait for the update mentioned in the cover-letter response
> > and should go all together through the PCI tree?
>
> This was dropped from v4, but should still be applied IMO.

It was probably dropped because I applied it ;-)

It's part of armsoc already [0] and should make its way into
5.13 shortly.


Heiko


[0] https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/commit/?h=arm/fixes&id=8efe01b4386ab38a36b99cfdc1dc02c38a8898c3

>
> Acked-by: Rob Herring <[email protected]>
>




2021-06-16 18:42:22

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] arm64: dts: rockchip: Update PCI host bridge window to 32-bit address memory

Heiko Stübner <[email protected]> writes:

> Am Dienstag, 15. Juni 2021, 23:29:12 CEST schrieb Rob Herring:
>> On Thu, Jun 10, 2021 at 3:50 PM Heiko Stübner <[email protected]> wrote:
>> >
>> > Hi,
>> >
>> > Am Montag, 7. Juni 2021, 13:28:56 CEST schrieb Punit Agrawal:
>> > > The PCIe host bridge on RK3399 advertises a single 64-bit memory
>> > > address range even though it lies entirely below 4GB.
>> > >
>> > > Previously the OF PCI range parser treated 64-bit ranges more
>> > > leniently (i.e., as 32-bit), but since commit 9d57e61bf723 ("of/pci:
>> > > Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
>> > > the code takes a stricter view and treats the ranges as advertised in
>> > > the device tree (i.e, as 64-bit).
>> > >
>> > > The change in behaviour causes failure when allocating bus addresses
>> > > to devices connected behind a PCI-to-PCI bridge that require
>> > > non-prefetchable memory ranges. The allocation failure was observed
>> > > for certain Samsung NVMe drives connected to RockPro64 boards.
>> > >
>> > > Update the host bridge window attributes to treat it as 32-bit address
>> > > memory. This fixes the allocation failure observed since commit
>> > > 9d57e61bf723.
>> > >
>> > > Reported-by: Alexandru Elisei <[email protected]>
>> > > Link: https://lore.kernel.org/r/[email protected]
>> > > Suggested-by: Robin Murphy <[email protected]>
>> > > Signed-off-by: Punit Agrawal <[email protected]>
>> > > Tested-by: Alexandru Elisei <[email protected]>
>> > > Cc: Heiko Stuebner <[email protected]>
>> > > Cc: Rob Herring <[email protected]>
>> >
>> > just for clarity, should I just pick this patch separately for 5.13-rc to
>> > make it easy for people using current kernel devicetrees, or should
>> > this wait for the update mentioned in the cover-letter response
>> > and should go all together through the PCI tree?
>>
>> This was dropped from v4, but should still be applied IMO.
>
> It was probably dropped because I applied it ;-)
>
> It's part of armsoc already [0] and should make its way into
> 5.13 shortly.

Thanks for sending the patch along. I left a note to the effect in v4
but it's easy to miss.

Hopefully all sorted now.

[...]