2010-06-02 16:59:21

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [Bug 16007] x86/pci Oops with CONFIG_SND_HDA_INTEL

I think the basic problem is that Yinghai's patch broke your system,
and this is a regression between 2.6.33 and 2.6.34.

We could use a quirk like yours (which looks fine, BTW) to cover up
this regression, but I don't like that approach because other machines
are probably affected by the same issue, and we'd have to find and
fix them one-by-one.

I think it'd be better to figure out the problem with 3e3da00c01d
and fix or revert it. I said earlier that I wasn't in favor of just
reverting it, and I still don't like that option because it will
likely break something. But Yinghai didn't supply any details about
the system that 3e3da00c01d fixed, so I don't know how to fix things
so both that system and yours work.

I assume that 2.6.34 with 3e3da00c01d reverted will work fine even
without "pci=use_crs". Can you try that and attach the dmesg log?


2010-06-11 21:49:10

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [Bug 16007] x86/pci Oops with CONFIG_SND_HDA_INTEL

[If you haven't been following this bug, the report is at [3].]

Here's a theory. I'm not an expert in HyperTransport, so maybe somebody
who knows HyperTransport and/or VIA chipsets can validate or refute it.

This is based on the _HyperTransport I/O Link Specification_, rev 3.10b [1],
and the _BIOS and Kernel Developer's Guide (BKDG) for AMD Family 10h
Processors_ [2].

In a nutshell, I think the problem is that amd_bus.c treats a
HyperTransport (HT) host bridge as though it were a PCI host bridge. In
particular, when an HT chain contains more than one PCI host bridge, the
HT host bridge apertures encompass all the PCI host bridges, but
amd_bus.c mistakenly assigns all those resources to one PCI host bridge.

>From a software point of view, HyperTransport is similar but not
identical to PCI. It is possible to make native HyperTransport
peripheral devices, but PCI devices must be attached via a
HyperTransport-to-PCI bridge [1, sec 4.1].

A PCI host bridge has a platform-specific non-PCI connection, e.g., a
front-side bus, on the primary (upstream) side and a PCI bus on the
secondary (downstream) side. Note that in the HyperTransport spec,
"host bridge" refers to the interface from the host, e.g., CPU cores, to
a HyperTransport chain. This HyperTransport host bridge has a
HyperTransport link on the secondary side, *not* a PCI bus.

A HyperTransport-to-PCI bridge is one kind of PCI host bridge, because
the primary side is HyperTransport and the secondary side is PCI.

Graham's machine contains one HT host bridge leading to an HT chain, and
it has PCI devices on buses 00, 02, 03, 06, and 80. In addition, the HT
host bridge configuration registers appear at device 18 (hex) in bus 00
configuration space, though they are not actually PCI functions. PCI
buses 02, 03, and 06 are reachable from bus 00 via the PCI-to-PCI
bridges at 00:03.3, 00:03.2, and 00:02.0, respectively.

However, there are no PCI-to-PCI bridges that lead to bus 00 or bus 80,
so the HT chain must contain two separate PCI host bridges that lead to
them.

Now, here's the problem: amd_bus.c reads the HT host bridge configuration
and learns that it routes buses 00-ff and the related address space,
including the following range, down the HT chain at node 0, link 0:

[mem 0x80000000-0xfcffffffff]

That makes sense, because both PCI host bridges are on that HT chain, so
the HT host bridge has to forward all that address space. The problem
is that amd_bus.c assumes there's only one PCI host bridge on the
chain, so it assigns *all* that address space to PCI bus 00.

This doesn't work because parts of that address space belong to bus 80,
not bus 00, and we can't reach bus 80 from PCI bus 00. In particular,
we know that at least the following address space is routed to bus 80,
because the 80:01.0 device does work at this address, which is in the
middle of the range we found above:

[mem 0xfebfc000-0xfebfffff]

(Note that we can reach bus 80 from the HT chain, but the HT chain is
outside the PCI domain, even though some of the HT registers appear in
PCI bus 00 config space. We need a second PCI host bridge from the HT
chain to PCI bus 80.)

The HT spec does suggest that an HT/PCI host bridge should implement a
HyperTransport Bridge Header [1, sec 7.4]. This header would make the
HT/PCI host bridge look just like a PCI-to-PCI bridge, with the usual
primary/secondary/subordinate bus numbers, memory, prefetchable memory,
and I/O port apertures, etc.

If all the HT/PCI host bridges on a chain were implemented this way, I
think it probably would work to pretend the HT host bridge is a PCI host
bridge. But this sort of implementation is apparently not universal.
The VIA chipset in Graham's machine doesn't do it that way, and the
Serverworks HT-2100 chipset in the HP DL785 doesn't either.


[1] http://www.hypertransport.org/docs/twgdocs/HTC20051222-0046-0033_changes.pdf
[2] http://support.amd.com/us/Embedded_TechDocs/31116-Public-GH-BKDG_3-28_5-28-09.pdf
[3] https://bugzilla.kernel.org/show_bug.cgi?id=16007

2010-06-11 22:08:31

by Yinghai Lu

[permalink] [raw]
Subject: Re: [Bug 16007] x86/pci Oops with CONFIG_SND_HDA_INTEL

On Fri, Jun 11, 2010 at 2:49 PM, Bjorn Helgaas <[email protected]> wrote:
> [If you haven't been following this bug, the report is at [3].]
>
> Here's a theory. ?I'm not an expert in HyperTransport, so maybe somebody
> who knows HyperTransport and/or VIA chipsets can validate or refute it.
>
> This is based on the _HyperTransport I/O Link Specification_, rev 3.10b [1],
> and the _BIOS and Kernel Developer's Guide (BKDG) for AMD Family 10h
> Processors_ [2].
>
> In a nutshell, I think the problem is that amd_bus.c treats a
> HyperTransport (HT) host bridge as though it were a PCI host bridge. ?In
> particular, when an HT chain contains more than one PCI host bridge, the
> HT host bridge apertures encompass all the PCI host bridges, but
> amd_bus.c mistakenly assigns all those resources to one PCI host bridge.

I don't think so. that system only have one HT chain.

May 19 23:20:33 ocham kernel: pci 0000:00:18.1 config space:
May 19 23:20:33 ocham kernel: 00: 22 10 01 11 00 00 00 00 00 00 00 06
00 00 80 00
May 19 23:20:33 ocham kernel: 10: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 20: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 30: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 40: 03 00 00 00 00 00 7f 00 00 00 00 00
01 00 00 00
May 19 23:20:33 ocham kernel: 50: 00 00 00 00 02 00 00 00 00 00 00 00
03 00 00 00
May 19 23:20:33 ocham kernel: 60: 00 00 00 00 04 00 00 00 00 00 00 00
05 00 00 00
May 19 23:20:33 ocham kernel: 70: 00 00 00 00 06 00 00 00 00 00 00 00
07 00 00 00
May 19 23:20:33 ocham kernel: 80: 03 00 e0 00 80 ff ef 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 90: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: a0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: b0: 03 0a 00 00 00 0b 00 00 03 00 80 00
00 ff ff 00
May 19 23:20:33 ocham kernel: c0: 13 10 00 00 00 f0 ff 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: d0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: e0: 03 00 00 ff 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: f0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00

the (0xe4) = ff 00 00 03

mean it will route pci operation all to node0 link0.

that chip from VIA has some design problem that will produce one orphan device.

May 19 23:20:33 ocham kernel: pci 0000:80:01.0 config space:
May 19 23:20:33 ocham kernel: 00: 06 11 88 32 06 00 10 00 10 00 03 04
10 00 00 00
May 19 23:20:33 ocham kernel: 10: 04 c0 bf fe 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 20: 00 00 00 00 00 00 00 00 00 00 00 00
49 18 88 08
May 19 23:20:33 ocham kernel: 30: 00 00 00 00 50 00 00 00 00 00 00 00
0b 01 00 00
May 19 23:20:33 ocham kernel: 40: 00 30 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 50: 01 60 42 c8 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 60: 05 70 80 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 70: 10 00 91 00 00 00 00 00 00 00 30 00
00 00 00 00
May 19 23:20:33 ocham kernel: 80: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: 90: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: a0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: b0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: c0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: d0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: e0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
May 19 23:20:33 ocham kernel: f0: 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00

YH

2010-06-11 23:09:06

by Yinghai Lu

[permalink] [raw]
Subject: Re: [Bug 16007] x86/pci Oops with CONFIG_SND_HDA_INTEL


please check if this one workaround the problem

Thanks

Yinghai Lu

[PATCH] x86, pci: handle fallout pci devices with peer root bus

Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/pci/bus_numa.c | 4 +++-
kernel/resource.c | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/pci/bus_numa.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/bus_numa.c
+++ linux-2.6/arch/x86/pci/bus_numa.c
@@ -22,7 +22,8 @@ void x86_pci_root_bus_res_quirks(struct
return;

for (i = 0; i < pci_root_num; i++) {
- if (pci_root_info[i].bus_min == b->number)
+ if (pci_root_info[i].bus_min <= b->number &&
+ pci_root_info[i].bus_max >= b->number)
break;
}

@@ -37,6 +38,7 @@ void x86_pci_root_bus_res_quirks(struct
for (j = 0; j < info->res_num; j++) {
struct resource *res;
struct resource *root;
+ struct resource *tmp;

res = &info->res[j];
pci_bus_add_resource(b, res, 0);
Index: linux-2.6/kernel/resource.c
===================================================================
--- linux-2.6.orig/kernel/resource.c
+++ linux-2.6/kernel/resource.c
@@ -451,7 +451,7 @@ static struct resource * __insert_resour
if (!first)
return first;

- if (first == parent)
+ if (first == parent || first == new)
return first;

if ((first->start > new->start) || (first->end < new->end))

2010-06-14 14:18:42

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [Bug 16007] x86/pci Oops with CONFIG_SND_HDA_INTEL

On Friday, June 11, 2010 05:06:49 pm Yinghai Lu wrote:
>
> please check if this one workaround the problem
>
> Thanks
>
> Yinghai Lu
>
> [PATCH] x86, pci: handle fallout pci devices with peer root bus
>
> Signed-off-by: Yinghai Lu <[email protected]>

This patch apparently does cover up the problem, but it fails on
so many levels:

- incomprehensible summary
- no changelog
- no bugzilla pointer
- unrelated junk in patch ("tmp")
- completely unexplained change to generic resource.c
- no indication that we understand the root cause

> ---
> arch/x86/pci/bus_numa.c | 4 +++-
> kernel/resource.c | 2 +-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/arch/x86/pci/bus_numa.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/bus_numa.c
> +++ linux-2.6/arch/x86/pci/bus_numa.c
> @@ -22,7 +22,8 @@ void x86_pci_root_bus_res_quirks(struct
> return;
>
> for (i = 0; i < pci_root_num; i++) {
> - if (pci_root_info[i].bus_min == b->number)
> + if (pci_root_info[i].bus_min <= b->number &&
> + pci_root_info[i].bus_max >= b->number)
> break;
> }
>
> @@ -37,6 +38,7 @@ void x86_pci_root_bus_res_quirks(struct
> for (j = 0; j < info->res_num; j++) {
> struct resource *res;
> struct resource *root;
> + struct resource *tmp;
>
> res = &info->res[j];
> pci_bus_add_resource(b, res, 0);
> Index: linux-2.6/kernel/resource.c
> ===================================================================
> --- linux-2.6.orig/kernel/resource.c
> +++ linux-2.6/kernel/resource.c
> @@ -451,7 +451,7 @@ static struct resource * __insert_resour
> if (!first)
> return first;
>
> - if (first == parent)
> + if (first == parent || first == new)
> return first;
>
> if ((first->start > new->start) || (first->end < new->end))
>

2010-06-14 17:49:54

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus


Graham bisected
| commit 3e3da00c01d050307e753fb7b3e84aefc16da0d0
| x86/pci: AMD one chain system to use pci read out res

cause the SND_HDA_INTEL doesn't work anymore.

https://bugzilla.kernel.org/show_bug.cgi?id=16007

It turns out that his system with via chipset only have one hypertransport
chain, but does have one extra orphan device 80:01.0

PCI: Probing PCI hardware (bus 00)
PCI: Discovered primary peer bus 80 [IRQ]

node 0 link 0: io port [1000, ffffff]
TOM: 0000000080000000 aka 2048M
node 0 link 0: mmio [e0000000, efffffff]
node 0 link 0: mmio [a0000, bffff]
node 0 link 0: mmio [80000000, ffffffff]
bus: [00, ff] on node 0 link 0

Try to make peer root buses to share same mmio/io resources if those peer root
buses fall into the same bus range.

Also need to update insert_resource to avoid insert same resource two times.

We need this patch for 2.6.34 stable.

Reported-by: Graham Ramsey <[email protected]>
Bisected-by: Graham Ramsey <[email protected]>
Tested-by: Graham Ramsey <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>
Cc: [email protected]

---
arch/x86/pci/bus_numa.c | 3 ++-
kernel/resource.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/pci/bus_numa.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/bus_numa.c
+++ linux-2.6/arch/x86/pci/bus_numa.c
@@ -22,7 +22,8 @@ void x86_pci_root_bus_res_quirks(struct
return;

for (i = 0; i < pci_root_num; i++) {
- if (pci_root_info[i].bus_min == b->number)
+ if (pci_root_info[i].bus_min <= b->number &&
+ pci_root_info[i].bus_max >= b->number)
break;
}

Index: linux-2.6/kernel/resource.c
===================================================================
--- linux-2.6.orig/kernel/resource.c
+++ linux-2.6/kernel/resource.c
@@ -451,7 +451,7 @@ static struct resource * __insert_resour
if (!first)
return first;

- if (first == parent)
+ if (first == parent || first == new)
return first;

if ((first->start > new->start) || (first->end < new->end))

2010-06-14 18:15:52

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Mon, 14 Jun 2010 10:47:59 -0700
Yinghai Lu <[email protected]> wrote:

>
> Graham bisected
> | commit 3e3da00c01d050307e753fb7b3e84aefc16da0d0
> | x86/pci: AMD one chain system to use pci read out res
>
> cause the SND_HDA_INTEL doesn't work anymore.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16007
>
> It turns out that his system with via chipset only have one hypertransport
> chain, but does have one extra orphan device 80:01.0
>
> PCI: Probing PCI hardware (bus 00)
> PCI: Discovered primary peer bus 80 [IRQ]
>
> node 0 link 0: io port [1000, ffffff]
> TOM: 0000000080000000 aka 2048M
> node 0 link 0: mmio [e0000000, efffffff]
> node 0 link 0: mmio [a0000, bffff]
> node 0 link 0: mmio [80000000, ffffffff]
> bus: [00, ff] on node 0 link 0
>
> Try to make peer root buses to share same mmio/io resources if those peer root
> buses fall into the same bus range.
>
> Also need to update insert_resource to avoid insert same resource two times.

So 3e3da00c01d050307e753fb7b3e84aefc16da0d0 was supposed to address the
case where some laptop RAM ranges ended up incorrect. Would using _CRS
on those machines also address that problem? If so, we should consider
dropping amd_bus.c like we did with intel_bus.c.

Yinghai, do you still have people from the RAM bug that could test
using _CRS data?

--
Jesse Barnes, Intel Open Source Technology Center

2010-06-14 18:23:33

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On 06/14/2010 11:14 AM, Jesse Barnes wrote:
> On Mon, 14 Jun 2010 10:47:59 -0700
> Yinghai Lu <[email protected]> wrote:
>
>>
>> Graham bisected
>> | commit 3e3da00c01d050307e753fb7b3e84aefc16da0d0
>> | x86/pci: AMD one chain system to use pci read out res
>>
>> cause the SND_HDA_INTEL doesn't work anymore.
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=16007
>>
>> It turns out that his system with via chipset only have one hypertransport
>> chain, but does have one extra orphan device 80:01.0
>>
>> PCI: Probing PCI hardware (bus 00)
>> PCI: Discovered primary peer bus 80 [IRQ]
>>
>> node 0 link 0: io port [1000, ffffff]
>> TOM: 0000000080000000 aka 2048M
>> node 0 link 0: mmio [e0000000, efffffff]
>> node 0 link 0: mmio [a0000, bffff]
>> node 0 link 0: mmio [80000000, ffffffff]
>> bus: [00, ff] on node 0 link 0
>>
>> Try to make peer root buses to share same mmio/io resources if those peer root
>> buses fall into the same bus range.
>>
>> Also need to update insert_resource to avoid insert same resource two times.
>
> So 3e3da00c01d050307e753fb7b3e84aefc16da0d0 was supposed to address the
> case where some laptop RAM ranges ended up incorrect. Would using _CRS
> on those machines also address that problem? If so, we should consider
> dropping amd_bus.c like we did with intel_bus.c.
>
> Yinghai, do you still have people from the RAM bug that could test
> using _CRS data?

I can not find the mail anymore.

looks like someone is using one AMD k8 Aruma laptop for firewire development.

YH

2010-06-14 18:34:21

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Monday, June 14, 2010 11:47:59 am Yinghai Lu wrote:
>
> Graham bisected
> | commit 3e3da00c01d050307e753fb7b3e84aefc16da0d0
> | x86/pci: AMD one chain system to use pci read out res
>
> cause the SND_HDA_INTEL doesn't work anymore.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16007
>
> It turns out that his system with via chipset only have one hypertransport
> chain, but does have one extra orphan device 80:01.0
>
> PCI: Probing PCI hardware (bus 00)
> PCI: Discovered primary peer bus 80 [IRQ]
>
> node 0 link 0: io port [1000, ffffff]
> TOM: 0000000080000000 aka 2048M
> node 0 link 0: mmio [e0000000, efffffff]
> node 0 link 0: mmio [a0000, bffff]
> node 0 link 0: mmio [80000000, ffffffff]
> bus: [00, ff] on node 0 link 0
>
> Try to make peer root buses to share same mmio/io resources if those peer root
> buses fall into the same bus range.

Yinghai, did you read https://bugzilla.kernel.org/show_bug.cgi?id=16007#c15 ?

I made the point there that an HT chain may contain multiple HT/PCI
host bridges, but you are stuck on the idea that "one HT chain == one
PCI root bus."

I have not found the "one PCI host bridge per HT chain" requirement
in the HT spec (if you find it, please point me to it).

If an HT chain may contain multiple HT/PCI host bridges, then it's
obvious that the HT host bridge registers read by amd_bus.c don't
contain enough information to correctly assign address space to the
PCI root buses.

> Also need to update insert_resource to avoid insert same resource two times.
>
> We need this patch for 2.6.34 stable.

No, we don't! Not yet, anyway. We need to find the root cause of this
problem, not just paper over it and wait for it to pop up again somewhere
else.

> Reported-by: Graham Ramsey <[email protected]>
> Bisected-by: Graham Ramsey <[email protected]>
> Tested-by: Graham Ramsey <[email protected]>
> Signed-off-by: Yinghai Lu <[email protected]>
> Cc: [email protected]
>
> ---
> arch/x86/pci/bus_numa.c | 3 ++-
> kernel/resource.c | 2 +-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/arch/x86/pci/bus_numa.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/bus_numa.c
> +++ linux-2.6/arch/x86/pci/bus_numa.c
> @@ -22,7 +22,8 @@ void x86_pci_root_bus_res_quirks(struct
> return;
>
> for (i = 0; i < pci_root_num; i++) {
> - if (pci_root_info[i].bus_min == b->number)
> + if (pci_root_info[i].bus_min <= b->number &&
> + pci_root_info[i].bus_max >= b->number)
> break;
> }
>
> Index: linux-2.6/kernel/resource.c
> ===================================================================
> --- linux-2.6.orig/kernel/resource.c
> +++ linux-2.6/kernel/resource.c
> @@ -451,7 +451,7 @@ static struct resource * __insert_resour
> if (!first)
> return first;
>
> - if (first == parent)
> + if (first == parent || first == new)
> return first;
>
> if ((first->start > new->start) || (first->end < new->end))
>

2010-06-14 18:40:30

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On 06/14/2010 11:34 AM, Bjorn Helgaas wrote:
>
> I made the point there that an HT chain may contain multiple HT/PCI
> host bridges, but you are stuck on the idea that "one HT chain == one
> PCI root bus."
>
> I have not found the "one PCI host bridge per HT chain" requirement
> in the HT spec (if you find it, please point me to it).
>
> If an HT chain may contain multiple HT/PCI host bridges, then it's
> obvious that the HT host bridge registers read by amd_bus.c don't
> contain enough information to correctly assign address space to the
> PCI root buses.
>

A HT-to-PCI bridge appears as a PCI-to-PCI bridge (i.e. a Header Type 1
device), not as a host bridge (a Header Type 0 device).

That is at least the software model as defined.

-hpa

2010-06-14 18:57:58

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On 06/14/2010 11:39 AM, H. Peter Anvin wrote:
> On 06/14/2010 11:34 AM, Bjorn Helgaas wrote:
>>
>> I made the point there that an HT chain may contain multiple HT/PCI
>> host bridges, but you are stuck on the idea that "one HT chain == one
>> PCI root bus."

should be.

>>
>> I have not found the "one PCI host bridge per HT chain" requirement
>> in the HT spec (if you find it, please point me to it).

according to my experience with LinuxBIOS. AMD chipset, nvidia and serverworks (broadcom)

>>
>> If an HT chain may contain multiple HT/PCI host bridges, then it's
>> obvious that the HT host bridge registers read by amd_bus.c don't
>> contain enough information to correctly assign address space to the
>> PCI root buses.

the host bridges is on AMD CPUs,

>>
>
> A HT-to-PCI bridge appears as a PCI-to-PCI bridge (i.e. a Header Type 1
> device), not as a host bridge (a Header Type 0 device).
>
> That is at least the software model as defined.

one HT chain could have some HT devices, HT devices could be HT tunnel or HT bridge.

If it is HT tunnel, the next device will use same primary pci bus number with some addon device number.
It it is HT bridge, will like some kind pci-to-pci bridge.

link between KT890 and vt32551? is some kind va-link? it is not HT between them

somehow the southbridge vt32551 respond the sound_intel from 80:01.0... and it is supposed to be under some pci bridge.

YH

2010-06-14 19:43:36

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Monday, June 14, 2010 12:39:54 pm H. Peter Anvin wrote:
> On 06/14/2010 11:34 AM, Bjorn Helgaas wrote:
> >
> > I made the point there that an HT chain may contain multiple HT/PCI
> > host bridges, but you are stuck on the idea that "one HT chain == one
> > PCI root bus."
> >
> > I have not found the "one PCI host bridge per HT chain" requirement
> > in the HT spec (if you find it, please point me to it).
> >
> > If an HT chain may contain multiple HT/PCI host bridges, then it's
> > obvious that the HT host bridge registers read by amd_bus.c don't
> > contain enough information to correctly assign address space to the
> > PCI root buses.
>
> A HT-to-PCI bridge appears as a PCI-to-PCI bridge (i.e. a Header Type 1
> device), not as a host bridge (a Header Type 0 device).
>
> That is at least the software model as defined.

Certainly that's what the HT I/O Link spec (v3.10, sec 7.4) suggests,
and I think I saw hints that AMD chipsets do that. I can't tell from
the HT I/O spec whether it would be an actual defect to use host bridges
instead of PCI-to-PCI bridges, and I can imagine why one might want to
leave an existing PCI host bridge design alone and merely glue on an
HT interface, rather than redesign the bridge register set.

In any case, the VIA chipset in Graham's machine does not have a
PCI-to-PCI bridge leading to bus 80 (see
https://bugzilla.kernel.org/show_bug.cgi?id=16007#c14).
However, ACPI *does* report a PCI host bridge leading to bus 80,
and the apertures it reports seem to be correct (see
https://bugzilla.kernel.org/show_bug.cgi?id=16007#c6).

Bjorn

2010-06-14 20:00:24

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Monday, June 14, 2010 12:55:44 pm Yinghai Lu wrote:
> On 06/14/2010 11:39 AM, H. Peter Anvin wrote:
> > On 06/14/2010 11:34 AM, Bjorn Helgaas wrote:
> >>
> >> I made the point there that an HT chain may contain multiple HT/PCI
> >> host bridges, but you are stuck on the idea that "one HT chain == one
> >> PCI root bus."
>
> should be.
>
> >> I have not found the "one PCI host bridge per HT chain" requirement
> >> in the HT spec (if you find it, please point me to it).
>
> according to my experience with LinuxBIOS. AMD chipset, nvidia and serverworks (broadcom)

I'm afraid I'm still not convinced.

> >> If an HT chain may contain multiple HT/PCI host bridges, then it's
> >> obvious that the HT host bridge registers read by amd_bus.c don't
> >> contain enough information to correctly assign address space to the
> >> PCI root buses.
>
> the host bridges is on AMD CPUs,

Don't confuse the HT host bridge with the PCI host bridge. The HT I/O spec
is quite clear that it uses "host bridge" to refer to the HT host bridge,
i.e., the interface between CPUs and a HyperTransport link.

I agree that the *HT host bridge* is indeed on the AMD CPU. But that is
certainly not the same as the PCI host bridge that bridges between an HT
link and a PCI bus.

See sections 4.9.4 (HT host bridge) and 7.4 (HT/PCI host bridge), for
example.

Bjorn

2010-06-14 20:09:15

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On 06/14/2010 01:00 PM, Bjorn Helgaas wrote:
>>
>> the host bridges is on AMD CPUs,
>
> Don't confuse the HT host bridge with the PCI host bridge. The HT I/O spec
> is quite clear that it uses "host bridge" to refer to the HT host bridge,
> i.e., the interface between CPUs and a HyperTransport link.
>
> I agree that the *HT host bridge* is indeed on the AMD CPU. But that is
> certainly not the same as the PCI host bridge that bridges between an HT
> link and a PCI bus.
>
> See sections 4.9.4 (HT host bridge) and 7.4 (HT/PCI host bridge), for
> example.
>

>From a software point of view the latter is [largely] a PCI-to-PCI
bridge, though; it's not a root-level host bridge in the classical sense
(as noted in section 7.4).

Incidentally, in my copy of HT 3.10b, section 7.4 is marked
"HyperTransport Bridge Headers", and does not use the term "host bridge"
to refer to a secondary PCI bus. Section 4.9.4 is simply marked "Host
Bridge". As such, I think the HT spec is pretty consistent about
unambiguously referring to the HT host bridge when using the term "host
bridge".

-hpa

2010-06-14 20:20:18

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Monday, June 14, 2010 02:08:37 pm H. Peter Anvin wrote:
> On 06/14/2010 01:00 PM, Bjorn Helgaas wrote:
> >>
> >> the host bridges is on AMD CPUs,
> >
> > Don't confuse the HT host bridge with the PCI host bridge. The HT I/O spec
> > is quite clear that it uses "host bridge" to refer to the HT host bridge,
> > i.e., the interface between CPUs and a HyperTransport link.
> >
> > I agree that the *HT host bridge* is indeed on the AMD CPU. But that is
> > certainly not the same as the PCI host bridge that bridges between an HT
> > link and a PCI bus.
> >
> > See sections 4.9.4 (HT host bridge) and 7.4 (HT/PCI host bridge), for
> > example.
>
> From a software point of view the latter is [largely] a PCI-to-PCI
> bridge, though; it's not a root-level host bridge in the classical sense
> (as noted in section 7.4).

OK, but Graham's system doesn't have anything resembling a PCI-to-PCI
bridge leading to bus 80. So while I agree that in an ideal world,
HT/PCI host bridges might always look like PCI-to-PCI bridges, it
seems this is not the case in practice.

> Incidentally, in my copy of HT 3.10b, section 7.4 is marked
> "HyperTransport Bridge Headers", and does not use the term "host bridge"
> to refer to a secondary PCI bus. Section 4.9.4 is simply marked "Host
> Bridge". As such, I think the HT spec is pretty consistent about
> unambiguously referring to the HT host bridge when using the term "host
> bridge".

Yes, absolutely. My point is that what the HT spec means by "host bridge"
is not the same as what the PCI spec and Linux mean by "PCI host bridge".

Those are two completely separate functions, and I think Yinghai is
confusing them when he says "the host bridge is on the AMD CPU and
amd_bus.c uses its config to determine PCI root bus resources."

Bjorn

2010-06-14 21:11:05

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On 06/14/2010 01:20 PM, Bjorn Helgaas wrote:
>
> OK, but Graham's system doesn't have anything resembling a PCI-to-PCI
> bridge leading to bus 80. So while I agree that in an ideal world,
> HT/PCI host bridges might always look like PCI-to-PCI bridges, it
> seems this is not the case in practice.
>

Invisible PCI bridges have been known to occur in pure PCI space, too.

> Yes, absolutely. My point is that what the HT spec means by "host bridge"
> is not the same as what the PCI spec and Linux mean by "PCI host bridge".

Actually, they're *exactly* the same thing.

-hpa

2010-06-15 01:49:54

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Monday, June 14, 2010 03:10:20 pm H. Peter Anvin wrote:
> On 06/14/2010 01:20 PM, Bjorn Helgaas wrote:
> >
> > OK, but Graham's system doesn't have anything resembling a PCI-to-PCI
> > bridge leading to bus 80. So while I agree that in an ideal world,
> > HT/PCI host bridges might always look like PCI-to-PCI bridges, it
> > seems this is not the case in practice.
>
> Invisible PCI bridges have been known to occur in pure PCI space, too.

Are you talking about PCI host bridges that don't appear in PCI config
space? I suppose those could be described as "invisible," but since
host bridges aren't architected and their primary interface isn't PCI,
it seems only natural that we'd discover them by a non-PCI mechanism.
They're invisible in PCI terms, but obviously perfectly discoverable
and configurable via ACPI.

If you ask me, it's weird that most x86 chipsets put PCI host bridge
configuration in PCI config space -- it may be convenient in some ways,
but still architecturally strange.

> > Yes, absolutely. My point is that what the HT spec means by "host bridge"
> > is not the same as what the PCI spec and Linux mean by "PCI host bridge".
>
> Actually, they're *exactly* the same thing.

If HT is identical to PCI, I agree "HT host bridge" means the same
as "PCI host bridge" (that's almost too trivial to say :-)).

I guess I'm still dubious that HT is identical to PCI. Since Graham's
box has a single HT change, we know that all his devices are on HT
chain A. If HT is identical to PCI, that chain must be bus 00. Here
are the relevant parts of the box:

00:00.0 Host bridge: VIA K8T890CF Host Bridge
00:00.1 Host bridge: VIA VT3351 Host Bridge
00:00.2 Host bridge: VIA VT3351 Host Bridge
00:00.3 Host bridge: VIA VT3351 Host Bridge
00:00.4 Host bridge: VIA VT3351 Host Bridge
00:00.7 Host bridge: VIA VT3351 Host Bridge
00:02.0 PCI bridge: VIA K8T890 PCI to PCI Bridge [to bus 06]
00:03.0 PCI bridge: VIA K8T890 PCI to PCI Bridge [to bus 05]
00:03.1 PCI bridge: VIA K8T890 PCI to PCI Bridge [to bus 04]
00:03.2 PCI bridge: VIA K8T890 PCI to PCI Bridge [to bus 03]
00:03.3 PCI bridge: VIA K8T890 PCI to PCI Bridge [to bus 02]
00:11.7 Host bridge: VIA VT8251 Ultra VLINK Controller
00:13.0 Host bridge: VIA VT8237A Host Bridge
00:18.0 Host bridge: AMD HyperTransport Technology Configuration
80:01.0 Audio device: VIA VT1708/A High Definition Audio Controller

The question is "how do we get to bus 80?" If everything behind the
AMD HT host bridge is PCI and can be understood solely in terms of
PCI specs, there must be a P2P bridge from bus 00 to bus 80. We
clearly don't have that.

I suppose one could argue that there's a non-standard P2P bridge
from bus 00 to bus 80, but I can't imagine anybody doing that.
An OS would have to have vendor-specific code just to do PCI
resource management, and that really misses the point of PCI.

It seems more likely to me that one of the VIA host bridges leads
to bus 80. PCI host bridges are not architected, so if this bridge
lives on HT chain 00, and we can think of HT as "not quite PCI,"
then it seems natural that the host bridge would be VIA-specific,
just like it was in pre-HT days.

The underlying question for all of this is "what's the future of
amd_bus.c?" or stated another way, "does AMD HT config and standard
PCI P2P bridge config tell us everything we need to know about
address space routing?"

On this machine, I claim the answer is "no," and therefore we must
use ACPI to discover and configure the host bridges, i.e., we have
to turn on "pci=use_crs". We currently turn it on automatically for
machines from 2008 and newer. I think we need to do it for older
machines, too, perhaps even whenever we use ACPI at all.

Bjorn

2010-06-15 01:57:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On 06/14/2010 06:49 PM, Bjorn Helgaas wrote:

>>
>> Invisible PCI bridges have been known to occur in pure PCI space, too.
>
> Are you talking about PCI host bridges that don't appear in PCI config
> space? I suppose those could be described as "invisible," but since
> host bridges aren't architected and their primary interface isn't PCI,
> it seems only natural that we'd discover them by a non-PCI mechanism.
> They're invisible in PCI terms, but obviously perfectly discoverable
> and configurable via ACPI.

I mean invisible PCI-PCI bridges. Yes, they exist.

> If you ask me, it's weird that most x86 chipsets put PCI host bridge
> configuration in PCI config space -- it may be convenient in some ways,
> but still architecturally strange.

It is only strange because they are non-bridge devices. PCI-Express
fixes that to some degree with the whole "root complex" notion, but
really a PCI host bridge should have been a bridge device from the start.

> I suppose one could argue that there's a non-standard P2P bridge
> from bus 00 to bus 80, but I can't imagine anybody doing that.

Ah, ye of little imagination.

> An OS would have to have vendor-specific code just to do PCI
> resource management, and that really misses the point of PCI.

This really misses the point of HT...

> It seems more likely to me that one of the VIA host bridges leads
> to bus 80. PCI host bridges are not architected, so if this bridge
> lives on HT chain 00, and we can think of HT as "not quite PCI,"
> then it seems natural that the host bridge would be VIA-specific,
> just like it was in pre-HT days.

I think the best word for it is "incompetent braindamage", but that's
just me...

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2010-06-15 15:31:06

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH -v2] x86, pci: Handle fallout pci devices with peer root bus

On Monday, June 14, 2010 07:56:17 pm H. Peter Anvin wrote:
> On 06/14/2010 06:49 PM, Bjorn Helgaas wrote:
>
> >> Invisible PCI bridges have been known to occur in pure PCI space, too.
> >
> > Are you talking about PCI host bridges that don't appear in PCI config
> > space? I suppose those could be described as "invisible," but since
> > host bridges aren't architected and their primary interface isn't PCI,
> > it seems only natural that we'd discover them by a non-PCI mechanism.
> > They're invisible in PCI terms, but obviously perfectly discoverable
> > and configurable via ACPI.
>
> I mean invisible PCI-PCI bridges. Yes, they exist.

Can you educate me more about these? What specifically is invisible?
Do they appear in config space? Are they in config space but merely
non-standard?

Let's say we have:

1) Invisible P2P bridge from bus X to bus 80.

2) PCI host bridge to bus 80.

Neither appears in PCI config space. In both cases, we would
discover bus 80 by blindly probing buses 00-ff. We could
distinguish them by putting a bus analyzer on bus X: if we
see bus 80 traffic on bus X, we must have case (1). If the
invisible P2P bridge happened to be below a standard P2P bridge,
we could also distinguish them by disabling the standard bridge:
if bus 80 disappeared, we'd know this is also case (1).

But in general, they seem pretty hard to distinguish, so I wonder
if it's possible that we have a case of mistaken identity, and we
only thought we had invisible P2P bridges because we started from
the assumption that systems only had a single PCI host bridge.

> > If you ask me, it's weird that most x86 chipsets put PCI host bridge
> > configuration in PCI config space -- it may be convenient in some ways,
> > but still architecturally strange.
>
> It is only strange because they are non-bridge devices. PCI-Express
> fixes that to some degree with the whole "root complex" notion, but
> really a PCI host bridge should have been a bridge device from the start.

Well, even if host bridges had always looked like P2P bridges, we'd
still have the chicken-and-egg problem of knowing where to look for
them. The OS could use the hack of "always assume bus 00 exists and
enumerate it," but then we still have to worry about multiple segments.
So we always need a non-PCI description of where the PCI buses live.

> > I suppose one could argue that there's a non-standard P2P bridge
> > from bus 00 to bus 80, but I can't imagine anybody doing that.
>
> Ah, ye of little imagination.

Heh, nobody's ever accused me of having a vivid imagination :-)

> > An OS would have to have vendor-specific code just to do PCI
> > resource management, and that really misses the point of PCI.
>
> This really misses the point of HT...

I don't follow you here. I was trying to get at the fact that if
there are non-standard P2P bridges, an OS without a device-specific
driver would not even find devices behind the bridge (unless it
has a blind probe hack) and it would not know the bridge apertures,
so it could never change resource assignments of peers of the bridge
or devices behind the bridge.

Does HT change that reasoning somehow?

> > It seems more likely to me that one of the VIA host bridges leads
> > to bus 80. PCI host bridges are not architected, so if this bridge
> > lives on HT chain 00, and we can think of HT as "not quite PCI,"
> > then it seems natural that the host bridge would be VIA-specific,
> > just like it was in pre-HT days.
>
> I think the best word for it is "incompetent braindamage", but that's
> just me...

That's a pretty broad brush. We've dismissed many ACPI issues as
being "incompetent braindamage" on the part of BIOS engineers, only
to find out later that we really had problems in the Linux/ACPI code.
Since I know approximately nothing about the VIA chipset, and I see
plenty of warts in Linux PCI code, I'm not ready to assign blame yet.

Bjorn

2010-06-21 17:27:57

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [Bug 16007] x86/pci Oops with CONFIG_SND_HDA_INTEL

I think the best long-term fix is to always enable "pci=use_crs",
regardless of the BIOS date (currently we only do it for 2008 and
newer). System designers and BIOS writers expect the OS to pay
attention to that information, and indications are that Windows
does use it, so I think we will ultimately be better off if we
use the expected, best-tested path.

However, we have at least one known Linux issue (bug #16228) when
_CRS is enabled, so I'm hesitant to enable it unconditionally at
least until that is resolved.

In the short term, I think we should apply Graham's quirk from
comment #8, which enables pci=use_crs just for his system.

Here's my response to Yinghai's patches. ACPI gives us these resources:
pci_root PNP0A03:00: host bridge window [mem 0x80000000-0xff37ffff] (bus 00)
pci_root PNP0A08:00: host bridge window [mem 0xfebfc000-0xfebfffff] (bus 80)

Yinghai's patch (comment #17, with a v2 posted to the list but not in
the bugzilla), gives us these resources:
pci_bus 0000:00: resource 5 [mem 0x80000000-0xfcffffffff]
pci_bus 0000:80: resource 5 [mem 0x80000000-0xfcffffffff]

I think it's just a bad idea to assign the same range to both buses,
especially when the BIOS is telling us what we should be using.

I also think it's a mistake to mess with the resource code to deal
with this specific case. A change like that makes resource.c hard
to understand and maintain in the future.