2010-01-13 06:08:55

by Jeff Garrett

[permalink] [raw]
Subject: 2.6.33-rc3: pci host bridge windows ignored (works with pci=use_crs)

Took a stab at getting the right emails. If I missed anyone, sorry...

I have a desktop machine with a radeon HD 4850, and on recent kernels
the radeon driver has failed with the message "No valid linear
framebuffer address". lspci on the broken configuration showed the
first memory region at d0000000 of the radeon card to be ignored. dmesg
showed there to be a host bridge window at that address which was also
ignored.

I haven't quantified "recent" kernel. I was using the Ubuntu/lucid
kernels and it broke in the 2.6.32-* timeline, at which point I switched
to vanilla 2.6.33-rc3 to see if that worked. I could probably bisect it
in a day or two when I get a little time.

Following dmesg's instructions, setting pci=use_crs causes the region
not to be ignored, and my video works again.

I'm attaching a dmesg from the broken & working configurations, lspci
-vvv output from the working configuration, and the output of acpidump.

(Between the failed/working, I also applied the patch at
http://bugzilla.kernel.org/show_bug.cgi?id=14954
to get rid of the ACPI-parsing oops. But that only fixed the oops.)

-Jeff Garrett


Attachments:
(No filename) (1.11 kB)
dmesg.failed (57.96 kB)
dmesg.working (50.53 kB)
lspci (32.91 kB)
acpidump (127.46 kB)
Download all attachments

2010-01-13 08:44:08

by Yinghai Lu

[permalink] [raw]
Subject: Re: 2.6.33-rc3: pci host bridge windows ignored (works with pci=use_crs)

On Tue, Jan 12, 2010 at 9:37 PM, Jeff Garrett <[email protected]> wrote:
> Took a stab at getting the right emails. ?If I missed anyone, sorry...
>
> I have a desktop machine with a radeon HD 4850, and on recent kernels
> the radeon driver has failed with the message "No valid linear
> framebuffer address". ?lspci on the broken configuration showed the
> first memory region at d0000000 of the radeon card to be ignored. ?dmesg
> showed there to be a host bridge window at that address which was also
> ignored.
>
> I haven't quantified "recent" kernel. ?I was using the Ubuntu/lucid
> kernels and it broke in the 2.6.32-* timeline, at which point I switched
> to vanilla 2.6.33-rc3 to see if that worked. ?I could probably bisect it
> in a day or two when I get a little time.
>
> Following dmesg's instructions, setting pci=use_crs causes the region
> not to be ignored, and my video works again.
>
> I'm attaching a dmesg from the broken & working configurations, lspci
> -vvv output from the working configuration, and the output of acpidump.
>
> (Between the failed/working, I also applied the patch at
> http://bugzilla.kernel.org/show_bug.cgi?id=14954
> to get rid of the ACPI-parsing oops. ?But that only fixed the oops.)

[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 00000000bf790000 (usable)
[ 0.000000] BIOS-e820: 00000000bf790000 - 00000000bf79e000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
[ 0.000000] BIOS-e820: 00000000bf7d0000 - 00000000bf7e0000 (reserved)
[ 0.000000] BIOS-e820: 00000000bf7ec000 - 00000000c0000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[ 0.000000] BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
[ 0.000000] BIOS-e820: 0000000100000000 - 0000000340000000 (usable)
...

[ 0.833443] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem
0xe0000000-0xefffffff] (base 0xe0000000)
[ 0.835028] PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved
in ACPI motherboard resources
...

[ 0.847449] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 0.847470] pci_root PNP0A08:00: ignoring host bridge windows from
ACPI; boot with "pci=use_crs" to use them
[ 0.847693] pci_root PNP0A08:00: host bridge window [io
0x0000-0x0cf7] (ignored)
[ 0.847695] pci_root PNP0A08:00: host bridge window [io
0x0d00-0xffff] (ignored)
[ 0.847697] pci_root PNP0A08:00: host bridge window [mem
0x000a0000-0x000bffff] (ignored)
[ 0.847699] pci_root PNP0A08:00: host bridge window [mem
0x000d0000-0x000dffff] (ignored)
[ 0.847701] pci_root PNP0A08:00: host bridge window [mem
0xc0000000-0xdfffffff] (ignored)
[ 0.847703] pci_root PNP0A08:00: host bridge window [mem
0xf0000000-0xfed8ffff] (ignored)
...
[ 0.848025] IOH bus: [00, fb]
[ 0.848026] IOH bus: 00 index 0 io port: [0, ffff]
[ 0.848028] IOH bus: 00 index 1 mmio: [e0000000, fdffffff]
...
[ 0.849289] PCI: peer root bus 00 res updated from pci conf
...
[ 0.849361] pci 0000:04:00.0: reg 10: [mem 0xd0000000-0xdfffffff 64bit pref]
[ 0.849369] pci 0000:04:00.0: reg 18: [mem 0xfbee0000-0xfbeeffff 64bit]
[ 0.849374] pci 0000:04:00.0: reg 20: [io 0xe000-0xe0ff]
[ 0.849381] pci 0000:04:00.0: reg 30: [mem 0xfbec0000-0xfbedffff pref]
[ 0.849423] pci 0000:04:00.1: reg 10: [mem 0xfbefc000-0xfbefffff 64bit]
[ 0.849481] pci 0000:00:07.0: PCI bridge to [bus 04-04]
[ 0.849484] pci 0000:00:07.0: bridge window [io 0xe000-0xefff]
[ 0.849487] pci 0000:00:07.0: bridge window [mem 0xfbe00000-0xfbefffff]
[ 0.849491] pci 0000:00:07.0: bridge window [mem
0xd0000000-0xdfffffff 64bit pref]


it seems HW IOH only can use [e000000 - fdffffff] under that bridge...

and _CRS said the peer root bus could use [c000000 - dfffffff]

could be that we need to check other register to decide if we can use
reading from ioh reg directly.

YH

2010-01-13 08:45:22

by Yinghai Lu

[permalink] [raw]
Subject: Re: 2.6.33-rc3: pci host bridge windows ignored (works with pci=use_crs)

On Wed, Jan 13, 2010 at 12:44 AM, Yinghai Lu <[email protected]> wrote:
> On Tue, Jan 12, 2010 at 9:37 PM, Jeff Garrett <[email protected]> wrote:
>> Took a stab at getting the right emails. ?If I missed anyone, sorry...
>>
>> I have a desktop machine with a radeon HD 4850, and on recent kernels
>> the radeon driver has failed with the message "No valid linear
>> framebuffer address". ?lspci on the broken configuration showed the
>> first memory region at d0000000 of the radeon card to be ignored. ?dmesg
>> showed there to be a host bridge window at that address which was also
>> ignored.
>>
>> I haven't quantified "recent" kernel. ?I was using the Ubuntu/lucid
>> kernels and it broke in the 2.6.32-* timeline, at which point I switched
>> to vanilla 2.6.33-rc3 to see if that worked. ?I could probably bisect it
>> in a day or two when I get a little time.
>>
>> Following dmesg's instructions, setting pci=use_crs causes the region
>> not to be ignored, and my video works again.
>>
>> I'm attaching a dmesg from the broken & working configurations, lspci
>> -vvv output from the working configuration, and the output of acpidump.
>>
>> (Between the failed/working, I also applied the patch at
>> http://bugzilla.kernel.org/show_bug.cgi?id=14954
>> to get rid of the ACPI-parsing oops. ?But that only fixed the oops.)
>
> [ ? ?0.000000] BIOS-provided physical RAM map:
> [ ? ?0.000000] ?BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> [ ? ?0.000000] ?BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> [ ? ?0.000000] ?BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> [ ? ?0.000000] ?BIOS-e820: 0000000000100000 - 00000000bf790000 (usable)
> [ ? ?0.000000] ?BIOS-e820: 00000000bf790000 - 00000000bf79e000 (ACPI data)
> [ ? ?0.000000] ?BIOS-e820: 00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
> [ ? ?0.000000] ?BIOS-e820: 00000000bf7d0000 - 00000000bf7e0000 (reserved)
> [ ? ?0.000000] ?BIOS-e820: 00000000bf7ec000 - 00000000c0000000 (reserved)
> [ ? ?0.000000] ?BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
> [ ? ?0.000000] ?BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
> [ ? ?0.000000] ?BIOS-e820: 0000000100000000 - 0000000340000000 (usable)
> ...
>
> [ ? ?0.833443] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem
> 0xe0000000-0xefffffff] (base 0xe0000000)
> [ ? ?0.835028] PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved
> in ACPI motherboard resources
> ...
>
> [ ? ?0.847449] ACPI: PCI Root Bridge [PCI0] (0000:00)
> [ ? ?0.847470] pci_root PNP0A08:00: ignoring host bridge windows from
> ACPI; boot with "pci=use_crs" to use them
> [ ? ?0.847693] pci_root PNP0A08:00: host bridge window [io
> 0x0000-0x0cf7] (ignored)
> [ ? ?0.847695] pci_root PNP0A08:00: host bridge window [io
> 0x0d00-0xffff] (ignored)
> [ ? ?0.847697] pci_root PNP0A08:00: host bridge window [mem
> 0x000a0000-0x000bffff] (ignored)
> [ ? ?0.847699] pci_root PNP0A08:00: host bridge window [mem
> 0x000d0000-0x000dffff] (ignored)
> [ ? ?0.847701] pci_root PNP0A08:00: host bridge window [mem
> 0xc0000000-0xdfffffff] (ignored)
> [ ? ?0.847703] pci_root PNP0A08:00: host bridge window [mem
> 0xf0000000-0xfed8ffff] (ignored)
> ...
> [ ? ?0.848025] IOH bus: [00, fb]
> [ ? ?0.848026] IOH bus: 00 index 0 io port: [0, ffff]
> [ ? ?0.848028] IOH bus: 00 index 1 mmio: [e0000000, fdffffff]
> ...
> [ ? ?0.849289] PCI: peer root bus 00 res updated from pci conf
> ...
> [ ? ?0.849361] pci 0000:04:00.0: reg 10: [mem 0xd0000000-0xdfffffff 64bit pref]
> [ ? ?0.849369] pci 0000:04:00.0: reg 18: [mem 0xfbee0000-0xfbeeffff 64bit]
> [ ? ?0.849374] pci 0000:04:00.0: reg 20: [io ?0xe000-0xe0ff]
> [ ? ?0.849381] pci 0000:04:00.0: reg 30: [mem 0xfbec0000-0xfbedffff pref]
> [ ? ?0.849423] pci 0000:04:00.1: reg 10: [mem 0xfbefc000-0xfbefffff 64bit]
> [ ? ?0.849481] pci 0000:00:07.0: PCI bridge to [bus 04-04]
> [ ? ?0.849484] pci 0000:00:07.0: ? bridge window [io ?0xe000-0xefff]
> [ ? ?0.849487] pci 0000:00:07.0: ? bridge window [mem 0xfbe00000-0xfbefffff]
> [ ? ?0.849491] pci 0000:00:07.0: ? bridge window [mem
> 0xd0000000-0xdfffffff 64bit pref]
>
>
> it seems HW IOH only can use [e000000 - fdffffff] under that bridge...
>
> and _CRS said the peer root bus could use [c000000 - dfffffff]
>
> could be that we need to check other register to decide if we can use
> reading from ioh reg directly.

can you send out

lspci -tvnn
lspci -vvxxxx

YH

2010-01-13 13:28:30

by Jeff Garrett

[permalink] [raw]
Subject: Re: 2.6.33-rc3: pci host bridge windows ignored (works with pci=use_crs)

On Wed, Jan 13, 2010 at 12:45:17AM -0800, Yinghai Lu wrote:
> On Wed, Jan 13, 2010 at 12:44 AM, Yinghai Lu <[email protected]> wrote:
> > On Tue, Jan 12, 2010 at 9:37 PM, Jeff Garrett <[email protected]> wrote:
> >> Took a stab at getting the right emails. ?If I missed anyone, sorry...
> >>
> >> I have a desktop machine with a radeon HD 4850, and on recent kernels
> >> the radeon driver has failed with the message "No valid linear
> >> framebuffer address". ?lspci on the broken configuration showed the
> >> first memory region at d0000000 of the radeon card to be ignored. ?dmesg
> >> showed there to be a host bridge window at that address which was also
> >> ignored.
> >> ...
> >
> > [ ? ?0.000000] BIOS-provided physical RAM map:
> > [ ? ?0.000000] ?BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
> > [ ? ?0.000000] ?BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
> > [ ? ?0.000000] ?BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > [ ? ?0.000000] ?BIOS-e820: 0000000000100000 - 00000000bf790000 (usable)
> > [ ? ?0.000000] ?BIOS-e820: 00000000bf790000 - 00000000bf79e000 (ACPI data)
> > [ ? ?0.000000] ?BIOS-e820: 00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
> > [ ? ?0.000000] ?BIOS-e820: 00000000bf7d0000 - 00000000bf7e0000 (reserved)
> > [ ? ?0.000000] ?BIOS-e820: 00000000bf7ec000 - 00000000c0000000 (reserved)
> > [ ? ?0.000000] ?BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
> > [ ? ?0.000000] ?BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
> > [ ? ?0.000000] ?BIOS-e820: 0000000100000000 - 0000000340000000 (usable)
> > ...
> >
> > [ ? ?0.833443] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem
> > 0xe0000000-0xefffffff] (base 0xe0000000)
> > [ ? ?0.835028] PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved
> > in ACPI motherboard resources
> > ...
> >
> > [ ? ?0.847449] ACPI: PCI Root Bridge [PCI0] (0000:00)
> > [ ? ?0.847470] pci_root PNP0A08:00: ignoring host bridge windows from
> > ACPI; boot with "pci=use_crs" to use them
> > [ ? ?0.847693] pci_root PNP0A08:00: host bridge window [io
> > 0x0000-0x0cf7] (ignored)
> > [ ? ?0.847695] pci_root PNP0A08:00: host bridge window [io
> > 0x0d00-0xffff] (ignored)
> > [ ? ?0.847697] pci_root PNP0A08:00: host bridge window [mem
> > 0x000a0000-0x000bffff] (ignored)
> > [ ? ?0.847699] pci_root PNP0A08:00: host bridge window [mem
> > 0x000d0000-0x000dffff] (ignored)
> > [ ? ?0.847701] pci_root PNP0A08:00: host bridge window [mem
> > 0xc0000000-0xdfffffff] (ignored)
> > [ ? ?0.847703] pci_root PNP0A08:00: host bridge window [mem
> > 0xf0000000-0xfed8ffff] (ignored)
> > ...
> > [ ? ?0.848025] IOH bus: [00, fb]
> > [ ? ?0.848026] IOH bus: 00 index 0 io port: [0, ffff]
> > [ ? ?0.848028] IOH bus: 00 index 1 mmio: [e0000000, fdffffff]
> > ...
> > [ ? ?0.849289] PCI: peer root bus 00 res updated from pci conf
> > ...
> > [ ? ?0.849361] pci 0000:04:00.0: reg 10: [mem 0xd0000000-0xdfffffff 64bit pref]
> > [ ? ?0.849369] pci 0000:04:00.0: reg 18: [mem 0xfbee0000-0xfbeeffff 64bit]
> > [ ? ?0.849374] pci 0000:04:00.0: reg 20: [io ?0xe000-0xe0ff]
> > [ ? ?0.849381] pci 0000:04:00.0: reg 30: [mem 0xfbec0000-0xfbedffff pref]
> > [ ? ?0.849423] pci 0000:04:00.1: reg 10: [mem 0xfbefc000-0xfbefffff 64bit]
> > [ ? ?0.849481] pci 0000:00:07.0: PCI bridge to [bus 04-04]
> > [ ? ?0.849484] pci 0000:00:07.0: ? bridge window [io ?0xe000-0xefff]
> > [ ? ?0.849487] pci 0000:00:07.0: ? bridge window [mem 0xfbe00000-0xfbefffff]
> > [ ? ?0.849491] pci 0000:00:07.0: ? bridge window [mem
> > 0xd0000000-0xdfffffff 64bit pref]
> >
> >
> > it seems HW IOH only can use [e000000 - fdffffff] under that bridge...
> >
> > and _CRS said the peer root bus could use [c000000 - dfffffff]
> >
> > could be that we need to check other register to decide if we can use
> > reading from ioh reg directly.
>
> can you send out
>
> lspci -tvnn
> lspci -vvxxxx
>
> YH

Certainly. Attached.

-Jeff


Attachments:
(No filename) (3.82 kB)
lspci.1 (2.33 kB)
lspci.2 (218.34 kB)
Download all attachments

2010-01-14 07:47:39

by Yinghai Lu

[permalink] [raw]
Subject: Re: 2.6.33-rc3: pci host bridge windows ignored (works with pci=use_crs)

On 01/13/2010 05:24 AM, Jeff Garrett wrote:
> On Wed, Jan 13, 2010 at 12:45:17AM -0800, Yinghai Lu wrote:
>> On Wed, Jan 13, 2010 at 12:44 AM, Yinghai Lu <[email protected]> wrote:
>>> On Tue, Jan 12, 2010 at 9:37 PM, Jeff Garrett <[email protected]> wrote:
>>>> Took a stab at getting the right emails. If I missed anyone, sorry...
>>>>
>>>> I have a desktop machine with a radeon HD 4850, and on recent kernels
>>>> the radeon driver has failed with the message "No valid linear
>>>> framebuffer address". lspci on the broken configuration showed the
>>>> first memory region at d0000000 of the radeon card to be ignored. dmesg
>>>> showed there to be a host bridge window at that address which was also
>>>> ignored.
>>>> ...
>>>
>>> [ 0.000000] BIOS-provided physical RAM map:
>>> [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
>>> [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
>>> [ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
>>> [ 0.000000] BIOS-e820: 0000000000100000 - 00000000bf790000 (usable)
>>> [ 0.000000] BIOS-e820: 00000000bf790000 - 00000000bf79e000 (ACPI data)
>>> [ 0.000000] BIOS-e820: 00000000bf79e000 - 00000000bf7d0000 (ACPI NVS)
>>> [ 0.000000] BIOS-e820: 00000000bf7d0000 - 00000000bf7e0000 (reserved)
>>> [ 0.000000] BIOS-e820: 00000000bf7ec000 - 00000000c0000000 (reserved)
>>> [ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
>>> [ 0.000000] BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
>>> [ 0.000000] BIOS-e820: 0000000100000000 - 0000000340000000 (usable)
>>> ...
>>>
>>> [ 0.833443] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem
>>> 0xe0000000-0xefffffff] (base 0xe0000000)
>>> [ 0.835028] PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved
>>> in ACPI motherboard resources
>>> ...
>>>
>>> [ 0.847449] ACPI: PCI Root Bridge [PCI0] (0000:00)
>>> [ 0.847470] pci_root PNP0A08:00: ignoring host bridge windows from
>>> ACPI; boot with "pci=use_crs" to use them
>>> [ 0.847693] pci_root PNP0A08:00: host bridge window [io
>>> 0x0000-0x0cf7] (ignored)
>>> [ 0.847695] pci_root PNP0A08:00: host bridge window [io
>>> 0x0d00-0xffff] (ignored)
>>> [ 0.847697] pci_root PNP0A08:00: host bridge window [mem
>>> 0x000a0000-0x000bffff] (ignored)
>>> [ 0.847699] pci_root PNP0A08:00: host bridge window [mem
>>> 0x000d0000-0x000dffff] (ignored)
>>> [ 0.847701] pci_root PNP0A08:00: host bridge window [mem
>>> 0xc0000000-0xdfffffff] (ignored)
>>> [ 0.847703] pci_root PNP0A08:00: host bridge window [mem
>>> 0xf0000000-0xfed8ffff] (ignored)
>>> ...
>>> [ 0.848025] IOH bus: [00, fb]
>>> [ 0.848026] IOH bus: 00 index 0 io port: [0, ffff]
>>> [ 0.848028] IOH bus: 00 index 1 mmio: [e0000000, fdffffff]
>>> ...
>>> [ 0.849289] PCI: peer root bus 00 res updated from pci conf
>>> ...
>>> [ 0.849361] pci 0000:04:00.0: reg 10: [mem 0xd0000000-0xdfffffff 64bit pref]
>>> [ 0.849369] pci 0000:04:00.0: reg 18: [mem 0xfbee0000-0xfbeeffff 64bit]
>>> [ 0.849374] pci 0000:04:00.0: reg 20: [io 0xe000-0xe0ff]
>>> [ 0.849381] pci 0000:04:00.0: reg 30: [mem 0xfbec0000-0xfbedffff pref]
>>> [ 0.849423] pci 0000:04:00.1: reg 10: [mem 0xfbefc000-0xfbefffff 64bit]
>>> [ 0.849481] pci 0000:00:07.0: PCI bridge to [bus 04-04]
>>> [ 0.849484] pci 0000:00:07.0: bridge window [io 0xe000-0xefff]
>>> [ 0.849487] pci 0000:00:07.0: bridge window [mem 0xfbe00000-0xfbefffff]
>>> [ 0.849491] pci 0000:00:07.0: bridge window [mem
>>> 0xd0000000-0xdfffffff 64bit pref]
>>>
>>>
>>> it seems HW IOH only can use [e000000 - fdffffff] under that bridge...
>>>
>>> and _CRS said the peer root bus could use [c000000 - dfffffff]
>>>
>>> could be that we need to check other register to decide if we can use
>>> reading from ioh reg directly.
>>
>> can you send out
>>
>> lspci -tvnn
>> lspci -vvxxxx
>>

not sure why [0xc000000 - 0xcffffff] should work with 00:07.0

checked with intel doc, that is out of scope to LMMIOL register...

really don't know how it works with your platform.

YH

2010-01-14 18:08:12

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: 2.6.33-rc3: pci host bridge windows ignored (works with pci=use_crs)

On Thursday 14 January 2010 12:46:22 am Yinghai Lu wrote:
> On 01/13/2010 05:24 AM, Jeff Garrett wrote:
> > On Wed, Jan 13, 2010 at 12:45:17AM -0800, Yinghai Lu wrote:
> >> On Wed, Jan 13, 2010 at 12:44 AM, Yinghai Lu <[email protected]> wrote:
> >>> On Tue, Jan 12, 2010 at 9:37 PM, Jeff Garrett <[email protected]> wrote:
> >>>> Took a stab at getting the right emails. If I missed anyone, sorry...
> >>>>
> >>>> I have a desktop machine with a radeon HD 4850, and on recent kernels
> >>>> the radeon driver has failed with the message "No valid linear
> >>>> framebuffer address". lspci on the broken configuration showed the
> >>>> first memory region at d0000000 of the radeon card to be ignored. dmesg
> >>>> showed there to be a host bridge window at that address which was also
> >>>> ignored.

> >>> it seems HW IOH only can use [e000000 - fdffffff] under that bridge...
> >>>
> >>> and _CRS said the peer root bus could use [c000000 - dfffffff]
> >>>
> >>> could be that we need to check other register to decide if we can use
> >>> reading from ioh reg directly.

> not sure why [0xc000000 - 0xcffffff] should work with 00:07.0
>
> checked with intel doc, that is out of scope to LMMIOL register...
>
> really don't know how it works with your platform.

I wonder about that [mem 0xc0000000-0xcfffffff] range, too. But as
far as I can tell, we don't actually try to use that anywhere yet.
(We assign part of that range to the 00:1c.0 bridge to bus 03, but
I don't see any devices actually using it.)

One problem I see is that _CRS includes the [mem 0xd0000000-0xdfffffff]
range, but intel_bus.c didn't find it. We DO need this range for the
Radeon, and the Radeon works with "pci=use_crs", so the host bridge must
be routing it.

I think this must be a detail of the Intel host bridge, not something
specific to Jeff's Dell system. This problem should affect all systems
using that host bridge.

PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xe0000000-0xefffffff] (base 0xe0000000)

ACPI: PCI Root Bridge [PCI0] (0000:00)
pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xdfffffff]
pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfed8ffff]

IOH bus: [00, fb]
IOH bus: 00 index 1 mmio: [e0000000, fdffffff]

pci 0000:00:07.0: PCI bridge to [bus 04-04]
pci 0000:00:07.0: bridge window [mem 0xd0000000-0xdfffffff 64bit pref]
pci 0000:04:00.0: reg 10: [mem 0xd0000000-0xdfffffff 64bit pref]

Another problem is that intel_bus.c didn't remove the MMCONFIG region
[mem 0xe0000000-0xefffffff] from the window. I think there's code there
to do that, but there must be something wrong with it.

Bjorn

2010-01-14 22:47:48

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH] x86/pci: intel ioh need to subtract mmconf range



Bjorn pointed out we need to remove mmconf range

Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/pci/intel_bus.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/pci/intel_bus.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/intel_bus.c
+++ linux-2.6/arch/x86/pci/intel_bus.c
@@ -46,6 +46,20 @@ static inline void print_ioh_resources(s

#define RANGE_NUM 16

+static void __devinit subtract_mmconf(struct range *range, int nr)
+{
+#ifdef CONFIG_PCI_MMCONFIG
+ struct pci_mmcfg_region *cfg;
+
+ if (list_empty(&pci_mmcfg_list))
+ return;
+
+ list_for_each_entry(cfg, &pci_mmcfg_list, list)
+ subtract_range(range, nr, cfg->res.start,
+ cfg->res.end + 1);
+#endif
+}
+
static void __devinit pci_root_bus_res(struct pci_dev *dev)
{
u16 word;
@@ -96,6 +110,7 @@ static void __devinit pci_root_bus_res(s

subtract_range(range, RANGE_NUM, vt_base, vt_end + 1);
}
+ subtract_mmconf(range, RANGE_NUM);
for (i = 0; i < RANGE_NUM; i++) {
if (!range[i].end)
continue;
@@ -112,8 +127,18 @@ static void __devinit pci_root_bus_res(s
mmioh_base |= ((u64)(dword & 0x7ffff)) << 32;
pci_read_config_dword(dev, IOH_LMMIOH_LIMITU, &dword);
mmioh_end |= ((u64)(dword & 0x7ffff)) << 32;
- update_res(info, cap_resource(mmioh_base), cap_resource(mmioh_end),
- IORESOURCE_MEM, 0);
+ memset(range, 0, sizeof(range));
+ add_range(range, RANGE_NUM, 0, mmioh_base, mmioh_end + 1);
+ /* mmconf could be above 4g */
+ subtract_mmconf(range, RANGE_NUM);
+ for (i = 0; i < RANGE_NUM; i++) {
+ if (!range[i].end)
+ continue;
+
+ update_res(info, cap_resource(range[i].start),
+ cap_resource(range[i].end - 1),
+ IORESOURCE_MEM, 0);
+ }

print_ioh_resources(info);
}

2010-01-14 23:10:04

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
>
> Bjorn pointed out we need to remove mmconf range
>
> Signed-off-by: Yinghai Lu <[email protected]>
>
> ---
> arch/x86/pci/intel_bus.c | 29 +++++++++++++++++++++++++++--
> 1 file changed, 27 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/arch/x86/pci/intel_bus.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/intel_bus.c
> +++ linux-2.6/arch/x86/pci/intel_bus.c
> @@ -46,6 +46,20 @@ static inline void print_ioh_resources(s
>
> #define RANGE_NUM 16
>
> +static void __devinit subtract_mmconf(struct range *range, int nr)
> +{
> +#ifdef CONFIG_PCI_MMCONFIG
> + struct pci_mmcfg_region *cfg;
> +
> + if (list_empty(&pci_mmcfg_list))
> + return;
> +
> + list_for_each_entry(cfg, &pci_mmcfg_list, list)
> + subtract_range(range, nr, cfg->res.start,
> + cfg->res.end + 1);
> +#endif

This can't be right, can it? Let's say the kernel was built with
CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
or the kernel decides not to use MMCONFIG for some other reason.

In that case, the hardware may still be configured to support
MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
leave the window alone. We might assign some of that MMCONFIG
space to a device, but the hardware will route it to MMCONFIG,
not to the device.

Bjorn

> +}
> +
> static void __devinit pci_root_bus_res(struct pci_dev *dev)
> {
> u16 word;
> @@ -96,6 +110,7 @@ static void __devinit pci_root_bus_res(s
>
> subtract_range(range, RANGE_NUM, vt_base, vt_end + 1);
> }
> + subtract_mmconf(range, RANGE_NUM);
> for (i = 0; i < RANGE_NUM; i++) {
> if (!range[i].end)
> continue;
> @@ -112,8 +127,18 @@ static void __devinit pci_root_bus_res(s
> mmioh_base |= ((u64)(dword & 0x7ffff)) << 32;
> pci_read_config_dword(dev, IOH_LMMIOH_LIMITU, &dword);
> mmioh_end |= ((u64)(dword & 0x7ffff)) << 32;
> - update_res(info, cap_resource(mmioh_base), cap_resource(mmioh_end),
> - IORESOURCE_MEM, 0);
> + memset(range, 0, sizeof(range));
> + add_range(range, RANGE_NUM, 0, mmioh_base, mmioh_end + 1);
> + /* mmconf could be above 4g */
> + subtract_mmconf(range, RANGE_NUM);
> + for (i = 0; i < RANGE_NUM; i++) {
> + if (!range[i].end)
> + continue;
> +
> + update_res(info, cap_resource(range[i].start),
> + cap_resource(range[i].end - 1),
> + IORESOURCE_MEM, 0);
> + }
>
> print_ioh_resources(info);
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2010-01-14 23:39:20

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
>>
>> Bjorn pointed out we need to remove mmconf range
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
>>
>> ---
>> arch/x86/pci/intel_bus.c | 29 +++++++++++++++++++++++++++--
>> 1 file changed, 27 insertions(+), 2 deletions(-)
>>
>> Index: linux-2.6/arch/x86/pci/intel_bus.c
>> ===================================================================
>> --- linux-2.6.orig/arch/x86/pci/intel_bus.c
>> +++ linux-2.6/arch/x86/pci/intel_bus.c
>> @@ -46,6 +46,20 @@ static inline void print_ioh_resources(s
>>
>> #define RANGE_NUM 16
>>
>> +static void __devinit subtract_mmconf(struct range *range, int nr)
>> +{
>> +#ifdef CONFIG_PCI_MMCONFIG
>> + struct pci_mmcfg_region *cfg;
>> +
>> + if (list_empty(&pci_mmcfg_list))
>> + return;
>> +
>> + list_for_each_entry(cfg, &pci_mmcfg_list, list)
>> + subtract_range(range, nr, cfg->res.start,
>> + cfg->res.end + 1);
>> +#endif
>
> This can't be right, can it? Let's say the kernel was built with
> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
> or the kernel decides not to use MMCONFIG for some other reason.
>
> In that case, the hardware may still be configured to support
> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
> leave the window alone. We might assign some of that MMCONFIG
> space to a device, but the hardware will route it to MMCONFIG,
> not to the device.

so if there is mmconf specified, we just skip the whole function?

YH

2010-01-14 23:49:16

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
> > On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
> >>
> >> Bjorn pointed out we need to remove mmconf range
> >>
> >> Signed-off-by: Yinghai Lu <[email protected]>
> >>
> >> ---
> >> arch/x86/pci/intel_bus.c | 29 +++++++++++++++++++++++++++--
> >> 1 file changed, 27 insertions(+), 2 deletions(-)
> >>
> >> Index: linux-2.6/arch/x86/pci/intel_bus.c
> >> ===================================================================
> >> --- linux-2.6.orig/arch/x86/pci/intel_bus.c
> >> +++ linux-2.6/arch/x86/pci/intel_bus.c
> >> @@ -46,6 +46,20 @@ static inline void print_ioh_resources(s
> >>
> >> #define RANGE_NUM 16
> >>
> >> +static void __devinit subtract_mmconf(struct range *range, int nr)
> >> +{
> >> +#ifdef CONFIG_PCI_MMCONFIG
> >> + struct pci_mmcfg_region *cfg;
> >> +
> >> + if (list_empty(&pci_mmcfg_list))
> >> + return;
> >> +
> >> + list_for_each_entry(cfg, &pci_mmcfg_list, list)
> >> + subtract_range(range, nr, cfg->res.start,
> >> + cfg->res.end + 1);
> >> +#endif
> >
> > This can't be right, can it? Let's say the kernel was built with
> > CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
> > or the kernel decides not to use MMCONFIG for some other reason.
> >
> > In that case, the hardware may still be configured to support
> > MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
> > leave the window alone. We might assign some of that MMCONFIG
> > space to a device, but the hardware will route it to MMCONFIG,
> > not to the device.
>
> so if there is mmconf specified, we just skip the whole function?

No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG region
from the host bridge apertures, even if Linux isn't using MMCONFIG.

That means intel-bus.c has to be smart enough to figure out on its
own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
to do it, because mmconfig-shared.c might not be there.

Bjorn

2010-01-15 00:40:29

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On 01/14/2010 03:49 PM, Bjorn Helgaas wrote:
> On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
>> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
>>> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
>>>>
>>>> Bjorn pointed out we need to remove mmconf range
>>>>
>>>> Signed-off-by: Yinghai Lu <[email protected]>
>>>>
>>>> ---
>>>> arch/x86/pci/intel_bus.c | 29 +++++++++++++++++++++++++++--
>>>> 1 file changed, 27 insertions(+), 2 deletions(-)
>>>>
>>>> Index: linux-2.6/arch/x86/pci/intel_bus.c
>>>> ===================================================================
>>>> --- linux-2.6.orig/arch/x86/pci/intel_bus.c
>>>> +++ linux-2.6/arch/x86/pci/intel_bus.c
>>>> @@ -46,6 +46,20 @@ static inline void print_ioh_resources(s
>>>>
>>>> #define RANGE_NUM 16
>>>>
>>>> +static void __devinit subtract_mmconf(struct range *range, int nr)
>>>> +{
>>>> +#ifdef CONFIG_PCI_MMCONFIG
>>>> + struct pci_mmcfg_region *cfg;
>>>> +
>>>> + if (list_empty(&pci_mmcfg_list))
>>>> + return;
>>>> +
>>>> + list_for_each_entry(cfg, &pci_mmcfg_list, list)
>>>> + subtract_range(range, nr, cfg->res.start,
>>>> + cfg->res.end + 1);
>>>> +#endif
>>>
>>> This can't be right, can it? Let's say the kernel was built with
>>> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
>>> or the kernel decides not to use MMCONFIG for some other reason.
>>>
>>> In that case, the hardware may still be configured to support
>>> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
>>> leave the window alone. We might assign some of that MMCONFIG
>>> space to a device, but the hardware will route it to MMCONFIG,
>>> not to the device.
>>
>> so if there is mmconf specified, we just skip the whole function?
>
> No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG region
> from the host bridge apertures, even if Linux isn't using MMCONFIG.
>
> That means intel-bus.c has to be smart enough to figure out on its
> own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
> to do it, because mmconfig-shared.c might not be there.

that seems go too far away...

Subject: [PATCH -v2] x86/pci: intel ioh need to subtrac mmconf range

Bjorn pointed out we need to remove mmconf range

-v2: if mmconf is not there, get out early.

Signed-off-by: Yinghai Lu <[email protected]>

---
arch/x86/pci/Makefile | 3 ++-
arch/x86/pci/intel_bus.c | 30 ++++++++++++++++++++++++++++--
2 files changed, 30 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/pci/intel_bus.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/intel_bus.c
+++ linux-2.6/arch/x86/pci/intel_bus.c
@@ -46,6 +46,18 @@ static inline void print_ioh_resources(s

#define RANGE_NUM 16

+static void __devinit subtract_mmconf(struct range *range, int nr)
+{
+ struct pci_mmcfg_region *cfg;
+
+ if (list_empty(&pci_mmcfg_list))
+ return;
+
+ list_for_each_entry(cfg, &pci_mmcfg_list, list)
+ subtract_range(range, nr, cfg->res.start,
+ cfg->res.end + 1);
+}
+
static void __devinit pci_root_bus_res(struct pci_dev *dev)
{
u16 word;
@@ -58,6 +70,9 @@ static void __devinit pci_root_bus_res(s
struct range range[RANGE_NUM];
int i;

+ if (list_empty(&pci_mmcfg_list))
+ return;
+
/* some sys doesn't get mmconf enabled */
if (dev->cfg_size < 0x200)
return;
@@ -96,6 +111,7 @@ static void __devinit pci_root_bus_res(s

subtract_range(range, RANGE_NUM, vt_base, vt_end + 1);
}
+ subtract_mmconf(range, RANGE_NUM);
for (i = 0; i < RANGE_NUM; i++) {
if (!range[i].end)
continue;
@@ -112,8 +128,18 @@ static void __devinit pci_root_bus_res(s
mmioh_base |= ((u64)(dword & 0x7ffff)) << 32;
pci_read_config_dword(dev, IOH_LMMIOH_LIMITU, &dword);
mmioh_end |= ((u64)(dword & 0x7ffff)) << 32;
- update_res(info, cap_resource(mmioh_base), cap_resource(mmioh_end),
- IORESOURCE_MEM, 0);
+ memset(range, 0, sizeof(range));
+ add_range(range, RANGE_NUM, 0, mmioh_base, mmioh_end + 1);
+ /* mmconf could be above 4g */
+ subtract_mmconf(range, RANGE_NUM);
+ for (i = 0; i < RANGE_NUM; i++) {
+ if (!range[i].end)
+ continue;
+
+ update_res(info, cap_resource(range[i].start),
+ cap_resource(range[i].end - 1),
+ IORESOURCE_MEM, 0);
+ }

print_ioh_resources(info);
}
Index: linux-2.6/arch/x86/pci/Makefile
===================================================================
--- linux-2.6.orig/arch/x86/pci/Makefile
+++ linux-2.6/arch/x86/pci/Makefile
@@ -14,7 +14,8 @@ obj-$(CONFIG_X86_VISWS) += visws.o
obj-$(CONFIG_X86_NUMAQ) += numaq_32.o

obj-y += common.o early.o
-obj-y += amd_bus.o bus_numa.o intel_bus.o
+obj-y += amd_bus.o bus_numa.o
+obj-$(CONFIG_PCI_MMCONFIG) += intel_bus.o

ifeq ($(CONFIG_PCI_DEBUG),y)
EXTRA_CFLAGS += -DDEBUG

2010-01-19 19:43:23

by Jeff Garrett

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On Fri, Jan 15, 2010 at 10:14:17AM -0800, Jesse Barnes wrote:
> On Thu, 14 Jan 2010 16:39:13 -0800
> Yinghai Lu <[email protected]> wrote:
>
> > On 01/14/2010 03:49 PM, Bjorn Helgaas wrote:
> > > On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
> > >> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
> > >>> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
> > >>>>
> > >>>> Bjorn pointed out we need to remove mmconf range
> > >>>>
> > >>>> Signed-off-by: Yinghai Lu <[email protected]>
> > >>>>
> > >>>> ---
...
> > >>>
> > >>> This can't be right, can it? Let's say the kernel was built with
> > >>> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
> > >>> or the kernel decides not to use MMCONFIG for some other reason.
> > >>>
> > >>> In that case, the hardware may still be configured to support
> > >>> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
> > >>> leave the window alone. We might assign some of that MMCONFIG
> > >>> space to a device, but the hardware will route it to MMCONFIG,
> > >>> not to the device.
> > >>
> > >> so if there is mmconf specified, we just skip the whole function?
> > >
> > > No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG
> > > region from the host bridge apertures, even if Linux isn't using
> > > MMCONFIG.
> > >
> > > That means intel-bus.c has to be smart enough to figure out on its
> > > own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
> > > to do it, because mmconfig-shared.c might not be there.
> >
> > that seems go too far away...
> >
> > Subject: [PATCH -v2] x86/pci: intel ioh need to subtrac mmconf range
> >
> > Bjorn pointed out we need to remove mmconf range
> >
> > -v2: if mmconf is not there, get out early.
> >
> > Signed-off-by: Yinghai Lu <[email protected]>
> >
> > ---
...
>
> This goes against the real intent of intel_bus.c doesn't it? When we
> first added it, the thought was that it would be a purely native way of
> getting at bridge window information and not rely on firmware. If
> you're going to make it dependent on MMCONFIG now, why not trust other
> firmware tables as well, like _CRS?
>
> The MMCONFIG ranges are pretty easy to get at, the public docs have
> info about the registers that control the MMCONFIG decode ranges, so
> you should be able to read them out and add them to this file,
> preserving the original intent.

I did attempt a bisection last week, but my pared down config kept
hitting a sysfs_create_file panic. I didn't succeed.

Should I try the v2 patch above? What tree is it against?

-Jeff

2010-01-19 19:58:23

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On 01/19/2010 11:42 AM, Jeff Garrett wrote:
> On Fri, Jan 15, 2010 at 10:14:17AM -0800, Jesse Barnes wrote:
>> On Thu, 14 Jan 2010 16:39:13 -0800
>> Yinghai Lu <[email protected]> wrote:
>>
>>> On 01/14/2010 03:49 PM, Bjorn Helgaas wrote:
>>>> On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
>>>>> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
>>>>>> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
>>>>>>>
>>>>>>> Bjorn pointed out we need to remove mmconf range
>>>>>>>
>>>>>>> Signed-off-by: Yinghai Lu <[email protected]>
>>>>>>>
>>>>>>> ---
> ...
>>>>>>
>>>>>> This can't be right, can it? Let's say the kernel was built with
>>>>>> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
>>>>>> or the kernel decides not to use MMCONFIG for some other reason.
>>>>>>
>>>>>> In that case, the hardware may still be configured to support
>>>>>> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
>>>>>> leave the window alone. We might assign some of that MMCONFIG
>>>>>> space to a device, but the hardware will route it to MMCONFIG,
>>>>>> not to the device.
>>>>>
>>>>> so if there is mmconf specified, we just skip the whole function?
>>>>
>>>> No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG
>>>> region from the host bridge apertures, even if Linux isn't using
>>>> MMCONFIG.
>>>>
>>>> That means intel-bus.c has to be smart enough to figure out on its
>>>> own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
>>>> to do it, because mmconfig-shared.c might not be there.
>>>
>>> that seems go too far away...
>>>
>>> Subject: [PATCH -v2] x86/pci: intel ioh need to subtrac mmconf range
>>>
>>> Bjorn pointed out we need to remove mmconf range
>>>
>>> -v2: if mmconf is not there, get out early.
>>>
>>> Signed-off-by: Yinghai Lu <[email protected]>
>>>
>>> ---
> ...
>>
>> This goes against the real intent of intel_bus.c doesn't it? When we
>> first added it, the thought was that it would be a purely native way of
>> getting at bridge window information and not rely on firmware. If
>> you're going to make it dependent on MMCONFIG now, why not trust other
>> firmware tables as well, like _CRS?
>>
>> The MMCONFIG ranges are pretty easy to get at, the public docs have
>> info about the registers that control the MMCONFIG decode ranges, so
>> you should be able to read them out and add them to this file,
>> preserving the original intent.
>
> I did attempt a bisection last week, but my pared down config kept
> hitting a sysfs_create_file panic. I didn't succeed.
>
> Should I try the v2 patch above? What tree is it against?

maybe later with -tip tree + pci/linux-next.

YH

2010-01-19 22:52:13

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On Tuesday 19 January 2010 12:57:39 pm Yinghai Lu wrote:
> On 01/19/2010 11:42 AM, Jeff Garrett wrote:
> > On Fri, Jan 15, 2010 at 10:14:17AM -0800, Jesse Barnes wrote:
> >> On Thu, 14 Jan 2010 16:39:13 -0800
> >> Yinghai Lu <[email protected]> wrote:
> >>
> >>> On 01/14/2010 03:49 PM, Bjorn Helgaas wrote:
> >>>> On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
> >>>>> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
> >>>>>> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
> >>>>>>>
> >>>>>>> Bjorn pointed out we need to remove mmconf range
> >>>>>>>
> >>>>>>> Signed-off-by: Yinghai Lu <[email protected]>
> >>>>>>>
> >>>>>>> ---
> > ...
> >>>>>>
> >>>>>> This can't be right, can it? Let's say the kernel was built with
> >>>>>> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
> >>>>>> or the kernel decides not to use MMCONFIG for some other reason.
> >>>>>>
> >>>>>> In that case, the hardware may still be configured to support
> >>>>>> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
> >>>>>> leave the window alone. We might assign some of that MMCONFIG
> >>>>>> space to a device, but the hardware will route it to MMCONFIG,
> >>>>>> not to the device.
> >>>>>
> >>>>> so if there is mmconf specified, we just skip the whole function?
> >>>>
> >>>> No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG
> >>>> region from the host bridge apertures, even if Linux isn't using
> >>>> MMCONFIG.
> >>>>
> >>>> That means intel-bus.c has to be smart enough to figure out on its
> >>>> own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
> >>>> to do it, because mmconfig-shared.c might not be there.
> >>>
> >>> that seems go too far away...
> >>>
> >>> Subject: [PATCH -v2] x86/pci: intel ioh need to subtrac mmconf range
> >>>
> >>> Bjorn pointed out we need to remove mmconf range
> >>>
> >>> -v2: if mmconf is not there, get out early.
> >>>
> >>> Signed-off-by: Yinghai Lu <[email protected]>
> >>>
> >>> ---
> > ...
> >>
> >> This goes against the real intent of intel_bus.c doesn't it? When we
> >> first added it, the thought was that it would be a purely native way of
> >> getting at bridge window information and not rely on firmware. If
> >> you're going to make it dependent on MMCONFIG now, why not trust other
> >> firmware tables as well, like _CRS?
> >>
> >> The MMCONFIG ranges are pretty easy to get at, the public docs have
> >> info about the registers that control the MMCONFIG decode ranges, so
> >> you should be able to read them out and add them to this file,
> >> preserving the original intent.
> >
> > I did attempt a bisection last week, but my pared down config kept
> > hitting a sysfs_create_file panic. I didn't succeed.

I don't think there's any need to bisect this; sorry I didn't
mention this earlier.

2.6.32 didn't include intel-bus.c, so the kernel just assumed that
all non-RAM addresses got routed to the PCI bus. This would have
included the [mem 0xd0000000-0xdfffffff] used by your Radeon device,
which explains why it would work there.

After 2.6.32, we added intel-bus.c, which reads some of the host
bridge aperture information from the chipset. This is apparently
missing something, because intel-bus.c didn't find that region,
so Linux thought the Radeon resource was wrong and disabled it,
which broke it.

> > Should I try the v2 patch above? What tree is it against?
>
> maybe later with -tip tree + pci/linux-next.

Yinghai, did you figure out how to discover the [mem 0xd0000000-0xdfffffff]
region in intel-bus.c? Jeff's video isn't going to work without that.

I don't think we have anything that's worth testing yet.

Bjorn

2010-01-19 22:55:25

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On 01/19/2010 02:52 PM, Bjorn Helgaas wrote:
> On Tuesday 19 January 2010 12:57:39 pm Yinghai Lu wrote:
>> On 01/19/2010 11:42 AM, Jeff Garrett wrote:
>>> On Fri, Jan 15, 2010 at 10:14:17AM -0800, Jesse Barnes wrote:
>>>> On Thu, 14 Jan 2010 16:39:13 -0800
>>>> Yinghai Lu <[email protected]> wrote:
>>>>
>>>>> On 01/14/2010 03:49 PM, Bjorn Helgaas wrote:
>>>>>> On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
>>>>>>> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
>>>>>>>> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
>>>>>>>>>
>>>>>>>>> Bjorn pointed out we need to remove mmconf range
>>>>>>>>>
>>>>>>>>> Signed-off-by: Yinghai Lu <[email protected]>
>>>>>>>>>
>>>>>>>>> ---
>>> ...
>>>>>>>>
>>>>>>>> This can't be right, can it? Let's say the kernel was built with
>>>>>>>> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
>>>>>>>> or the kernel decides not to use MMCONFIG for some other reason.
>>>>>>>>
>>>>>>>> In that case, the hardware may still be configured to support
>>>>>>>> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
>>>>>>>> leave the window alone. We might assign some of that MMCONFIG
>>>>>>>> space to a device, but the hardware will route it to MMCONFIG,
>>>>>>>> not to the device.
>>>>>>>
>>>>>>> so if there is mmconf specified, we just skip the whole function?
>>>>>>
>>>>>> No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG
>>>>>> region from the host bridge apertures, even if Linux isn't using
>>>>>> MMCONFIG.
>>>>>>
>>>>>> That means intel-bus.c has to be smart enough to figure out on its
>>>>>> own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
>>>>>> to do it, because mmconfig-shared.c might not be there.
>>>>>
>>>>> that seems go too far away...
>>>>>
>>>>> Subject: [PATCH -v2] x86/pci: intel ioh need to subtrac mmconf range
>>>>>
>>>>> Bjorn pointed out we need to remove mmconf range
>>>>>
>>>>> -v2: if mmconf is not there, get out early.
>>>>>
>>>>> Signed-off-by: Yinghai Lu <[email protected]>
>>>>>
>>>>> ---
>>> ...
>>>>
>>>> This goes against the real intent of intel_bus.c doesn't it? When we
>>>> first added it, the thought was that it would be a purely native way of
>>>> getting at bridge window information and not rely on firmware. If
>>>> you're going to make it dependent on MMCONFIG now, why not trust other
>>>> firmware tables as well, like _CRS?
>>>>
>>>> The MMCONFIG ranges are pretty easy to get at, the public docs have
>>>> info about the registers that control the MMCONFIG decode ranges, so
>>>> you should be able to read them out and add them to this file,
>>>> preserving the original intent.
>>>
>>> I did attempt a bisection last week, but my pared down config kept
>>> hitting a sysfs_create_file panic. I didn't succeed.
>
> I don't think there's any need to bisect this; sorry I didn't
> mention this earlier.
>
> 2.6.32 didn't include intel-bus.c, so the kernel just assumed that
> all non-RAM addresses got routed to the PCI bus. This would have
> included the [mem 0xd0000000-0xdfffffff] used by your Radeon device,
> which explains why it would work there.
>
> After 2.6.32, we added intel-bus.c, which reads some of the host
> bridge aperture information from the chipset. This is apparently
> missing something, because intel-bus.c didn't find that region,
> so Linux thought the Radeon resource was wrong and disabled it,
> which broke it.
>
>>> Should I try the v2 patch above? What tree is it against?
>>
>> maybe later with -tip tree + pci/linux-next.
>
> Yinghai, did you figure out how to discover the [mem 0xd0000000-0xdfffffff]
> region in intel-bus.c? Jeff's video isn't going to work without that.
>

didn't get info from vendor yet.

looks there is some bit that will enable those register, otherwise those register should not be used.

YH

2010-01-15 18:14:19

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH] x86/pci: intel ioh need to subtract mmconf range

On Thu, 14 Jan 2010 16:39:13 -0800
Yinghai Lu <[email protected]> wrote:

> On 01/14/2010 03:49 PM, Bjorn Helgaas wrote:
> > On Thursday 14 January 2010 04:38:08 pm Yinghai Lu wrote:
> >> On 01/14/2010 03:09 PM, Bjorn Helgaas wrote:
> >>> On Thursday 14 January 2010 03:46:35 pm Yinghai Lu wrote:
> >>>>
> >>>> Bjorn pointed out we need to remove mmconf range
> >>>>
> >>>> Signed-off-by: Yinghai Lu <[email protected]>
> >>>>
> >>>> ---
> >>>> arch/x86/pci/intel_bus.c | 29 +++++++++++++++++++++++++++--
> >>>> 1 file changed, 27 insertions(+), 2 deletions(-)
> >>>>
> >>>> Index: linux-2.6/arch/x86/pci/intel_bus.c
> >>>> ===================================================================
> >>>> --- linux-2.6.orig/arch/x86/pci/intel_bus.c
> >>>> +++ linux-2.6/arch/x86/pci/intel_bus.c
> >>>> @@ -46,6 +46,20 @@ static inline void print_ioh_resources(s
> >>>>
> >>>> #define RANGE_NUM 16
> >>>>
> >>>> +static void __devinit subtract_mmconf(struct range *range, int
> >>>> nr) +{
> >>>> +#ifdef CONFIG_PCI_MMCONFIG
> >>>> + struct pci_mmcfg_region *cfg;
> >>>> +
> >>>> + if (list_empty(&pci_mmcfg_list))
> >>>> + return;
> >>>> +
> >>>> + list_for_each_entry(cfg, &pci_mmcfg_list, list)
> >>>> + subtract_range(range, nr, cfg->res.start,
> >>>> + cfg->res.end + 1);
> >>>> +#endif
> >>>
> >>> This can't be right, can it? Let's say the kernel was built with
> >>> CONFIG_PCI_MMCONFIG turned off, or the user used "pci=nommconf",
> >>> or the kernel decides not to use MMCONFIG for some other reason.
> >>>
> >>> In that case, the hardware may still be configured to support
> >>> MMCONFIG, but the pci_mmcfg_list will be empty, so your code will
> >>> leave the window alone. We might assign some of that MMCONFIG
> >>> space to a device, but the hardware will route it to MMCONFIG,
> >>> not to the device.
> >>
> >> so if there is mmconf specified, we just skip the whole function?
> >
> > No, I'm saying that intel-bus.c must ALWAYS remove the MMCONFIG
> > region from the host bridge apertures, even if Linux isn't using
> > MMCONFIG.
> >
> > That means intel-bus.c has to be smart enough to figure out on its
> > own what the MMCONFIG area is. It can't depend on mmconfig-shared.c
> > to do it, because mmconfig-shared.c might not be there.
>
> that seems go too far away...
>
> Subject: [PATCH -v2] x86/pci: intel ioh need to subtrac mmconf range
>
> Bjorn pointed out we need to remove mmconf range
>
> -v2: if mmconf is not there, get out early.
>
> Signed-off-by: Yinghai Lu <[email protected]>
>
> ---
> arch/x86/pci/Makefile | 3 ++-
> arch/x86/pci/intel_bus.c | 30 ++++++++++++++++++++++++++++--
> 2 files changed, 30 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/arch/x86/pci/intel_bus.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/intel_bus.c
> +++ linux-2.6/arch/x86/pci/intel_bus.c
> @@ -46,6 +46,18 @@ static inline void print_ioh_resources(s
>
> #define RANGE_NUM 16
>
> +static void __devinit subtract_mmconf(struct range *range, int nr)
> +{
> + struct pci_mmcfg_region *cfg;
> +
> + if (list_empty(&pci_mmcfg_list))
> + return;
> +
> + list_for_each_entry(cfg, &pci_mmcfg_list, list)
> + subtract_range(range, nr, cfg->res.start,
> + cfg->res.end + 1);
> +}
> +
> static void __devinit pci_root_bus_res(struct pci_dev *dev)
> {
> u16 word;
> @@ -58,6 +70,9 @@ static void __devinit pci_root_bus_res(s
> struct range range[RANGE_NUM];
> int i;
>
> + if (list_empty(&pci_mmcfg_list))
> + return;
> +
> /* some sys doesn't get mmconf enabled */
> if (dev->cfg_size < 0x200)
> return;
> @@ -96,6 +111,7 @@ static void __devinit pci_root_bus_res(s
>
> subtract_range(range, RANGE_NUM, vt_base, vt_end +
> 1); }
> + subtract_mmconf(range, RANGE_NUM);
> for (i = 0; i < RANGE_NUM; i++) {
> if (!range[i].end)
> continue;
> @@ -112,8 +128,18 @@ static void __devinit pci_root_bus_res(s
> mmioh_base |= ((u64)(dword & 0x7ffff)) << 32;
> pci_read_config_dword(dev, IOH_LMMIOH_LIMITU, &dword);
> mmioh_end |= ((u64)(dword & 0x7ffff)) << 32;
> - update_res(info, cap_resource(mmioh_base),
> cap_resource(mmioh_end),
> - IORESOURCE_MEM, 0);
> + memset(range, 0, sizeof(range));
> + add_range(range, RANGE_NUM, 0, mmioh_base, mmioh_end + 1);
> + /* mmconf could be above 4g */
> + subtract_mmconf(range, RANGE_NUM);
> + for (i = 0; i < RANGE_NUM; i++) {
> + if (!range[i].end)
> + continue;
> +
> + update_res(info, cap_resource(range[i].start),
> + cap_resource(range[i].end - 1),
> + IORESOURCE_MEM, 0);
> + }
>
> print_ioh_resources(info);
> }
> Index: linux-2.6/arch/x86/pci/Makefile
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/Makefile
> +++ linux-2.6/arch/x86/pci/Makefile
> @@ -14,7 +14,8 @@ obj-$(CONFIG_X86_VISWS) += visws.o
> obj-$(CONFIG_X86_NUMAQ) += numaq_32.o
>
> obj-y += common.o early.o
> -obj-y += amd_bus.o bus_numa.o
> intel_bus.o +obj-y += amd_bus.o
> bus_numa.o +obj-$(CONFIG_PCI_MMCONFIG) += intel_bus.o
>
> ifeq ($(CONFIG_PCI_DEBUG),y)
> EXTRA_CFLAGS += -DDEBUG
>

This goes against the real intent of intel_bus.c doesn't it? When we
first added it, the thought was that it would be a purely native way of
getting at bridge window information and not rely on firmware. If
you're going to make it dependent on MMCONFIG now, why not trust other
firmware tables as well, like _CRS?

The MMCONFIG ranges are pretty easy to get at, the public docs have
info about the registers that control the MMCONFIG decode ranges, so
you should be able to read them out and add them to this file,
preserving the original intent.

--
Jesse Barnes, Intel Open Source Technology Center