Hello.
Time ago I bisected a commit that was making my OKI Anima 3300 laptop
hang during boot.
The offending commit was:
------------
commit 12c22d6ef299ccf0955e5756eb57d90d7577ac68
Author: Linus Torvalds <[email protected]>
Date: Wed Mar 26 11:22:40 2008 -0700
Revert "PCI: remove transparent bridge sizing"
This reverts commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f, which
caused various interesting problems for people, including wrong
resource
allocations. See for example bugzilla entry "2.6.25-rc2: ohci1394
problem (MMIO broken)" at
http://bugzilla.kernel.org/show_bug.cgi?id=10080
[...]
------------
It happenend sometime fter 2.6.25-rc7; since then I've been living on
hand-patched kernels; I can't boot any official kernel from Ubuntu or
Gentoo without patching them. For Ubuntu this would mean rebuilding the
LiveCD and then manually patching every kernel update and restricted
modules distribution...
I filled this bug:
http://bugzilla.kernel.org/show_bug.cgi?id=11054
that seems to be getting no attention at all. It includes detailed lspci
output.
The specific bridge that fails (as far as I can tell from putting
kprintf()'s into the kernel) is:
00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2)
(prog-if 01 [Subtractive decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Bus: primary=00, secondary=04, subordinate=08, sec-latency=64
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: c3000000-c30fffff
Prefetchable memory behind bridge: fff00000-000fffff
Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr+ DiscTmrStat- DiscTmrSERREn-
Capabilities: [b8] Subsystem: Gammagraphx, Inc. Device 0000
Capabilities: [8c] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
A specific weird detail of the machine (I don't know whether it's
related at all) is that the onboard ethernet controller is recognized as
some kind of bridge??? :
00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a3)
Subsystem: CLEVO/KAPOK Computer Device 5403
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0 (250ns min, 5000ns max)
Interrupt: pin A routed to IRQ 23
Region 0: Memory at c0007000 (32-bit, non-prefetchable)
[size=4K]
Region 1: I/O ports at 30b8 [size=8]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable+ DSel=0 DScale=0 PME-
Kernel driver in use: forcedeth
Since this issue means that either I scrap my laptop or suffer a forced
live of kernel micromanagement no matter the distro I choose, I'm very
interested in helping with this. I'll try to find the problem myself if
someone directs me about what to look for.
Regards, and thanks in advance,
Juan Jesus.
On 11/10/2008 10:12 AM, GARCIA DE SORIA LUCENA, JUAN JESUS wrote:
> Hello.
>
> Time ago I bisected a commit that was making my OKI Anima 3300 laptop
> hang during boot.
Doesn't pci=norom help in your case? There was a patch which tried to resolve
this issue in a different manner, but it was reverted too. This boot parameter
was introduced as a replacement IIRC.
> The offending commit was:
>
> ------------
> commit 12c22d6ef299ccf0955e5756eb57d90d7577ac68
> Author: Linus Torvalds <[email protected]>
> Date: Wed Mar 26 11:22:40 2008 -0700
>
> Revert "PCI: remove transparent bridge sizing"
>
> This reverts commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f, which
> caused various interesting problems for people, including wrong
> resource
> allocations. See for example bugzilla entry "2.6.25-rc2: ohci1394
> problem (MMIO broken)" at
>
> http://bugzilla.kernel.org/show_bug.cgi?id=10080
>
> [...]
On Mon, Nov 10, 2008 at 11:14:11AM +0100, Jiri Slaby wrote:
> On 11/10/2008 10:12 AM, GARCIA DE SORIA LUCENA, JUAN JESUS wrote:
> > Hello.
> >
> > Time ago I bisected a commit that was making my OKI Anima 3300 laptop
> > hang during boot.
>
> Doesn't pci=norom help in your case? There was a patch which tried to resolve
> this issue in a different manner, but it was reverted too. This boot parameter
> was introduced as a replacement IIRC.
As the unfortunate author of both of the reverted patches and author
of the pci=norom patch I can confirm that Jiri is correct. The issues
that the patches addressed (with some unintended side effects caused
by the reverted attempts) were PCI resource allocation failures observed
during PCI hotplug. We were not seeing or trying to address boot-time
PCI resource allocation failures or hangs.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc
Hi again.
> -----Original Message-----
> From: Gary Hade [mailto:[email protected]]
>
> > Doesn't pci=norom help in your case? There was a patch
> which tried to
> > resolve this issue in a different manner, but it was reverted too.
> > This boot parameter was introduced as a replacement IIRC.
>
> As the unfortunate author of both of the reverted patches and
> author of the pci=norom patch I can confirm that Jiri is
> correct. The issues that the patches addressed (with some
> unintended side effects caused by the reverted attempts) were
> PCI resource allocation failures observed during PCI hotplug.
> We were not seeing or trying to address boot-time PCI
> resource allocation failures or hangs.
>
> Gary
I've tested pci=norom with the Ubuntu 8.10 AMD64 kernel, with no
effects. I'll try to download and test the 32 bit version too, to check
whether it has anything to do with the size of the resource_size_t type
(defined in linux/types.h) being u64. Perhaps it's u32 in a 32 bit
architecture.
A problem with this bridge in my lspci info is that both the I/O and the
prefetchable memory ranges behind bridges have a end address BELOW the
start address.
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: c3000000-c30fffff
Prefetchable memory behind bridge: fff00000-000fffff
I don't know whether these ranges are supposed to encompass the BIOS ROM
or whatever (and thus your pci=norom option). Another explanation for
the hang may lie in the definition of resource_size() (defined in
linux/ioport.h):
static inline resource_size_t resource_size(struct resource *res)
{
return res->end - res->start + 1;
}
As you can see, the subtraction will overflow due to the end address
being BELOW the start address, and the resulting size will be different
when resource_size_t is u64 than it would be if it's u32.
Moreover, I suppose that the intended size-of-size for the IO range
would be u16.
I mean: see this table of calculations:
Size, current u64 Size, u16/u32 clamp
I/O 0xFFFFFFFFFFFF2000 0x2000
Prefetchable mem 0xFFFFFFFF00200000 0x200000
Where "clamping" means and'ing with 0xffff for u16 clamp or 0xffffffff
for u32 clamp (at least when the end address is below the start
address).
You can see that, apart from whether there's a rom in the address range
or not, the size calculation (at least by code inspection by simple
arithmetic in resource_size()) seems to be wrong. It produces gigantic
address range sizes, whereas the clamped values (8KB I/O, 2MB
prefetchable mem) seem to be far more sensible.
I tried to do some preliminary tests by putting conditions inside
resource_size() to check for "reversed ranges" and clamp the size by
anding with (res->flags & IORESOURCE_IO) ? 0xffffUL : 0xffffffffUL, but
to no avail yet. It printed the IO mapping in the kernel messages, but
it seems to have choked on the memory range??? I have to investigate
more.
I don't know if what I've written above gives you any clue about my
issue.
Regards, and thanks in advance,
Juan Jesus.
GARCIA DE SORIA LUCENA, JUAN JESUS napsal(a):
> resource_size_t type
> (defined in linux/types.h) being u64. Perhaps it's u32 in a 32 bit
> architecture.
Unless you have RESOURCES_64BIT=y which is the default on x86_32 now.
> -----Original Message-----
> From: Jiri Slaby [mailto:[email protected]]
>
> GARCIA DE SORIA LUCENA, JUAN JESUS napsal(a):
> > resource_size_t type
> > (defined in linux/types.h) being u64. Perhaps it's u32 in a 32 bit
> > architecture.
>
> Unless you have RESOURCES_64BIT=y which is the default on x86_32 now.
Ugh. Knowing this will save me from downloading, burning and testing the
32 bit Ubuntu distro, whose 2.6.27 kernel will surely use that default
configuration.
I'll perform more tests, with more debug kprintf()'s, anyway.
Do you remember if the cases Gary's patches were trying to fix included
ranges rolling past the 32-bit address limit (end address below start
address)?
Regards,
Juan Jesus.
On Tue, Nov 11, 2008 at 12:43:28PM +0100, GARCIA DE SORIA LUCENA, JUAN JESUS wrote:
> > -----Original Message-----
> > From: Jiri Slaby [mailto:[email protected]]
> >
> > GARCIA DE SORIA LUCENA, JUAN JESUS napsal(a):
> > > resource_size_t type
> > > (defined in linux/types.h) being u64. Perhaps it's u32 in a 32 bit
> > > architecture.
> >
> > Unless you have RESOURCES_64BIT=y which is the default on x86_32 now.
>
> Ugh. Knowing this will save me from downloading, burning and testing the
> 32 bit Ubuntu distro, whose 2.6.27 kernel will surely use that default
> configuration.
>
> I'll perform more tests, with more debug kprintf()'s, anyway.
>
> Do you remember if the cases Gary's patches were trying to fix included
> ranges rolling past the 32-bit address limit (end address below start
> address)?
No, my patches were not trying to fix anything like that.
I looked at the `lspci -vvv` output for some transparent bridges on
one of our systems and found that the messed up looking ranges are
not unique to your system. We also see that on our systems. I checked
the lspci code and found that the displayed ranges are based on base
and limit register values obtained directly from PCI config space
for the device. I also noticed that -vvv it will display values even
if they do not represent valid ranges such as you might expect for the
contents of base and limit registers on transparent bridges. This
caused me to peek at the lspci man page which says:
"-vvv Be even more verbose and display everything we are able
to parse, even if it doesn’t look interesting at all
(e.g., undefined memory regions)."
You must be getting some of that "even if it doesn't look interesting
at all" stuff. :)
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc
On Mon, Nov 10, 2008 at 09:58:16AM -0800, Gary Hade wrote:
> On Mon, Nov 10, 2008 at 11:14:11AM +0100, Jiri Slaby wrote:
> > On 11/10/2008 10:12 AM, GARCIA DE SORIA LUCENA, JUAN JESUS wrote:
> > > Hello.
> > >
> > > Time ago I bisected a commit that was making my OKI Anima 3300 laptop
> > > hang during boot.
> >
> > Doesn't pci=norom help in your case? There was a patch which tried to resolve
> > this issue in a different manner, but it was reverted too. This boot parameter
> > was introduced as a replacement IIRC.
>
> As the unfortunate author of both of the reverted patches and author
> of the pci=norom patch I can confirm that Jiri is correct. The issues
> that the patches addressed (with some unintended side effects caused
> by the reverted attempts) were PCI resource allocation failures observed
> during PCI hotplug. We were not seeing or trying to address boot-time
> PCI resource allocation failures or hangs.
Correction. We were not trying to address boot-time hangs but I
believe we may have been trying to address expansion ROM related
PCI resource allocation failures that we were seeing during boot
with certain PCI cards. However, I doubt that this is relevent
to your problem. The transparent bridge sizing removal change is
probably a red herring.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc
> -----Original Message-----
> From: Gary Hade [mailto:[email protected]]
>
> Correction. We were not trying to address boot-time hangs
> but I believe we may have been trying to address expansion
> ROM related PCI resource allocation failures that we were
> seeing during boot with certain PCI cards. However, I doubt
> that this is relevent to your problem. The transparent
> bridge sizing removal change is probably a red herring.
After reading a little bit on how PCI and PCI-to-PCI bridges work, and
how they're handled in linux (http://tldp.org/LDP/tlk/dd/pci.html), I
now know that ranges in the bridge (either I/O, mmio or prefetchable
mem) are simply disabled when start < end, and that's the original
configuration that the BIOS enforces when the bridge is not sized by
Linux.
After inserting kprintf()'s, I see that the hang happens actually while
the positive decoding of the I/O range in the bridge is being activated
in pci_setup_bridge(), sometime in between the writes to the I/O
base/limit registers of the bridge; I don't remember exactly which was
the last pci_Write_config_dword() that allowed the next kprintf() to
succeed. I'll look at it tonight again, but I suspect that the final
enabling write (the one that updates the PCI_IO_BASE_UPPER16 register
with its final value) was the one hanging the machine.
The I/O range being activated was the one in the range 0x1000-0x1fff,
apparently correctly sized to accomodate the two I/O ranges
(0x1000-0x10ff, 0x1400-0x14ff) assigned to the CardBus bridge on the
secondary bus.
One theory is that my system has something actually mapped to that I/O
range in the root PCI bus. When only subtractive decoding is in place
(the I/O range isn't activated), access to the secondary bus behind the
PCI-to-PCI bridge is done when the transaction isn't claimed by any
device in the root bus, after what the PCI docs describe as a 4-cycle
timeout. When the I/O range is activated, that range is positively
decoded by the bridge, which tries to claim the transaction before the
timeout. Perhaps two devices (the bridge and the unknown device on the
root bus) conflict when claiming the same transaction?
Another possibility could be that activating the I/O range disables the
negative decoding in the secondary-to-primary sense of the bridge for
that I/O range. Perhaps some device behind the bridge depends on being
able to forward transactions to the primary bus on that I/O range, but
it's disallowed after the range is configured. For me this seems rather
unlikely, because of the nature of the devices behind the bridge.
I'll look at it more closely again, and I will test whether commenting
out the I/O range sizing (leaving the other ranges to be sized) is
enough to allow the system to run. If so, is there any way to use a
system-specific quirk in order to remove the PCI-to-PCI bridge I/O range
from being sized/activated?
Best regards,
Juan Jesus.
On Wed, Nov 12, 2008 at 09:50:14AM +0100, GARCIA DE SORIA LUCENA, JUAN JESUS wrote:
> > -----Original Message-----
> > From: Gary Hade [mailto:[email protected]]
> >
> > Correction. We were not trying to address boot-time hangs
> > but I believe we may have been trying to address expansion
> > ROM related PCI resource allocation failures that we were
> > seeing during boot with certain PCI cards. However, I doubt
> > that this is relevent to your problem. The transparent
> > bridge sizing removal change is probably a red herring.
>
> After reading a little bit on how PCI and PCI-to-PCI bridges work, and
> how they're handled in linux (http://tldp.org/LDP/tlk/dd/pci.html), I
> now know that ranges in the bridge (either I/O, mmio or prefetchable
> mem) are simply disabled when start < end, and that's the original
> configuration that the BIOS enforces when the bridge is not sized by
> Linux.
>
> After inserting kprintf()'s, I see that the hang happens actually while
> the positive decoding of the I/O range in the bridge is being activated
> in pci_setup_bridge(), sometime in between the writes to the I/O
> base/limit registers of the bridge; I don't remember exactly which was
> the last pci_Write_config_dword() that allowed the next kprintf() to
> succeed. I'll look at it tonight again, but I suspect that the final
> enabling write (the one that updates the PCI_IO_BASE_UPPER16 register
> with its final value) was the one hanging the machine.
>
> The I/O range being activated was the one in the range 0x1000-0x1fff,
> apparently correctly sized to accomodate the two I/O ranges
> (0x1000-0x10ff, 0x1400-0x14ff) assigned to the CardBus bridge on the
> secondary bus.
>
> One theory is that my system has something actually mapped to that I/O
> range in the root PCI bus. When only subtractive decoding is in place
> (the I/O range isn't activated), access to the secondary bus behind the
> PCI-to-PCI bridge is done when the transaction isn't claimed by any
> device in the root bus, after what the PCI docs describe as a 4-cycle
> timeout. When the I/O range is activated, that range is positively
> decoded by the bridge, which tries to claim the transaction before the
> timeout. Perhaps two devices (the bridge and the unknown device on the
> root bus) conflict when claiming the same transaction?
>
> Another possibility could be that activating the I/O range disables the
> negative decoding in the secondary-to-primary sense of the bridge for
> that I/O range. Perhaps some device behind the bridge depends on being
> able to forward transactions to the primary bus on that I/O range, but
> it's disallowed after the range is configured. For me this seems rather
> unlikely, because of the nature of the devices behind the bridge.
>
> I'll look at it more closely again, and I will test whether commenting
> out the I/O range sizing (leaving the other ranges to be sized) is
> enough to allow the system to run. If so, is there any way to use a
> system-specific quirk in order to remove the PCI-to-PCI bridge I/O range
> from being sized/activated?
Yes, there are already examples of this sort of thing in the PCI core.
See drivers/pci/quirks.c. Of course, you will likely have to present
a very good argument as to why such a change is really necessary.
BTW, I just noticed that you have not been copying [email protected]
or the PCI maintainer Jesse Barnes <[email protected]> where you
will probably get more seasoned advise.
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc