2005-12-02 21:02:08

by Jeff Garzik

[permalink] [raw]
Subject: [2.6.15-rc4] oops in acpiphp

Bootdata ok (command line is ro root=LABEL=/ nogui console=ttyS0,38400)
Linux version 2.6.15-rc4acpiphp1-gdeda4987 ([email protected]) (gcc version 4.0.2 20051125 (Red Hat 4.0.2-8)) #1 SMP Fri Dec 2 15:35:47 EST 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009b800 (usable)
BIOS-e820: 000000000009b800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000007fff9300 (usable)
BIOS-e820: 000000007fff9300 - 0000000080000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
SRAT: Node 0 PXM 0 0-a0000
SRAT: Node 0 PXM 0 0-40000000
SRAT: Node 1 PXM 1 40000000-80000000
Bootmem setup node 0 0000000000000000-0000000040000000
Bootmem setup node 1 0000000040000000-000000007fff9000
Nvidia board detected. Ignoring ACPI timer override.
ACPI: PM-Timer IO Port: 0xf808
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
Processor #2 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
Processor #3 15:1 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 17, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x09] address[0xf2700000] gsi_base[24])
IOAPIC[1]: apic_id 9, version 17, address 0xf2700000, GSI 24-27
ACPI: IOAPIC (id[0x0a] address[0xf2701000] gsi_base[28])
IOAPIC[2]: apic_id 10, version 17, address 0xf2701000, GSI 28-31
ACPI: IOAPIC (id[0x0b] address[0xf2800000] gsi_base[32])
IOAPIC[3]: apic_id 11, version 17, address 0xf2800000, GSI 32-55
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 88000000 (gap: 80000000:7ec00000)
Checking aperture...
CPU 0: aperture @ 42a0000000 size 32 MB
Aperture from northbridge cpu 0 too small (32 MB)
No AGP bridge found
Built 2 zonelists
[...]
ACPI: bus type pci registered
PCI: Using configuration type 1
usbcore: registered new driver usbfs
usbcore: registered new driver hub
ACPI: Subsystem revision 20050902
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Transparent bridge - 0000:00:09.0
ACPI: PCI Root Bridge [PCI1] (0000:40)
ACPI: PCI Root Bridge [PCI2] (0000:80)
ACPI: PCI Interrupt Link [LNKA] (IRQs 16 17 18 19) *5
ACPI: PCI Interrupt Link [LNKB] (IRQs 16 17 18 19) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 16 17 18 19) *5
ACPI: PCI Interrupt Link [LNKD] (IRQs 16 17 18 19) *10
ACPI: PCI Interrupt Link [LNKW] (IRQs *48), disabled.
ACPI: PCI Interrupt Link [LNKX] (IRQs *49), disabled.
ACPI: PCI Interrupt Link [LNKY] (IRQs *50), disabled.
ACPI: PCI Interrupt Link [LNKZ] (IRQs *51), disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 20 21 22 23) *11
ACPI: PCI Interrupt Link [LSB0] (IRQs 20 21 22 23) *5
ACPI: PCI Interrupt Link [LSB2] (IRQs 20 21 22 23) *10
ACPI: PCI Interrupt Link [LMAC] (IRQs 20 21 22 23) *11
ACPI: PCI Interrupt Link [LACI] (IRQs 20 21 22 23) *11
ACPI: PCI Interrupt Link [LMCI] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LIDE] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [LSA0] (IRQs 20 21 22 23) *5
ACPI: PCI Interrupt Link [LSA1] (IRQs 20 21 22 23) *10
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 12 devices
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
PCI-DMA: Disabling IOMMU.
pnp: 00:09: ioport range 0x4d0-0x4d1 has been reserved
pnp: 00:0a: ioport range 0x400-0x47f could not be reserved
pnp: 00:0a: ioport range 0x480-0x48f has been reserved
pnp: 00:0a: ioport range 0xf800-0xf87f could not be reserved
pnp: 00:0a: ioport range 0xf880-0xf8ff has been reserved
pnp: 00:0a: ioport range 0xf900-0xf97f has been reserved
pnp: 00:0a: ioport range 0xf980-0xf9ff has been reserved
pnp: 00:0a: ioport range 0xfa00-0xfa7f has been reserved
pnp: 00:0a: ioport range 0xfa80-0xfaff has been reserved
PCI: Bridge: 0000:00:09.0
IO window: disabled.
MEM window: f2000000-f20fffff
PREFETCH window: disabled.
PCI: Failed to allocate mem resource #6:20000@e0000000 for 0000:0a:00.0
PCI: Bridge: 0000:00:0e.0
IO window: disabled.
MEM window: f0000000-f1ffffff
PREFETCH window: d0000000-dfffffff
PCI: Bridge: 0000:40:01.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:40:02.0
IO window: 1000-1fff
MEM window: f2200000-f26fffff
PREFETCH window: 88000000-882fffff
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
audit: initializing netlink socket (disabled)
audit(1133556420.240:1): initialized
Total HugeTLB memory allocated, 0
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
PCI: MSI quirk detected. pci_msi_quirk set.
PCI: MSI quirk detected. pci_msi_quirk set.
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Badness in kref_get at lib/kref.c:32

Call Trace:<ffffffff801eebae>{kref_get+46} <ffffffff801eddd2>{kobject_get+18}
<ffffffff80274de4>{get_bus+20} <ffffffff80275446>{bus_add_driver+38}
<ffffffff8045b0dc>{pcied_init+188} <ffffffff8010b232>{init+482}
<ffffffff8010eb9e>{child_rip+8} <ffffffff8010b050>{init+0}
<ffffffff8010eb96>{child_rip+0}
Badness in kref_get at lib/kref.c:32

Call Trace:<ffffffff801eebae>{kref_get+46} <ffffffff801eddd2>{kobject_get+18}
<ffffffff801ee22b>{kobject_init+43} <ffffffff801ee310>{kobject_register+32}
<ffffffff80275485>{bus_add_driver+101} <ffffffff8045b0dc>{pcied_init+188}
<ffffffff8010b232>{init+482} <ffffffff8010eb9e>{child_rip+8}
<ffffffff8010b050>{init+0} <ffffffff8010eb96>{child_rip+0}

BUG: soft lockup detected on CPU#3!
CPU 3:
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.15-rc4acpiphp1-gdeda4987 #1
RIP: 0010:[<ffffffff80310439>] <ffffffff80310439>{.text.lock.spinlock+0}
RSP: 0000:ffff81007ffdbea0 EFLAGS: 00000286
RAX: ffffffff80388000 RBX: ffffffff80147f5a RCX: 00000000000029e5
RDX: 00000000ffffff01 RSI: 0000000000000246 RDI: ffffffff80388020
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000034 R11: 000000000000000a R12: 0000000000000000
R13: ffffffff8033d171 R14: ffffffff801522ea R15: 0000000000000292
FS: 0000000000587850(0000) GS:ffffffff80437980(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000583438 CR3: 0000000000101000 CR4: 00000000000006e0

Call Trace:<ffffffff801ee02e>{kobject_add+94} <ffffffff801ee318>{kobject_register+40}
<ffffffff80275485>{bus_add_driver+101} <ffffffff8045b0dc>{pcied_init+188}
<ffffffff8010b232>{init+482} <ffffffff8010eb9e>{child_rip+8}
<ffffffff8010b050>{init+0} <ffffffff8010eb96>{child_rip+0}

BUG: soft lockup detected on CPU#3!
[repeats]


Attachments:
oops.txt (7.26 kB)
dmesg.dead.txt.bz2 (3.58 kB)
dmesg.alive.txt.bz2 (5.63 kB)
config.txt.bz2 (6.85 kB)
lspci.txt.bz2 (804.00 B)
lspciv.txt.bz2 (3.48 kB)
Download all attachments

2005-12-02 21:41:08

by Greg KH

[permalink] [raw]
Subject: Re: [2.6.15-rc4] oops in acpiphp

On Fri, Dec 02, 2005 at 04:01:58PM -0500, Jeff Garzik wrote:
>
> Booting with acpiphp on this dual core, dual package (2x2) box causes an
> oops.

Hm, this is oopsing in the pci express hotplug driver, not the acpi
hotplug driver. Did you mean to load that driver?

I'll look at your other attachments for more details...

thanks,

greg k-h

2005-12-02 21:46:00

by Rajesh Shah

[permalink] [raw]
Subject: Re: [2.6.15-rc4] oops in acpiphp

On Fri, Dec 02, 2005 at 04:01:58PM -0500, Jeff Garzik wrote:
>
> Booting with acpiphp on this dual core, dual package (2x2) box causes an
> oops.
>
The oops is actually in pciehp, not acpiphp. It's been around
for a while, I saw a similar report dating back to April. It
triggers only if pciehp (CONFIG_HOTPLUG_PCI_PCIE) is marked as
"y" in .config. Making it a module will make it go away. I
looked into this very briefly last week and it looked like some
race related to the pcie core registering the pciehp driver.
I'll be working on making some pcie improvements soon, and will
look into this as a part of that.

Rajesh

2005-12-02 21:48:33

by Greg KH

[permalink] [raw]
Subject: Re: [2.6.15-rc4] oops in acpiphp

On Fri, Dec 02, 2005 at 04:01:58PM -0500, Jeff Garzik wrote:
>
> Booting with acpiphp on this dual core, dual package (2x2) box causes an
> oops.

Ok, I've heard about this one before, in looking closer at this. Rajesh
is aware of this one (right Rajesh? I'll bounce you the original
message). I'll let him take it from here...

thanks,

greg k-h

2005-12-02 21:49:04

by Jeff Garzik

[permalink] [raw]
Subject: Re: [2.6.15-rc4] oops in acpiphp

Greg KH wrote:
> On Fri, Dec 02, 2005 at 04:01:58PM -0500, Jeff Garzik wrote:
>
>>Booting with acpiphp on this dual core, dual package (2x2) box causes an
>>oops.
>
>
> Hm, this is oopsing in the pci express hotplug driver, not the acpi
> hotplug driver. Did you mean to load that driver?

Yes. I didn't look too closely beyond 'pci hotplug oops' TBH, since my
machine wouldn't boot.


> I'll look at your other attachments for more details...

Thanks,

Jeff


2005-12-02 21:57:26

by Rajesh Shah

[permalink] [raw]
Subject: Re: [2.6.15-rc4] oops in acpiphp

On Fri, Dec 02, 2005 at 01:46:55PM -0800, Greg KH wrote:
> On Fri, Dec 02, 2005 at 04:01:58PM -0500, Jeff Garzik wrote:
> >
> > Booting with acpiphp on this dual core, dual package (2x2) box causes an
> > oops.
>
> Ok, I've heard about this one before, in looking closer at this. Rajesh
> is aware of this one (right Rajesh? I'll bounce you the original
> message). I'll let him take it from here...
>
Yes, I'm aware of this as of last week, never tried with the
config option set to "y" previously. I'll look into this as
part of the pciehp rework.

thanks,
Rajesh