2006-01-20 12:32:22

by Carlo E. Prelz

[permalink] [raw]
Subject: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Best regards to all and everyone. I just purchased a new RS480 (Radeon
XPress200)-based motherboard. It is a Sapphire Pure Performance
PP-A9RS480. Mine is equipped with an Athlon64 3200+. I am attaching
the output of lspci -v

When booting with kernels from 2.6.15-rc1 up (tested with 2.6.15-rc1,
2.6.15-rc5, 2.6.15 and 2.6.16-rc1), the boot process freezes after
displaying messages retated to registering io schedulers:

...
...
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered

There is no OOPS of any kind. The disk activity led remains on. With
both 2.6.14 and 2.6.14.6, boot regularly continues with:

Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
pktcdvd: v0.2.0a 2004-07-14 Jens Axboe ([email protected]) and [email protected]
nbd: registered device at major 43
...
...

I tried to enable all debug options in configure, but no new message
appears.

Important: I obtain the same result (frozen after "io scheduler cfq
registered") when booting with a 100MB netinst debian sid bootdisk,
downloaded last night. An older (9 month old) Ubuntu bootdisk boots
perfectly. Both cd's are AMD64-specific.

I will be very happy to provide any further info, or to make any
test. Please CC me in your answers if you can.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)


Attachments:
(No filename) (1.53 kB)
lspciout (7.23 kB)
Download all attachments

2006-01-21 09:10:03

by Andrew Morton

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

"Carlo E. Prelz" <[email protected]> wrote:
>
> Best regards to all and everyone. I just purchased a new RS480 (Radeon
> XPress200)-based motherboard. It is a Sapphire Pure Performance
> PP-A9RS480. Mine is equipped with an Athlon64 3200+. I am attaching
> the output of lspci -v
>
> When booting with kernels from 2.6.15-rc1 up (tested with 2.6.15-rc1,
> 2.6.15-rc5, 2.6.15 and 2.6.16-rc1), the boot process freezes after
> displaying messages retated to registering io schedulers:
>
> ...
> ...
> io scheduler noop registered
> io scheduler anticipatory registered
> io scheduler deadline registered
> io scheduler cfq registered
>
> There is no OOPS of any kind. The disk activity led remains on. With
> both 2.6.14 and 2.6.14.6, boot regularly continues with:
>
> Floppy drive(s): fd0 is 1.44M
> FDC 0 is a post-1991 82077
> loop: loaded (max 8 devices)
> pktcdvd: v0.2.0a 2004-07-14 Jens Axboe ([email protected]) and [email protected]
> nbd: registered device at major 43
> ...
> ...
>
> I tried to enable all debug options in configure, but no new message
> appears.
>
> Important: I obtain the same result (frozen after "io scheduler cfq
> registered") when booting with a 100MB netinst debian sid bootdisk,
> downloaded last night. An older (9 month old) Ubuntu bootdisk boots
> perfectly. Both cd's are AMD64-specific.
>

Can you please add `initcall_debug' to the kernel boot command line?
That'll tell us which function got stuck.

2006-01-21 12:57:50

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: sab 21 gen 06 01:09:32 -0800

Quoting Andrew Morton ([email protected]):

> Can you please add `initcall_debug' to the kernel boot command line?
> That'll tell us which function got stuck.

I photographed the screen. I am copying here the last few lines. I
hope I make no errors in copying...

...
...
Installing knfsd (copyright (C) 1996 [email protected]).
Calling initcall 0xffffffff806de042: init_nlm+0x0/0x21()
Calling initcall 0xffffffff806de063: init_nls_cp437+0x0/0xc()
Calling initcall 0xffffffff806de06f: init_nls_cp850+0x0/0xc()
Calling initcall 0xffffffff806de07b: init_nls_cp852+0x0/0xc()
Calling initcall 0xffffffff806de087: init_nls_iso8859_1+0x0/0xc()
Calling initcall 0xffffffff806de093: init_nls_iso8859_15+0x0/0xc()
Calling initcall 0xffffffff806de09f: init_nls_utf8+0x0/0x1f()
Calling initcall 0xffffffff806de0be: init_autofs4_fs+0x0/0xc()
Calling initcall 0xffffffff806de0ca: init_udf_fs+0x0/0x53()
Calling initcall 0xffffffff806de11d: ipc_init+0x0/0x14()
Calling initcall 0xffffffff806de2ea: init_mqueue_fs+0x0/0xc7()
Calling initcall 0xffffffff806de51d: key_proc_init+0x0/0x52()
Calling initcall 0xffffffff806de67c: init_crypto+0x0/0x18()
Initializing Cryptographic API
Calling initcall 0xffffffff806de6b4: init+0x0/0xc()
Calling initcall 0xffffffff806de6c0: init+0x0/0xc()
Calling initcall 0xffffffff806de6cc: init+0x0/0xc()
Calling initcall 0xffffffff806de6d8: init+0x0/0xc()
Calling initcall 0xffffffff806de6e4: init+0x0/0x35()
Calling initcall 0xffffffff806de719: init+0x0/0x5a()
Calling initcall 0xffffffff806de773: init+0x0/0x5a()
Calling initcall 0xffffffff806de7cd: init+0x0/0x35()
Calling initcall 0xffffffff806de802: init+0x0/0xc()
Calling initcall 0xffffffff806de80e: init+0x0/0xc()
Calling initcall 0xffffffff806de81a: init+0x0/0x35()
Calling initcall 0xffffffff806de84f: init+0x0/0xc()
Calling initcall 0xffffffff806de85b: init+0x0/0xc()
Calling initcall 0xffffffff806de867: arc4_init+0x0/0xc()
Calling initcall 0xffffffff806de873: init+0x0/0x5a()
Calling initcall 0xffffffff806de8cd: init+0x0/0xc()
Calling initcall 0xffffffff806de8d9: init+0x0/0xc()
Calling initcall 0xffffffff806de8e5: init+0x0/0xc()
Calling initcall 0xffffffff806de8f1: michael_mic_init+0x0/0xc()
Calling initcall 0xffffffff806de8fd: init+0x0/0xc()
Calling initcall 0xffffffff806dea0a: noop_init+0x0/0xc()
Scheduler noop registered
Calling initcall 0xffffffff806dea16: as_init+0x0/0x4f()
Scheduler anticipatory registered
Calling initcall 0xffffffff806dea65: deadline_init+0x0/0x4f()
Scheduler deadline registered
Calling initcall 0xffffffff806deab4: cfq_init+0x0/0xc4()
Scheduler cfq registered
Calling initcall 0xffffffff8026836c: pci_init+0x0/0x2b()


And then the machine freezes. I may add that, with 2.6.14.6, I am
getting errors like:

APIC error on CPU0: 04(40)

or, more often

APIC error on CPU0: 40(40)

I tried either to pass noapic, or to disable apic in bios, and the
kernel entered a slow process of testing and refusing all addresses of
my SCSI card (I tried both an Advansys and a NCR: the result has been
the same) and then eventually gave an oops in ifconfig. I haven't
photographed those messages, but I will do so if needed.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-01-21 20:58:53

by Andrew Morton

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

"Carlo E. Prelz" <[email protected]> wrote:
>
> Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
> Date: sab 21 gen 06 01:09:32 -0800
>
> Quoting Andrew Morton ([email protected]):
>
> > Can you please add `initcall_debug' to the kernel boot command line?
> > That'll tell us which function got stuck.
>
> I photographed the screen. I am copying here the last few lines. I
> hope I make no errors in copying...

Thanks. That's probably more lines than we needed ;)

If you have a web server somewhere, just upload the .jpg file. Or send it
to me and I can do that.

> ...
> ...
> Installing knfsd (copyright (C) 1996 [email protected]).
> Calling initcall 0xffffffff806de042: init_nlm+0x0/0x21()
> Calling initcall 0xffffffff806de063: init_nls_cp437+0x0/0xc()
> Calling initcall 0xffffffff806de06f: init_nls_cp850+0x0/0xc()
> Calling initcall 0xffffffff806de07b: init_nls_cp852+0x0/0xc()
> Calling initcall 0xffffffff806de087: init_nls_iso8859_1+0x0/0xc()
> Calling initcall 0xffffffff806de093: init_nls_iso8859_15+0x0/0xc()
> Calling initcall 0xffffffff806de09f: init_nls_utf8+0x0/0x1f()
> Calling initcall 0xffffffff806de0be: init_autofs4_fs+0x0/0xc()
> Calling initcall 0xffffffff806de0ca: init_udf_fs+0x0/0x53()
> Calling initcall 0xffffffff806de11d: ipc_init+0x0/0x14()
> Calling initcall 0xffffffff806de2ea: init_mqueue_fs+0x0/0xc7()
> Calling initcall 0xffffffff806de51d: key_proc_init+0x0/0x52()
> Calling initcall 0xffffffff806de67c: init_crypto+0x0/0x18()
> Initializing Cryptographic API
> Calling initcall 0xffffffff806de6b4: init+0x0/0xc()
> Calling initcall 0xffffffff806de6c0: init+0x0/0xc()
> Calling initcall 0xffffffff806de6cc: init+0x0/0xc()
> Calling initcall 0xffffffff806de6d8: init+0x0/0xc()
> Calling initcall 0xffffffff806de6e4: init+0x0/0x35()
> Calling initcall 0xffffffff806de719: init+0x0/0x5a()
> Calling initcall 0xffffffff806de773: init+0x0/0x5a()
> Calling initcall 0xffffffff806de7cd: init+0x0/0x35()
> Calling initcall 0xffffffff806de802: init+0x0/0xc()
> Calling initcall 0xffffffff806de80e: init+0x0/0xc()
> Calling initcall 0xffffffff806de81a: init+0x0/0x35()
> Calling initcall 0xffffffff806de84f: init+0x0/0xc()
> Calling initcall 0xffffffff806de85b: init+0x0/0xc()
> Calling initcall 0xffffffff806de867: arc4_init+0x0/0xc()
> Calling initcall 0xffffffff806de873: init+0x0/0x5a()
> Calling initcall 0xffffffff806de8cd: init+0x0/0xc()
> Calling initcall 0xffffffff806de8d9: init+0x0/0xc()
> Calling initcall 0xffffffff806de8e5: init+0x0/0xc()

Oh dear. We have 394 funtions all called 'init()' in the kernel.
netfilter is a prime source.

> Calling initcall 0xffffffff806de8f1: michael_mic_init+0x0/0xc()
> Calling initcall 0xffffffff806de8fd: init+0x0/0xc()
> Calling initcall 0xffffffff806dea0a: noop_init+0x0/0xc()
> Scheduler noop registered
> Calling initcall 0xffffffff806dea16: as_init+0x0/0x4f()
> Scheduler anticipatory registered
> Calling initcall 0xffffffff806dea65: deadline_init+0x0/0x4f()
> Scheduler deadline registered
> Calling initcall 0xffffffff806deab4: cfq_init+0x0/0xc4()
> Scheduler cfq registered
> Calling initcall 0xffffffff8026836c: pci_init+0x0/0x2b()
>
>
> And then the machine freezes. I may add that, with 2.6.14.6, I am
> getting errors like:

OK, it looks like a PCI initcall went South. Can you please add this, then
when it hangs, record the last few lines then send us those, as well as the
output of `lspci -vx'?

--- devel/drivers/pci/pci.c~a 2006-01-21 12:55:38.000000000 -0800
+++ devel-akpm/drivers/pci/pci.c 2006-01-21 12:56:51.000000000 -0800
@@ -886,6 +886,7 @@ static int __devinit pci_init(void)
struct pci_dev *dev = NULL;

while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+ printk("pci_init: %04x%04x\n", dev->vendor, dev->device);
pci_fixup_device(pci_fixup_final, dev);
}
return 0;
_


> APIC error on CPU0: 04(40)
>
> or, more often
>
> APIC error on CPU0: 40(40)
>
> I tried either to pass noapic, or to disable apic in bios, and the
> kernel entered a slow process of testing and refusing all addresses of
> my SCSI card (I tried both an Advansys and a NCR: the result has been
> the same) and then eventually gave an oops in ifconfig. I haven't
> photographed those messages, but I will do so if needed.
>

Is this new behaviour? If so, are you able to pinpoint the latest kernel
which didn't have such problems?

2006-01-21 21:29:20

by Erwin Rol

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

I had also had the problem that my Shuttle ST20G5 (a RS480 IGP based
system) hung in pci_init. This was with one of the Fedora Rawhide
kernels, after reporting it Dave Jones fixed it cause the next rawhide
kernel worked again, maybe he could explain what it was, and where the
fix is (if it is the same thing, but it really looks like it).

- Erwin



2006-01-21 22:04:50

by Dave Jones

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

On Sat, Jan 21, 2006 at 10:28:46PM +0100, Erwin Rol wrote:
> I had also had the problem that my Shuttle ST20G5 (a RS480 IGP based
> system) hung in pci_init. This was with one of the Fedora Rawhide
> kernels, after reporting it Dave Jones fixed it cause the next rawhide
> kernel worked again, maybe he could explain what it was, and where the
> fix is (if it is the same thing, but it really looks like it).

That was due to us carrying one of the 'make the clock not tick
at twice the speed on ati chipsets' patches. Matthew Garrett's variant iirc.
It worked fine in .14, but caused havoc in .15+

I put it down to the problem being fixed in other ways upstream.

Dave

2006-01-22 04:08:20

by Kurt Wall

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

On Sat, Jan 21, 2006 at 05:04:02PM -0500, Dave Jones took 21 lines to write:
> On Sat, Jan 21, 2006 at 10:28:46PM +0100, Erwin Rol wrote:
> > I had also had the problem that my Shuttle ST20G5 (a RS480 IGP based
> > system) hung in pci_init. This was with one of the Fedora Rawhide
> > kernels, after reporting it Dave Jones fixed it cause the next rawhide
> > kernel worked again, maybe he could explain what it was, and where the
> > fix is (if it is the same thing, but it really looks like it).
>
> That was due to us carrying one of the 'make the clock not tick
> at twice the speed on ati chipsets' patches. Matthew Garrett's variant iirc.
> It worked fine in .14, but caused havoc in .15+
>
> I put it down to the problem being fixed in other ways upstream.

Heh. I think you mean s/fixed/worked around/.

The bugme tracking this, 3927, is a joyous muddle of patches,
patches to patches, workarounds, and dueling command line options.
I ported this patch,
http://www.ussg.iu.edu/hypermail/linux/kernel/0504.0/1625.html,
to 2.6.16-rc1 to fix my pain, but 1) it didn't work without the
disable_timer_pin_1 kernel option and, 2) I don't think it's a fix
so much as a workaround.

Kurt
--
I can read your mind, and you should be ashamed of yourself.

2006-01-22 07:40:37

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: sab 21 gen 06 12:58:22 -0800

Quoting Andrew Morton ([email protected]):

> If you have a web server somewhere, just upload the .jpg file. Or send it
> to me and I can do that.

The latest screenshot can be found at
http://www.fluido.as/files/images/screenshot.jpg

> > Calling initcall 0xffffffff8026836c: pci_init+0x0/0x2b()
> >
> >
> > And then the machine freezes. I may add that, with 2.6.14.6, I am
> > getting errors like:
>
> OK, it looks like a PCI initcall went South. Can you please add this, then
> when it hangs, record the last few lines then send us those, as well as the
> output of `lspci -vx'?

These lines appear at the end of the logfile:

pci_init: 10025950
pci_init: 10025a3f
pci_init: 1002437a
pci_init: 10024379
pci_init: 10024374
pci_init: 10024375
pci_init: 10024373

and 1002:4373 is the USB2 (EHCI) controller. I attach the output of
lspci -vx. Even with 2.6.14.6, I have problems with USB. It did not
work at all, then I downloaded the latest bios, and now, right after
boot, usb works OK. But after some time (possibly after the first APIC
error message), newly inserted USB disks are not detected anymore, and
I had one case in which a mounted disk was not accessible anymore.

I just bought the motherboard - I did not make too many tests.

> Is this new behaviour? If so, are you able to pinpoint the latest kernel
> which didn't have such problems?

The motherboard is new. With the previous motherboard (via k8t88
based), using the same processor and most of the same boards, I did
not have any of these problems.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)


Attachments:
(No filename) (1.84 kB)
lspciout2 (11.72 kB)
Download all attachments

2006-01-22 07:56:12

by Andrew Morton

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

"Carlo E. Prelz" <[email protected]> wrote:
>
> Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
> Date: sab 21 gen 06 12:58:22 -0800
>
> Quoting Andrew Morton ([email protected]):
>
> > If you have a web server somewhere, just upload the .jpg file. Or send it
> > to me and I can do that.
>
> The latest screenshot can be found at
> http://www.fluido.as/files/images/screenshot.jpg
>
> > > Calling initcall 0xffffffff8026836c: pci_init+0x0/0x2b()
> > >
> > >
> > > And then the machine freezes. I may add that, with 2.6.14.6, I am
> > > getting errors like:
> >
> > OK, it looks like a PCI initcall went South. Can you please add this, then
> > when it hangs, record the last few lines then send us those, as well as the
> > output of `lspci -vx'?
>
> These lines appear at the end of the logfile:
>
> pci_init: 10025950
> pci_init: 10025a3f
> pci_init: 1002437a
> pci_init: 10024379
> pci_init: 10024374
> pci_init: 10024375
> pci_init: 10024373
>
> and 1002:4373 is the USB2 (EHCI) controller. I attach the output of
> lspci -vx. Even with 2.6.14.6, I have problems with USB. It did not
> work at all, then I downloaded the latest bios, and now, right after
> boot, usb works OK.

OK, so it sounds like quirk_usb_disable_ehci() caused your machine to hang
with the old BIOS. That's fairly bad behaviour from the kernel, even
though the BIOS presumably had some problems.

> But after some time (possibly after the first APIC
> error message), newly inserted USB disks are not detected anymore, and
> I had one case in which a mounted disk was not accessible anymore.

Can you please gather some more details on this and prepare a new report?
The full demsg output, machine description, etc. It might be best to do
this via a new bugzilla.kernel.org record so we know where to find it.

Thanks.

> I just bought the motherboard - I did not make too many tests.
>
> > Is this new behaviour? If so, are you able to pinpoint the latest kernel
> > which didn't have such problems?
>
> The motherboard is new. With the previous motherboard (via k8t88
> based), using the same processor and most of the same boards, I did
> not have any of these problems.

2006-01-22 08:30:07

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: sab 21 gen 06 11:55:46 -0800

Quoting Andrew Morton ([email protected]):

> > and 1002:4373 is the USB2 (EHCI) controller. I attach the output of
> > lspci -vx. Even with 2.6.14.6, I have problems with USB. It did not
> > work at all, then I downloaded the latest bios, and now, right after
> > boot, usb works OK.
>
> OK, so it sounds like quirk_usb_disable_ehci() caused your machine to hang
> with the old BIOS. That's fairly bad behaviour from the kernel, even
> though the BIOS presumably had some problems.

It did not hang. When inserting USB disks, I got messages like:

ehci_hcd 0000:00:13.2: Unlink after no-IRQ? Controller is probably using the wrong IRQ.
usb 1-1: device not accepting address 2, error -110

and the usb-storage module was not loaded. But the kernel went on
working. After upgrading the BIOS, with the same kernel, the USB disk
was recognized. But I now notice that the log file, with both old and
new BIOS, contains these lines:

ehci_hcd 0000:00:13.2: BIOS handoff failed (160, 01010001)
ehci_hcd 0000:00:13.2: continuing after BIOS bug...
ehci_hcd 0000:00:13.2: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:13.2: irq 18, io mem 0xfe02b000
ehci_hcd 0000:00:13.2: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004

(only, the IRQ number changed from 10 to 18 after changing BIOS)

> Can you please gather some more details on this and prepare a new report?
> The full demsg output, machine description, etc. It might be best to do
> this via a new bugzilla.kernel.org record so we know where to find
> it.

I will try that. I saw on apic.c that APIC error 40 should mean
'received illegal vector,' but I have no clue about how to interpret
this.

I will now play a bit with BIOS USB parameters.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-01-22 11:11:15

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: sab 21 gen 06 11:55:46 -0800

Quoting Andrew Morton ([email protected]):

> Can you please gather some more details on this and prepare a new report?
> The full demsg output, machine description, etc. It might be best to do
> this via a new bugzilla.kernel.org record so we know where to find it.

I filed bug #5935.

The BIOS contains 4 fields related to USB. In the 'Integrated
peripherals' menu:

USB EHCI Controller
Onchip USB Controller
Onchip USB KBC Controller

In the 'PnP/PCI configurations' menu:

Assign IRQ for USB

All four only allow a choice between enable/disable. As expected,
disabling the first one makes the 0000:00:13.2 pci device
disappear. This results in the completing of the boot process. I am
writing this from a working 2.6.15 kernel, and I can mount my USB hard
disk. It probably works in 1.1 mode.

I noticed that the new motherboard's USB device is OHCI-based, while
the previous one was UHCI-based. The OHCI driver was previously not
compiled in. EHCI was functional even without a working OHCI module
(in 2.6.14). Compiling in the OHCI driver has had no effect in the
boot hang with EHCI enabled - the machine still hangs, and the output
is identical.

Disabling the second field (Onchip USB Controller) makes all USB
devices disappear - boot is OK, but there is no usb activity at all.

The third field has no evident effect. This is true of the fourth one
too (the one about IRQ assignment). I saw no change in boot logs.

Complete dmesg output for the successful 2.6.15 boot with EHCI
disabled can be downloaded from http://www.fluido.as/files/dmesg.txt.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-01-23 19:20:55

by David Brownell

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1


> OK, so it sounds like quirk_usb_disable_ehci() caused your machine to hang
> with the old BIOS. That's fairly bad behaviour from the kernel, even
> though the BIOS presumably had some problems.

I think what happened is the "always run quirks" code got turned into
the default too early, before the EHCI "quirk" version of the handoff
code got checked against what most systems have been using for the past
several years.

I noticed at least one suspicous thing: it enables an SMI IRQ.
Even in cases when the boot firmware says it's not using EHCI ...
easy to imagine that causing hangage.


Maybe this time it'd help to tell your BIOS "yes, DO use USB".
Or, the attached patch might help. Please try both experiments.

- Dave



Attachments:
(No filename) (735.00 B)
ehci-handoff.patch (7.08 kB)
Download all attachments

2006-01-23 21:47:43

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: lun 23 gen 06 11:01:25 -0800

Quoting David Brownell ([email protected]):

> Maybe this time it'd help to tell your BIOS "yes, DO use USB".

That I am doing. And I now have the appropriate OHCI module
loaded. USB 1.1 works apparently quite OK.

> Or, the attached patch might help.

I applied the patch. The three changes to the second file applied with
an offset of 6 lines (to 2.6.15 vanilla). Nothing changed: the booting
process hung at the same place, generating the same printout as
before. I have now booted the new kernel with EHCI disabled, and saved
the dmesg oputput to http://www.fluido.as/files/dmesg2.txt (here,
USB1.1 is active).

It is time for sleep for me. I will perform any new test tomorrow
morning.

Carlo


--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-01-24 04:42:40

by Greg KH

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

On Mon, Jan 23, 2006 at 11:01:25AM -0800, David Brownell wrote:
> This moves the previously widely-used ehci-pci.c BIOS handoff
> code into the pci-quirks.c file, replacing the less widely used
> "early handoff" version that seems to cause problems lately.
>
> One notable change: the "early handoff" version always enabled
> an SMI IRQ ... and did so even if the pre-Linux code said it was
> not using EHCI (and not expecting EHCI SMIs). Looks like a goof
> in a workaround for some unknown BIOS version.
>
> This merged version only forcibly enables those IRQs when pre-Linux
> code says it's using EHCI. And now it always forces them off "just
> in case".

Thanks for posting this, it fixes my EHCI + APIC error, and makes my
laptop work just fine.

Turns out that 2.6.14 worked for it, but 2.6.15 didn't. git bisect a
zillion times later narrowed it down to the usb early handoff stuff but
due to merge issues, it was tough to track down the exact patch.

For fun I tried this one on top of the latest -mm, and it works!

So, care to clean it up to make it feel better to you and send it to me
again so I can add it to my tree? I know the next SuSE kernel will need
it :)

thanks,

greg k-h

2006-01-24 15:15:33

by David Brownell

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

On Monday 23 January 2006 8:42 pm, Greg KH wrote:
> On Mon, Jan 23, 2006 at 11:01:25AM -0800, David Brownell wrote:
> > This moves the previously widely-used ehci-pci.c BIOS handoff
> > code into the pci-quirks.c file, replacing the less widely used
> > "early handoff" version that seems to cause problems lately.
> >
> > One notable change: the "early handoff" version always enabled
> > an SMI IRQ ... and did so even if the pre-Linux code said it was
> > not using EHCI (and not expecting EHCI SMIs). Looks like a goof
> > in a workaround for some unknown BIOS version.
> >
> > This merged version only forcibly enables those IRQs when pre-Linux
> > code says it's using EHCI. And now it always forces them off "just
> > in case".
>
> Thanks for posting this, it fixes my EHCI + APIC error, and makes my
> laptop work just fine.

OK, here's a version with a Signed-Off-By; against current GIT.

I'm mildly surprised it helps that laptop, but not surprised that
it helps _some_ of those "ehci init goofs" cases. :)

- Dave




Attachments:
(No filename) (1.01 kB)
ehci-handoff.patch (7.13 kB)
Download all attachments

2006-02-05 10:33:53

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Time has passed without further mention of this problem. Today I took
some time to discover exactly where the boot process was hanging. I
found the place.

In drivers/usb/host/pci-quirks.c, in function quirk_usb_disable_ehci
(should start around line 211) there is a stanza that reads:

/* always say Linux will own the hardware
* by setting EHCI_USBLEGSUP_OS.
*/

pci_write_config_byte(pdev, offset + 3, 1);

On my sapphire athlon64 motherboard (see the thread for more details),
this call never returns (without generating any output). I commented
it out, and now the EHCI subsystem works OK (currently running
2.6.16rc2).

I do not know what the right patch should be.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-02-05 19:45:27

by David Brownell

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

On Sunday 05 February 2006 2:33 am, Carlo E. Prelz wrote:
> In drivers/usb/host/pci-quirks.c, in function quirk_usb_disable_ehci
> (should start around line 211) there is a stanza that reads:
>
> /* always say Linux will own the hardware
> * by setting EHCI_USBLEGSUP_OS.
> */
> pci_write_config_byte(pdev, offset + 3, 1);
>
> On my sapphire athlon64 motherboard (see the thread for more details),
> this call never returns (without generating any output). I commented
> it out, and now the EHCI subsystem works OK (currently running
> 2.6.16rc2).

Interesting ... feels like a BIOS problem. If you want to experiment,
there's a right bracket -- "}" -- immediately before that. Try moving
it right after that write, so that write_config_byte is covered by the
preceding "if LEGSUP_BIOS" test; or copying the much later "disable SMI"
clause into an "else" for that "if".

- Dave

2006-02-06 08:03:03

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: dom 05 feb 06 11:45:22 -0800

Quoting David Brownell ([email protected]):

> Interesting ... feels like a BIOS problem. If you want to experiment,
> there's a right bracket -- "}" -- immediately before that. Try moving
> it right after that write, so that write_config_byte is covered by the
> preceding "if LEGSUP_BIOS" test; or copying the much later "disable SMI"
> clause into an "else" for that "if".

The first one would be useless - I inserted lots of printouts to find
out where the freeze took place, and I know that the
EHCI_USBLEGSUP_BIOS flag is on (cap is 0x10001). The value remains the
same after the 'spin till it hands it over' loop - so that this
printout appears:

0000:00:13.2 EHCI: BIOS handoff failed (BIOS bug ?)

About the second thing you suggest: do you refer to this call?

/* just in case, always disable EHCI SMIs */
pci_write_config_dword(pdev,
offset + EHCI_USBLEGCTLSTS,
0);

In my machine, the write takes place without apparent ill effects. If
I add it as an else clause to the "if LEGSUP_BIOS" test, it won't
execute, because the EHCI_USBLEGSUP_BIOS flag is on.

In case you need it: hcc_params is 0xa012.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-02-06 16:34:13

by David Brownell

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

On Monday 06 February 2006 12:02 am, Carlo E. Prelz wrote:
>
> > Interesting ... feels like a BIOS problem. If you want to experiment,
> > there's a right bracket -- "}" -- immediately before that. Try moving
> > it right after that write, so that write_config_byte is covered by the
> > preceding "if LEGSUP_BIOS" test; or copying the much later "disable SMI"
> > clause into an "else" for that "if".
>
> The first one would be useless - I inserted lots of printouts to find
> out where the freeze took place, and I know that the
> EHCI_USBLEGSUP_BIOS flag is on (cap is 0x10001). The value remains the
> same after the 'spin till it hands it over' loop - so that this
> printout appears:
>
> 0000:00:13.2 EHCI: BIOS handoff failed (BIOS bug ?)

If it printed that, then how is it possible that it hung _before_ printing
that message??? Your reports are not making any sense to me.

Maybe that whole "if" block that turns that SMI _on_ is the problem; it
was part of the "early handoff" code, which came from who knows where,
was clearly buggy, and was never widely used until recently. Enabling
the SMI seemed pretty dubious to me, but I suspect that some undescribed
buggy BIOS really does need it ... maybe whoever provided that "early"
handoff version could report what they were trying to do by enabling
the SMI?

- Dave

2006-02-06 16:50:21

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: Mon 06 Feb 06 08:24:04AM -0800

Quoting David Brownell ([email protected]):

> If it printed that, then how is it possible that it hung _before_ printing
> that message???

I already wrote that I had commented out the line that caused the
hangup:

// pci_write_config_byte(pdev, offset + 3, 1);

After commenting out this line, the machine boots OK and EHCI works
fine. It does print the BIOS handoff failed message.

If I do not comment out the above line, the machine hangs, and,
obviously, no BIOS handoff failed message is printed.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

2006-02-06 17:31:18

by David Brownell

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1


> > If it printed that, then how is it possible that it hung _before_ printing
> > that message???
>
> I already wrote that I had commented out the line that caused the
> hangup:
>
> // pci_write_config_byte(pdev, offset + 3, 1);
>
> After commenting out this line, the machine boots OK and EHCI works
> fine. It does print the BIOS handoff failed message.

Then if disabling that code which enables the SMI doesn't work,
you have only one real option other than telling your BIOS not
to support USB keyboards/mice/disks: replace your BIOS.

The reason it prints the BIOS handoff message is because you
completely disabled the handoff, so your BIOS still thinks it
owns that controller. Commenting out that line is not good.

2006-02-06 17:45:33

by Carlo E. Prelz

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1

Subject: Re: [linux-usb-devel] Re: ATI RS480-based motherboard: stuck while booting with kernel >= 2.6.15 rc1
Date: Mon 06 Feb 06 09:31:14AM -0800

Quoting David Brownell ([email protected]):

> Then if disabling that code which enables the SMI doesn't work,
> you have only one real option other than telling your BIOS not
> to support USB keyboards/mice/disks: replace your BIOS.

Sapphire has no newer bios than the one I am using. But I am saying
that USB works with that line commented out. I tried a couple of USB
disks, a USB mouse and my palm pilot - all seem to work quite OK. No
angry messages.

Carlo

--
* Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - [email protected] che bisogno ci sarebbe
* di parlare tanto di amore e di rettitudine? (Chuang-Tzu)