2009-06-27 15:52:52

by Zbigniew Luszpinski

[permalink] [raw]
Subject: OHCI USB hangs during intensive use, nVidia MCP78S chipset, few questions to help me find the problem.

Hello,

after upgrading my CPU from single core to multicore and mainboard I started
to have problem with OHCI USB controller. It hangs during big data transfer
(USB hdd drive, pendrive, IRDA dongle or adsl modem). The result of hang is
usb device is not responding. This is the only error I found. I would like to
learn why it hangs and fix it. I tried kernels 2.6.27-30 and most popular
Linux distributions. Adding noapic or acpi=noirq parameter to kernel boot line
almost workarounds the problem, disabling tickless kernel: CONFIG_NO_HZ also
makes usb ohci hanging a little bit less. Usb 1.1 devices still hangs but
seldom: only transferring over 100MiB file to/from usb hdd or downloading
CentOS dvd iso from ftp or torrent can hang usb. I do not want to be left with
workaround. I want to fix it. Can you tell me how to find the issue and where
to look for faulty code? By looking at kernel parameters which cures the usb
ohci bug it must be something wrong with IRQs.

Questions which will help me narrowing the source of the problem:
Can you confirm/deny that Linux usb ohci driver works perfect on multicore
Phenom cpu when big data transfer happens (Phenom 9550 reports very welcome)?
Can you confirm/deny that Linux usb ohci driver works perfect on nVidia MCP78S
(Geforce 8200) chipset when big data transfer happens?
If you have working/non working ohci on your Geforce 8200 mainboard please
include detailed information about your system:
-cpu model (manufacturer, model)
-mainboard (manufacturer, model, bios version)

I have changed:
mainboard: Asus A8N-VM CSM (nVidia nForce 6150/430) to:
mainboard: ASRock K10N78FullHD-hSLI R3.0 (nVidia MCP78S (Geforce8200))
cpu: Athlon64 3000+ Venice (1 core) to:
cpu: Phenom 9550 (4 cores)
mem: 4x256MiB DDR400 (twinbank) to 2x1GiB DDR2 800 CL4 (twinbank)

The previous mainboard works perfectly with usb 1.1 devices. The new one is
not. I tried M$ Win XP Pro SP3 and ohci usb works perfectly so new mainboard
is not broken - this is software bug in bios or Linux ohci/IRQ on this
mainboard.
All usb 1.1 devices works perfectly with all my previous mainboards so this
must be new mainboard incompatibility with Linux up to and including 2.6.30.

Here is my bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=13405

have a nice day,
Zbigniew Luszpinski


2009-06-27 17:27:20

by Robert Hancock

[permalink] [raw]
Subject: Re: OHCI USB hangs during intensive use, nVidia MCP78S chipset, few questions to help me find the problem.

On 06/27/2009 09:51 AM, Zbigniew Luszpinski wrote:
> Hello,
>
> after upgrading my CPU from single core to multicore and mainboard I started
> to have problem with OHCI USB controller. It hangs during big data transfer
> (USB hdd drive, pendrive, IRDA dongle or adsl modem). The result of hang is
> usb device is not responding. This is the only error I found. I would like to
> learn why it hangs and fix it. I tried kernels 2.6.27-30 and most popular
> Linux distributions. Adding noapic or acpi=noirq parameter to kernel boot line
> almost workarounds the problem, disabling tickless kernel: CONFIG_NO_HZ also
> makes usb ohci hanging a little bit less. Usb 1.1 devices still hangs but
> seldom: only transferring over 100MiB file to/from usb hdd or downloading
> CentOS dvd iso from ftp or torrent can hang usb. I do not want to be left with
> workaround. I want to fix it. Can you tell me how to find the issue and where
> to look for faulty code? By looking at kernel parameters which cures the usb
> ohci bug it must be something wrong with IRQs.

Are you getting any kernel errors/messages in dmesg? In any case you
should post your dmesg output from bootup.

2009-06-28 00:39:19

by Zbigniew Luszpinski

[permalink] [raw]
Subject: Re: OHCI USB hangs during intensive use, nVidia MCP78S chipset, few questions to help me find the problem.

On Saturday, 27 June 2009 at 19:27:58 Robert Hancock wrote:
> Are you getting any kernel errors/messages in dmesg? In any case you
> should post your dmesg output from bootup.

Thanks for response :)
Here are all my earlier dmesg dumps, irq dumps, acpi dumps...
http://bugzilla.kernel.org/show_bug.cgi?id=13405

Ok. See attached files (current dmesg I made today):
dmesg.copyFile - dmesg when usb hdd become frozen during copying CentOS DVD.
How I tested:
USB 2.0 disabled in bios
boot 2.6.30
wait till speedtouch 330 usb modem finished initialization
login as root on tty1 tty2 tty3
power on usb hdd
wait 10 seconds for hdd to finish selfinit
plugin to usb port
fdisk -l /dev/sdb
fdisk -l /dev/sdb
mount /dev/sdb3 /mnt/pendrive
ls /mnt/pendrive
cp /mnt/pendrive/CentOS-5.3-i386-bin-DVD.iso /root
usb hangs
-- cancelling URB appears in dmesg
cp is hanging on tty1
Ctrl+c terminates cp
ls -lh /mnt/pendrive | grep CentOS
-rwxr-xr-x 1 root root 3,7G 05-08 20:18 CentOS-5.3-i386-bin-DVD.iso
ls -lh /root | grep CentOS
-rwxr-xr-x 1 root root 101M 06-27 21:59 CentOS-5.3-i386-bin-DVD.iso
Only 101M from 3,7G was copied when usb hanged
trying cp again: nothing happens, dmesg silent.
fdisk -l /dev/sdb
hangs, Ctrl+c is unable to kill it
from tty3 I do umount /mnt/pendrive
hangs, Ctrl+c is unable to kill it
trying to remove ohci_hcd - rmmod hanging.
reboot - shutdown procedure begins. Stops after few seconds. Whole machine
frozen. Unable to restart. Keyboard (PS/2) frozen. Can not type. Hard reset.

Rebooting with noapic. This above procedure finished successful. 3,7GB from
3,7GB copied successfully. fdisk ok, umount ok, reboot ok.

dmesg.apic - dmesg with apic enabled. This time usb hdd hangs immediately
after plugging in. Usb hdd hanged usb bus causing adsl modem to stop
responding see ATM errors at the end of log.

ADSL modem errors are reported by pppd (I have 1Mbit connection). I can only
receive mail from single mailbox at once to not cause usb overload. I can only
browse simple web pages to not overload usb bus. Opening Akregator which
checks 13 RSS feeds at once causes usb hang in a 1 second time. Opening 2
mailboxes together hangs usb immediately. However small usb traffic like
pinging one site works perfect.
All these problems are resolved thanks to noapic or acpi=noirq kernel
parameter (but downloading dvd iso of CentOS is still not possible). This log
shows how usb modem behaves when kernel is booted without any parameters:
Mar 19 20:47:16 localhost pppd[5193]: Plugin pppoatm.so loaded.
Mar 19 20:47:16 localhost pppd[5193]: PPPoATM plugin_init
Mar 19 20:47:16 localhost pppd[5193]: PPPoATM setdevname_pppoatm -
SUCCESS:0.35
Mar 19 20:47:16 localhost pppd[5193]: pppd 2.4.4 started by root, uid 0
Mar 19 20:47:16 localhost pppd[5193]: Using interface ppp0
Mar 19 20:47:16 localhost pppd[5193]: Connect: ppp0 <--> 0.35
Mar 19 20:47:19 localhost pppd[5193]: CHAP authentication succeeded
Mar 19 20:47:19 localhost pppd[5193]: CHAP authentication succeeded
Mar 19 20:47:19 localhost pppd[5193]: kernel does not support PPP filtering
Mar 19 20:47:19 localhost pppd[5193]: local IP address xx.xxx.xxx.xxx
Mar 19 20:47:19 localhost pppd[5193]: remote IP address xx.xxx.xxx.xxx
Mar 19 20:47:19 localhost pppd[5193]: primary DNS address xx.xxx.xxx.xx
Mar 19 20:47:19 localhost pppd[5193]: secondary DNS address xxx.xxx.xxx.xx
Mar 19 20:49:10 localhost acpid: client connected from 14701[0:0]
Mar 19 20:49:10 localhost acpid: 1 client rule loaded
Mar 19 20:50:29 localhost acpid: client connected from 20859[0:0]
Mar 19 20:50:29 localhost acpid: 1 client rule loaded
Mar 19 20:59:30 localhost acpid: client connected from 20970[0:0]
Mar 19 20:59:30 localhost acpid: 1 client rule loaded
Mar 19 20:59:31 localhost acpid: client connected from 20970[0:0]
Mar 19 20:59:31 localhost acpid: 1 client rule loaded
Mar 19 21:10:30 localhost acpid: client connected from 21368[0:0]
Mar 19 21:10:30 localhost acpid: 1 client rule loaded
Mar 19 21:10:30 localhost acpid: client connected from 21368[0:0]
Mar 19 21:10:30 localhost acpid: 1 client rule loaded
Mar 19 21:14:40 localhost pppd[5205]: No response to 3 echo-requests <---
since now usb hangs
Mar 19 21:14:40 localhost pppd[5205]: Serial link appears to be disconnected.
Mar 19 21:14:40 localhost pppd[5205]: Connect time 27.4 minutes.
Mar 19 21:14:40 localhost pppd[5205]: Sent 28932 bytes, received 716806 bytes.
Mar 19 21:14:46 localhost pppd[5205]: Connection terminated.
Mar 19 21:14:46 localhost pppd[5205]: Modem hangup
Mar 19 21:14:50 localhost pppd[5205]: Using interface ppp0
Mar 19 21:14:50 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:15:20 localhost pppd[5205]: LCP: timeout sending Config-Requests
Mar 19 21:15:20 localhost pppd[5205]: Connection terminated.
Mar 19 21:15:20 localhost pppd[5205]: Modem hangup
Mar 19 21:15:24 localhost pppd[5205]: Using interface ppp0
Mar 19 21:15:24 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:15:54 localhost pppd[5205]: LCP: timeout sending Config-Requests
Mar 19 21:15:54 localhost pppd[5205]: Connection terminated.
Mar 19 21:15:54 localhost pppd[5205]: Modem hangup
Mar 19 21:15:58 localhost pppd[5205]: Using interface ppp0
Mar 19 21:15:58 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:16:29 localhost pppd[5205]: LCP: timeout sending Config-Requests
Mar 19 21:16:29 localhost pppd[5205]: Connection terminated.
Mar 19 21:16:29 localhost pppd[5205]: Modem hangup
Mar 19 21:16:33 localhost pppd[5205]: Using interface ppp0
Mar 19 21:16:33 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:17:03 localhost pppd[5205]: LCP: timeout sending Config-Requests
Mar 19 21:17:03 localhost pppd[5205]: Connection terminated.
Mar 19 21:17:03 localhost pppd[5205]: Modem hangup
Mar 19 21:17:07 localhost pppd[5205]: Using interface ppp0
Mar 19 21:17:07 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:17:37 localhost pppd[5205]: LCP: timeout sending Config-Requests
Mar 19 21:17:37 localhost pppd[5205]: Connection terminated.
Mar 19 21:17:37 localhost pppd[5205]: Modem hangup
Mar 19 21:17:41 localhost pppd[5205]: Using interface ppp0
Mar 19 21:17:41 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:18:11 localhost pppd[5205]: LCP: timeout sending Config-Requests
Mar 19 21:18:11 localhost pppd[5205]: Connection terminated.
Mar 19 21:18:11 localhost pppd[5205]: Modem hangup
Mar 19 21:18:15 localhost pppd[5205]: Using interface ppp0
Mar 19 21:18:15 localhost pppd[5205]: Connect: ppp0 <--> 0.35
Mar 19 21:18:35 localhost init: Switching to runlevel: 6 <-- I try to reboot


Attachments:
(No filename) (6.41 kB)
dmesg.copyFile.bz2 (2.26 kB)
dmesg.apic.bz2 (12.12 kB)
dmesg.noapic.bz2 (14.02 kB)
Download all attachments