2012-05-18 20:36:32

by Martin Mokrejs

[permalink] [raw]
Subject: linux-3.4-rc7: rcu_sched self-detected stall on CPU

Hi,
I was converting some data from large files placed on ext4 filesystem and writing them back to the same
directory. I happened while reading a 1.6GB large file, I think. The disk is connected via USB3.0 connection
provided by Texas Instruments hosts controller in my Dell Vostro 3550 laptop. The laptop has 16GB of RAM.
It seems it was coping with the CPU stall for an hour or so.

I have attached dmesg with xhci_hcd lines removed.

I can provide dmesg when the machine was booted up and then the stuff bloated with xhci_hcd debug (huge):

-rw-r--r-- 1 root root 172013 May 18 21:44 messages-head
-rw-r--r-- 1 root root 13604880 May 18 21:53 messages-tail

The machine was alive, 2 cores were used at full by system, one was used by user processes and one was in
wait state, as reported by vmstat (I have 2-core i7 processor with HT enabled).
But after a while (20 min) I was trying to close my apps and one remote ssh connection over 3G modem (hanging
on the same USB3.0 controller like the external disk) my mouse just disappeared and keyboard stopped working.
The CPU fan was already quiet for some while so there was nothing else notable. Alt+SysRq+S did not work.
Only 5sec long push of the power button. :(

Now when the laptop is booted up back again I see last lines in /var/log/messages are xhci_hcd transfers.
I should note that when the laptop stopped responding I unplugged an external USB hub (with my mouse and keyboard
connected to it) connected to USB2.0 controller. Re-connecting the hub did not turn its lights on, so the USB2.0
port was dead. I unplugged also the 3G modem which had at least power from the laptop and also the external disk.
So maybe something from this was logged in /var/log/messages?

But as it did not respond to keystrokes, I turned it off as I already said.

Please let me know if I can provide some more details.
Martin
P.S.: Maybe related to http://lkml.org/lkml/2012/3/24/86 ?


Attachments:
dmesg_without_xhci_hcd_lines.txt (85.09 kB)

2012-05-19 09:19:09

by Dan Carpenter

[permalink] [raw]
Subject: Re: linux-3.4-rc7: rcu_sched self-detected stall on CPU

I don't think this is the same bug that I saw earlier. That one was
fixed for me, although I couldn't say in which patch.

regards,
dan carpenter