2005-01-21 06:15:22

by Klaus Muth

[permalink] [raw]
Subject: kernel panic with 2.4.26

Hi.
Every now and then (maybe twice a week) my server panics. This
is a dual Xeon system with 5Gb memory. I did my best to get the
full oops from the screen and doublechecked. Sorry, but I don't
understand anything from the ksymoops output.
Any help will be appreciated.

ksymoops 2.4.5 on i686 2.4.26-msi1. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.26-msi1/ (default)
-m System.map-2.4.26-msi1.nogood (specified)

f893281d
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<f893281d>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010256
eax: fffc43fc ebx: 00000002 ecx: f703b000 edx: 0000000d
esi: f187d000 edi: 00000000 ebp: f7005c1c esp: c0353ed4
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c0353000)
Stack: 00000000 f7040d00 00000000 f778a480 00000040 f7005c00 00000000 f703b000
00040d00 f7007e80 f8921982 f7040d00 f7030b08 f778a480 00000002 00000000
00000000 00000000 f703b200 00000000 f8921a9c f778a480 f7040d08 f77dc680
Call Trace: [<f8921982>] [<f8921a9c>] [<c010a041>] [<c010a236>] [<c0106d60>]
[<c0106d60>] [<c0106d60>] [<c0106d60>] [<c0106d89>] [<c0106df2>] [<c0105000>]
[<c010504f>]
Code: 88 08 8b 86 58 01 00 00 ff 86 5c 01 00 00 88 10 ff 86 58 01


>>EIP; f893281d <_end+3851dc61/385fa444> <=====

>>eax; fffc43fc <END_OF_CODE+74ee74d/????>
>>ecx; f703b000 <_end+36c26444/385fa444>
>>esi; f187d000 <_end+31468444/385fa444>
>>ebp; f7005c1c <_end+36bf1060/385fa444>
>>esp; c0353ed4 <init_task_union+1ed4/2000>

Trace; f8921982 <_end+3850cdc6/385fa444>
Trace; f8921a9c <_end+3850cee0/385fa444>
Trace; c010a041 <handle_IRQ_event+5d/88>
Trace; c010a236 <do_IRQ+a6/ec>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d89 <default_idle+29/34>
Trace; c0106df2 <cpu_idle+3e/54>
Trace; c0105000 <_stext+0/0>
Trace; c010504f <rest_init+4f/50>

Code; f893281d <_end+3851dc61/385fa444>
00000000 <_EIP>:
Code; f893281d <_end+3851dc61/385fa444> <=====
0: 88 08 mov %cl,(%eax) <=====
Code; f893281f <_end+3851dc63/385fa444>
2: 8b 86 58 01 00 00 mov 0x158(%esi),%eax
Code; f8932825 <_end+3851dc69/385fa444>
8: ff 86 5c 01 00 00 incl 0x15c(%esi)
Code; f893282b <_end+3851dc6f/385fa444>
e: 88 10 mov %dl,(%eax)
Code; f893282d <_end+3851dc71/385fa444>
10: ff 86 58 01 00 00 incl 0x158(%esi)

<0>Kernel panic: Aiee, killing interrupt handler!

Could you please help me out?

klaus


2005-02-11 09:16:14

by Klaus Muth

[permalink] [raw]
Subject: Re: kernel panic with 2.4.26

Am Freitag, 21. Januar 2005 07:15 schrieb Klaus Muth:
> Every now and then (maybe twice a week) my server panics. [...]
> Any help will be appreciated.

Did help myself. Seems to work.

> ksymoops 2.4.5 on i686 2.4.26-msi1. Options used

Updated to 2.4.29, keeping my kernel config. No panic since then (2 weeks)
which seems to be a drastic decrease of the panics/week ratio ;).

klaus

2005-02-16 11:15:07

by Jonathan Sambrook

[permalink] [raw]
Subject: Re: kernel panic with 2.4.26

Klaus Muth wrote:
> Am Freitag, 21. Januar 2005 07:15 schrieb Klaus Muth:
>
>>Every now and then (maybe twice a week) my server panics. [...]
>>Any help will be appreciated.
>
>
> Did help myself. Seems to work.
>
>
>>ksymoops 2.4.5 on i686 2.4.26-msi1. Options used
>
>
> Updated to 2.4.29, keeping my kernel config. No panic since then (2 weeks)
> which seems to be a drastic decrease of the panics/week ratio ;).

Sorry, didn't spot your previous email.

I've not set aside time to investigate further, but turning HT off made
the problem go away. Would be interested to hear further reports.

Regards,
Jonathan

2005-02-16 13:02:43

by Klaus Muth

[permalink] [raw]
Subject: Re: kernel panic with 2.4.26

Am Mittwoch, 16. Februar 2005 12:14 schrieb Jonathan Sambrook:
> >>Every now and then (maybe twice a week) my server panics. [...]
> >>Any help will be appreciated.
> >
> Sorry, didn't spot your previous email.
No problem ;).

> I've not set aside time to investigate further, but turning HT off made
> the problem go away. Would be interested to hear further reports.

Server oopsed again 10 minutes ago. Same symptoms. The kernel upgrade did not
help... Would an update to an 2.6 kernel help or should I better turn
hyperthreading off?

klaus

2005-02-16 15:16:45

by Jonathan Sambrook

[permalink] [raw]
Subject: Re: kernel panic with 2.4.26

Klaus Muth wrote:
> Am Mittwoch, 16. Februar 2005 12:14 schrieb Jonathan Sambrook:

> Server oopsed again 10 minutes ago. Same symptoms.

<sigh> schade


> The kernel upgrade did not
> help... Would an update to an 2.6 kernel help or should I better turn
> hyperthreading off?

My experience is running _modified_ 2.4 kernels. Turning HT off solved
the problem here. Of course YMMV if the root cause is different.

I have no experience of running HT on 2.6. My hunch would be that more
HT users run 2.6 than 2.4 nowadays, so the problem would've been raised
by now? If so your choice depends on whether the joint benefits of HT
and of 2.6 outweigh any effort of moving to 2.6.

Jonathan

2005-02-21 11:06:30

by Klaus Muth

[permalink] [raw]
Subject: Re: kernel panic with 2.4.26

Am Mittwoch, 16. Februar 2005 16:16 schrieb Jonathan Sambrook:
> Klaus Muth wrote:
> > Am Mittwoch, 16. Februar 2005 12:14 schrieb Jonathan Sambrook:
> >
> > Server oopsed again 10 minutes ago. Same symptoms.
>
> <sigh> schade
Your help is really appreciated.

> > The kernel upgrade did not
> > help... Would an update to an 2.6 kernel help or should I better turn
> > hyperthreading off?
>
> My experience is running _modified_ 2.4 kernels. Turning HT off solved
> the problem here. Of course YMMV if the root cause is different.
Hypertreading turned off, machine crashed again in a matter of minutes. But
I took a photograph again and did run kymoops over the yield:
---------------------------------------
Unable to handle kernel paging request at virtual addresss ffffffff
f8a3589d
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<f8a3589d>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010256
eax: ffffffff ebx: 00000002 ecx: f5942000 edx: 0000000d
esi: e162f000 edi: 00000000 ebp: f60d701c esp: c0379ee4
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c0379000)
Stack: 00000000 f593d580 00000000 f77ba480 00000040 f60d7000 00000000 f5942000
0093d580 f593fd80 f8a24982 f593d580 f6347188 f77ba480 00000002 00000000
00000000 00000000 f5942200 00000000 f8a24a61 f77ba480 f593d588 f621ec40
Call Trace: [<f8a24982>] [<f8a24a61>] [<c010a041>] [<c010a236>] [<c0106d60>]
[<c0106d60>] [<c0106d60>] [<c0106d60>] [<c0106d89>] [<c0106df2>] [<c0105000>]
[<c010504f>]
<0>Kernel panic: Aiee, killing interrupt handler!
Warning (Oops_read): Code line not seen, dumping what data is available


>>EIP; f8a3589d <[ftdi_sio]ftdi_read_bulk_callback+361/438> <=====

>>eax; ffffffff <END_OF_CODE+754d54c/????>
>>ecx; f5942000 <_end+35507244/385d6244>
>>esi; e162f000 <_end+211f4244/385d6244>
>>ebp; f60d701c <_end+35c9c260/385d6244>
>>esp; c0379ee4 <init_task_union+1ee4/2000>

Trace; f8a24982 <[usb-uhci]process_urb+1e6/230>
Trace; f8a24a61 <[usb-uhci]uhci_interrupt+95/fc>
Trace; c010a041 <handle_IRQ_event+5d/88>
Trace; c010a236 <do_IRQ+a6/ec>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d60 <default_idle+0/34>
Trace; c0106d89 <default_idle+29/34>
Trace; c0106df2 <cpu_idle+3e/54>
Trace; c0105000 <_stext+0/0>
Trace; c010504f <rest_init+4f/50>
---------------------------------------

This strongly suggests a problem in the ftdi_sio_usb USB to serial driver
and I filed this at the SourceForge bugtracker already.

klaus