2003-08-05 12:23:58

by Robert L. Harris

[permalink] [raw]
Subject: FINALLY caught a panic



I've got a machine which has been dying repeatedly with nothing loged to
the serial console we could see. The problem was the console would kick
you off if idle for too long. I managed to keep one it from
disconnecting for quite some time and finally got a panic. I caught
this:

Unable to handle kernel paging request at virtual address 8011c560
printing eip:
8011c560
*pde = 00000000
Oops: 0000
CPU: 1
EIP: 0010:[<8011c560>] Not tainted
EFLAGS: 00010286
eax: 8011c560 ebx: c037f754 ecx: 00000040 edx: c0357980
esi: 00000040 edi: c037f740 ebp: c037ef40 esp: c1e19f28
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c1e19000)
Stack: c011c47d 00000001 c0358180 00000001 fffffffe 00000040 c011c1ff c0358180
c037ef40 c0351800 00000000 c1e19f74 00000046 c0108bdb c0105400 c1e18000
c0105400 00000040 c02f5b44 00000000 c010ae78 c0105400 c1e18000 c1e18000
Call Trace: [<c011c47d>] [<c011c1ff>] [<c0108bdb>] [<c0105400>] [<c0105400>]
[<c010ae78>] [<c0105400>] [<c0105400>] [<c010542c>] [<c01054a2>] [<c0117e7f>]
[<c0117d8e>]

Code: Bad EIP value.
<0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

Can someone please tell me what this means or how to figure it out? The
machine is offline for the next 12 hours unfortunately due to lack of
remote help. I do have similar machines with the same kernel/hardware
if I need to run any commands against this output. I don't have access
to any specific files on the machine though.

Thanks,
Robert


:wq!
---------------------------------------------------------------------------
Robert L. Harris | GPG Key ID: E344DA3B
@ x-hkp://pgp.mit.edu
DISCLAIMER:
These are MY OPINIONS ALONE. I speak for no-one else.

Life is not a destination, it's a journey.
Microsoft produces 15 car pileups on the highway.
Don't stop traffic to stand and gawk at the tragedy.


Attachments:
(No filename) (1.93 kB)
(No filename) (189.00 B)
Download all attachments

2003-08-05 12:31:22

by Matti Aarnio

[permalink] [raw]
Subject: Re: FINALLY caught a panic

On Tue, Aug 05, 2003 at 08:23:54AM -0400, Robert L. Harris wrote:
> I've got a machine which has been dying repeatedly with nothing loged to
> the serial console we could see.
....
> Can someone please tell me what this means or how to figure it out? The

See FAQ at: http://www.tux.org/lkml/#s4-3

Feeding that Oops to ksymoops at your running machine _might_ help,
but if you use modules, it is somewhat less likely to succeed, as
module loading may end up at different memory locations.

Plain numeric Oops definitely does not help.

...
> Thanks,
> Robert

/Matti Aarnio

2003-08-05 12:37:48

by Denis Vlasenko

[permalink] [raw]
Subject: Re: FINALLY caught a panic

On 5 August 2003 15:23, Robert L. Harris wrote:
>
>
> I've got a machine which has been dying repeatedly with nothing loged to
> the serial console we could see. The problem was the console would kick
> you off if idle for too long. I managed to keep one it from
> disconnecting for quite some time and finally got a panic. I caught
> this:
>
> Unable to handle kernel paging request at virtual address 8011c560
> printing eip:
> 8011c560
> *pde = 00000000
> Oops: 0000
> CPU: 1
> EIP: 0010:[<8011c560>] Not tainted
> EFLAGS: 00010286
> eax: 8011c560 ebx: c037f754 ecx: 00000040 edx: c0357980
> esi: 00000040 edi: c037f740 ebp: c037ef40 esp: c1e19f28
> ds: 0018 es: 0018 ss: 0018
> Process swapper (pid: 0, stackpage=c1e19000)
> Stack: c011c47d 00000001 c0358180 00000001 fffffffe 00000040 c011c1ff c0358180
> c037ef40 c0351800 00000000 c1e19f74 00000046 c0108bdb c0105400 c1e18000
> c0105400 00000040 c02f5b44 00000000 c010ae78 c0105400 c1e18000 c1e18000
> Call Trace: [<c011c47d>] [<c011c1ff>] [<c0108bdb>] [<c0105400>] [<c0105400>]
> [<c010ae78>] [<c0105400>] [<c0105400>] [<c010542c>] [<c01054a2>] [<c0117e7f>]
> [<c0117d8e>]
>
> Code: Bad EIP value.
> <0>Kernel panic: Aiee, killing interrupt handler!
> In interrupt handler - not syncing
>
> Can someone please tell me what this means or how to figure it out? The
> machine is offline for the next 12 hours unfortunately due to lack of
> remote help. I do have similar machines with the same kernel/hardware
> if I need to run any commands against this output. I don't have access
> to any specific files on the machine though.

This document is mailed to lkml regularly and will be modified
whenever new victim wishes to be listed in it or someone can
no longer devote his time to maintainer work.

If you want your entry added/updated/removed, contact me.

BTW, requests to move your entry to the top of the list
without actually changing the text are fine too: that
will indicate that entry is not outdated, so don't be shy ;-)
--
vda
------- cut here ------ cut here ------ cut here ------ cut here ------

So, you are new to Linux kernel hacking and want to submit a kernel bug
report or a patch but don't know how to do it and _where_ to report it?

Preparing bug report:
=====================
*** Remember: bad/incomplete bug report ONLY wastes bandwidth! ***
How To Ask Questions The Smart Way:
http://www.catb.org/~esr/faqs/smart-questions.html
Anybody who has written software for public use will
probably have received at least one bad bug report.
Reports that say nothing ("It doesn't work!");
reports that make no sense; reports that don't give
enough information; reports that give wrong information.
How to Report Bugs Effectively:
http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
Before asking a technical question by email, or in
a newsgroup, or on a website chat board, do the following:
* Try to find an answer by searching the Web.
* Try to find an answer by reading the manual.
* Try to find an answer by reading a FAQ.
* Try to find an answer by inspection or experimentation.
* Try to find an answer by reading the source code.
Compile problems: report GCC output and result of
"grep '^CONFIG_' .config"
Oops: decode it with ksymoops (or use 2.5 with kksymoops enabled ;).
Unkillable process: Alt-SysRq-T and ksymoops relevant part.
Yes it means you should have ksymoops installed and tested,
which is easy to get wrong. I've done that too often.

Sending bug report/patch:
=========================
* Some device drivers have active developers, try to contact them first.
* Otherwise find a subsystem maintainer to which your report pertains
and send report to his address.
* Small fixes and device driver updates are best directed to subsystem
maintainers and "small bits" integrators.
* It never hurts to CC: Linux kernel mailing list, but without specific
maintainer address in To: field there is high probability that your
patch won't be noticed. You have been warned.
* Do not send it to all addresses at once! This will annoy lots of people
and isn't useful at all. It's a spam.
* Do NOT send small fixes to Linus, he just can't handle _everything_.
He will eventually receive it from maintainers/integrators, send it
their way.
* If your patch is something big and new, announce it on lkml and try
to attract testers. After it has been tested and discussed, you can
expect Linus to consider inclusion in mainline.


Current Linux kernel people

Note that this list is sorted in reversed date order, most recent
entries first. This means than entries at bottom can be outdated :-(


Linux kernel mailing list <[email protected]>
Post anything related to Linux kernel here, but nothing else :-)

Bartlomiej Zolnierkiewicz <[email protected]> [21 may 2003]
IDE SUBSYSTEM

Andre Hedrick <[email protected]> [15 apr 2003]
ATA/ATAPI Storage Architect [2.0,2.2,2.4,2.5]
HBA interface developer
Serial ATA Architect [released][backported]

George Anzinger <[email protected]> [19 mar 2003]
I maintain the posix-timers and related code.

Miles Bader <[email protected]> [13 mar 2003]
I'm maintainer of the v850 port (uClinux).
There's a more generic mailing address that might be better though:
<[email protected]>

Jesse Barnes <[email protected]> [28 feb 2003]
I maintain arch/ia64/sn (SGI SN support for IA64) in the 2.5 tree,
and John Hesterberg <[email protected]> does the same for 2.4

http://bugzilla.kernel.org [13 feb 2003]
Database of 2.5 bugs.

Martin J. Bligh <[email protected]> [13 feb 2003]
I am the maintainer for the kernel bugtracker (bugzilla.kernel.org)
I'm interested in kernel issues with:
* NUMA / discontigmem
* VM issues with lots (>4Gb) of RAM
* Scalability issues with > 2 CPUs
See also:
Andrea Arcangeli <[email protected]>

John Bradford <[email protected]> [13 feb 2003]
I maintain an unofficial kernel bug database at
http://grabjohn.com/kernelbugdatabase
and I'm also happy to help people who are trying
to get run Linux usefully on old and/or low spec
machines, (4 MB 486s, etc).

Dave Olien <[email protected]> [12 feb 2003]
I maintain DAC960 RAID controller driver
Visit http://www.osdl.org/archive/dmo/DAC960

Benjamin Herrenschmidt <[email protected]> [27 jan 2003]
My duty is to try to make sure Apple's PowerMacs
happily run the Linux kernel.
I also do various things related to the PPC port (more
specifically PPC32), so I'd appreciate beeing CC'ed any
comment, patch or bug report regarding the PPC architecture

Adam Belay <[email protected]> [17 dec 2002]
I am Plug and Play maintainer.

Andrew Morton <[email protected]> [10 dec 2002]
- VM
- The "data" part of the VFS: pagecache, buffer layer, etc.
- memory management
- ext2 and ext3
- 3c59x.c
- direct-IO

James Simmons <[email protected]> [28 Nov 2002]
Console and framebuffer subsystems.
I also play around with the input layer.

Petko Manolov <[email protected]> [27 nov 2002]
pegasus and rtl8150 usb-ethernet drivers maintainer.
Interested in any bugs or new devices related to those drivers.
string-486.h code maintainer.

Greg Ungerer <[email protected]> [14 nov 2002]
uClinux (MMU-less support) maintainer. I'll take antyhing
specifically related to MMU-less support or any of the
MMU-less architecture branches (m68knommu, v850, etc).
I would highly recommend sending to [email protected]
mailing list as well.

Jeff Garzik <[email protected]> [24 sep 2002]
I am the network-card-drivers guy (8139 for instance).
CC me and Andrew Morton <[email protected]> on network driver patches.

Jan-Benedict Glaw <[email protected]> [18 sep 2002]
I'm responsible for Alpha's srm_env driver, providing access to
SRM's firmware variables.

Stuart MacDonald <[email protected]> [13 sep 2002]
Connect Tech's linux kernel guy. Currently includes hacking on
drivers/char/serial.c (Blue Heat, Xtreme, Dflex) and maintaining
drivers/usb/serial/whiteheat.c (WhiteHEAT)

Vojtech Pavlik <[email protected]> [13 sep 2002]
Feel free to send me bug reports and patches to input device drivers
(drivers/input/*, drivers/char/joystick/*)
I also want to receive bug reports and patches for following
USB drivers: printer, acm, catc, hid*, usbmouse, usbkbd, wacom.
All other (not in the list) USB driver changes should go to USB
maintainer (hopefully there is one listed here :-).
Also CC me if you are posting VIA IDE driver related message
(although I am not IDE subsystem maintainer).

Robert Love <[email protected]> [12 sep 2002]
Preemptible kernel maintainer.
I am also interesting in anything related to scheduling or locking
primitives.

Jan Kara <[email protected]> [22 aug 2002]
quota subsystem maintainer

Paul Larson <[email protected]> [20 aug 2002]
I'm a maintainer for the Linux Test Project and it would be nice
if people knew to send their test programs, etc. to me. I see
a lot of them flying around on lkml and try to catch them when
I can, but it's a lot to keep up with. It would be even better
if people just knew to send them our way so we could clean
them up and put them in LTP for regression testing.

Dave Engebretsen <[email protected]> [15 aug 2002]
PPC64 architecture maintainer. Please send PPC64 patches to me
and our mailing list at <[email protected]>

Ingo Molnar <[email protected]> [30 jul 2002]
Ingo wrote the new scheduler for 2.5.

Ralf Baechle <[email protected]> [30 jul 2002]
I am maintainer of the AX.25 code

Victor Yodaiken <[email protected]> [30 jul 2002]
RTLinux patches, updates, contributions, drivers.
Please send first to the list: [email protected]

Pavel Machek <[email protected]> [27 jul 2002]
I am network block device maintainer. Visit http://nbd.sf.net.
(see Steven Whitehouse <[email protected]> entry)
I am working on software suspend.

William Irwin <[email protected]> [02 jul 2002]
Send bug reports and/or feature requests related to many tasks,
rmap, space consumption, or allocators to me. I'm involved in
* rmap
* memory allocators
* reducing space consumed by data structures (e.g. struct page)
* issues arising in workloads with many tasks
* kernel janitoring
See also:
Rik van Riel <[email protected]>
Andrea Arcangeli <[email protected]>
Martin Bligh <[email protected]>
Andrew Morton <[email protected]>

Dave Jones <[email protected]> [23 apr 2002]
I collect various bits and pieces for inclusion in 2.5,
especially small and trivial ones and driver updates.
I'll feed them to Linus when (and if) they
are proved to be worthy.

Andrea Arcangeli <[email protected]> [28 mar 2002]
Send VM related bug reports and patches to me.
I'm especially interested in VM issues with:
* lots of RAM and CPUs
* NUMA
* heavy swap scenarios
* performance of I/O intensive workloads (in particular
with lots of async buffer flushing involved)
See also Martin J. Bligh <[email protected]> entry
Mail also:
Arjan van de Ven <[email protected]>

Steven Whitehouse <[email protected]> [27 mar 2002]
I am the Linux DECnet network stack maintainer
Visit http://www.chygwyn.com/decnet/

Arnaldo Carvalho de Melo <[email protected]> [26 mar 2002]
IPX, 802.2 LLC, NetBEUI, http://kerneljanitors.org,
cyclom2x sync card driver

John Cagle <[email protected]> [19 mar 2002]
The current maintainer of devices.txt, the list of
assigned device numbers for LANANA. Consult the web
site (http://www.lanana.org) for instructions on submitting
requests for new device numbers. Send all device
related email to <[email protected]>.

Tigran Aivazian <[email protected]>
I am author and maintainer of BFS filesystem and IA32
microcode update driver.

Rogier Wolff <[email protected]> [12 mar 2002]
I do "specialix serial ports":
drivers/char/specialix.c (IO8+)
drivers/char/sx.c (SX, SI, SIO)
drivers/char/rio/*.c (RIO)

Ed Vance <[email protected]> [05 mar 2002]
Maintainer for the generic serial driver, serial.c,
for 2.2 and 2.4 kernels. Please post patches to
[email protected] for tested bug
fixes or to add support for a new serial device.
Limited to time available. If I have not responded
in a week, yell at [email protected]

netfilter/iptables <[email protected]> [23 feb 2002]
Please report all netfilter/iptables related problems
to this mailinglist, where all netfilter developers are present.
See also http://www.netfilter.org/contact.html

Hans Reiser <[email protected]> [16 feb 2002]
Send me all reiserfs related patches with a cc to
[email protected], send bug reports to
[email protected], send paid support requests to
[email protected] after going to http://www.namesys.com/support.html
to pay, send discussions (not bug reports unless they are
interesting to most persons) to [email protected].
If we sit on your patch for a week without responding,
yell at us, we deserve it. Look at our web page
at http://www.namesys.com for more about sending us code,
working with us, and our patch submission and tracking system.

Paul Bristow <[email protected]> [16 feb 2002]
I am an ide-floppy driver maintainer
(ATAPI ZIP, LS-120/240 Superdisk, Clik! drives).

Mike Phillips <[email protected]> [15 feb 2002]
Token ring subsystem and drivers.

Anton Altaparmakov <[email protected]> [15 feb 2002]
I am the NTFS guy.

https://bugzilla.redhat.com/bugzilla [14 feb 2002]
Reports of problems with the Red Hat shipped kernels.

Alan Cox <[email protected]> [14 feb 2002]
Linux 2.2 maintainer (maintenance fixes only).
Collator of patches for unmaintained things in 2.2/2.4.
Maintainer of the 2.4-ac (2.4 plus stuff being tested) tree.
I2O, sound, 3c501 maintainer for 2.2/2.4.

ALSA development <[email protected]> [12 feb 2002]
Jaroslav Kysela <[email protected]> [12 feb 2002]
Advanced Linux Sound Architecture
ALSA patches are available at
ftp://ftp.alsa-project.org/pub/kernel-patches/*

Neil Brown <[email protected]> [08 feb 2002]
I am interested in any issues with the code in:
NFS server (fs/nfsd/*)
software RAID (drivers/md/{md,raid,linear}*)
or related include files.

Maksim Krasnyanskiy <[email protected]> [08 feb 2002]
I'm author and maintainer of the Bluetooth subsystem
and Universal TUN/TAP device driver.
These days mostly working on Bluetooth stuff.

Rik van Riel <[email protected]> [07 feb 2002]
Send me VM related stuff, please CC to [email protected]

Geert Uytterhoeven <[email protected]> [07 feb 2002]
I work on the frame buffer subsystem, the m68k port (Amiga part),
and the PPC port (CHRP LongTrail part).
Unfortunately I barely have spare time to really work on these
things. My job is not Linux-related (so far :-). I can not
promise anything about my maintainership performance.

H. Peter Anvin <[email protected]> [07 feb 2002]
i386 boot and feature code, i386 boot protocol, autofs3,
compressed iso9660 (but I'll accept all iso9660-related
changes). kernel.org site manager; please contact me
for sponsorship-related issues.

kernel.org admins <[email protected]> [07 feb 2002]
Kernel.org sysadmins. Contact us if you notice something breaks,
or if you want a change make sure you give us at least 1-2 weeks.
Please note that we got a lot of feature requests, a lot of
which conflict or simply aren't practical; we don't have time to
respond to all requests.

Greg KH <[email protected]> [07 feb 2002]
I am USB and PCI Hotplug maintainer.

Trond Myklebust <[email protected]> [07 feb 2002]
I am NFS client maintainer.

Richard Gooch <[email protected]> [07 feb 2002]
I maintain devfs. I want people to Cc: me when reporting devfs
problems, since I don't read all messages on linux-kernel.
Send devfs related patches to me directly, rather than
bypassing me and sending to Linus/Marcelo/Alan/Dave etc.

Russell King <[email protected]> [06 feb 2002]
ARM architecture maintainer. Please send all ARM patches through
the patch system at http://www.arm.linux.org.uk/developer/patches/
New serial drivers maintainer for 2.5. Submit patches to
[email protected]

Petr Vandrovec <[email protected]> [05 feb 2002]
ncpfs filesystem, matrox framebuffer driver, problems related
to VMware - in all of 2.2.x, 2.4.x and 2.5.x.

Reiserfs developers list <[email protected]> [05 feb 2002]
Send all reiserfs-related stuff here including but not limited to bug
reports, fixes, suggestions.

Oleg Drokin <[email protected]> [05 feb 2002]
SA11x0 USB-ethernet and SA11x0 watchdog are mine.

======= These entries are suggested by lkml folks ========

Ralf Baechle <[email protected]> [27 mar 2002]
I am mips/mips64 maintainer.

David S. Miller <[email protected]> [07 feb 2002]
I am Sparc64 and networking core maintainer.

======= These ones I made myself ========
======= I am waiting confirmation/correction from these people ========

Urban Widmark <[email protected]> [13 feb 2002]
smbfs

video4linux list <[email protected]> [12 feb 2002]
Gerd Knorr <[email protected]> [12 feb 2002]
video4linux

Tim Waugh <[email protected]> [08 feb 2002]
> Who is maintaining the linux iomega stuff?
For 2.4.x, me (in theory). I don't have time for 2.5.x at the moment.

Alexander Viro <[email protected]> [5 feb 2002]
I am NOT a fs subsystem maintainer. But I won't kill
you if you send me some generic fs bug reports and (hopefully) patches.

G?rard Roudier <[email protected]> [5 feb 2002]
I am SCSI guy.

Jens Axboe <[email protected]> [5 feb 2002]
I am block device subsystem maintainer.

Linus Torvalds <[email protected]> [5 feb 2002]
Do not send anything to me unless it is for 2.5, well tested,
discussed on lkml and is used by significant number of people.
In general it is a bad idea to send me small fixes and driver
updates, send them to subsystem maintainers and/or
"small stuff" integrator (currently Dave Jones <[email protected]>,
see his entry). Sorry, I can't do everything.

Marcelo Tosatti <[email protected]> [5 feb 2002]
Do not send anything to me unless it is for 2.4 and well tested.
If you are sending me small fixes and driver updates, send
a copy to subsystem maintainers and/or "small stuff" integrators:
- Alan Cox <[email protected]>,
- Rusty Russell <[email protected]>.

Rusty Russell <[email protected]> [5 feb 2002]
> Here are some cleanups of whitespace in .....
Want me to add this to the trivial patch collection for tracking?
If so just send (or cc:) it to [email protected].

======= Entries which were valid sometime ago. Not valid anymore ========
======= Retained for historic (and hall-of-fame) purposes ===============
Eric S. Raymond <[email protected]> [5 feb 2002]
Send kernel configuration bug reports and suggestions to me.
Also I'll be more than happy to accept help enties for kernel config
options (Configure.help).

Martin Dalecki <[email protected]> [11 mar 2002]
IDE subsystem maintainer for 2.5

2003-08-05 12:38:10

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: FINALLY caught a panic

On Tue, 5 Aug 2003, Robert L. Harris wrote:

> Unable to handle kernel paging request at virtual address 8011c560
> printing eip:
> 8011c560
> *pde = 00000000
> Oops: 0000
> CPU: 1
> EIP: 0010:[<8011c560>] Not tainted
> EFLAGS: 00010286
> eax: 8011c560 ebx: c037f754 ecx: 00000040 edx: c0357980
> esi: 00000040 edi: c037f740 ebp: c037ef40 esp: c1e19f28
> ds: 0018 es: 0018 ss: 0018
> Process swapper (pid: 0, stackpage=c1e19000)
> Stack: c011c47d 00000001 c0358180 00000001 fffffffe 00000040 c011c1ff c0358180
> c037ef40 c0351800 00000000 c1e19f74 00000046 c0108bdb c0105400 c1e18000
> c0105400 00000040 c02f5b44 00000000 c010ae78 c0105400 c1e18000 c1e18000
> Call Trace: [<c011c47d>] [<c011c1ff>] [<c0108bdb>] [<c0105400>] [<c0105400>]
> [<c010ae78>] [<c0105400>] [<c0105400>] [<c010542c>] [<c01054a2>] [<c0117e7f>]
> [<c0117d8e>]
>
> Code: Bad EIP value.
> <0>Kernel panic: Aiee, killing interrupt handler!
> In interrupt handler - not syncing

Could have been someone removing a module without unregistering an
interrupt handler. But that's just guessing.

> Can someone please tell me what this means or how to figure it out? The
> machine is offline for the next 12 hours unfortunately due to lack of
> remote help. I do have similar machines with the same kernel/hardware
> if I need to run any commands against this output. I don't have access
> to any specific files on the machine though.

You don't specify which kernel this is, it appears to be 2.4 something.
Please run this through ksymoops (man ksymoops)

--
function.linuxpower.ca

2003-08-05 13:40:01

by Robert L. Harris

[permalink] [raw]
Subject: Re: FINALLY caught a panic

Thus spake Zwane Mwaikambo ([email protected]):

> On Tue, 5 Aug 2003, Robert L. Harris wrote:
>
> > Unable to handle kernel paging request at virtual address 8011c560
> > printing eip:
> > 8011c560
> > *pde = 00000000
> > Oops: 0000
> > CPU: 1
> > EIP: 0010:[<8011c560>] Not tainted
> > EFLAGS: 00010286
> > eax: 8011c560 ebx: c037f754 ecx: 00000040 edx: c0357980
> > esi: 00000040 edi: c037f740 ebp: c037ef40 esp: c1e19f28
> > ds: 0018 es: 0018 ss: 0018
> > Process swapper (pid: 0, stackpage=c1e19000)
> > Stack: c011c47d 00000001 c0358180 00000001 fffffffe 00000040 c011c1ff c0358180
> > c037ef40 c0351800 00000000 c1e19f74 00000046 c0108bdb c0105400 c1e18000
> > c0105400 00000040 c02f5b44 00000000 c010ae78 c0105400 c1e18000 c1e18000
> > Call Trace: [<c011c47d>] [<c011c1ff>] [<c0108bdb>] [<c0105400>] [<c0105400>]
> > [<c010ae78>] [<c0105400>] [<c0105400>] [<c010542c>] [<c01054a2>] [<c0117e7f>]
> > [<c0117d8e>]
> >
> > Code: Bad EIP value.
> > <0>Kernel panic: Aiee, killing interrupt handler!
> > In interrupt handler - not syncing
>
> Could have been someone removing a module without unregistering an
> interrupt handler. But that's just guessing.
>
> > Can someone please tell me what this means or how to figure it out? The
> > machine is offline for the next 12 hours unfortunately due to lack of
> > remote help. I do have similar machines with the same kernel/hardware
> > if I need to run any commands against this output. I don't have access
> > to any specific files on the machine though.
>
> You don't specify which kernel this is, it appears to be 2.4 something.
> Please run this through ksymoops (man ksymoops)
>
> --
> function.linuxpower.ca


This is 2.4.18 with the 2.6.5 i2c modules. The only modules enabled are
for the i2c and everything else is compiled in static. There was no-one
on the box the times when it went down and no messages about modules
being messed with either. I have about 30 of the servers out in the
wild all 99.9% identicle and there is only 1 having this issue which is
why my first thought it hardware.

This week I'm upgrading the puppies to 2.4.21-ac4 with the 2.8 i2c.

Robert

:wq!
---------------------------------------------------------------------------
Robert L. Harris | GPG Key ID: E344DA3B
@ x-hkp://pgp.mit.edu
DISCLAIMER:
These are MY OPINIONS ALONE. I speak for no-one else.

Life is not a destination, it's a journey.
Microsoft produces 15 car pileups on the highway.
Don't stop traffic to stand and gawk at the tragedy.


Attachments:
(No filename) (2.58 kB)
(No filename) (189.00 B)
Download all attachments

2003-08-05 13:50:56

by Robert L. Harris

[permalink] [raw]
Subject: Re: FINALLY caught a panic

Thus spake Zwane Mwaikambo ([email protected]):

> On Tue, 5 Aug 2003, Robert L. Harris wrote:
>
> > Unable to handle kernel paging request at virtual address 8011c560
> > printing eip:
> > 8011c560
> > *pde = 00000000
> > Oops: 0000
> > CPU: 1
> > EIP: 0010:[<8011c560>] Not tainted
> > EFLAGS: 00010286
> > eax: 8011c560 ebx: c037f754 ecx: 00000040 edx: c0357980
> > esi: 00000040 edi: c037f740 ebp: c037ef40 esp: c1e19f28
> > ds: 0018 es: 0018 ss: 0018
> > Process swapper (pid: 0, stackpage=c1e19000)
> > Stack: c011c47d 00000001 c0358180 00000001 fffffffe 00000040 c011c1ff c0358180
> > c037ef40 c0351800 00000000 c1e19f74 00000046 c0108bdb c0105400 c1e18000
> > c0105400 00000040 c02f5b44 00000000 c010ae78 c0105400 c1e18000 c1e18000
> > Call Trace: [<c011c47d>] [<c011c1ff>] [<c0108bdb>] [<c0105400>] [<c0105400>]
> > [<c010ae78>] [<c0105400>] [<c0105400>] [<c010542c>] [<c01054a2>] [<c0117e7f>]
> > [<c0117d8e>]
> >
> > Code: Bad EIP value.
> > <0>Kernel panic: Aiee, killing interrupt handler!
> > In interrupt handler - not syncing
>


Running the above through ksymoops on an identicle machine with the same
kernel, etc I get this:

8011c560
*pde = 00000000
Oops: 0000
CPU: 1
EIP: 0010:[<8011c560>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 8011c560 ebx: c037f754 ecx: 00000040 edx: c0357980
esi: 00000040 edi: c037f740 ebp: c037ef40 esp: c1e19f28
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c1e19000)
Stack: c011c47d 00000001 c0358180 00000001 fffffffe 00000040 c011c1ff c0358180
c037ef40 c0351800 00000000 c1e19f74 00000046 c0108bdb c0105400 c1e18000
c0105400 00000040 c02f5b44 00000000 c010ae78 c0105400 c1e18000 c1e18000
Call Trace: [<c011c47d>] [<c011c1ff>] [<c0108bdb>] [<c0105400>] [<c0105400>]
[<c010ae78>] [<c0105400>] [<c0105400>] [<c010542c>] [<c01054a2>] [<c0117e7f>]
[<c0117d8e>]
Code: Bad EIP value.

>>EIP; 8011c560 Before first symbol <=====
Trace; c011c47d <tasklet_hi_action+5d/90>
Trace; c011c1ff <do_softirq+6f/d0>
Trace; c0108bdb <do_IRQ+db/f0>
Trace; c0105400 <default_idle+0/40>
Trace; c0105400 <default_idle+0/40>
Trace; c010ae78 <call_do_IRQ+5/d>
Trace; c0105400 <default_idle+0/40>
Trace; c0105400 <default_idle+0/40>
Trace; c010542c <default_idle+2c/40>
Trace; c01054a2 <cpu_idle+42/60>
Trace; c0117e7f <release_console_sem+8f/a0>
Trace; c0117d8e <printk+11e/140>

<0>Kernel panic: Aiee, killing interrupt handler!

1 warning issued. Results may not be reliable.




This is a dual 1.5Ghz AMD, 1Gig of ram, kernel 2.4.18 with i2c version
2.6.5. Everything but the i2c is compiled in staticly. Out of 35ish
machins which are identicle this is the only one leading me to believe
something hardware related, any reliable guesses as to what/why?

And thanks for those taking the time to help,
Robert





:wq!
---------------------------------------------------------------------------
Robert L. Harris | GPG Key ID: E344DA3B
@ x-hkp://pgp.mit.edu
DISCLAIMER:
These are MY OPINIONS ALONE. I speak for no-one else.

Life is not a destination, it's a journey.
Microsoft produces 15 car pileups on the highway.
Don't stop traffic to stand and gawk at the tragedy.


Attachments:
(No filename) (3.26 kB)
(No filename) (189.00 B)
Download all attachments

2003-08-05 14:20:13

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: FINALLY caught a panic

On Tue, 5 Aug 2003, Robert L. Harris wrote:

> Code: Bad EIP value.
>
> >>EIP; 8011c560 Before first symbol <=====
> Trace; c011c47d <tasklet_hi_action+5d/90>
> Trace; c011c1ff <do_softirq+6f/d0>
> Trace; c0108bdb <do_IRQ+db/f0>
> Trace; c0105400 <default_idle+0/40>
> Trace; c0105400 <default_idle+0/40>
> Trace; c010ae78 <call_do_IRQ+5/d>
> Trace; c0105400 <default_idle+0/40>
> Trace; c0105400 <default_idle+0/40>
> Trace; c010542c <default_idle+2c/40>
> Trace; c01054a2 <cpu_idle+42/60>
> Trace; c0117e7f <release_console_sem+8f/a0>
> Trace; c0117d8e <printk+11e/140>
>
> <0>Kernel panic: Aiee, killing interrupt handler!

It looks like someone freed a tasklet without removing it. But considering
your kernel cocktail (imported i2c code) it makes it harder for us to
debug, perhaps if you could try on a newer kernel (i know it'd be hard to
do if it's production)

--
function.linuxpower.ca

2003-08-05 14:23:11

by Robert L. Harris

[permalink] [raw]
Subject: Re: FINALLY caught a panic



It is production. This week these systems go to 2.4.21-ac4 with the
2.8.0 i2c. I only think it's not a straight up kernel bug because there
are 35ish identicle systems running all the same software with 0
problems. In addition for some odd reason this one machine was giving
some very odd temp readings.


Thus spake Zwane Mwaikambo ([email protected]):

> On Tue, 5 Aug 2003, Robert L. Harris wrote:
>
> > Code: Bad EIP value.
> >
> > >>EIP; 8011c560 Before first symbol <=====
> > Trace; c011c47d <tasklet_hi_action+5d/90>
> > Trace; c011c1ff <do_softirq+6f/d0>
> > Trace; c0108bdb <do_IRQ+db/f0>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c010ae78 <call_do_IRQ+5/d>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c010542c <default_idle+2c/40>
> > Trace; c01054a2 <cpu_idle+42/60>
> > Trace; c0117e7f <release_console_sem+8f/a0>
> > Trace; c0117d8e <printk+11e/140>
> >
> > <0>Kernel panic: Aiee, killing interrupt handler!
>
> It looks like someone freed a tasklet without removing it. But considering
> your kernel cocktail (imported i2c code) it makes it harder for us to
> debug, perhaps if you could try on a newer kernel (i know it'd be hard to
> do if it's production)
>
> --
> function.linuxpower.ca

:wq!
---------------------------------------------------------------------------
Robert L. Harris | GPG Key ID: E344DA3B
@ x-hkp://pgp.mit.edu
DISCLAIMER:
These are MY OPINIONS ALONE. I speak for no-one else.

Life is not a destination, it's a journey.
Microsoft produces 15 car pileups on the highway.
Don't stop traffic to stand and gawk at the tragedy.


Attachments:
(No filename) (1.71 kB)
(No filename) (189.00 B)
Download all attachments

2003-08-05 19:01:47

by Krzysztof Halasa

[permalink] [raw]
Subject: Re: FINALLY caught a panic

Zwane Mwaikambo <[email protected]> writes:

> > Code: Bad EIP value.
> >
> > >>EIP; 8011c560 Before first symbol <=====
> > Trace; c011c47d <tasklet_hi_action+5d/90>
> > Trace; c011c1ff <do_softirq+6f/d0>
> > Trace; c0108bdb <do_IRQ+db/f0>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c010ae78 <call_do_IRQ+5/d>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c0105400 <default_idle+0/40>
> > Trace; c010542c <default_idle+2c/40>
> > Trace; c01054a2 <cpu_idle+42/60>
> > Trace; c0117e7f <release_console_sem+8f/a0>
> > Trace; c0117d8e <printk+11e/140>
> >
> > <0>Kernel panic: Aiee, killing interrupt handler!
>
> It looks like someone freed a tasklet without removing it.

Not sure, note that EIP=0x80xxx and not 0xC0xxx.
--
Krzysztof Halasa
Network Administrator