There is a lot of information on the web about Linux and oops/crash
dumps. Some positive, some quite negative. There are lots of pages
about tools, patches, best practices, etc. The problem is that they
all seem to be out of date. I think the latest patch I have seen
available was for 2.6.10. All this data begs the question:
What is the current state and roadmap for Linux kernel oops/crash dump
capabilities?
I ask this because it has always been of interest, a "barb" from the
"true unix" zealots, and more importantly I have a machine or two that
keeps panicing. While I know I can setup a serial output to capture
said oops, the hardware I am using is not very conducive to such a
setup in our environment. (Blades requiring front side dongles in a
very tight cage provided by our colo)
The FAQ for this mailing list even states that x86 hardware is not
conducive to collecting crash dumps. The FreeBSD camp seems to think
otherwise. What are they doing that is so unique? I spent some
midnight oil last night reading about their boot process. While I am
certainly in a little deeper than my knowledge base, I didn't see
anything that seems groundbreaking in their boot/hardware management
process that should allow them to collect/dump data on a x86 platform
during a oops. Am I missing something?
If this is covered somewhere please do whack me with a link or at
least a search phrase. I have spent many hours both recently and over
the last couple years crawling the search engines with no real success
of which to speak.
On Tue, 4 Mar 2008, R H wrote:
> What is the current state and roadmap for Linux kernel oops/crash dump
> capabilities?
Good starting point for you might be Documentation/kdump/*
--
Jiri Kosina
SUSE Labs
R H wrote:
> There is a lot of information on the web about Linux and oops/crash
> dumps. Some positive, some quite negative. There are lots of pages
> about tools, patches, best practices, etc. The problem is that they
> all seem to be out of date. I think the latest patch I have seen
> available was for 2.6.10. All this data begs the question:
>
> What is the current state and roadmap for Linux kernel oops/crash dump
> capabilities?
>
I'm sure that you've been looking at the Documentation/ tree
in the kernel sources? You should've come across netconsole
then. It will do just about the same as a serial console except
that it uses the network.
With newer kernels I don't think there's much need for
patches or tools. Although there are some things that you can
do to improve the stack traces (configure frame pointers, for
example).
-- Jan Evert
Have you ever looked at a problem so long that you missed the obvious
answer right under your nose?
Thank you for your responses. I have some (more) reading to do.