Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761125AbYAHQPE (ORCPT ); Tue, 8 Jan 2008 11:15:04 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754786AbYAHQOr (ORCPT ); Tue, 8 Jan 2008 11:14:47 -0500 Received: from rgminet01.oracle.com ([148.87.113.118]:19167 "EHLO rgminet01.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760028AbYAHQOq (ORCPT ); Tue, 8 Jan 2008 11:14:46 -0500 Date: Tue, 8 Jan 2008 08:14:01 -0800 From: Randy Dunlap To: Linus Torvalds Cc: Kevin Winchester , "J. Bruce Fields" , Al Viro , Arjan van de Ven , Linux Kernel Mailing List , Andrew Morton , NetDev Subject: Re: Top 10 kernel oopses for the week ending January 5th, 2008 Message-Id: <20080108081401.d9576ac5.randy.dunlap@oracle.com> In-Reply-To: References: <477FF149.4070609@linux.intel.com> <20080105213935.GN27894@ZenIV.linux.org.uk> <20080107174431.GC27741@fieldses.org> <4782CF9C.6000508@gmail.com> Organization: Oracle Linux Eng. X-Mailer: Sylpheed 2.4.7 (GTK+ 2.8.10; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3056 Lines: 72 On Mon, 7 Jan 2008 19:26:12 -0800 (PST) Linus Torvalds wrote: > On Mon, 7 Jan 2008, Kevin Winchester wrote: > > > J. Bruce Fields wrote: > > > > > > Is there any good basic documentation on this to point people at? > > > > I would second this question. I see people "decode" oops on lkml often > > enough, but I've never been entirely sure how its done. Is it somewhere > > in Documentation? > > It's actually not necessarily at all that trivial, unless you have a deep > understanding of the code generated for the architecture in question (and > even then, some oopses take more time to figure out than others, thanks > to inlining and tailcalls etc). > > If the oops happened with a kernel you generated yourself, it's usually > rather easy. Especially if you said "y" to the "generate debugging info" > question at configuration time. Because, in that case, you really just do > a simple > > gdb vmlinux > > and then you can do (for example) something like setting a breakpoint at > the EIP that was reported for the oops, and it will tell you what line it > came from. > > However, if you don't have the exact binary - which is the common case for > random oopses reported on lkml - you will generally have to disassemble > the hex sequence given in the oops (the "Code:" line), and try to match it > up against the source code to try to figure out what is going on. > > Even just the disassembly is not entirely trivial, since the oops will > give you the eip that it happened at, but you often want to also > disassemble *backwards* in order to get more of a context (the "Code:" > line will mark the particular EIP that starts the oopsing instruction by > enclosing it in , but with non-constant instruction lengths, you need > to use a bit of trial-and-error to figure it out. > > I usually just compile a small program like > > const char array[]="\xnn\xnn\xnn..."; > > int main(int argc, char **argv) > { > printf("%p\n", array); > *(int *)0=0; > } > > and run it under gdb, and then when it gets the SIGSEGV (due to the > obvious NULL pointer dereference), I can just ask gdb to disassemble > around the array that contains the code[] stuff. Try a few offsets, to see > when the disassembly makes sense (and gives the reported EIP as the > beginning of one of the disassembled instructions). > > (You can do it other and smarter ways too, I'm not claiming that's a > particularly good way to do it, and the old "ksymoops" program used to do > a pretty good job of this, but I'm used to that particular idiotic way > myself, since it's how I've basically always done it) One other way to do it (at least for x86-32/64) is to use $kerneltree/scripts/decodecode. It may work on other $arches also, but I haven't tested it on others. --- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/