Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755282AbYHSNFp (ORCPT ); Tue, 19 Aug 2008 09:05:45 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753379AbYHSNFh (ORCPT ); Tue, 19 Aug 2008 09:05:37 -0400 Received: from tomts43.bellnexxia.net ([209.226.175.110]:38394 "EHLO tomts43-srv.bellnexxia.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753320AbYHSNFg (ORCPT ); Tue, 19 Aug 2008 09:05:36 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtIFAHdcqkhMRKxB/2dsb2JhbACBYrMlgVg Date: Tue, 19 Aug 2008 09:05:33 -0400 From: Mathieu Desnoyers To: Eran Liberty Cc: Steven Rostedt , Steven Rostedt , linux-kernel@vger.kernel.org, "Paul E. McKenney" , linuxppc-dev@ozlabs.org Subject: Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Message-ID: <20080819130533.GA18611@Krystal> References: <48591941.4070408@extricom.com> <48A92E15.2080709@extricom.com> <48A9901B.1080900@redhat.com> <48A9BEA3.10906@extricom.com> <48AAB7FF.4070502@extricom.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline In-Reply-To: <48AAB7FF.4070502@extricom.com> X-Editor: vi X-Info: http://krystal.dyndns.org:8080 X-Operating-System: Linux/2.6.21.3-grsec (i686) X-Uptime: 09:04:03 up 75 days, 17:44, 8 users, load average: 0.90, 0.99, 0.78 User-Agent: Mutt/1.5.16 (2007-06-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8987 Lines: 251 * Eran Liberty (liberty@extricom.com) wrote: > Steven Rostedt wrote: >> On Mon, 18 Aug 2008, Eran Liberty wrote: >> >> >>> Steven Rostedt wrote: >>> >>>> Eran Liberty wrote: >>>> >>>>> After compiling a kernel with ftrace I started to experience all sorts >>>>> of >>>>> crashes. >>>>> >>>> Just to make sure... >>>> >>>> ftrace enables markers too, and RCU has tracing with the markers. This >>>> may >>>> not be the problem, but I just want to eliminate as many variables as >>>> possible. >>>> Could you disable ftrace, but keep the markers on too. Also, could you >>>> enable ftrace again and turn on the FTRACE_STARTUP_TEST. >>>> >>> for the fun of it I took out all my propriety modules; so now its a non >>> tainted kernel. >>> >>> Here is the matrix: >>> >>> !FTRACE x !MARKERS => stable >>> !FTRACE x MARKERS => stable >>> FTRACE x !MARKERS => n/a (FTRACE forces MARKERS) >>> FTRACE x MARKERS => unstable >>> FTRACE x FTRACE_STARTUP_TEST x MARKERS => unstable + tests passed >>> >> >> Thanks >> >> >>> Testing tracer sched_switch: PASSED >>> Testing tracer ftrace: PASSED >>> Testing dynamic ftrace: PASSED >>> >>> Oops: Exception in kernel mode, sig: 11 [#1] >>> Exsw1600 >>> Modules linked in: >>> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000 >>> >> >> Could you load this into gdb for me and show me the output of: >> >> gdb> li *0xc00bbb20 >> >> (Assuming you compiled with debuginfo on) >> >> and... >> >> gdb> disass 0xc00bbb20 >> >> >>> REGS: dd5b1c50 TRAP: 0700 Not tainted (2.6.27-rc2) >>> MSR: 00029000 CR: 24082282 XER: 20000000 >>> TASK = ddcce060[1707] 'find' THREAD: dd5b0000 >>> GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b >>> 100234ec >>> GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 >>> 1008a1d0 >>> GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 >>> dd802084 >>> GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 >>> dd5b1d00 >>> NIP [c00bbb20] d_lookup+0x40/0x90 >>> LR [c00bbb20] d_lookup+0x40/0x90 >>> Call Trace: >>> [dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable) >>> [dd5b1d20] [c00aebc4] do_lookup+0xe8/0x220 >>> [dd5b1d50] [c00b0a80] __link_path_walk+0x5a4/0xd54 >>> [dd5b1dc0] [c00b1288] path_walk+0x58/0xe0 >>> [dd5b1df0] [c00b13f8] do_path_lookup+0x78/0x13c >>> [dd5b1e20] [c00b20f4] user_path_at+0x64/0xac >>> [dd5b1e90] [c00a9028] vfs_lstat_fd+0x34/0x74 >>> [dd5b1ec0] [c00a90fc] vfs_lstat+0x30/0x48 >>> [dd5b1ed0] [c00a9144] sys_lstat64+0x30/0x5c >>> [dd5b1f40] [c0010554] ret_from_syscall+0x0/0x3c >>> Instruction dump: >>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0 >>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0 >>> >> >> Ouch! we have a 00000000 instruction. I'm almost done with the new mcount >> record for PPC (I have it done for ppc64, I'm just porting it to 32). I'm >> thinking this new code may solve your issues too. I hate the daemon. >> >> -- Steve >> >> >> > > I have attached ddd. Up on crashing it points on this line > > struct dentry * d_lookup(struct dentry * parent, struct qstr * name) > { > struct dentry * dentry = NULL; > unsigned long seq; > do { > seq = read_seqbegin(&rename_lock); > dentry = __d_lookup(parent, name); <==== ddd marker > if (dentry) > break; > } while (read_seqretry(&rename_lock, seq)); > return dentry; > } > > (gdb) bt > #0 d_lookup (parent=0xdd801180, name=0xd99b3d68) at fs/dcache.c:1361 > #1 0xc00ae7c8 in real_lookup (nd=, name= optimized out>, parent=0xdd801180) at fs/namei.c:507 > #2 do_lookup (nd=0xd99b3e28, name=0xd99b3d68, path=0xd99b3d58) at > fs/namei.c:825 > #3 0xc00b0684 in __link_path_walk (name=0xddc23009 "", nd=0xd99b3e28) at > fs/namei.c:982 > #4 0xc00b0e8c in link_path_walk (nd=, name= optimized out>) at fs/namei.c:570 > #5 path_walk (name=0xddc23000 "/proc/mtd", nd=0xd99b3e28) at > fs/namei.c:1041 > #6 0xc00b0ffc in do_path_lookup (dfd=, > name=0xddc23000 "/proc/mtd", flags=, nd=0xd99b3e28) at > fs/namei.c:1091 > #7 0xc00b1cf8 in user_path_at (dfd=-100, name=, > flags=0, path=0xd99b3e98) at fs/namei.c:1334 > #8 0xc00a8c98 in vfs_lstat_fd (dfd=-578809472, name=0xd99b3d68 "", > stat=0xd99b3ed8) at fs/stat.c:83 > #9 0xc00a8d6c in vfs_lstat (name=0xd99b3d68 "", stat=0xd99b3d58) at > fs/stat.c:93 > #10 0xc00a8db4 in sys_lstat64 (filename=0xdd801180 "", statbuf=0xbfe285d8) > at fs/stat.c:381 > #11 0xc0010554 in syscall_dotrace_cont () > > both cp & lr points to 0xC00BB724 > (gdb) info registers > r0 0x0 0 > r1 0xd99b3d00 3650829568 > r2 0xddccf2e0 3721196256 > r3 0xdd801180 3716157824 > r4 0xd99b3d68 3650829672 > r5 0xd99b3d58 3650829656 > r6 0xdd822a5b 3716295259 > r7 0x100234ec 268580076 > r8 0xc0800000 3229614080 > r9 0x19330 103216 > r10 0xffff 65535 > r11 0xd99b3d20 3650829600 > r12 0x24000288 603980424 > r13 0x100ad874 269146228 > r14 0x100936f8 269039352 > r15 0x1008a1d0 269001168 > r16 0x10083f80 268976000 > r17 0xd99b3e2c 3650829868 > r18 0xd99b3d68 3650829672 > r19 0xfffffff4 4294967284 > r20 0xc0380000 3224895488 > r21 0xd99b3d60 3650829664 > r22 0xd99b3d58 3650829656 > r23 0xdd802084 3716161668 > r24 0xdc3fb280 3695161984 > r25 0xdd802018 3716161560 > r26 0xd99b3d68 3650829672 > r27 0xc0380000 3224895488 > r28 0xdd801180 3716157824 > r29 0xd99b3d68 3650829672 > r30 0x0 0 > r31 0xd99b3d00 3650829568 > pc 0xc00bb724 3221993252 > cr 0x24082282 604512898 > lr 0xc00bb724 3221993252 > ctr 0x0 0 > xer 0x20000000 536870912 > > (gdb) li *0xC00BB724 > Line 1361 of "fs/dcache.c" starts at address 0xc00bb724 and > ends at 0xc00bb728 . > > (gdb) disassemble 0xC00BB724 > Dump of assembler code for function d_lookup: > 0xc00bb6e4 : mflr r0 > 0xc00bb6e8 : stw r0,4(r1) > 0xc00bb6ec : nop > 0xc00bb6f0 : stwu r1,-32(r1) > 0xc00bb6f4 : mflr r0 > 0xc00bb6f8 : stmw r27,12(r1) > 0xc00bb6fc : lis r27,-16328 > 0xc00bb700 : mr r31,r1 > 0xc00bb704 : stw r0,36(r1) > 0xc00bb708 : mr r28,r3 > 0xc00bb70c : mr r29,r4 > 0xc00bb710 : lwz r30,12960(r27) > 0xc00bb714 : andi. r0,r30,1 > 0xc00bb718 : mr r3,r28 > 0xc00bb71c : mr r4,r29 > 0xc00bb720 : bnela- 0x2c > 0xc00bb724 : .long 0x0 > 0xc00bb728 : cmpwi cr7,r3,0 > 0xc00bb72c : bne- cr7,0xc00bb75c > 0xc00bb730 : lwz r0,12960(r27) > 0xc00bb734 : cmpw cr7,r0,r30 > 0xc00bb738 : beq- cr7,0xc00bb75c > 0xc00bb73c : mr r30,r0 > 0xc00bb740 : andi. r0,r30,1 > 0xc00bb744 : mr r3,r28 > 0xc00bb748 : mr r4,r29 > 0xc00bb74c : beq+ 0xc00bb724 > 0xc00bb750 : lwz r0,12960(r27) > 0xc00bb754 : mr r30,r0 > 0xc00bb758 : b 0xc00bb740 > 0xc00bb75c : lwz r11,0(r1) > 0xc00bb760 : lwz r0,4(r11) > 0xc00bb764 : lmw r27,-20(r11) > 0xc00bb768 : mtlr r0 > 0xc00bb76c : mr r1,r11 > 0xc00bb770 : blr > End of assembler dump. > (gdb) > > Hope this is helpfull > Can you also give us objdump -S --start-address=0xC00BB724 vmlinux | head 20 ? Then we could compare the result with the OOPS instruction dump : 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0 Mathieu > Liberty > > > -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/