Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756953AbYHTOEw (ORCPT ); Wed, 20 Aug 2008 10:04:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751937AbYHTOEn (ORCPT ); Wed, 20 Aug 2008 10:04:43 -0400 Received: from smtp1.extricom.com ([212.235.24.249]:35070 "HELO smtp.extricom.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with SMTP id S1751716AbYHTOEm (ORCPT ); Wed, 20 Aug 2008 10:04:42 -0400 Message-ID: <48AC23F4.80900@extricom.com> Date: Wed, 20 Aug 2008 17:02:28 +0300 From: Eran Liberty User-Agent: Thunderbird 2.0.0.14 (X11/20080502) MIME-Version: 1.0 To: Steven Rostedt CC: Benjamin Herrenschmidt , "Paul E. McKenney" , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org, Steven Rostedt , Scott Wood , Alan Modra , Segher Boessenkool Subject: Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) References: <48591941.4070408@extricom.com> <48A92E15.2080709@extricom.com> <48A9901B.1080900@redhat.com> <20080818154746.GA26835@Krystal> <48A9AFA7.8080508@freescale.com> <1219110814.8062.2.camel@pasglop> <1219113549.8062.13.camel@pasglop> <1219114600.8062.15.camel@pasglop> <1219119431.8062.35.camel@pasglop> <1219216705.21386.46.camel@pasglop> <48AC1DD8.9080702@extricom.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3416 Lines: 113 Steven Rostedt wrote: > On Wed, 20 Aug 2008, Eran Liberty wrote: > > >> Steven Rostedt wrote: >> >>> On Wed, 20 Aug 2008, Steven Rostedt wrote: >>> >>> >>> >>>> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote: >>>> >>>> >>>> >>>>> Found the problem (or at least -a- problem), it's a gcc bug. >>>>> >>>>> Well, first I must say the code generated by -pg is just plain >>>>> horrible :-) >>>>> >>>>> Appart from that, look at the exit of, for example, __d_lookup, as >>>>> generated by gcc when ftrace is enabled: >>>>> >>>>> c00c0498: 38 60 00 00 li r3,0 >>>>> c00c049c: 81 61 00 00 lwz r11,0(r1) >>>>> c00c04a0: 80 0b 00 04 lwz r0,4(r11) >>>>> c00c04a4: 7d 61 5b 78 mr r1,r11 >>>>> c00c04a8: bb 0b ff e0 lmw r24,-32(r11) >>>>> c00c04ac: 7c 08 03 a6 mtlr r0 >>>>> c00c04b0: 4e 80 00 20 blr >>>>> >>>>> As you can see, it restores r1 -before- it pops r24..r31 off >>>>> the stack ! I let you imagine what happens if an interrupt happens >>>>> just in between those two instructions (mr and lmw). We don't do >>>>> redzones on our ABI, so basically, the registers end up corrupted >>>>> by the interrupt. >>>>> >>>>> >>>> Ouch! You've disassembled this without -pg too, and it does not have this >>>> bug? What version of gcc do you have? >>>> >>>> >>>> >>> I have: >>> gcc (Debian 4.3.1-2) 4.3.1 >>> >>> c00c64c8: 81 61 00 00 lwz r11,0(r1) >>> c00c64cc: 7f 83 e3 78 mr r3,r28 >>> c00c64d0: 80 0b 00 04 lwz r0,4(r11) >>> c00c64d4: ba eb ff dc lmw r23,-36(r11) >>> c00c64d8: 7d 61 5b 78 mr r1,r11 >>> c00c64dc: 7c 08 03 a6 mtlr r0 >>> c00c64e0: 4e 80 00 20 blr >>> >>> >>> My version looks fine. I'm thinking that this is a separate issue than what >>> Eran is seeing. >>> >>> Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and >>> print out the end of the function dump. >>> >>> Thanks, >>> >>> -- Steve >>> >>> >>> >>> >>> >> powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head -n 100 >> >> vmlinux: file format elf32-powerpc >> >> Disassembly of section .text: >> >> c00bb584 <__d_lookup>: >> > > [...] > > >> c00bb670: 41 9e 00 50 beq- cr7,c00bb6c0 <__d_lookup+0x13c> >> c00bb674: 83 de 00 00 lwz r30,0(r30) >> c00bb678: 2f 9e 00 00 cmpwi cr7,r30,0 >> c00bb67c: 40 9e ff 98 bne+ cr7,c00bb614 <__d_lookup+0x90> >> c00bb680: 38 60 00 00 li r3,0 >> c00bb684: 81 61 00 00 lwz r11,0(r1) >> c00bb688: 80 0b 00 04 lwz r0,4(r11) >> c00bb68c: 7d 61 5b 78 mr r1,r11 >> > > [ BUG HERE IF INTERRUPT HAPPENS ] > > >> c00bb690: bb 0b ff e0 lmw r24,-32(r11) >> c00bb694: 7c 08 03 a6 mtlr r0 >> c00bb698: 4e 80 00 20 blr >> > > Yep, you have the same bug in your compiler. > > -- Steve > Hmm... so whats now? Is there a way to prove this scenario is indeed the one that caused the opps? -- Liberty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/