Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760902AbYGRU2Z (ORCPT ); Fri, 18 Jul 2008 16:28:25 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755831AbYGRU2N (ORCPT ); Fri, 18 Jul 2008 16:28:13 -0400 Received: from yw-out-2324.google.com ([74.125.46.31]:9974 "EHLO yw-out-2324.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757308AbYGRU2K (ORCPT ); Fri, 18 Jul 2008 16:28:10 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=KCTuC9XdSYW1zz8UXh2mNe1pedODWaw5JDDb0IibatxGooCPFC3lzqxWT6cajnQr1O 19lFkT0y4jvZJdNbaQn4xiXv/KEUcwIIm91g5GX4qazWbaQ++ScDXyfIM00BkfLiOuKd 9CWVTZYNnat6UmWVvsWe1Ymtroib+TXw7m6eg= Message-ID: <19f34abd0807181328m7d484b00had3250f42728dba0@mail.gmail.com> Date: Fri, 18 Jul 2008 22:28:08 +0200 From: "Vegard Nossum" To: "Josef Bacik" Subject: Re: ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference at 0000000c Cc: "Andreas Dilger" , "Josef Bacik" , linux-ext4@vger.kernel.org, sct@redhat.com, akpm@linux-foundation.org, "Johannes Weiner" , "Duane Griffin" , "Linux Kernel Mailing List" In-Reply-To: <19f34abd0807180458i2749f5d0jc5c07f691931954e@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20080717141333.GC14133@unused.rdu.redhat.com> <19f34abd0807170744r79e46a78odfcfbd67687d2ceb@mail.gmail.com> <20080717143332.GE14133@unused.rdu.redhat.com> <19f34abd0807170800q13cc021dyed27c665c25ac520@mail.gmail.com> <20080717144342.GA15844@unused.rdu.redhat.com> <20080717230905.GI6239@webber.adilger.int> <20080718105152.GB15844@unused.rdu.redhat.com> <19f34abd0807180432i19567dfal5d7d29bb1916b562@mail.gmail.com> <20080718112003.GC15844@unused.rdu.redhat.com> <19f34abd0807180458i2749f5d0jc5c07f691931954e@mail.gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4036 Lines: 95 On Fri, Jul 18, 2008 at 1:58 PM, Vegard Nossum wrote: > On Fri, Jul 18, 2008 at 1:20 PM, Josef Bacik wrote: >>> You can see the full log at >>> http://folk.uio.no/vegardno/linux/log-1216380709.txt which shows that >>> it already survived a lot of failures, so I'm guessing your patch was >>> correct and we just hit a different case. What do you think? >>> >> >> Yeah you are right, its like a shitty game of wack-a-mole. Heres another patch, >> same thing as last time, pull the other one out put this one on. Thanks, > > It seems to hold up -- no stacktraces, but lots of IO failures. > > I would leave it in testing for a bit more, but I've got to run; I'll > give it another go when I get home. Ok, we still got this: BUG: unable to handle kernel NULL pointer dereference at 0000000c IP: [] journal_dirty_metadata+0xb8/0x1b0 *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Pid: 4770, comm: rm Not tainted (2.6.26-03421-g253a722 #49) EIP: 0060:[] EFLAGS: 00210246 CPU: 1 EIP is at journal_dirty_metadata+0xb8/0x1b0 EAX: 00000000 EBX: f3d70c90 ECX: 00000001 EDX: f3e12000 ESI: 00000000 EDI: f21118f0 EBP: f3e13d94 ESP: f3e13d6c DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process rm (pid: 4770, ti=f3e12000 task=f62cdfa0 task.ti=f3e12000) Stack: f3d70430 f578047c f578047c f3e13d94 c0222cdb f779c000 f6ff2e70 f21118f0 f779c000 f21118f0 f3e13db4 c02345ef 0000001c 00001499 c0760bc4 f21118f0 00000000 ef36d004 f3e13de4 c0228e6f 0000147e 0000001c ef36d004 ef36d400 Call Trace: [] ? ext3_free_blocks+0x6b/0xa0 [] ? __ext3_journal_dirty_metadata+0x1f/0x50 [] ? ext3_free_data+0x9f/0x100 [] ? ext3_free_branches+0x213/0x220 [] ? ext3_free_blocks+0x6b/0xa0 [] ? ext3_free_branches+0xae/0x220 [] ? ext3_truncate+0x58c/0x940 [] ? trace_hardirqs_on_caller+0x116/0x170 [] ? journal_start+0xd3/0x110 [] ? journal_start+0xb0/0x110 [] ? ext3_delete_inode+0xd7/0xe0 [] ? ext3_delete_inode+0x0/0xe0 [] ? generic_delete_inode+0x81/0x120 [] ? generic_drop_inode+0x127/0x180 [] ? iput+0x47/0x50 [] ? do_unlinkat+0xec/0x170 [] ? vfs_readdir+0x6b/0xa0 [] ? filldir64+0x0/0xf0 [] ? trace_hardirqs_on_thunk+0xc/0x10 [] ? trace_hardirqs_on_caller+0x116/0x170 [] ? sys_unlinkat+0x23/0x50 [] ? sysenter_past_esp+0x78/0xc5 ======================= Code: b8 01 00 00 00 e8 c9 3f ed ff 89 e0 25 00 e0 ff ff f6 40 08 08 74 05 e8 47 98 4e 00 83 c4 1c 31 c0 5b 5e 5f 5d c3 90 8d 74 26 00 <8b> 46 0c 85 c0 0f 84 8d 00 00 00 8b 45 f0 39 46 18 74 66 8d 47 EIP: [] journal_dirty_metadata+0xb8/0x1b0 SS:ESP 0068:f3e13d6c Kernel panic - not syncing: Fatal exception It looks similar to one of the others we saw. Are you sure I should back out all your previous patches? My stack looks like this: Duane Griffin (1): ext3: validate directory entry Josef Bacik (1): ext3 on latest -git: BUG: unable to handle kernel NULL pointer dereference And I am using error=continue. Now I've modified my scripts to also save the bad image, so I (or whomever) can re-test a specific crash easily. For instance, this one can be downloaded from http://folk.uio.no/vegardno/linux/ext3-crash-fs.bin.bz2 and mounted. Then you run rm -rf mnt/* and it should crash. Log is also available at http://folk.uio.no/vegardno/linux/log-1216412153.txt Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/