Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753838Ab3EKHwo (ORCPT ); Sat, 11 May 2013 03:52:44 -0400 Received: from mail-la0-f41.google.com ([209.85.215.41]:58193 "EHLO mail-la0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752531Ab3EKHwn (ORCPT ); Sat, 11 May 2013 03:52:43 -0400 From: Dmitry Monakhov To: Tony Luck , eunb.song@samsung.com Cc: "Theodore Ts'o" , "linux-ext4\@vger.kernel.org" , "linux-kernel\@vger.kernel.org" Subject: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+ In-Reply-To: References: <6719519.5821368147110937.JavaMail.weblogic@epml17> User-Agent: Notmuch/0.6.1 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-redhat-linux-gnu) Date: Sat, 11 May 2013 11:52:38 +0400 Message-ID: <871u9e6ji1.fsf@openvz.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2200 Lines: 54 On Fri, 10 May 2013 10:27:58 -0700, Tony Luck wrote: Non-text part: multipart/mixed > I think I have the same (or highly similar) thing happening on ia64. What was page_size and fsblock size? > > Similarities: seeing assertions fail for b_transaction > Differences: I only have ext3 filesystems mounted, no ext4 > > See attached trace. I'm pretty certain that the highly unhelpful > > bugcheck! 0 [1] > > comes from the > > J_ASSERT_JH(jh, jh->b_transaction == NULL); > > from disassembling __journal_remove_journal_head(). The instruction > pointer points to the 2nd "break" instruction > in the function. > > The problem shows up after 30 minutes to a couple of hours of stress (kernel > builds with "make -j32"). I cant reproduce this one yet. But changes {ext3,jbd} are minimal #git log --oneline v3.9.. fs/{ext3,jbd} 5af43c2 Merge branch 'akpm' (incoming from Andrew) a27bb33 aio: don't include aio.h in sched.h 4385bab make blkdev_put() return void 14a9e5c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs e760040 fs/buffer.c: remove unnecessary init operation after allocating buffer_head. 713685111 mm: make snapshotting pages for stable writes a per-bio operation 8bb9da9 jbd: use kmem_cache_zalloc for allocating journal head e162b2f jbd: use kmem_cache_zalloc instead of kmem_cache_alloc/memset e678a4f jbd: don't wait (forever) for stale tid caused by wraparound e643692 ext3: fix data=journal fast mount/umount hang So looks very strange.. I have ia64 and now I work on reproduction. > > I'm pretty sure this problem didn't occur in plain v3.9 (it can run for > a full 24 hours). > > Trying to bisect - but it takes a while to be convinced that a good kernel > is actually good (since I don't have a clear picture of how long to run > before deciding that the bug isn't going to show) > > -Tony Attachment: bug (application/octet-stream) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/