From: Andreas Dilger Subject: Re: ext4 deadlocks Date: Tue, 07 Oct 2008 15:46:37 -0600 Message-ID: <20081007214637.GA2009@webber.adilger.int> References: <48EBB9E9.4030105@goop.org> <48EBC6B9.7000608@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7BIT Cc: Jeremy Fitzhardinge , linux-ext4@vger.kernel.org, Linux Kernel Mailing List , Kalpak Shah To: Eric Sandeen Return-path: Received: from sca-es-mail-2.Sun.COM ([192.18.43.133]:58004 "EHLO sca-es-mail-2.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753138AbYJGVrA (ORCPT ); Tue, 7 Oct 2008 17:47:00 -0400 In-reply-to: <48EBC6B9.7000608@sandeen.net> Content-disposition: inline Sender: linux-ext4-owner@vger.kernel.org List-ID: On Oct 07, 2008 15:29 -0500, Eric Sandeen wrote: > Jeremy Fitzhardinge wrote: > > I tried giving ext4 a spin on my rawhide system, and it appears to > > deadlock pretty quickly: lots of processed blocked in either ext4 or jbd2. > > > > From the look of the sysrq-t dumps I captured, I think beagled is > > what's triggering it by doing something with EAs. I haven't had any > > lockups since I killed it off. > > > > beagled D 0000000000000000 0 3477 1 > > ffff880125809778 0000000000000082 0000000000000000 ffff88013b078150 > > ffffffff8187a780 ffffffff8187a780 ffff88010f8445c0 ffff88013badc5c0 > > ffff88010f844970 0000000100000001 0000000000000000 ffff88010f844970 > > Call Trace: > > [] ? scsi_sg_alloc+0x48/0x4a > > [] __down_write_nested+0xa3/0xbd > > [] __down_write+0xb/0xd > > [] down_write+0x2f/0x33 > > [] ext4_expand_extra_isize_ea+0x67/0x6f2 [ext4dev] > > At first glance it seems that we're trying a down_write on the xattr_sem > here... > > > [] ? jbd2_journal_add_journal_head+0x113/0x1b0 [jbd2] > > [] ? jbd2_journal_put_journal_head+0x1a/0x56 [jbd2] > > [] ? jbd2_journal_get_write_access+0x31/0x38 [jbd2] > > [] ? jbd2_journal_extend+0x1af/0x1ca [jbd2] > > [] ext4_mark_inode_dirty+0x119/0x18b [ext4dev] > > [] ext4_dirty_inode+0xab/0xc3 [ext4dev] > > [] __mark_inode_dirty+0x38/0x194 > > [] ext4_mb_new_blocks+0x700/0x70f [ext4dev] > > [] ? mark_page_accessed+0x5f/0x6b > > [] ? __find_get_block+0x1af/0x1c1 > > [] ? __wait_on_bit+0x6f/0x7e > > [] ? sync_buffer+0x0/0x44 > > [] do_blk_alloc+0x9d/0xb3 [ext4dev] > > [] ext4_new_meta_blocks+0x34/0x76 [ext4dev] > > [] ext4_new_meta_block+0x24/0x26 [ext4dev] > > [] ext4_xattr_block_set+0x50e/0x6d6 [ext4dev] > > [] ext4_xattr_set_handle+0x281/0x3f0 [ext4dev] > > Having already downed it here? > > I'll look into it, not 100% sure what path gets us here (between > in-inode EAs and external block EAs) but I'll see. This looks suspiciously like a similar bug fixed in the past by Kalpak, related to trying to grow large-inode space in ext4_expand_extra_isize_ea(). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.