From: tytso@mit.edu Subject: Re: 2.6.33-rc1: kernel BUG at fs/ext4/inode.c:1063 (sparc) Date: Sun, 27 Dec 2009 22:51:59 -0500 Message-ID: <20091228035159.GC4429@thunk.org> References: <87hbrfdylc.fsf@openvz.org> <87eimir4yk.fsf@openvz.org> <20091227225216.GB4429@thunk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dmitry Monakhov , Linux Kernel Mailing List , linux-ext4@vger.kernel.org, Jens Axboe , dmitry.torokhov@gmail.com, Jan Kara To: Alexander Beregalov Return-path: Received: from thunk.org ([69.25.196.29]:36308 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750823AbZL1DwG (ORCPT ); Sun, 27 Dec 2009 22:52:06 -0500 Content-Disposition: inline In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: kernel BUG at fs/ext4/inode.c:1063! > >> Here is output of 2.6.33-rc2 plus Ted's patch > >> > >> EXT4-fs (sda1): inode #1387643: mdb_free (1) < mdb_claim (2) BUG > > OK, i've been able to reproduce the problem using xfsqa test #74 (fstest) when an ext3 file system is mounted the ext4 file system driver. I was then able to bisect it down to commit d21cd8f6, which was introduced between 2.6.33-rc1 and 2.6.33-rc2, as part of quota/ext4 patch series pushed by Jan. I then tested v2.6.33-rc2 with commit d21cd8 reverted, and I was not able to replicate the BUG. More investigation is needed, but if we compare the potential quota deadlock with the apparently fairly-easy-to-replicate BUG, if we can't find a better fix fairly quickly we should probably just revert this commit for now. - Ted commit d21cd8f163ac44b15c465aab7306db931c606908 Author: Dmitry Monakhov AuthorDate: Thu Dec 10 03:31:45 2009 +0000 Commit: Jan Kara CommitDate: Wed Dec 23 13:44:12 2009 +0100 ext4: Fix potential quota deadlock We have to delay vfs_dq_claim_space() until allocation context destruction. Currently we have following call-trace: ext4_mb_new_blocks() /* task is already holding ac->alloc_semp */ ->ext4_mb_mark_diskspace_used ->vfs_dq_claim_space() /* acquire dqptr_sem here. Possible deadlock */ ->ext4_mb_release_context() /* drop ac->alloc_semp here */ Let's move quota claiming to ext4_da_update_reserve_space() ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-rc7 #18 ------------------------------------------------------- write-truncate-/3465 is trying to acquire lock: (&s->s_dquot.dqptr_sem){++++..}, at: [] dquot_claim_space+0x3b/0x1b0 but task is already holding lock: (&meta_group_info[i]->alloc_sem){++++..}, at: [] ext4_mb_load_buddy+0xb2/0x370 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&meta_group_info[i]->alloc_sem){++++..}: [] __lock_acquire+0xd7b/0x1260 [] lock_acquire+0xba/0xd0 [] down_read+0x51/0x90 [] ext4_mb_load_buddy+0xb2/0x370 [] ext4_mb_free_blocks+0x46c/0x870 [] ext4_free_blocks+0x73/0x130 [] ext4_ext_truncate+0x76c/0x8d0 [] ext4_truncate+0x187/0x5e0 [] vmtruncate+0x6b/0x70 [] inode_setattr+0x62/0x190 [] ext4_setattr+0x25a/0x370 [] notify_change+0x151/0x340 [] do_truncate+0x6d/0xa0 [] may_open+0x1d4/0x200 [] do_filp_open+0x1eb/0x910 [] do_sys_open+0x6d/0x140 [] sys_open+0x2e/0x40 [] sysenter_do_call+0x12/0x32 -> #2 (&ei->i_data_sem){++++..}: [] __lock_acquire+0xd7b/0x1260 [] lock_acquire+0xba/0xd0 [] down_read+0x51/0x90 [] ext4_get_blocks+0x47/0x450 [] ext4_getblk+0x61/0x1d0 [] ext4_bread+0x1f/0xa0 [] ext4_quota_write+0x12c/0x310 [] qtree_write_dquot+0x93/0x120 [] v2_write_dquot+0x28/0x30 [] dquot_commit+0xab/0xf0 [] ext4_write_dquot+0x77/0x90 [] ext4_mark_dquot_dirty+0x2f/0x50 [] dquot_alloc_inode+0x101/0x180 [] ext4_new_inode+0x602/0xf00 [] ext4_create+0x89/0x150 [] vfs_create+0xa2/0xc0 [] do_filp_open+0x7a7/0x910 [] do_sys_open+0x6d/0x140 [] sys_open+0x2e/0x40 [] sysenter_do_call+0x12/0x32 -> #1 (&sb->s_type->i_mutex_key#7/4){+.+...}: [] __lock_acquire+0xd7b/0x1260 [] lock_acquire+0xba/0xd0 [] mutex_lock_nested+0x65/0x2d0 [] vfs_load_quota_inode+0x4bd/0x5a0 [] vfs_quota_on_path+0x5f/0x70 [] ext4_quota_on+0x112/0x190 [] sys_quotactl+0x44a/0x8a0 [] sysenter_do_call+0x12/0x32 -> #0 (&s->s_dquot.dqptr_sem){++++..}: [] __lock_acquire+0x1091/0x1260 [] lock_acquire+0xba/0xd0 [] down_read+0x51/0x90 [] dquot_claim_space+0x3b/0x1b0 [] ext4_mb_mark_diskspace_used+0x36f/0x380 [] ext4_mb_new_blocks+0x34a/0x530 [] ext4_ext_get_blocks+0x122b/0x13c0 [] ext4_get_blocks+0x226/0x450 [] mpage_da_map_blocks+0xc3/0xaa0 [] ext4_da_writepages+0x506/0x790 [] do_writepages+0x22/0x50 [] __filemap_fdatawrite_range+0x6d/0x80 [] filemap_flush+0x2b/0x30 [] ext4_alloc_da_blocks+0x5c/0x60 [] ext4_release_file+0x75/0xb0 [] __fput+0xf9/0x210 [] fput+0x27/0x30 [] filp_close+0x4c/0x80 [] put_files_struct+0x6e/0xd0 [] exit_files+0x47/0x60 [] do_exit+0x144/0x710 [] do_group_exit+0x38/0xa0 [] get_signal_to_deliver+0x2ac/0x410 [] do_notify_resume+0xb9/0x890 [] work_notifysig+0x13/0x21 other info that might help us debug this: 3 locks held by write-truncate-/3465: #0: (jbd2_handle){+.+...}, at: [] start_this_handle+0x38f/0x5c0 #1: (&ei->i_data_sem){++++..}, at: [] ext4_get_blocks+0xb6/0x450 #2: (&meta_group_info[i]->alloc_sem){++++..}, at: [] ext4_mb_load_buddy+0xb2/0x370 stack backtrace: Pid: 3465, comm: write-truncate- Not tainted 2.6.32-rc7 #18 Call Trace: [] ? printk+0x1d/0x22 [] print_circular_bug+0xca/0xd0 [] __lock_acquire+0x1091/0x1260 [] ? sched_clock_local+0xd2/0x170 [] ? trace_hardirqs_off_caller+0x20/0xd0 [] lock_acquire+0xba/0xd0 [] ? dquot_claim_space+0x3b/0x1b0 [] down_read+0x51/0x90 [] ? dquot_claim_space+0x3b/0x1b0 [] dquot_claim_space+0x3b/0x1b0 [] ext4_mb_mark_diskspace_used+0x36f/0x380 [] ext4_mb_new_blocks+0x34a/0x530 [] ? ext4_ext_find_extent+0x25d/0x280 [] ext4_ext_get_blocks+0x122b/0x13c0 [] ? sched_clock_local+0xd2/0x170 [] ? sched_clock_cpu+0x120/0x160 [] ? cpu_clock+0x4f/0x60 [] ? trace_hardirqs_off_caller+0x20/0xd0 [] ? down_write+0x8c/0xa0 [] ext4_get_blocks+0x226/0x450 [] ? sched_clock_cpu+0x120/0x160 [] ? cpu_clock+0x4f/0x60 [] ? trace_hardirqs_off+0xb/0x10 [] mpage_da_map_blocks+0xc3/0xaa0 [] ? find_get_pages_tag+0x16c/0x180 [] ? find_get_pages_tag+0x0/0x180 [] ? __mpage_da_writepage+0x16d/0x1a0 [] ? pagevec_lookup_tag+0x2e/0x40 [] ? write_cache_pages+0xdb/0x3d0 [] ? __mpage_da_writepage+0x0/0x1a0 [] ext4_da_writepages+0x506/0x790 [] ? cpu_clock+0x4f/0x60 [] ? sched_clock_local+0xd2/0x170 [] ? sched_clock_cpu+0x120/0x160 [] ? sched_clock_cpu+0x120/0x160 [] ? ext4_da_writepages+0x0/0x790 [] do_writepages+0x22/0x50 [] __filemap_fdatawrite_range+0x6d/0x80 [] filemap_flush+0x2b/0x30 [] ext4_alloc_da_blocks+0x5c/0x60 [] ext4_release_file+0x75/0xb0 [] __fput+0xf9/0x210 [] fput+0x27/0x30 [] filp_close+0x4c/0x80 [] put_files_struct+0x6e/0xd0 [] exit_files+0x47/0x60 [] do_exit+0x144/0x710 [] ? lock_release_holdtime+0x33/0x210 [] ? _spin_unlock_irq+0x27/0x30 [] do_group_exit+0x38/0xa0 [] ? trace_hardirqs_on+0xb/0x10 [] get_signal_to_deliver+0x2ac/0x410 [] do_notify_resume+0xb9/0x890 [] ? trace_hardirqs_off_caller+0x20/0xd0 [] ? lock_release_holdtime+0x33/0x210 [] ? autoremove_wake_function+0x0/0x50 [] ? trace_hardirqs_on_caller+0x134/0x190 [] ? trace_hardirqs_on+0xb/0x10 [] ? security_file_permission+0x14/0x20 [] ? vfs_write+0x131/0x190 [] ? do_sync_write+0x0/0x120 [] ? sysenter_do_call+0x27/0x32 [] work_notifysig+0x13/0x21 CC: Theodore Ts'o Signed-off-by: Dmitry Monakhov Signed-off-by: Jan Kara