From: Tao Ma Subject: Re: [URGENT PATCH] ext4: fix potential deadlock in ext4_evict_inode() Date: Fri, 26 Aug 2011 15:42:11 +0800 Message-ID: <4E574E53.80104@tao.ma> References: <20110826073507.GZ3162@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Theodore Ts'o , Jiaying Zhang , linux-ext4@vger.kernel.org To: Dave Chinner Return-path: Received: from oproxy3-pub.bluehost.com ([69.89.21.8]:45855 "HELO oproxy3-pub.bluehost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1750704Ab1HZHmS (ORCPT ); Fri, 26 Aug 2011 03:42:18 -0400 In-Reply-To: <20110826073507.GZ3162@dastard> Sender: linux-ext4-owner@vger.kernel.org List-ID: Hi Dave, On 08/26/2011 03:35 PM, Dave Chinner wrote: > On Thu, Aug 25, 2011 at 11:33:44PM -0400, Theodore Ts'o wrote: >> >> Note: this will probably need to be sent to Linus as an emergency >> bugfix ASAP, since it was introduced in 3.1-rc1, so it represents a >> regression. > > It doesn't appear to be a bug. All of the new ext4 lockdep reports > in 3.1 I've seen (except for the mmap_sem/i_mutex one) are false > positives.... > > ..... >> ======================================================= >> [ INFO: possible circular locking dependency detected ] >> 3.1.0-rc3-00012-g2a22fc1 #1839 >> ------------------------------------------------------- >> dd/7677 is trying to acquire lock: >> (&type->s_umount_key#18){++++..}, at: [] writeback_inodes_sb_if_idle+0x26/0x3d >> >> but task is already holding lock: >> (&sb->s_type->i_mutex_key#3){+.+.+.}, at: [] generic_file_aio_write+0x52/0xba >> >> which lock already depends on the new lock. >> >> the existing dependency chain (in reverse order) is: >> >> -> #1 (&sb->s_type->i_mutex_key#3){+.+.+.}: >> [] lock_acquire+0x99/0xbd >> [] __mutex_lock_common+0x33/0x2fb >> [] mutex_lock_nested+0x26/0x2f >> [] ext4_evict_inode+0x3e/0x2bd >> [] evict+0x8e/0x131 >> [] dispose_list+0x36/0x40 >> [] evict_inodes+0xcd/0xd5 >> [] generic_shutdown_super+0x3d/0xaa >> [] kill_block_super+0x22/0x5e >> [] deactivate_locked_super+0x22/0x4e >> [] deactivate_super+0x3d/0x43 >> [] mntput_no_expire+0xda/0xdf >> [] sys_umount+0x286/0x2ab >> [] sys_oldumount+0x12/0x14 >> [] syscall_call+0x7/0xb >> >> -> #0 (&type->s_umount_key#18){++++..}: >> [] __lock_acquire+0x967/0xbd2 >> [] lock_acquire+0x99/0xbd >> [] down_read+0x28/0x65 >> [] writeback_inodes_sb_if_idle+0x26/0x3d >> [] ext4_nonda_switch+0xd0/0xe1 >> [] ext4_da_write_begin+0x3c/0x1cf >> [] generic_file_buffered_write+0xc0/0x1b4 >> [] __generic_file_aio_write+0x254/0x285 >> [] generic_file_aio_write+0x6a/0xba >> [] ext4_file_write+0x1d6/0x227 >> [] do_sync_write+0x8f/0xca >> [] vfs_write+0x85/0xe3 >> [] sys_write+0x40/0x65 >> [] syscall_call+0x7/0xb > > That's definitely a false positive - sys_write() will have an active > reference to the inode, and evict is only called on inodes without > active references. Hence you can never get a deadlock between an > inode context with an active reference and the same inode in the > evict/dispose path because inode cannot be in both places at once... yeah, the fact is that lockdep isn't that smart. ;) > > This is why XFS changes the lockdep context for the its iolock as > soon as .evict is called on the inode - to stop these false > positives from being emitted whenever memory reclaim or unmount > evicts inodes. I don't think ext4 can change the lockdep context here since we have another ext4_end_io_work which can work and have mutex_lock(i_mutex) with inode->i_count=0. It isn't safe for us to abruptly change the lockdep here I guess. What I am trying to do here is to avoid the ext4_end_io_work in case of i_count = 0, so that we can either change the lockdep context or remove the mutex_lock here completely. Thanks Tao