Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755451AbXJWVxx (ORCPT ); Tue, 23 Oct 2007 17:53:53 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753174AbXJWVxp (ORCPT ); Tue, 23 Oct 2007 17:53:45 -0400 Received: from anchor-post-37.mail.demon.net ([194.217.242.87]:37616 "EHLO anchor-post-37.mail.demon.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753028AbXJWVxo (ORCPT ); Tue, 23 Oct 2007 17:53:44 -0400 Subject: Re: mm: soft lockup in 2.6.23-6636. caused by drop_caches ? From: richard kennedy To: Peter Zijlstra Cc: lkml , linux-mm , Andrew Morton In-Reply-To: <1193148210.16944.1.camel@twins> References: <1193147728.3044.18.camel@castor.rsk.org> <1193148210.16944.1.camel@twins> Content-Type: text/plain Date: Tue, 23 Oct 2007 22:53:41 +0100 Message-Id: <1193176421.3126.14.camel@castor.rsk.org> Mime-Version: 1.0 X-Mailer: Evolution 2.10.3 (2.10.3-4.fc7) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4972 Lines: 121 On Tue, 2007-10-23 at 16:03 +0200, Peter Zijlstra wrote: > On Tue, 2007-10-23 at 14:55 +0100, richard kennedy wrote: > > on git v2.6.23-6636-g557ebb7 I'm getting a soft lockup when running a > > simple disk write test case on AMD64X2, sata hd & ext3. > > > > the test does this > > sync > > echo 3 > /proc/sys/vm/drop_caches > > for (( i=0; $i < $count; i=$i+1 )) ; do > > dd if=large_file of=copy_file_$i bs=4k & > > done > > have you tried with lockdep enabled? > > Also, doesn't really surprise me since its known that drop_caches has a > deadlock in it. > Thanks for suggestion, of course it took a lot longer to fail with all the debug turned on. But, lockdep gives a possible circular lock dependency between journal->j_list_lock and inode_lock drop_pagecache_sb takes the inode_lock and calls down into journal_try_to_free_buffers which takes the journal->j_list_lock while in kjournald, journal_commit_transaction take the j_list_lock and calls __journal_unfile_buffer that takes the inode_lock. I'm not sure how to fix this, but hope this helps someone else ;) Here's the full info. Cheers Richard ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.23 #2 ------------------------------------------------------- bash/3535 is trying to acquire lock: (&journal->j_list_lock){--..}, at: [] journal_try_to_free_buffers+0x7e/0x131 [jbd] but task is already holding lock: (inode_lock){--..}, at: [] drop_pagecache+0x4d/0xeb which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (inode_lock){--..}: [] __lock_acquire+0xac8/0xcf0 [] __mark_inode_dirty+0xe2/0x174 [] lock_acquire+0x84/0xa8 [] __mark_inode_dirty+0xe2/0x174 [] _spin_lock+0x1e/0x27 [] __mark_inode_dirty+0xe2/0x174 [] __set_page_dirty+0x118/0x124 [] __journal_unfile_buffer+0x9/0x13 [jbd] [] journal_commit_transaction+0xbd1/0xde8 [jbd] [] kjournald+0xc6/0x1f1 [jbd] [] autoremove_wake_function+0x0/0x2e [] trace_hardirqs_on+0x11e/0x149 [] kjournald+0x0/0x1f1 [jbd] [] kthread+0x47/0x73 [] trace_hardirqs_on_thunk+0x35/0x3a [] child_rip+0xa/0x12 [] restore_args+0x0/0x30 [] kthread+0x0/0x73 [] child_rip+0x0/0x12 [] 0xffffffffffffffff -> #0 (&journal->j_list_lock){--..}: [] print_circular_bug_header+0xcc/0xd3 [] __lock_acquire+0x9cd/0xcf0 [] journal_try_to_free_buffers+0x7e/0x131 [jbd] [] lock_acquire+0x84/0xa8 [] journal_try_to_free_buffers+0x7e/0x131 [jbd] [] _spin_lock+0x1e/0x27 [] journal_try_to_free_buffers+0x7e/0x131 [jbd] [] __invalidate_mapping_pages+0x81/0x103 [] drop_pagecache+0x76/0xeb [] drop_caches_sysctl_handler+0x1a/0x2e [] proc_sys_write+0x7c/0xa4 [] vfs_write+0xc6/0x16f [] sys_write+0x45/0x6e [] tracesys+0xdc/0xe1 [] 0xffffffffffffffff other info that might help us debug this: 2 locks held by bash/3535: #0: (&type->s_umount_key#17){----}, at: [] drop_pagecache+0x3a/0xeb #1: (inode_lock){--..}, at: [] drop_pagecache+0x4d/0xeb stack backtrace: Call Trace: [] print_circular_bug_tail+0x69/0x72 [] print_circular_bug_header+0xcc/0xd3 [] __lock_acquire+0x9cd/0xcf0 [] :jbd:journal_try_to_free_buffers+0x7e/0x131 [] lock_acquire+0x84/0xa8 [] :jbd:journal_try_to_free_buffers+0x7e/0x131 [] _spin_lock+0x1e/0x27 [] :jbd:journal_try_to_free_buffers+0x7e/0x131 [] __invalidate_mapping_pages+0x81/0x103 [] drop_pagecache+0x76/0xeb [] drop_caches_sysctl_handler+0x1a/0x2e [] proc_sys_write+0x7c/0xa4 [] vfs_write+0xc6/0x16f [] sys_write+0x45/0x6e [] tracesys+0xdc/0xe1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/