Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755203AbYJ2VAa (ORCPT ); Wed, 29 Oct 2008 17:00:30 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752823AbYJ2VAT (ORCPT ); Wed, 29 Oct 2008 17:00:19 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:60994 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752481AbYJ2VAR (ORCPT ); Wed, 29 Oct 2008 17:00:17 -0400 Date: Wed, 29 Oct 2008 13:58:40 -0700 From: Andrew Morton To: Mariusz Kozlowski Cc: linux-kernel@vger.kernel.org, kernel-testers@vger.kernel.org, Christoph Lameter , KOSAKI Motohiro , Heiko Carstens , Nick Piggin , Hugh Dickins , linux-mm@kvack.org Subject: Re: 2.6.28-rc2-mm1: possible circular locking Message-Id: <20081029135840.0a50e19c.akpm@linux-foundation.org> In-Reply-To: <200810292146.03967.m.kozlowski@tuxland.pl> References: <20081028233836.8b1ff9ae.akpm@linux-foundation.org> <200810292146.03967.m.kozlowski@tuxland.pl> X-Mailer: Sylpheed version 2.2.4 (GTK+ 2.8.20; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5599 Lines: 144 On Wed, 29 Oct 2008 21:46:03 +0100 Mariusz Kozlowski wrote: > Hello, > > Happens on every startup when psi starts as well. Thanks. > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.28-rc2-mm1 #1 > ------------------------------------------------------- > psi/4733 is trying to acquire lock: > (events){--..}, at: [] flush_work+0x2d/0xcb > > but task is already holding lock: > (&mm->mmap_sem){----}, at: [] sys_mlock+0x2c/0xb6 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #4 (&mm->mmap_sem){----}: > [] validate_chain+0xacb/0xfe0 > [] __lock_acquire+0x26e/0x98d > [] lock_acquire+0x5c/0x74 > [] might_fault+0x74/0x90 > [] copy_to_user+0x28/0x40 > [] filldir64+0xaa/0xcd > [] sysfs_readdir+0x118/0x1dd > [] vfs_readdir+0x70/0x85 > [] sys_getdents64+0x60/0xa8 > [] sysenter_do_call+0x12/0x35 > [] 0xffffffff > > -> #3 (sysfs_mutex){--..}: > [] validate_chain+0xacb/0xfe0 > [] __lock_acquire+0x26e/0x98d > [] lock_acquire+0x5c/0x74 > [] mutex_lock_nested+0x8d/0x297 > [] sysfs_addrm_start+0x26/0x9e > [] create_dir+0x3a/0x83 > [] sysfs_create_dir+0x2b/0x43 > [] kobject_add_internal+0xb4/0x186 > [] kobject_add_varg+0x41/0x4d > [] kobject_add+0x2f/0x57 > [] device_add+0xa4/0x58f > [] netdev_register_kobject+0x65/0x6a > [] register_netdevice+0x209/0x2fb > [] register_netdev+0x32/0x3f > [] loopback_net_init+0x33/0x6f > [] register_pernet_operations+0x13/0x15 > [] register_pernet_device+0x1f/0x4c > [] loopback_init+0xd/0xf > [] _stext+0x27/0x147 > [] kernel_init+0x7d/0xd6 > [] kernel_thread_helper+0x7/0x14 > [] 0xffffffff > > -> #2 (rtnl_mutex){--..}: > [] validate_chain+0xacb/0xfe0 > [] __lock_acquire+0x26e/0x98d > [] lock_acquire+0x5c/0x74 > [] mutex_lock_nested+0x8d/0x297 > [] rtnl_lock+0xf/0x11 > [] linkwatch_event+0x8/0x27 > [] run_workqueue+0x15c/0x1e3 > [] worker_thread+0x71/0xa4 > [] kthread+0x37/0x59 > [] kernel_thread_helper+0x7/0x14 > [] 0xffffffff > > -> #1 ((linkwatch_work).work){--..}: > [] validate_chain+0xacb/0xfe0 > [] __lock_acquire+0x26e/0x98d > [] lock_acquire+0x5c/0x74 > [] run_workqueue+0x157/0x1e3 > [] worker_thread+0x71/0xa4 > [] kthread+0x37/0x59 > [] kernel_thread_helper+0x7/0x14 > [] 0xffffffff > > -> #0 (events){--..}: > [] validate_chain+0x5aa/0xfe0 > [] __lock_acquire+0x26e/0x98d > [] lock_acquire+0x5c/0x74 > [] flush_work+0x59/0xcb > [] schedule_on_each_cpu+0x65/0x7f > [] lru_add_drain_all+0xd/0xf > [] __mlock_vma_pages_range+0x44/0x206 > [] mlock_fixup+0x15d/0x1c9 > [] do_mlock+0x96/0xc8 > [] sys_mlock+0xb2/0xb6 > [] sysenter_do_call+0x12/0x35 > [] 0xffffffff > > other info that might help us debug this: > > 1 lock held by psi/4733: > #0: (&mm->mmap_sem){----}, at: [] sys_mlock+0x2c/0xb6 > > stack backtrace: > Pid: 4733, comm: psi Not tainted 2.6.28-rc2-mm1 #1 > Call Trace: > [] print_circular_bug_tail+0x78/0xb5 > [] ? print_circular_bug_entry+0x43/0x4b > [] validate_chain+0x5aa/0xfe0 > [] ? hrtick_update+0x23/0x25 > [] __lock_acquire+0x26e/0x98d > [] ? default_wake_function+0xb/0xd > [] lock_acquire+0x5c/0x74 > [] ? flush_work+0x2d/0xcb > [] flush_work+0x59/0xcb > [] ? flush_work+0x2d/0xcb > [] ? trace_hardirqs_on+0xb/0xd > [] ? __queue_work+0x26/0x2b > [] ? queue_work_on+0x37/0x47 > [] ? lru_add_drain_per_cpu+0x0/0xa > [] ? lru_add_drain_per_cpu+0x0/0xa > [] schedule_on_each_cpu+0x65/0x7f > [] lru_add_drain_all+0xd/0xf > [] __mlock_vma_pages_range+0x44/0x206 > [] ? vma_adjust+0x17e/0x384 > [] ? split_vma+0xe1/0xf7 > [] mlock_fixup+0x15d/0x1c9 > [] do_mlock+0x96/0xc8 > [] ? down_write+0x42/0x68 > [] sys_mlock+0xb2/0xb6 > [] sysenter_do_call+0x12/0x35 > This is similar to the problem which mm-move-migrate_prep-out-from-under-mmap_sem.patch was supposed to fix. We've been calling schedule_on_each_cpu() from within lru_add_drain_all() for ages. What changed to cause all this to start happening? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/