Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752086Ab3CUTG7 (ORCPT ); Thu, 21 Mar 2013 15:06:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:20521 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751073Ab3CUTG5 (ORCPT ); Thu, 21 Mar 2013 15:06:57 -0400 Date: Thu, 21 Mar 2013 15:06:53 -0400 From: Dave Jones To: Al Viro Cc: Linux Kernel , Linus Torvalds Subject: VFS deadlock ? Message-ID: <20130321190653.GA15479@redhat.com> Mail-Followup-To: Dave Jones , Al Viro , Linux Kernel , Linus Torvalds MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 12311 Lines: 242 Al, Linus, I've seen this happen a few times recently.. I have a bunch of trinity processes that run for a few hours, and then eventually all those processes end up spinning waiting on each other. The stack traces always seem to look like this.. Does this look like a real kernel deadlock, or more likely that trinity was unlucky enough to do something really dumb that I need to protect itself against ? I've only seen it happening the last week or so (but that may be because of all the crap that I ended up finding last week preventing me running long enough to hit this until now..) Tree is Linus' latest with Peter Hurley's tty patchset on top (which afair doesn't touch anything in these paths). lockdep hasn't picked up anything, which makes me wonder if it's my problem.. Dave trinity-child2 D ffff880110b3a7e0 5448 7669 1 0x00000004 ffff8801272a7d20 0000000000000086 ffff880101874920 ffff8801272a7fd8 ffff8801272a7fd8 ffff8801272a7fd8 ffff880128ca8000 ffff880101874920 ffff8801039a0250 0000000000000246 ffff880101874920 ffff8801272a7fd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lock_rename+0x114/0x120 [] ? lock_rename+0x114/0x120 [] lock_rename+0x114/0x120 [] sys_renameat+0x1f7/0x3b0 [] ? get_lock_stats+0x22/0x70 [] ? trace_hardirqs_on_caller+0x115/0x1a0 [] ? trace_hardirqs_on_caller+0x115/0x1a0 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] system_call_fastpath+0x16/0x1b trinity-child1 D ffff880110b3a7e0 5200 7901 1 0x00000004 ffff8801015ddc50 0000000000000086 ffff880123fca490 ffff8801015ddfd8 ffff8801015ddfd8 ffff8801015ddfd8 ffff880128cac920 ffff880123fca490 ffff8801015dde60 0000000000000246 ffff880123fca490 ffff8801015ddfd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? native_sched_clock+0x24/0x80 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? native_sched_clock+0x24/0x80 [] ? trace_hardirqs_off_caller+0x28/0xc0 [] ? sysret_check+0x1b/0x56 [] user_path_at+0x11/0x20 [] sys_fchmodat+0x38/0xa0 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] sys_chmod+0x19/0x20 [] system_call_fastpath+0x16/0x1b trinity-child1 D ffff880110b3a7e0 5528 8651 29705 0x00000004 ffff8801187cfc60 0000000000000082 ffff880126a28000 ffff8801187cffd8 ffff8801187cffd8 ffff8801187cffd8 ffffffff81c10440 ffff880126a28000 ffff8801187cfe70 0000000000000246 ffff880126a28000 ffff8801187cffd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? trace_hardirqs_off_caller+0x28/0xc0 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? trace_hardirqs_off_caller+0x28/0xc0 [] ? sysret_check+0x1b/0x56 [] user_path_at+0x11/0x20 [] sys_removexattr+0x34/0xb0 [] system_call_fastpath+0x16/0x1b trinity-child3 D ffff880110b3a7e0 5312 9067 1 0x00000004 ffff880100d29c60 0000000000000082 ffff8801271a0000 ffff880100d29fd8 ffff880100d29fd8 ffff880100d29fd8 ffff880128cac920 ffff8801271a0000 ffff880100d29e70 0000000000000246 ffff8801271a0000 ffff880100d29fd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? put_lock_stats.isra.27+0xe/0x40 [] ? trace_hardirqs_on+0xd/0x10 [] ? _raw_spin_unlock_irq+0x37/0x60 [] ? sysret_check+0x1b/0x56 [] user_path_at+0x11/0x20 [] sys_fchmodat+0x38/0xa0 [] system_call_fastpath+0x16/0x1b trinity-child0 D ffff880110b3a7e0 5320 9112 1 0x00000004 ffff8801014fbc50 0000000000000086 ffff880123a94920 ffff8801014fbfd8 ffff8801014fbfd8 ffff8801014fbfd8 ffffffff81c10440 ffff880123a94920 ffff8801014fbe60 0000000000000246 ffff880123a94920 ffff8801014fbfd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? native_sched_clock+0x24/0x80 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? native_sched_clock+0x24/0x80 [] ? trace_hardirqs_off_caller+0x28/0xc0 [] ? sysret_check+0x1b/0x56 [] user_path_at+0x11/0x20 [] do_sys_truncate+0x35/0x90 [] sys_truncate+0xe/0x10 [] system_call_fastpath+0x16/0x1b trinity-child3 D ffff880110b3a7e0 5304 9414 29705 0x00000004 ffff880127b23c40 0000000000000082 ffff880128d38000 ffff880127b23fd8 ffff880127b23fd8 ffff880127b23fd8 ffffffff81c10440 ffff880128d38000 ffff880127b23e50 0000000000000246 ffff880128d38000 ffff880127b23fd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? creds_are_invalid.part.8+0x12/0x50 [] ? prepare_creds+0x3a/0x1f0 [] user_path_at+0x11/0x20 [] sys_faccessat+0xa3/0x230 [] system_call_fastpath+0x16/0x1b trinity-child2 D ffff880110b3a7e0 5704 10021 29705 0x00000004 ffff8801228a7be0 0000000000000086 ffff880128c18000 ffff8801228a7fd8 ffff8801228a7fd8 ffff8801228a7fd8 ffffffff81c10440 ffff880128c18000 ffff8801228a7df0 0000000000000246 ffff880128c18000 ffff8801228a7fd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] ? native_sched_clock+0x24/0x80 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? might_fault+0x9c/0xb0 [] user_path_at+0x11/0x20 [] vfs_fstatat+0x49/0x90 [] sys_newlstat+0x27/0x40 [] ? trace_hardirqs_on_caller+0x115/0x1a0 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] system_call_fastpath+0x16/0x1b trinity-child0 D ffff880110b3a7e0 5624 10222 29705 0x00000004 ffff8801005ebc50 0000000000000086 ffff880101872490 ffff8801005ebfd8 ffff8801005ebfd8 ffff8801005ebfd8 ffffffff81c10440 ffff880101872490 ffff8801005ebe60 0000000000000246 ffff880101872490 ffff8801005ebfd8 Call Trace: [] schedule_preempt_disabled+0x33/0x80 [] mutex_lock_nested+0x1a1/0x3d0 [] ? lookup_slow+0x13f/0x1b8 [] ? lookup_slow+0x13f/0x1b8 [] lookup_slow+0x13f/0x1b8 [] ? lookup_fast+0x16a/0x2d0 [] path_lookupat+0x79c/0x7c0 [] ? kmem_cache_alloc+0xf7/0x340 [] ? getname_flags+0x5a/0x1a0 [] filename_lookup+0x34/0xc0 [] user_path_at_empty+0x63/0xa0 [] ? get_lock_stats+0x22/0x70 [] ? trace_hardirqs_on+0xd/0x10 [] ? _raw_spin_unlock_irq+0x37/0x60 [] ? sysret_check+0x1b/0x56 [] user_path_at+0x11/0x20 [] sys_fchownat+0x66/0x110 [] system_call_fastpath+0x16/0x1b runnable tasks: task PID tree-key switches prio exec-runtime sum-exec sum-sleep ---------------------------------------------------------------------------------------------------------- Showing all locks held in the system: 4 locks on stack by trinity-child2/7669: #0: blocked: (sb_writers#4){.+.+.+}, instance: ffff8801292d17d8, at: [] mnt_want_write+0x24/0x50 #1: held: (&type->s_vfs_rename_key){+.+.+.}, instance: ffff8801292d1928, at: [] lock_rename+0x3e/0x120 #2: held: (&type->i_mutex_dir_key#2/1){+.+.+.}, instance: ffff880110b3a858, at: [] lock_rename+0xfe/0x120 #3: blocked: (&type->i_mutex_dir_key#2/2){+.+.+.}, instance: ffff880110b3a858, at: [] lock_rename+0x114/0x120 1 lock on stack by trinity-child1/7901: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 1 lock on stack by trinity-child1/8651: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 1 lock on stack by trinity-child3/9067: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 1 lock on stack by trinity-child0/9112: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 1 lock on stack by trinity-child3/9414: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 1 lock on stack by trinity-child2/10021: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 1 lock on stack by trinity-child0/10222: #0: blocked: (&type->i_mutex_dir_key#2){+.+.+.}, instance: ffff880110b3a858, at: [] lookup_slow+0x13f/0x1b8 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/