Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752313Ab3CCAJr (ORCPT ); Sat, 2 Mar 2013 19:09:47 -0500 Received: from mail-qc0-f169.google.com ([209.85.216.169]:59844 "EHLO mail-qc0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751058Ab3CCAJq (ORCPT ); Sat, 2 Mar 2013 19:09:46 -0500 MIME-Version: 1.0 In-Reply-To: <20130302194923.GE4503@ZenIV.linux.org.uk> References: <20130302194923.GE4503@ZenIV.linux.org.uk> Date: Sat, 2 Mar 2013 16:09:45 -0800 Message-ID: Subject: Re: fasync race in fs/fcntl.c From: Russ Dill To: Al Viro Cc: linux-kernel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8213 Lines: 148 On Sat, Mar 2, 2013 at 11:49 AM, Al Viro wrote: > On Sat, Mar 02, 2013 at 03:00:28AM -0800, Russ Dill wrote: >> I'm seeing a race in fs/fcntl.c. I'm not sure exactly how the race is >> occurring, but the following is my best guess. A kernel log is >> attached. > > [snip the analysis - it's a different lock anyway] > > The traces below are essentially sys_execve() getting to get_random_bytes(), > to kill_fasync(), to send_sigio(), which spins on tasklist_lock. > > Could you rebuild it with lockdep enabled and try to reproduce that? > I very much doubt that this execve() is a part of deadlock - it's > getting caught on one, but it shouldn't be holding any locks that > nest inside tasklist_lock at that point, so even it hadn't been there, > the process holding tasklist_lock probably wouldn't have progressed any > further... ok, I did screw up the analysis quite badly, luckily, lockdep got it right away. [ 199.411236] [ 199.411261] ====================================================== [ 199.411268] [ INFO: possible circular locking dependency detected ] [ 199.411277] 3.8.0-9-generic #18 Not tainted [ 199.411283] ------------------------------------------------------- [ 199.411290] slirp-wrapper.s/5574 is trying to acquire lock: [ 199.411296] (&(&new->fa_lock)->rlock){-.-...}, at: [] kill_fasync+0x69/0xe0 [ 199.411332] [ 199.411332] but task is already holding lock: [ 199.411338] (&nonblocking_pool.lock){......}, at: [] account+0x39/0x1d0 [ 199.411364] [ 199.411364] which lock already depends on the new lock. [ 199.411364] [ 199.411371] [ 199.411371] the existing dependency chain (in reverse order) is: [ 199.411379] [ 199.411379] -> #2 (&nonblocking_pool.lock){......}: [ 199.411400] [] lock_acquire+0x98/0x120 [ 199.411416] [] _raw_spin_lock_irqsave+0x4e/0x70 [ 199.411430] [] mix_pool_bytes.constprop.20+0x43/0xb0 [ 199.411442] [] add_device_randomness+0x64/0x90 [ 199.411454] [] posix_cpu_timers_exit+0x1e/0x50 [ 199.411469] [] release_task+0xe2/0x470 [ 199.411484] [] do_exit+0x5d7/0x9c0 [ 199.411495] [] __module_put_and_exit+0x1a/0x20 [ 199.411509] [] cryptomgr_test+0x25/0x30 [ 199.411526] [] kthread+0xea/0xf0 [ 199.411537] [] ret_from_fork+0x7c/0xb0 [ 199.411550] [ 199.411550] -> #1 (&(&sighand->siglock)->rlock){-.-...}: [ 199.411689] [] lock_acquire+0x98/0x120 [ 199.411702] [] _raw_spin_lock_irqsave+0x4e/0x70 [ 199.411713] [] vblank_disable_fn+0x60/0xd0 [drm] [ 199.411768] [] call_timer_fn+0x7a/0x180 [ 199.411781] [] run_timer_softirq+0x1fc/0x2a0 [ 199.411794] [] __do_softirq+0xe0/0x220 [ 199.411807] [] call_softirq+0x1c/0x30 [ 199.411819] [] do_softirq+0xa5/0xe0 [ 199.411833] [] irq_exit+0xb5/0xc0 [ 199.411846] [] smp_apic_timer_interrupt+0x6e/0x99 [ 199.411859] [] apic_timer_interrupt+0x72/0x80 [ 199.411871] [] cpuidle_enter_tk+0x10/0x20 [ 199.411885] [] cpuidle_idle_call+0xa5/0x270 [ 199.411897] [] cpu_idle+0xb5/0x120 [ 199.411910] [] rest_init+0xb4/0xc0 [ 199.411924] [] start_kernel+0x3ea/0x3f6 [ 199.411939] [] x86_64_start_reservations+0x130/0x133 [ 199.411952] [] x86_64_start_kernel+0x100/0x10f [ 199.411966] [ 199.411966] -> #0 (&(&new->fa_lock)->rlock){-.-...}: [ 199.411984] [] __lock_acquire+0x1279/0x1aa0 [ 199.411997] [] lock_acquire+0x98/0x120 [ 199.412103] [] _raw_spin_lock_irqsave+0x4e/0x70 [ 199.412115] [] kill_fasync+0x69/0xe0 [ 199.412127] [] account+0x113/0x1d0 [ 199.412138] [] extract_entropy+0x65/0x140 [ 199.412149] [] get_random_bytes+0x20/0x30 [ 199.412159] [] create_elf_tables+0xaa/0x615 [ 199.412172] [] load_elf_binary+0xae4/0xe00 [ 199.412185] [] search_binary_handler+0x171/0x390 [ 199.412197] [] load_script+0x265/0x2b0 [ 199.412208] [] search_binary_handler+0x171/0x390 [ 199.412219] [] do_execve_common.isra.23+0x405/0x4c0 [ 199.412230] [] do_execve+0x18/0x20 [ 199.412240] [] sys_execve+0x3d/0x60 [ 199.412251] [] stub_execve+0x69/0xc0 [ 199.412263] [ 199.412263] other info that might help us debug this: [ 199.412263] [ 199.412271] Chain exists of: [ 199.412271] &(&new->fa_lock)->rlock --> &(&sighand->siglock)->rlock --> &nonblocking_pool.lock [ 199.412271] [ 199.412298] Possible unsafe locking scenario: [ 199.412298] [ 199.412305] CPU0 CPU1 [ 199.412310] ---- ---- [ 199.412316] lock(&nonblocking_pool.lock); [ 199.412328] lock(&(&sighand->siglock)->rlock); [ 199.412339] lock(&nonblocking_pool.lock); [ 199.412452] lock(&(&new->fa_lock)->rlock); [ 199.412464] [ 199.412464] *** DEADLOCK *** [ 199.412464] [ 199.412473] 2 locks held by slirp-wrapper.s/5574: [ 199.412478] #0: (&nonblocking_pool.lock){......}, at: [] account+0x39/0x1d0 [ 199.412504] #1: (rcu_read_lock){.+.+..}, at: [] kill_fasync+0x21/0xe0 [ 199.412531] [ 199.412531] stack backtrace: [ 199.412540] Pid: 5574, comm: slirp-wrapper.s Not tainted 3.8.0-9-generic #18 [ 199.412546] Call Trace: [ 199.412560] [] print_circular_bug+0x1f9/0x208 [ 199.412572] [] __lock_acquire+0x1279/0x1aa0 [ 199.412584] [] lock_acquire+0x98/0x120 [ 199.412596] [] ? kill_fasync+0x69/0xe0 [ 199.412606] [] _raw_spin_lock_irqsave+0x4e/0x70 [ 199.412618] [] ? kill_fasync+0x69/0xe0 [ 199.412628] [] kill_fasync+0x69/0xe0 [ 199.412639] [] ? kill_fasync+0x21/0xe0 [ 199.412649] [] account+0x113/0x1d0 [ 199.412658] [] extract_entropy+0x65/0x140 [ 199.412668] [] get_random_bytes+0x20/0x30 [ 199.412678] [] create_elf_tables+0xaa/0x615 [ 199.412689] [] load_elf_binary+0xae4/0xe00 [ 199.412700] [] ? load_elf_library+0x240/0x240 [ 199.412710] [] search_binary_handler+0x171/0x390 [ 199.412720] [] ? search_binary_handler+0x7f/0x390 [ 199.412730] [] ? compat_sys_ioctl+0x320/0x320 [ 199.412740] [] load_script+0x265/0x2b0 [ 199.412750] [] ? search_binary_handler+0x18f/0x390 [ 199.412760] [] ? compat_sys_ioctl+0x320/0x320 [ 199.412770] [] search_binary_handler+0x171/0x390 [ 199.412779] [] ? search_binary_handler+0x7f/0x390 [ 199.412789] [] do_execve_common.isra.23+0x405/0x4c0 [ 199.412799] [] ? do_execve_common.isra.23+0x114/0x4c0 [ 199.412809] [] do_execve+0x18/0x20 [ 199.412922] [] sys_execve+0x3d/0x60 [ 199.412933] [] stub_execve+0x69/0xc0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/