Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758617Ab3GPCcy (ORCPT ); Mon, 15 Jul 2013 22:32:54 -0400 Received: from mail-vb0-f52.google.com ([209.85.212.52]:35128 "EHLO mail-vb0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753710Ab3GPCcw (ORCPT ); Mon, 15 Jul 2013 22:32:52 -0400 MIME-Version: 1.0 In-Reply-To: <20130716015305.GB30569@redhat.com> References: <20130716015305.GB30569@redhat.com> Date: Mon, 15 Jul 2013 19:32:51 -0700 X-Google-Sender-Auth: -0NDQI_Gv6l_1Gxn9F83IQ3U-0Q Message-ID: Subject: Re: splice vs execve lockdep trace. From: Linus Torvalds To: Dave Jones , Linux Kernel , Linus Torvalds , Peter Zijlstra , Alexander Viro , Oleg Nesterov , Ben Myers Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7334 Lines: 127 Hmm. I don't have a lot of ideas, I'm just adding lockdep, splice and FS people (and Oleg, just because) to the cc to see if anybody else does. Al, Peter? So the problematic op *seems* to be the splice into /proc//attr/ files, which causes that "pipe -> cred_guard_mutex" locking. While the *normal* ordering would be the other way around, coming from execve(), which has the cred_guard_mutex -> VFS locks ordering for reading the executable headers. Al, can we break either of those? Do we need to hold on to the cred mutex that long? We get it fairly early (prepare_bprm_creds) and we drop it very late (install_exec_creds), which means that it covers a lot. But that seems pretty basic. The splice into /proc//attr/* seems to be the more annoying one, and maybe we just shouldn't allow splicing into or from /proc? Dave, is this new (it doesn't *smell* new to me), or is it just that trinity is doing new splice things? Or is the XFS i_iolock required for this thing to happen at all? Adding Ben Myers to the cc just for luck/completeness. Linus On Mon, Jul 15, 2013 at 6:53 PM, Dave Jones wrote: > > [ 696.047396] ====================================================== > [ 696.049036] [ INFO: possible circular locking dependency detected ] > [ 696.050689] 3.11.0-rc1+ #53 Tainted: G W > [ 696.052182] ------------------------------------------------------- > [ 696.053846] trinity-child2/14017 is trying to acquire lock: > [ 696.055429] (&sig->cred_guard_mutex){+.+.+.}, at: [] proc_pid_attr_write+0xf5/0x140 > [ 696.057652] but task is already holding lock: > [ 696.060171] (&pipe->mutex/1){+.+.+.}, at: [] pipe_lock+0x26/0x30 > [ 696.062097] which lock already depends on the new lock. > [ 696.065723] the existing dependency chain (in reverse order) is: > [ 696.068295] -> #2 (&pipe->mutex/1){+.+.+.}: > [ 696.070650] [] lock_acquire+0x91/0x1f0 > [ 696.072258] [] mutex_lock_nested+0x7a/0x410 > [ 696.073916] [] pipe_lock+0x26/0x30 > [ 696.075464] [] generic_file_splice_write+0x64/0x170 > [ 696.077192] [] xfs_file_splice_write+0xb0/0x230 [xfs] > [ 696.078914] [] SyS_splice+0x24a/0x7e0 > [ 696.080455] [] tracesys+0xdd/0xe2 > [ 696.082000] -> #1 (&(&ip->i_iolock)->mr_lock){++++++}: > [ 696.084428] [] lock_acquire+0x91/0x1f0 > [ 696.086017] [] down_read_nested+0x52/0xa0 > [ 696.087643] [] xfs_ilock+0x1d0/0x280 [xfs] > [ 696.089305] [] xfs_file_aio_read+0x112/0x3e0 [xfs] > [ 696.091037] [] do_sync_read+0x80/0xb0 > [ 696.092571] [] vfs_read+0xa1/0x170 > [ 696.094124] [] kernel_read+0x41/0x60 > [ 696.095693] [] prepare_binprm+0xb3/0x130 > [ 696.097307] [] do_execve_common.isra.29+0x599/0x6c0 > [ 696.099036] [] SyS_execve+0x36/0x50 > [ 696.100577] [] stub_execve+0x69/0xa0 > [ 696.102157] -> #0 (&sig->cred_guard_mutex){+.+.+.}: > [ 696.104536] [] __lock_acquire+0x1786/0x1af0 > [ 696.106189] [] lock_acquire+0x91/0x1f0 > [ 696.107813] [] mutex_lock_interruptible_nested+0x75/0x4c0 > [ 696.109616] [] proc_pid_attr_write+0xf5/0x140 > [ 696.111243] [] __kernel_write+0x72/0x150 > [ 696.112870] [] write_pipe_buf+0x49/0x70 > [ 696.114503] [] splice_from_pipe_feed+0x84/0x140 > [ 696.116202] [] __splice_from_pipe+0x6e/0x90 > [ 696.117836] [] splice_from_pipe+0x51/0x70 > [ 696.119419] [] default_file_splice_write+0x19/0x30 > [ 696.121147] [] SyS_splice+0x24a/0x7e0 > [ 696.122719] [] tracesys+0xdd/0xe2 > [ 696.124272] other info that might help us debug this: > > [ 696.127604] Chain exists of: &sig->cred_guard_mutex --> &(&ip->i_iolock)->mr_lock --> &pipe->mutex/1 > > [ 696.131445] Possible unsafe locking scenario: > > [ 696.133763] CPU0 CPU1 > [ 696.135108] ---- ---- > [ 696.136420] lock(&pipe->mutex/1); > [ 696.137745] lock(&(&ip->i_iolock)->mr_lock); > [ 696.139469] lock(&pipe->mutex/1); > [ 696.141030] lock(&sig->cred_guard_mutex); > [ 696.142409] > *** DEADLOCK *** > > [ 696.145237] 2 locks held by trinity-child2/14017: > [ 696.146532] #0: (sb_writers#4){.+.+.+}, at: [] SyS_splice+0x731/0x7e0 > [ 696.148435] #1: (&pipe->mutex/1){+.+.+.}, at: [] pipe_lock+0x26/0x30 > [ 696.150357] stack backtrace: > [ 696.152287] CPU: 2 PID: 14017 Comm: trinity-child2 Tainted: G W 3.11.0-rc1+ #53 > [ 696.156695] ffffffff825270a0 ffff8801fee2db90 ffffffff8170164e ffffffff824f84c0 > [ 696.158392] ffff8801fee2dbd0 ffffffff816fda92 ffff8801fee2dc20 ffff8802331446d8 > [ 696.160300] ffff880233143fc0 0000000000000002 0000000000000002 ffff8802331446d8 > [ 696.162217] Call Trace: > [ 696.163292] [] dump_stack+0x4e/0x82 > [ 696.164695] [] print_circular_bug+0x200/0x20f > [ 696.166194] [] __lock_acquire+0x1786/0x1af0 > [ 696.167677] [] lock_acquire+0x91/0x1f0 > [ 696.169111] [] ? proc_pid_attr_write+0xf5/0x140 > [ 696.170633] [] ? proc_pid_attr_write+0xf5/0x140 > [ 696.172151] [] mutex_lock_interruptible_nested+0x75/0x4c0 > [ 696.173765] [] ? proc_pid_attr_write+0xf5/0x140 > [ 696.175280] [] ? proc_pid_attr_write+0xf5/0x140 > [ 696.176793] [] ? alloc_pages_current+0xa9/0x170 > [ 696.178301] [] proc_pid_attr_write+0xf5/0x140 > [ 696.179799] [] ? splice_direct_to_actor+0x1f0/0x1f0 > [ 696.181346] [] __kernel_write+0x72/0x150 > [ 696.182796] [] ? splice_direct_to_actor+0x1f0/0x1f0 > [ 696.184354] [] write_pipe_buf+0x49/0x70 > [ 696.185779] [] splice_from_pipe_feed+0x84/0x140 > [ 696.187293] [] ? splice_direct_to_actor+0x1f0/0x1f0 > [ 696.188850] [] __splice_from_pipe+0x6e/0x90 > [ 696.190325] [] ? splice_direct_to_actor+0x1f0/0x1f0 > [ 696.191878] [] splice_from_pipe+0x51/0x70 > [ 696.193335] [] default_file_splice_write+0x19/0x30 > [ 696.194896] [] SyS_splice+0x24a/0x7e0 > [ 696.196312] [] tracesys+0xdd/0xe2 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/