Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933235Ab3CVNWZ (ORCPT ); Fri, 22 Mar 2013 09:22:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59085 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932330Ab3CVNWX (ORCPT ); Fri, 22 Mar 2013 09:22:23 -0400 Date: Fri, 22 Mar 2013 14:20:11 +0100 From: Oleg Nesterov To: Andrew Morton Cc: Tejun Heo , Dave Jones , Linux Kernel , cgroups@vger.kernel.org, Li Zefan Subject: Re: [PATCH] do not abuse ->cred_guard_mutex in threadgroup_lock() Message-ID: <20130322132011.GA13261@redhat.com> References: <20130307180139.GD29601@htj.dyndns.org> <20130307180332.GE29601@htj.dyndns.org> <20130307191242.GA18265@redhat.com> <20130307193820.GB3209@htj.dyndns.org> <513A9A67.60909@huawei.com> <20130309032936.GT14556@mtj.dyndns.org> <513AE918.7020704@huawei.com> <20130309200046.GA8149@redhat.com> <20130321162138.GA21859@redhat.com> <20130321150626.a7934d989fb80d835fa92255@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130321150626.a7934d989fb80d835fa92255@linux-foundation.org> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7292 Lines: 153 On 03/21, Andrew Morton wrote: > > On Thu, 21 Mar 2013 17:21:38 +0100 Oleg Nesterov wrote: > > > threadgroup_lock() takes signal->cred_guard_mutex to ensure that > > thread_group_leader() is stable. This doesn't look nice, the scope > > of this lock in do_execve() is huge. > > > > And as Dave pointed out this can lead to deadlock, we have the > > following dependencies: > > > > do_execve: cred_guard_mutex -> i_mutex > > cgroup_mount: i_mutex -> cgroup_mutex > > attach_task_by_pid: cgroup_mutex -> cred_guard_mutex > > > > Change de_thread() to take threadgroup_change_begin() around the > > switch-the-leader code and change threadgroup_lock() to avoid > > ->cred_guard_mutex. > > > > Note that de_thread() can't sleep with ->group_rwsem held, this > > can obviously deadlock with the exiting leader if the writer is > > active, so it does threadgroup_change_end() before schedule(). > > > When writing a changelog, please describe the end-user-visible effects > of the bug, so that others can more easily decide which kernel > version(s) should be fixed, and so that downstream kernel maintainers > can more easily work out whether this patch will fix a problem which > they or their customers are observing. > > > > Reported-by: Dave Jones > > Perhaps Dave's report provides the needed info? trinity went titsup? Yes, trinity. Please see the original report below. I tried to translate the lockdep's output into the human-readable form. Oleg. ------------------------------------------------------------------------------- Looks like this happens when my fuzzer tries to look up garbage in /sys/fs/cgroup/freezer/ trinity -c execve -V /sys/fs/cgroup/freezer/ will reproduce it very quickly. This isn't a new trace. I've seen it in the past from iknowthis also. Dave [ 943.971541] ====================================================== [ 943.972451] [ INFO: possible circular locking dependency detected ] [ 943.973370] 3.9.0-rc1+ #69 Not tainted [ 943.973927] ------------------------------------------------------- [ 943.974838] trinity-child0/1301 is trying to acquire lock: [ 943.975650] blocked: (&sb->s_type->i_mutex_key#9){+.+.+.}, instance: ffff880127ea1680, at: [] do_last+0x35c/0xe30 [ 943.977522] but task is already holding lock: [ 943.978371] held: (&sig->cred_guard_mutex){+.+.+.}, instance: ffff880123937578, at: [] prepare_bprm_creds+0x36/0x80 [ 943.980260] which lock already depends on the new lock. [ 943.981434] the existing dependency chain (in reverse order) is: [ 943.982499] -> #2 (&sig->cred_guard_mutex){+.+.+.}: [ 943.983280] [] lock_acquire+0x92/0x1d0 [ 943.984196] [] mutex_lock_nested+0x73/0x3b0 [ 943.985173] [] attach_task_by_pid+0x122/0x8d0 [ 943.986151] [] cgroup_tasks_write+0x13/0x20 [ 943.987127] [] cgroup_file_write+0x130/0x2f0 [ 943.988118] [] vfs_write+0xaf/0x180 [ 943.988985] [] sys_write+0x55/0xa0 [ 943.989853] [] system_call_fastpath+0x16/0x1b [ 943.990853] -> #1 (cgroup_mutex){+.+.+.}: [ 943.991616] [] lock_acquire+0x92/0x1d0 [ 943.992527] [] mutex_lock_nested+0x73/0x3b0 [ 943.993492] [] cgroup_mount+0x2e7/0x520 [ 943.994423] [] mount_fs+0x43/0x1b0 [ 943.995275] [] vfs_kern_mount+0x61/0x100 [ 943.996220] [] do_mount+0x211/0xa00 [ 943.997103] [] sys_mount+0x8e/0xe0 [ 943.997965] [] system_call_fastpath+0x16/0x1b [ 943.998972] -> #0 (&sb->s_type->i_mutex_key#9){+.+.+.}: [ 943.999886] [] __lock_acquire+0x1b86/0x1c80 [ 944.000864] [] lock_acquire+0x92/0x1d0 [ 944.001771] [] mutex_lock_nested+0x73/0x3b0 [ 944.002750] [] do_last+0x35c/0xe30 [ 944.003620] [] path_openat+0xba/0x4f0 [ 944.004517] [] do_filp_open+0x41/0xa0 [ 944.005427] [] open_exec+0x53/0x130 [ 944.006296] [] do_execve_common.isra.26+0x31d/0x710 [ 944.007373] [] do_execve+0x18/0x20 [ 944.008222] [] sys_execve+0x3d/0x60 [ 944.009093] [] stub_execve+0x69/0xa0 [ 944.009983] other info that might help us debug this: [ 944.011126] Chain exists of: &sb->s_type->i_mutex_key#9 --> cgroup_mutex --> &sig->cred_guard_mutex [ 944.012745] Possible unsafe locking scenario: [ 944.013617] CPU0 CPU1 [ 944.014280] ---- ---- [ 944.014942] lock(&sig->cred_guard_mutex); [ 944.021332] lock(cgroup_mutex); [ 944.028094] lock(&sig->cred_guard_mutex); [ 944.035007] lock(&sb->s_type->i_mutex_key#9); [ 944.041602] *** DEADLOCK *** [ 944.059241] 1 lock on stack by trinity-child0/1301: [ 944.065496] #0: held: (&sig->cred_guard_mutex){+.+.+.}, instance: ffff880123937578, at: [] prepare_bprm_creds+0x36/0x80 [ 944.073100] stack backtrace: [ 944.085269] Pid: 1301, comm: trinity-child0 Not tainted 3.9.0-rc1+ #69 [ 944.091788] Call Trace: [ 944.097633] [] print_circular_bug+0x1fe/0x20f [ 944.104041] [] __lock_acquire+0x1b86/0x1c80 [ 944.110223] [] ? trace_hardirqs_off+0xd/0x10 [ 944.116282] [] lock_acquire+0x92/0x1d0 [ 944.122293] [] ? do_last+0x35c/0xe30 [ 944.128287] [] mutex_lock_nested+0x73/0x3b0 [ 944.134460] [] ? do_last+0x35c/0xe30 [ 944.140497] [] ? do_last+0x35c/0xe30 [ 944.146446] [] do_last+0x35c/0xe30 [ 944.152303] [] ? inode_permission+0x18/0x50 [ 944.158260] [] ? link_path_walk+0x245/0x880 [ 944.164165] [] path_openat+0xba/0x4f0 [ 944.169934] [] do_filp_open+0x41/0xa0 [ 944.175834] [] ? do_execve_common.isra.26+0x30e/0x710 [ 944.181817] [] ? get_lock_stats+0x22/0x70 [ 944.187828] [] ? put_lock_stats.isra.23+0xe/0x40 [ 944.193892] [] ? lock_release_holdtime.part.24+0xcb/0x130 [ 944.200099] [] open_exec+0x53/0x130 [ 944.206046] [] do_execve_common.isra.26+0x31d/0x710 [ 944.212123] [] ? do_execve_common.isra.26+0x122/0x710 [ 944.218275] [] do_execve+0x18/0x20 [ 944.224206] [] sys_execve+0x3d/0x60 [ 944.230155] [] stub_execve+0x69/0xa0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/