Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751322AbbEYRoI (ORCPT ); Mon, 25 May 2015 13:44:08 -0400 Received: from relay.parallels.com ([195.214.232.42]:43818 "EHLO relay.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750881AbbEYRoG (ORCPT ); Mon, 25 May 2015 13:44:06 -0400 Message-ID: <1432575832.6866.29.camel@odin.com> Subject: [PATCH RFC 00/13] Reduction globality of tasklist_lock From: Kirill Tkhai To: CC: Oleg Nesterov , Andrew Morton , Ingo Molnar , "Peter Zijlstra" , Michal Hocko , "Rik van Riel" , Ionut Alexa , Peter Hurley , Kirill Tkhai Date: Mon, 25 May 2015 20:43:52 +0300 Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.9-1+b1 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.30.16.109] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4002 Lines: 91 tasklist_lock is used for protection of many different task links. These are place in init_task.tasks, parent-child relationship, ptrace waiting, place in PID lists, SID/PGID leadership, proibition of creation of new tasks etc. It's like gone BKL, and this badly affects on granularity of a system. The series aims to decrease its globality and introduces a new lock for protection of task-task relationship. The lock's name is kin_lock. Firstly, it protects parent-child relationship. Children sibling and real_parent fields are protected by father's kin_lock. For transfering a child from one parent to another you should take both fathers locks. Child's parent is protected by ptracer's kin_lock. We should take real_parent's and ptracer's locks to attach a child to ptracer. We should use parent's lock to notify it about exiting. tasklist_lock is not used in exit/wait notifications since now. Also, real_parent's kin lock protects child's threads (is the child is multithreaded) and multithread exiting. Task's sighand is protected by it too. After all I nested tasklist_lock under kin_lock, so the lock order now is kin_lock tasklist_lock sighand->lock But. sighand, task_struct::tasks, thread group is still safe under tasklist_lock. We may change __exit_signal() a little bit more, but it wants additional changing of all tasklist_lock users and makes the series bigger (plus 8-9 patches). I don't thing it's so necessary right now. We may do that in the future if we want. All new users shouldn't use tasklist_lock since now if possible (but mostly current users may be easy rewritten using RCU. In a couple of place we will have to use tasklist_lock to stop process creation). Besides that, tasklist_lock still protects init_task.tasks list, PID lists, SID and PGID leadership. The series is in RFC format, because I didn't add exhaustive comments to the code yet. Also, nesting of rwlock_t isn't reflected in lockdep, and arch code still uses tasklist_lock (I hadn't analyze it yet). I'd like to hear people's opinions about that. Welcome your comments/ ideas/suggestions. Thanks! --- Kirill Tkhai (13): exit: Clarify choice of new parent in forget_original_parent() rwlock_t: Implement double_write_{,un}lock() pid_ns: Implement rwlock_t pid_ns::cr_lock for locking child_reaper exit: Small refactoring mm_update_next_owner() fs: Refactoring in get_children_pid() core: Add rwlock_t task_list::kin_lock kin_lock: Implement helpers for kin_lock locking. core: Use kin_lock synchronizations between parent and child and for thread group exit: Use for_each_thread() in do_wait() exit: Add struct wait_opts's member held_lock and use it for tasklist_lock exit: Syncronize on kin_lock while do_notify_parent() exit: Delete write dependence on tasklist_lock in exit_notify() core: Nest tasklist_lock into task_struct::kin_lock fs/exec.c | 26 ++-- fs/proc/array.c | 28 ++-- include/linux/init_task.h | 1 include/linux/pid_namespace.h | 1 include/linux/sched.h | 303 +++++++++++++++++++++++++++++++++++++++++ include/linux/spinlock.h | 19 +++ kernel/exit.c | 178 +++++++++++++++++------- kernel/fork.c | 13 +- kernel/pid.c | 10 + kernel/pid_namespace.c | 5 - kernel/ptrace.c | 53 +++++-- kernel/signal.c | 20 +-- kernel/sys.c | 19 +-- mm/oom_kill.c | 9 + security/keys/keyctl.c | 4 - security/selinux/hooks.c | 4 - 16 files changed, 570 insertions(+), 123 deletions(-) -- Signed-off-by: Kirill Tkhai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/