Received: by 10.223.164.221 with SMTP id h29csp429203wrb; Wed, 11 Oct 2017 02:49:36 -0700 (PDT) X-Google-Smtp-Source: AOwi7QA/1DlBDFbEfDSA1GplE5P4fvl9G0riyn17iejhUqwM67F1Vr+cgE0RME8yKMJCZdddGOBV X-Received: by 10.98.214.76 with SMTP id r73mr16045847pfg.261.1507715375995; Wed, 11 Oct 2017 02:49:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1507715375; cv=none; d=google.com; s=arc-20160816; b=abMwmtUDRhsiiW39jpeU0aFkgVGb488FekM+P2IXIy6ysCErOUnKt1xqk12D0zrdEM a+WzAJNnd0F4BBA7JK73J1lg9RKGOVYB2Avq3O7qF+8fwbbQ6lQqIKhwttt7pLtaRoQq gdFJhDOOdF5buS0oXkHlZSWj6XNWK1UsKK9WC2pitQCUR3o0P9QY7bWLEUrFrLoxzdrt 5ZixbLoYaj8Hjpg384GYNIqJEtsV6aqduobPW/l05EXcRJt94aQz5QEDoCWRlY5yUe2B 80wS+ca1nWZd2PBxJj0EJPOSaf75pGczPoAzOylw/wNnI5xVK2mNDgriJRu34qTcIjRO 7/qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=MJzUqOWmVjJJ7ytO0CDW4msHch5r1E9vPJGBbZhDfiM=; b=pxdOxnq5QZ7/wqrjUeGCMYwuucLylNV31HsksDDaDytsbrCHk4NfBSqwkUx2XDYlDq Gh+Qm6S7NIc97dpNGeGpAeUxu/cYqvOn0UkOqVb41BK4cgZdl6wi6d1UZTSFgYJm4NcC q5nx1g6VksGVd6cYNmbjORkrqfonG5Ku2APUjmWqNGKP7V/ZMOk+5K3nqrdro5laNKjd B4Te5WQ/S/MZ/zXcwPNRrA16AdX8c0QBgxbA01bZrpoLCmntssUljR5OO9OVCMjaNQ5A DV1xWOdRSCZBJJDQBXBbEqSqBoZfRRan52o1Vl1UaIMcpjFs688K2o9NVFRNzeJ0XAFi huow== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=pRbNa9T5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r124si10707139pfc.423.2017.10.11.02.49.22; Wed, 11 Oct 2017 02:49:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=pRbNa9T5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752947AbdJKJsm (ORCPT + 99 others); Wed, 11 Oct 2017 05:48:42 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:49942 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752298AbdJKJsk (ORCPT ); Wed, 11 Oct 2017 05:48:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=MJzUqOWmVjJJ7ytO0CDW4msHch5r1E9vPJGBbZhDfiM=; b=pRbNa9T5h7pyYS6jimOGM+haJ boKk2ISyAqEHpwcGHpLVIp2zFAgg3uhkkCI7OJdIDOT8b2iu1VoPzPmd4G+dOgsjtMGQK0kVIZvOC WFV/YO7YDoO/sx60s21K5C2VYuOyJeAPHWOjiRDYbdOQTCvu/OQidcPcB9n4KYxlqwLq3o+DKlBSg VBnv2pHHBq5m9T35/W/KzyRfFKOWuRuYIVvMnzL9mBrRJ8eTXvkx19vpM9lEvOiCtMoBWkxOpd1e7 ltn1+hYooUDVhSoMRYU17YwoMO6P6/glcjU2jrP6TlMye1DCkyFTXMjqKaIcGOeu8yP7yU9b8sija dtf0IrNpQ==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.87 #1 (Red Hat Linux)) id 1e2Dcy-0000I3-8h; Wed, 11 Oct 2017 09:48:36 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id CB60D203DC305; Wed, 11 Oct 2017 11:48:33 +0200 (CEST) Date: Wed, 11 Oct 2017 11:48:33 +0200 From: Peter Zijlstra To: Prateek Sood Cc: tj@kernel.org, lizefan@huawei.com, cgroups@vger.kernel.org, mingo@kernel.org, longman@redhat.com, boqun.feng@gmail.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, sramana@codeaurora.org Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock Message-ID: <20171011094833.pdp4torvotvjdmkt@hirez.programming.kicks-ass.net> References: <1504792583-10424-1-git-send-email-prsood@codeaurora.org> <20170907175107.GG17526@worktop.programming.kicks-ass.net> <4668d1ec-dc43-8a9c-4f94-a421683d3c17@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4668d1ec-dc43-8a9c-4f94-a421683d3c17@codeaurora.org> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 09, 2017 at 06:57:46PM +0530, Prateek Sood wrote: > On 09/07/2017 11:21 PM, Peter Zijlstra wrote: > > But if you invert these locks, the need for cpuset_hotplug_workfn() goes > > away, at least for the CPU part, and we can make in synchronous again. > > Yay!! > The callback making a call to cpuset_hotplug_workfn()in hotplug path are > [CPUHP_AP_ACTIVE] = { > .name = "sched:active", > .startup.single = sched_cpu_activate, > .teardown.single = sched_cpu_deactivate, > }, > > if we make cpuset_hotplug_workfn() synchronous, deadlock might happen: > _cpu_down() > cpus_write_lock() //held > cpuhp_kick_ap_work() > cpuhp_kick_ap() > __cpuhp_kick_ap() > wake_up_process() //cpuhp_thread_fun > wait_for_ap_thread() //wait for complete from cpuhp_thread_fun() > > cpuhp_thread_fun() > cpuhp_invoke_callback() > sched_cpu_deactivate() > cpuset_cpu_inactive() > cpuset_update_active_cpus() > cpuset_hotplug_work() > rebuild_sched_domains() > cpus_read_lock() //waiting as acquired in _cpu_down() Well, duh, don't use rebuild_sched_domains() 'obviously' :-) use rebuild_sched_domains_cpuslocked() instead and it works just fine. After applying your patch, the below boots and survives a hotplug. --- include/linux/cpuset.h | 6 ------ kernel/cgroup/cpuset.c | 30 +++++++++--------------------- kernel/power/process.c | 2 -- kernel/sched/core.c | 1 - 4 files changed, 9 insertions(+), 30 deletions(-) --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -51,9 +51,7 @@ static inline void cpuset_dec(void) extern int cpuset_init(void); extern void cpuset_init_smp(void); -extern void cpuset_force_rebuild(void); extern void cpuset_update_active_cpus(void); -extern void cpuset_wait_for_hotplug(void); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask); extern void cpuset_cpus_allowed_fallback(struct task_struct *p); extern nodemask_t cpuset_mems_allowed(struct task_struct *p); @@ -166,15 +164,11 @@ static inline bool cpusets_enabled(void) static inline int cpuset_init(void) { return 0; } static inline void cpuset_init_smp(void) {} -static inline void cpuset_force_rebuild(void) { } - static inline void cpuset_update_active_cpus(void) { partition_sched_domains(1, NULL, NULL); } -static inline void cpuset_wait_for_hotplug(void) { } - static inline void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask) { --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -833,7 +833,12 @@ static void rebuild_sched_domains_cpuslo cpumask_var_t *doms; int ndoms; + /* + * When called during hotplug, this lock is held by the calling + * thread, not cpuhp_thread_fun :/ + * lockdep_assert_cpus_held(); + */ lockdep_assert_held(&cpuset_mutex); /* @@ -2281,13 +2286,6 @@ static void cpuset_hotplug_update_tasks( mutex_unlock(&cpuset_mutex); } -static bool force_rebuild; - -void cpuset_force_rebuild(void) -{ - force_rebuild = true; -} - /** * cpuset_hotplug_workfn - handle CPU/memory hotunplug for a cpuset * @@ -2362,25 +2360,15 @@ static void cpuset_hotplug_workfn(struct } /* rebuild sched domains if cpus_allowed has changed */ - if (cpus_updated || force_rebuild) { - force_rebuild = false; + if (cpus_updated) rebuild_sched_domains(); - } } void cpuset_update_active_cpus(void) { - /* - * We're inside cpu hotplug critical region which usually nests - * inside cgroup synchronization. Bounce actual hotplug processing - * to a work item to avoid reverse locking order. - */ - schedule_work(&cpuset_hotplug_work); -} - -void cpuset_wait_for_hotplug(void) -{ - flush_work(&cpuset_hotplug_work); + mutex_lock(&cpuset_mutex); + rebuild_sched_domains_cpuslocked(); + mutex_unlock(&cpuset_mutex); } /* --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -203,8 +203,6 @@ void thaw_processes(void) __usermodehelper_set_disable_depth(UMH_FREEZING); thaw_workqueues(); - cpuset_wait_for_hotplug(); - read_lock(&tasklist_lock); for_each_process_thread(g, p) { /* No other threads should have PF_SUSPEND_TASK set */ --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5598,7 +5598,6 @@ static void cpuset_cpu_active(void) * restore the original sched domains by considering the * cpuset configurations. */ - cpuset_force_rebuild(); } cpuset_update_active_cpus(); } From 1580786740224366626@xxx Mon Oct 09 13:28:33 +0000 2017 X-GM-THRID: 1577859769769316492 X-Gmail-Labels: Inbox,Category Forums