Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753997AbdIGH25 (ORCPT ); Thu, 7 Sep 2017 03:28:57 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:54111 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751780AbdIGH2z (ORCPT ); Thu, 7 Sep 2017 03:28:55 -0400 Date: Thu, 7 Sep 2017 09:28:48 +0200 From: Peter Zijlstra To: Prateek Sood Cc: tj@kernel.org, lizefan@huawei.com, cgroups@vger.kernel.org, mingo@kernel.org, longman@redhat.com, linux-kernel@vger.kernel.org, sramana@codeaurora.org, Thomas Gleixner Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock Message-ID: <20170907072848.2sjjddwincaeplju@hirez.programming.kicks-ass.net> References: <1504764252-29091-1-git-send-email-prsood@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1504764252-29091-1-git-send-email-prsood@codeaurora.org> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2800 Lines: 86 On Thu, Sep 07, 2017 at 11:34:12AM +0530, Prateek Sood wrote: > Remove circular dependency deadlock in a scenario where hotplug of CPU is > being done while there is updation in cgroup and cpuset triggered from > userspace. > > Example scenario: > kworker/0:0 => kthreadd => init:729 => init:1 => kworker/0:0 > > kworker/0:0 - percpu_down_write(&cpu_hotplug_lock) [held] > flush(work) [no high prio workqueue available on CPU] > wait_for_completion() > > kthreadd - percpu_down_read(cgroup_threadgroup_rwsem) [waiting] > > init:729 - percpu_down_write(cgroup_threadgroup_rwsem) [held] > lock(cpuset_mutex) [waiting] > > init:1 - lock(cpuset_mutex) [held] > percpu_down_read(&cpu_hotplug_lock) [waiting] That's both unreadable and useless :/ You want to tell what code paths that were, not which random tasks happened to run them. > Eliminate this dependecy by reordering locking of cpuset_mutex > and cpu_hotplug_lock in following order > 1. Acquire cpu_hotplug_lock (read) > 2. Acquire cpuset_mutex > > Signed-off-by: Prateek Sood > --- > kernel/cgroup/cpuset.c | 70 +++++++++++++++++++++++++++++++++++++++++++------- > 1 file changed, 61 insertions(+), 9 deletions(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 2f4039b..687be57 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -843,10 +843,41 @@ static void rebuild_sched_domains_locked(void) > out: > put_online_cpus(); > } > + > +/* > + * Rebuild scheduler domains. > + * Call with following lock held in the order > + * 1. cpu_hotplug_lock (read) > + * 2. cpuset_mutex Do not put that in comments, nobody ever reads comments. > + */ > +static void rebuild_sched_domains_unlocked(void) The common postfix for a function called with the cpuhotplug lock held is: _cpuslocked() > +{ > + struct sched_domain_attr *attr; > + cpumask_var_t *doms; > + int ndoms; lockdep_assert_cpus_held(); lockdep_assert_held(&cpuset_mutex); > + > + /* > + * We have raced with CPU hotplug. Don't do anything to avoid > + * passing doms with offlined cpu to partition_sched_domains(). > + * Anyways, hotplug work item will rebuild sched domains. > + */ > + if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask)) > + return; > + > + /* Generate domain masks and attrs */ > + ndoms = generate_sched_domains(&doms, &attr); > + > + /* Have scheduler rebuild the domains */ > + partition_sched_domains(ndoms, doms, attr); > +} And you couldn't come up with a way to share _anything_ with the existing rebuild_sched_domains_locked() function? *sigh*.. please try again.