Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751435AbdIENWu (ORCPT ); Tue, 5 Sep 2017 09:22:50 -0400 Received: from mail-qt0-f196.google.com ([209.85.216.196]:36327 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751149AbdIENWs (ORCPT ); Tue, 5 Sep 2017 09:22:48 -0400 X-Google-Smtp-Source: ADKCNb71UhcyEiV3RAweKcZtkGErhOQnmTx7XF4/z5pNf9xNT5AtY+7qAHx6FV7UcHAw5QN9So/ZXQ== Date: Tue, 5 Sep 2017 06:22:43 -0700 From: Tejun Heo To: Prateek Sood Cc: lizefan@huawei.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, sramana@codeaurora.org, mingo@kernel.org, longman@redhat.com, apkm@linux-foundation.org Subject: Re: [PATCH] Workqueue lockup: Circular dependency in threads Message-ID: <20170905132242.GA1774378@devbig577.frc2.facebook.com> References: <1504101538-20075-1-git-send-email-prsood@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1480 Lines: 44 Hello, On Thu, Aug 31, 2017 at 06:43:56PM +0530, Prateek Sood wrote: > > 6) cpuset_mutex is acquired by task init:1 and is waiting for cpuhotplug lock. Yeah, this is the problematic one. > > We can reorder the sequence of locks as in the below diff to avoid this > > deadlock. But I am looking for inputs/better solution to fix this deadlock. > > > > --- > > diff --git a/kernel/cpuset.c b/kernel/cpuset.c > > /** > > * update_tasks_cpumask - Update the cpumasks of tasks in the cpuset. > > * @cs: the cpuset in which each task's cpus_allowed mask needs to be changed > > @@ -930,7 +946,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus) > > rcu_read_unlock(); > > > > if (need_rebuild_sched_domains) > > - rebuild_sched_domains_locked(); > > + rebuild_sched_domains_unlocked()(without taking cpuhotplug.lock) > > } > > > > /** > > @@ -1719,6 +1735,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, > > + get_online_cpus(); > > mutex_lock(&cpuset_mutex); > > if (!is_cpuset_online(cs)) > > goto out_unlock; > > @@ -1744,6 +1761,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, > > mutex_unlock(&cpuset_mutex); > > + put_online_cpus(); > > kernfs_unbreak_active_protection(of->kn); > > css_put(&cs->css); > > flush_workqueue(cpuset_migrate_mm_wq); > > And the patch looks good to me. Can you please format the patch with proper description and sob? Thanks. -- tejun