Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932189AbdIGJH1 (ORCPT ); Thu, 7 Sep 2017 05:07:27 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:54040 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932170AbdIGJHY (ORCPT ); Thu, 7 Sep 2017 05:07:24 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 928A261155 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=prsood@codeaurora.org Subject: Re: [PATCH] cgroup/cpuset: remove circular dependency deadlock To: Boqun Feng , Peter Zijlstra References: <1504764252-29091-1-git-send-email-prsood@codeaurora.org> <20170907072848.2sjjddwincaeplju@hirez.programming.kicks-ass.net> <20170907085534.GA30135@tardis> Cc: tj@kernel.org, lizefan@huawei.com, cgroups@vger.kernel.org, mingo@kernel.org, longman@redhat.com, linux-kernel@vger.kernel.org, sramana@codeaurora.org, Thomas Gleixner From: Prateek Sood Message-ID: <88abfcfd-3d85-b903-e404-d101c34613a8@codeaurora.org> Date: Thu, 7 Sep 2017 14:37:15 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: <20170907085534.GA30135@tardis> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1978 Lines: 69 On 09/07/2017 02:26 PM, Boqun Feng wrote: > On Thu, Sep 07, 2017 at 09:28:48AM +0200, Peter Zijlstra wrote: >> On Thu, Sep 07, 2017 at 11:34:12AM +0530, Prateek Sood wrote: >>> Remove circular dependency deadlock in a scenario where hotplug of CPU is >>> being done while there is updation in cgroup and cpuset triggered from >>> userspace. >>> >>> Example scenario: >>> kworker/0:0 => kthreadd => init:729 => init:1 => kworker/0:0 >>> >>> kworker/0:0 - percpu_down_write(&cpu_hotplug_lock) [held] >>> flush(work) [no high prio workqueue available on CPU] >>> wait_for_completion() > > Hi Prateek, > > so this is: > > _cpu_down(): > cpus_write_lock(); // percpu_down_write(&cpu_hotlug_lock) > cpuhp_invoke_callbacks(): > workqueue_offine_cpu(): > wq_update_unbound_numa(): > alloc_unbound_pool(): > get_unbound_pool(): > create_worker(): > kthread_create_on_node(): > wake_up_process(kthreadd_task); > wait_for_completion(); // create->done > > , right? > > Wonder running in a kworker is necessary to trigger this, I mean running > a cpu_down() in a normal process context could also trigger this, no? > Just ask out of curiosity. > > Regards, > Boqun Hi Boqun, cpu_down() in normal process can also trigger this. Regards Prateek > >>> >>> kthreadd - percpu_down_read(cgroup_threadgroup_rwsem) [waiting] >>> >>> init:729 - percpu_down_write(cgroup_threadgroup_rwsem) [held] >>> lock(cpuset_mutex) [waiting] >>> >>> init:1 - lock(cpuset_mutex) [held] >>> percpu_down_read(&cpu_hotplug_lock) [waiting] >> >> That's both unreadable and useless :/ You want to tell what code paths >> that were, not which random tasks happened to run them. >> >> > [...] > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project