Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932627AbeAOMCc (ORCPT + 1 other); Mon, 15 Jan 2018 07:02:32 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:56038 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752404AbeAOMCa (ORCPT ); Mon, 15 Jan 2018 07:02:30 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org CD83E60854 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=prsood@codeaurora.org Subject: Re: [PATCH] cgroup/cpuset: fix circular locking dependency To: Tejun Heo Cc: Peter Zijlstra , avagin@gmail.com, mingo@kernel.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, sramana@codeaurora.org, "Paul E. McKenney" References: <1511868946-23959-1-git-send-email-prsood@codeaurora.org> <623f214b-8b9a-f967-7a3d-ca9c06151267@codeaurora.org> <20171204202219.GF2421075@devbig577.frc2.facebook.com> <20171204225825.GP2421075@devbig577.frc2.facebook.com> <20171204230117.GF20227@worktop.programming.kicks-ass.net> <20171211152059.GH2421075@devbig577.frc2.facebook.com> <20171213160617.GQ3919388@devbig577.frc2.facebook.com> <9843d982-d201-8702-2e4e-0541a4d96b53@codeaurora.org> <20180102161656.GD3668920@devbig577.frc2.facebook.com> From: Prateek Sood Message-ID: <3c9b2a2d-ede4-1242-418a-353ec9f78db3@codeaurora.org> Date: Mon, 15 Jan 2018 17:32:18 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20180102161656.GD3668920@devbig577.frc2.facebook.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 01/02/2018 09:46 PM, Tejun Heo wrote: > Hello, > > On Fri, Dec 29, 2017 at 02:07:16AM +0530, Prateek Sood wrote: >> task T is waiting for cpuset_mutex acquired >> by kworker/2:1 >> >> sh ==> cpuhp/2 ==> kworker/2:1 ==> sh >> >> kworker/2:3 ==> kthreadd ==> Task T ==> kworker/2:1 >> >> It seems that my earlier patch set should fix this scenario: >> 1) Inverting locking order of cpuset_mutex and cpu_hotplug_lock. >> 2) Make cpuset hotplug work synchronous. >> >> Could you please share your feedback. > > Hmm... this can also be resolved by adding WQ_MEM_RECLAIM to the > synchronize rcu workqueue, right? Given the wide-spread usages of > synchronize_rcu and friends, maybe that's the right solution, or at > least something we also need to do, for this particular deadlock? > > Again, I don't have anything against making the domain rebuliding part > of cpuset operations synchronous and these tricky deadlock scenarios > do indicate that doing so would probably be beneficial. That said, > tho, these scenarios seem more of manifestations of other problems > exposed through kthreadd dependency than anything else. > > Thanks. > Hi TJ, Thanks for suggesting WQ_MEM_RECLAIM solution. My understanding of WQ_MEM_RECLAIM was that it needs to be used for cases where memory pressure could cause deadlocks. In this case it does not seem to be a memory pressure issue. Overloading WQ_MEM_RECLAIM usage for solution to another problem is the correct approach? This scenario can be resolved by using WQ_MEM_RECLAIM and a separate workqueue for rcu. But there seems to be a possibility in future if any cpu hotplug callbacks use other predefined workqueues which do not have WQ_MEM_RECLAIM option. Please let me know your feedback on this. Thanks -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project