Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751837AbdH2TeV (ORCPT ); Tue, 29 Aug 2017 15:34:21 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:40095 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751359AbdH2TeT (ORCPT ); Tue, 29 Aug 2017 15:34:19 -0400 Date: Tue, 29 Aug 2017 21:34:16 +0200 From: Peter Zijlstra To: Sebastian Andrzej Siewior Cc: Borislav Petkov , Byungchul Park , Thomas Gleixner , lkml Subject: Re: WARNING: possible circular locking dependency detected Message-ID: <20170829193416.GC32112@worktop.programming.kicks-ass.net> References: <20170825100304.5cwrlrfwi7f3zcld@pd.tnic> <20170825144755.ms2h2j2xe6gznnqi@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170825144755.ms2h2j2xe6gznnqi@linutronix.de> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4436 Lines: 130 On Fri, Aug 25, 2017 at 04:47:55PM +0200, Sebastian Andrzej Siewior wrote: > On 2017-08-25 12:03:04 [+0200], Borislav Petkov wrote: > > ====================================================== > > WARNING: possible circular locking dependency detected > > 4.13.0-rc6+ #1 Not tainted > > ------------------------------------------------------ > > While looking at this, I stumbled upon another one also enabled by > "completion annotation" in the TIP: > > | ====================================================== > | WARNING: possible circular locking dependency detected > | 4.13.0-rc6-00758-gd80d4177391f-dirty #112 Not tainted > | ------------------------------------------------------ > | cpu-off.sh/426 is trying to acquire lock: > | ((complete)&st->done){+.+.}, at: [] takedown_cpu+0x84/0xf0 > | > | but task is already holding lock: > | (sparse_irq_lock){+.+.}, at: [] irq_lock_sparse+0x12/0x20 > | > | which lock already depends on the new lock. > | > | the existing dependency chain (in reverse order) is: > | > | -> #1 (sparse_irq_lock){+.+.}: > | __mutex_lock+0x88/0x9a0 > | mutex_lock_nested+0x16/0x20 > | irq_lock_sparse+0x12/0x20 > | irq_affinity_online_cpu+0x13/0xd0 > | cpuhp_invoke_callback+0x4a/0x130 > | > | -> #0 ((complete)&st->done){+.+.}: > | check_prev_add+0x351/0x700 > | __lock_acquire+0x114a/0x1220 > | lock_acquire+0x47/0x70 > | wait_for_completion+0x5c/0x180 > | takedown_cpu+0x84/0xf0 > | cpuhp_invoke_callback+0x4a/0x130 > | cpuhp_down_callbacks+0x3d/0x80 > … > | > | other info that might help us debug this: > | > | Possible unsafe locking scenario: > | CPU0 CPU1 > | ---- ---- > | lock(sparse_irq_lock); > | lock((complete)&st->done); > | lock(sparse_irq_lock); > | lock((complete)&st->done); > | > | *** DEADLOCK *** > > We hold the sparse_irq_lock lock while waiting for the completion in the > CPU-down case and in the CPU-up case we acquire the sparse_irq_lock lock > while the other CPU is waiting for the completion. > This is not an issue if my interpretation of lockdep here is correct. > > How do we annotate this? The below is the nicest thing I could come up with. This results in _cpu_down and _cpu_up having a different class for st->done. Its a bit weird, and would probably need a wee comment to explain things, but it boots and avoids the splat on hotplug. --- kernel/cpu.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index acf5308fad51..d93df21c5cfb 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -375,13 +375,6 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st, /* * The cpu hotplug threads manage the bringup and teardown of the cpus */ -static void cpuhp_create(unsigned int cpu) -{ - struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu); - - init_completion(&st->done); -} - static int cpuhp_should_run(unsigned int cpu) { struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state); @@ -520,7 +513,6 @@ static int cpuhp_kick_ap_work(unsigned int cpu) static struct smp_hotplug_thread cpuhp_threads = { .store = &cpuhp_state.thread, - .create = &cpuhp_create, .thread_should_run = cpuhp_should_run, .thread_fn = cpuhp_thread_fun, .thread_comm = "cpuhp/%u", @@ -687,7 +679,7 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, enum cpuhp_state target) { struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu); - int prev_state, ret = 0; + int i, prev_state, ret = 0; if (num_online_cpus() == 1) return -EBUSY; @@ -697,6 +689,9 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, cpus_write_lock(); + for_each_possible_cpu(i) + init_completion(&per_cpu_ptr(&cpuhp_state, i)->done); + cpuhp_tasks_frozen = tasks_frozen; prev_state = st->state; @@ -802,10 +797,13 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target) { struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu); struct task_struct *idle; - int ret = 0; + int i, ret = 0; cpus_write_lock(); + for_each_possible_cpu(i) + init_completion(&per_cpu_ptr(&cpuhp_state, i)->done); + if (!cpu_present(cpu)) { ret = -EINVAL; goto out;