Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750911AbdHaH4D (ORCPT ); Thu, 31 Aug 2017 03:56:03 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:59088 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750733AbdHaH4C (ORCPT ); Thu, 31 Aug 2017 03:56:02 -0400 Date: Thu, 31 Aug 2017 09:55:57 +0200 (CEST) From: Thomas Gleixner To: Peter Zijlstra cc: Borislav Petkov , Sebastian Andrzej Siewior , lkml Subject: Re: WARNING: possible circular locking dependency detected In-Reply-To: <20170831073739.ytexc7omldyb5lgy@hirez.programming.kicks-ass.net> Message-ID: References: <20170825100304.5cwrlrfwi7f3zcld@pd.tnic> <20170828145808.btuqpe2bvxymljyg@hirez.programming.kicks-ass.net> <20170829194948.GD32112@worktop.programming.kicks-ass.net> <20170830054730.GF32112@worktop.programming.kicks-ass.net> <20170831073739.ytexc7omldyb5lgy@hirez.programming.kicks-ass.net> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1986 Lines: 55 On Thu, 31 Aug 2017, Peter Zijlstra wrote: > On Thu, Aug 31, 2017 at 09:08:05AM +0200, Thomas Gleixner wrote: > > On Wed, 30 Aug 2017, Peter Zijlstra wrote: > > > On offline it basically does perf_event_disable() for all CPU context > > > events, and then adds HOTPLUG_OFFSET (-32) to arrive at: OFF + > > > HOTPLUG_OFFSET = -33. > > > > > > That's smaller than ERROR and thus perf_event_enable() no-ops on events > > > for offline CPUs (maybe we should try and plumb an error return for > > > IOC_ENABLE). > > > > > > On online we subtract the HOTPLUG_OFFSET again and the event becomes a > > > regular OFF, after which perf_event_enable() should work again. > > > > I haven't come around to test that as I was busy cleaning up the unholy > > mess in the watchdog code. > > > > One other thing I stumbled over is: > > > > perf_event_create() > > .... > > x86_hw_reserve(event) > > > > if (__x86_pmu_event_init(event) < 0) > > event->destroy(event); > > x86_hw_release() > > .... > > cpus_read_lock(); > > > > If that happens from a hotplug function, we are doomed. > > > > I mean, that particular watchdog event won't fail if the watchdog code > > would verify that already at init time (which it does soon), but in general > > event creation during hotplug is dangerous. > > Arghh!!! > > And allowing us to create events for offline CPUs (possible I think, but > maybe slightly tricky) won't solve that, because we're already holding > the hotplug_lock during PREPARE. There are two ways to cure that: 1) Have a pre cpus_write_lock() stage which is serialized via cpus_add_remove_lock, which is the outer lock for hotplug. There we can sanely create stuff and fail with all consequences. 2) Have some deferred mechanism, which is destroying the event after failure, but that might be tricky as RCU and workqueues might end up being flushed during hotplug, which creates the same mess again. Thanks, tglx