Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751282AbdH3Frn (ORCPT ); Wed, 30 Aug 2017 01:47:43 -0400 Received: from merlin.infradead.org ([205.233.59.134]:48978 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751110AbdH3Frm (ORCPT ); Wed, 30 Aug 2017 01:47:42 -0400 Date: Wed, 30 Aug 2017 07:47:30 +0200 From: Peter Zijlstra To: Thomas Gleixner Cc: Borislav Petkov , Sebastian Andrzej Siewior , lkml Subject: Re: WARNING: possible circular locking dependency detected Message-ID: <20170830054730.GF32112@worktop.programming.kicks-ass.net> References: <20170825100304.5cwrlrfwi7f3zcld@pd.tnic> <20170828145808.btuqpe2bvxymljyg@hirez.programming.kicks-ass.net> <20170829194948.GD32112@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4813 Lines: 150 On Tue, Aug 29, 2017 at 10:10:37PM +0200, Thomas Gleixner wrote: > On Tue, 29 Aug 2017, Peter Zijlstra wrote: > > So I have a patch _somewhere_ that preserves the event<->cpu relation > > across hotplug and disable/enable would be sufficient. If you want I can > > try and dig that out and make it work again. > > > > That would avoid having to do the destroy/create cycle of the watchdog > > events. > > Yes, that would solve the x86_release_hw() issue, but still lots of the > other rework is required in one way or the other. > > I'm currently trying to avoid that extra lock mess in the cpu hotplug code, > which would just open the door for everybody to add his extra locks there, > so we end up taking a gazillion locks before we can hotplug :) > > I think I have an idea how to solve that cleanly, but certainly your offer > of preserving the event - cpu relation accross hotplug would help > tremendously. I think something like the below ought to work. Compile tested only. On offline it basically does perf_event_disable() for all CPU context events, and then adds HOTPLUG_OFFSET (-32) to arrive at: OFF + HOTPLUG_OFFSET = -33. That's smaller than ERROR and thus perf_event_enable() no-ops on events for offline CPUs (maybe we should try and plumb an error return for IOC_ENABLE). On online we subtract the HOTPLUG_OFFSET again and the event becomes a regular OFF, after which perf_event_enable() should work again. --- include/linux/perf_event.h | 2 ++ kernel/events/core.c | 51 +++++++++++++++++++++++++++++++++------------- 2 files changed, 39 insertions(+), 14 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 9bac4bfa5e1a..7b39ceeb206b 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -497,6 +497,8 @@ enum perf_event_active_state { PERF_EVENT_STATE_OFF = -1, PERF_EVENT_STATE_INACTIVE = 0, PERF_EVENT_STATE_ACTIVE = 1, + + PERF_EVENT_STATE_HOTPLUG_OFFSET = -32, }; struct file; diff --git a/kernel/events/core.c b/kernel/events/core.c index f77c97477e08..b277c27fd81e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -11025,19 +11025,21 @@ void perf_swevent_init_cpu(unsigned int cpu) } #if defined CONFIG_HOTPLUG_CPU || defined CONFIG_KEXEC_CORE -static void __perf_event_exit_context(void *__info) +static void __perf_event_exit_cpu(void *__info) { - struct perf_event_context *ctx = __info; - struct perf_cpu_context *cpuctx = __get_cpu_context(ctx); + struct perf_cpu_context *cpuctx = __info; + struct perf_event_context *ctx = &cpuctx->ctx; struct perf_event *event; raw_spin_lock(&ctx->lock); - list_for_each_entry(event, &ctx->event_list, event_entry) - __perf_remove_from_context(event, cpuctx, ctx, (void *)DETACH_GROUP); + list_for_each_entry(event, &ctx->event_list, event_entry) { + __perf_event_disable(event, cpuctx, ctx, NULL); + event->state += PERF_EVENT_STATE_HOTPLUG_OFFSET; + } raw_spin_unlock(&ctx->lock); } -static void perf_event_exit_cpu_context(int cpu) +int perf_event_exit_cpu(unsigned int cpu) { struct perf_cpu_context *cpuctx; struct perf_event_context *ctx; @@ -11049,17 +11051,43 @@ static void perf_event_exit_cpu_context(int cpu) ctx = &cpuctx->ctx; mutex_lock(&ctx->mutex); - smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); + smp_call_function_single(cpu, __perf_event_exit_cpu, cpuctx, 1); cpuctx->online = 0; mutex_unlock(&ctx->mutex); } cpumask_clear_cpu(cpu, perf_online_mask); mutex_unlock(&pmus_lock); + + return 0; +} + +static void __perf_event_init_cpu(void *__info) +{ + struct perf_cpu_context *cpuctx = __info; + struct perf_event_context *ctx = &cpuctx->ctx; + struct perf_event *event; + + raw_spin_lock(&ctx->lock); + list_for_each_entry(event, &ctx->event_list, event_entry) + event->state -= PERF_EVENT_STATE_HOTPLUG_OFFSET; + raw_spin_unlock(&ctx->lock); +} + +static void _perf_event_init_cpu(int cpu, struct perf_cpu_context *cpuctx) +{ + smp_call_function_single(cpu, __perf_event_init_cpu, cpuctx, 1); } + #else -static void perf_event_exit_cpu_context(int cpu) { } +int perf_event_exit_cpu(unsigned int cpu) +{ + return 0; +} +static void _perf_event_init_cpu(int cpu, struct perf_cpu_context *cpuctx) +{ +} #endif int perf_event_init_cpu(unsigned int cpu) @@ -11078,6 +11106,7 @@ int perf_event_init_cpu(unsigned int cpu) mutex_lock(&ctx->mutex); cpuctx->online = 1; + _perf_event_init_cpu(cpu, cpuctx); mutex_unlock(&ctx->mutex); } mutex_unlock(&pmus_lock); @@ -11085,12 +11114,6 @@ int perf_event_init_cpu(unsigned int cpu) return 0; } -int perf_event_exit_cpu(unsigned int cpu) -{ - perf_event_exit_cpu_context(cpu); - return 0; -} - static int perf_reboot(struct notifier_block *notifier, unsigned long val, void *v) {