Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752646AbbBWQdA (ORCPT ); Mon, 23 Feb 2015 11:33:00 -0500 Received: from mail-qc0-f177.google.com ([209.85.216.177]:43629 "EHLO mail-qc0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752546AbbBWQc7 (ORCPT ); Mon, 23 Feb 2015 11:32:59 -0500 Date: Mon, 23 Feb 2015 11:32:55 -0500 (EST) From: Nicolas Pitre To: Peter Zijlstra cc: linux-kernel@vger.kernel.org, mingo@kernel.org, rjw@rjwysocki.net, tglx@linutronix.de, Preeti U Murthy Subject: Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting In-Reply-To: <20150223161457.GA5029@twins.programming.kicks-ass.net> Message-ID: References: <20150216121435.203983131@infradead.org> <20150216122413.880378334@infradead.org> <20150221124659.GG23367@worktop.ger.corp.intel.com> <20150223161457.GA5029@twins.programming.kicks-ass.net> User-Agent: Alpine 2.11 (LFD 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1837 Lines: 52 On Mon, 23 Feb 2015, Peter Zijlstra wrote: > In any case, having had a second look I think I might have some ideas: > > - bL_switcher_enable() -- enables the whole switcher thing and > disables half the cpus with hot-un-plug, creates a mapping etc.. > > - bL_switcher_disable() -- disabled the whole switcher thing and > gives us back all our cpus with hot-plug. > > When the switcher is enabled; we switch by this magic cpu_suspend() call > that saves the entire cpu state and allows you to restore it on another > cpu. > > You muck about with the tick; you disable it before cpu_suspend() and > re-enable it after on the target cpu. You further reprogram the > interrupt routing from the old to the new cpu. > > But that appears to be it, no more. Exact. > I suppose the tick is special because its the only per-cpu device? Right. > The reported function that fails: bL_switcher_restore_cpus() is called > in the error paths of the former and the main path in the latter to make > the 'stolen' cpus re-appear. > > The patch in question somehow makes that go boom. > > > Now what all do you need to do to make it go boom? Just enable/disable > the switcher once and it'll explode? Or does it need to do actual > switches while it is enabled? It gets automatically enabled during boot. Then several switches are performed while user space is brought up. If I manually disable it via /sys then it goes boom. > The place where it explodes is a bit surprising, it thinks hrtimers are > not enabled even though its calling into hrtimer code on that cpu... > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/