Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752965AbbHaUzt (ORCPT ); Mon, 31 Aug 2015 16:55:49 -0400 Received: from relay2.sgi.com ([192.48.180.65]:58827 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751521AbbHaUzs (ORCPT ); Mon, 31 Aug 2015 16:55:48 -0400 Date: Mon, 31 Aug 2015 15:55:45 -0500 From: Alex Thorlton To: Peter Zijlstra Cc: Alex Thorlton , linux-kernel@vger.kernel.org, Ingo Molnar , John Stultz , Thomas Gleixner , Russ Anderson , Dimitri Sivanich Subject: Re: [BUG] Boot hangs at clocksource_done_booting on large configs Message-ID: <20150831205545.GT20615@asylum.americas.sgi.com> References: <20150831180432.GQ20615@asylum.americas.sgi.com> <20150831203250.GH16853@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150831203250.GH16853@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1925 Lines: 54 On Mon, Aug 31, 2015 at 10:32:50PM +0200, Peter Zijlstra wrote: > On Mon, Aug 31, 2015 at 01:04:33PM -0500, Alex Thorlton wrote: > q > > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c > > index fd643d8..8502521 100644 > > --- a/kernel/stop_machine.c > > +++ b/kernel/stop_machine.c > > @@ -417,8 +417,11 @@ static void cpu_stopper_thread(unsigned int cpu) > > { > > struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); > > struct cpu_stop_work *work; > > + unsigned long flags; > > int ret; > > > > + local_irq_save(flags); > > + > > repeat: > > work = NULL; > > spin_lock_irq(&stopper->lock); > > @@ -452,6 +455,8 @@ repeat: > > cpu_stop_signal_done(done, true); > > goto repeat; > > } > > + > > + local_irq_restore(flags); > > } > > > > So I should probably just go sleep and not say anything.. _but_ > *confused*. > > That local_irq_save() will disable IRQs over: > > work = NULL; > > But that is _all_. The spin_unlock_irq() will re-enable IRQs, after > which things will run as usual. > > That local_irq_restore() is a total NOP, IRQs are guaranteed enabled at > the irq_local_save() (otherwise lockdep would've complained about > spin_unlock_irq() unconditionally enabling them) and by the time we get > to the restore that same unlock_irq will have enabled them already. Ahh, right. Well that code is worthless then :) Either way though, I guess that means that slight change just fudged the timing enough in that area to avoid the lockup we're seeing. Ignoring my useless code change, does anything else jump out at you as interesting, here? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/