Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751442AbaL0Qtw (ORCPT ); Sat, 27 Dec 2014 11:49:52 -0500 Received: from arcturus.aphlor.org ([188.246.204.175]:33161 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751072AbaL0Qtt (ORCPT ); Sat, 27 Dec 2014 11:49:49 -0500 Date: Sat, 27 Dec 2014 11:48:48 -0500 From: Dave Jones To: Linus Torvalds Cc: Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141227164848.GA13844@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linus Torvalds , Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz References: <20141222225725.GA8140@codemonkey.org.uk> <20141224030125.GA8725@codemonkey.org.uk> <20141226163410.GA25161@codemonkey.org.uk> <20141226181204.GA26527@codemonkey.org.uk> <20141226225744.GA30955@codemonkey.org.uk> <20141227003636.GA32271@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Report: Spam report generated by SpamAssassin on "arcturus.aphlor.org" Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Authenticated-User: davej@codemonkey.org.uk Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 26, 2014 at 07:14:55PM -0800, Linus Torvalds wrote: > On Fri, Dec 26, 2014 at 4:36 PM, Dave Jones wrote: > > > > > > Oh - and have you actually seen the "TSC unstable (delta = xyz)" + > > > "switched to hpet" messages there yet? > > > > not yet. 3 hrs in. > > Ok, so then the > > INFO: rcu_preempt detected stalls on CPUs/tasks: > > has nothing to do with HPET, since you'd still be running with the TSC enabled. right. 16hrs later, that's the only thing that's spewed. > My googling around did find a number of "machine locks up a few hours > after switching to hpet" reports, so it is possible that the whole rcu > stall and nmi watchdog thing is independent and unrelated to the > actual locking up. possible. I'm heading home in a few hours to start the wipe of that box. This is going to be 'the one that got away', but at least we've managed to find a number of other things that needed fixing along the way. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/