Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751315AbaLZQei (ORCPT ); Fri, 26 Dec 2014 11:34:38 -0500 Received: from arcturus.aphlor.org ([188.246.204.175]:56894 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140AbaLZQeh (ORCPT ); Fri, 26 Dec 2014 11:34:37 -0500 Date: Fri, 26 Dec 2014 11:34:10 -0500 From: Dave Jones To: Linus Torvalds , Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141226163410.GA25161@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linus Torvalds , Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz References: <20141221223204.GA9618@codemonkey.org.uk> <20141222225725.GA8140@codemonkey.org.uk> <20141224030125.GA8725@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20141224030125.GA8725@codemonkey.org.uk> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Report: Spam report generated by SpamAssassin on "arcturus.aphlor.org" Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Authenticated-User: davej@codemonkey.org.uk Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 23, 2014 at 10:01:25PM -0500, Dave Jones wrote: > On Mon, Dec 22, 2014 at 03:59:19PM -0800, Linus Torvalds wrote: > > > But in the meantime please do keep that thing running as long as you > > can. Let's see if we get bigger jumps. Or perhaps we'll get a negative > > result - the original softlockup bug happening *without* any bigger > > hpet jumps. > > So I've got this box a *little* longer than anticipated. > It's now been running 30 hours with not a single NMI lockup. > and that's with my kitchen-sink debugging kernel. > > The 'hpet off' messages continue to be spewed, and again they're > all in the same range of 4293198075 -> 4294967266 In case there was any doubt remaining, it's now been running 3 days, 20 hours with no lockups at all. I haven't seen it run this long in months. Either tomorrow or Sunday I'm finally wiping that box to give it back on Monday, so if there's anything else you'd like to try, the next 24hrs are pretty much the only remaining time I have. One thing I think I'll try is to try and narrow down which syscalls are triggering those "Clocksource hpet had cycles off" messages. I'm still unclear on exactly what is doing the stomping on the hpet. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/