Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751481AbaL0DO6 (ORCPT ); Fri, 26 Dec 2014 22:14:58 -0500 Received: from mail-qc0-f180.google.com ([209.85.216.180]:48933 "EHLO mail-qc0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751259AbaL0DO4 (ORCPT ); Fri, 26 Dec 2014 22:14:56 -0500 MIME-Version: 1.0 In-Reply-To: <20141227003636.GA32271@codemonkey.org.uk> References: <20141222225725.GA8140@codemonkey.org.uk> <20141224030125.GA8725@codemonkey.org.uk> <20141226163410.GA25161@codemonkey.org.uk> <20141226181204.GA26527@codemonkey.org.uk> <20141226225744.GA30955@codemonkey.org.uk> <20141227003636.GA32271@codemonkey.org.uk> Date: Fri, 26 Dec 2014 19:14:55 -0800 X-Google-Sender-Auth: U3oJhaVklom3odLIPhwJ8Zq_FHE Message-ID: Subject: Re: frequent lockups in 3.18rc4 From: Linus Torvalds To: Dave Jones , Linus Torvalds , Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?UTF-8?Q?D=C3=A2niel_Fraga?= , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 26, 2014 at 4:36 PM, Dave Jones wrote: > > > > Oh - and have you actually seen the "TSC unstable (delta = xyz)" + > > "switched to hpet" messages there yet? > > not yet. 3 hrs in. Ok, so then the INFO: rcu_preempt detected stalls on CPUs/tasks: has nothing to do with HPET, since you'd still be running with the TSC enabled. My googling around did find a number of "machine locks up a few hours after switching to hpet" reports, so it is possible that the whole rcu stall and nmi watchdog thing is independent and unrelated to the actual locking up. It *is* intriguing that my broken patch seemed to prevent it from happening, though. And both NMI watchdogs and the rcu stall are related to wall-clock time. But hey, maybe there really is some odd loop in the kernel that stops scheduling or RCU grace periods. It just seems to be never caught by your backtraces.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/