Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752211AbaLCVtB (ORCPT ); Wed, 3 Dec 2014 16:49:01 -0500 Received: from www.linutronix.de ([62.245.132.108]:55795 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752067AbaLCVs7 (ORCPT ); Wed, 3 Dec 2014 16:48:59 -0500 Date: Wed, 3 Dec 2014 22:48:44 +0100 (CET) From: Thomas Gleixner To: Dave Jones cc: John Stultz , Linus Torvalds , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?ISO-8859-15?Q?D=E2niel_Fraga?= , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List Subject: Re: frequent lockups in 3.18rc4 In-Reply-To: <20141203210504.GA6361@redhat.com> Message-ID: References: <1417540493.21136.3@mail.thefacebook.com> <20141203184111.GA32005@redhat.com> <20141203190045.GB32005@redhat.com> <20141203204439.GA5019@redhat.com> <20141203210504.GA6361@redhat.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 3 Dec 2014, Dave Jones wrote: > On Wed, Dec 03, 2014 at 09:59:20PM +0100, Thomas Gleixner wrote: > > > Can you please provide the cpuinfo flags of that box? > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe > syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good > nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 > monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 > sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand > lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept > vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm > xsaveopt So that has nonstop_tsc and constant_tsc, which means that we switch to sched_clock_stable, i.e. no range checks, nothing. We just take the raw value and use it. The clocksource code is a bit more paranoid and lets the TSC be monitored by the watchdog. Now, if the TSC is detected as unstable we should switch back to sched_clock_unstable, but we don't have a mechanism for that. That was obviously not considered when the sched_clock_stable stuff was introduced. So sched_clock() happily uses TSC as a reliable thing even when the clocksource code detected that it is crap. For sure we need something here, but that sched_clock_stable mechanism got introduced in 3.14, so it does not make any sense that you observe that only post 3.16. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/