Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753489Ab3IQQdJ (ORCPT ); Tue, 17 Sep 2013 12:33:09 -0400 Received: from mail.openrapids.net ([64.15.138.104]:50631 "EHLO blackscsi.openrapids.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753218Ab3IQQdH (ORCPT ); Tue, 17 Sep 2013 12:33:07 -0400 Date: Tue, 17 Sep 2013 12:33:03 -0400 From: Mathieu Desnoyers To: Ingo Molnar Cc: hpa@zytor.com, linux-kernel@vger.kernel.org, gerlando.falauto@keymile.com, john.stultz@linaro.org, minggr@gmail.com, tglx@linutronix.de, linux-tip-commits@vger.kernel.org, lttng-dev@lists.lttng.org Subject: Re: [tip:timers/urgent] timekeeping: Fix HRTICK related deadlock from ntp lock changes Message-ID: <20130917163303.GA10491@Krystal> References: <1378943457-27314-1-git-send-email-john.stultz@linaro.org> <20130916160426.GA24669@Krystal> <20130917070728.GC20661@gmail.com> <20130917080941.GA4675@Krystal> <20130917082649.GE20661@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130917082649.GE20661@gmail.com> X-Editor: vi X-Info: http://www.efficios.com User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2695 Lines: 70 * Ingo Molnar (mingo@kernel.org) wrote: > > * Mathieu Desnoyers wrote: > > > * Ingo Molnar (mingo@kernel.org) wrote: > > > > > > * Mathieu Desnoyers wrote: > > > > > > > Hi Ingo, > > > > > > > > Do you have an estimate of the time it will take for this fix to hit > > > > mainline, stable-3.10 and stable-3.11 ? Meanwhile, I'm marking 3.10 and > > > > 3.11 as broken for LTTng with a kernel version at compile-time, since > > > > this kernel regression currently triggers hard system lockup when people > > > > use LTTng on those kernels, and this is certainly something nobody > > > > wants. > > > > > > So, at least as per the description of John, this should only trigger if > > > SCHED_HRTICK is enabled in sched_features - which is disabled by default, > > > it's a debug-only development feature. Does the bug trigger on more > > > regular kernels as well? > > > > Unfortunately, it does happen on a pretty standard kernel config (giving > > my x230 config as example below). Pasting relevant bug description from > > http://bugs.lttng.org/issues/631 : > > > > "Starting from Linux kernel commit > > 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 "timekeeping: Hold > > timekeepering locks in do_adjtimex and hardpps" (3.10 kernels), the > > xtime write seqlock is held across calls to __do_adjtimex(), which > > includes a call to notify_cmos_timer(), and hence > > schedule_delayed_work(). > > > > This introduces a side-effect for a set of tracepoints, including mainly > > the workqueue tracepoints: a tracer hooking on those tracepoints and > > reading current time with ktime_get() will cause hard system LOCKUP" > > It's the LTTng tracepoint 'hooking' in something that does something > invalid in that context that is causing the hang, not the vanilla kernel > itself, right? Yes, that's correct. In order to ensure this kind of problem is entirely taken care of, I've started working on a synchronization scheme proposed by Peter Zijlstra that would allow ktime() to be called from any execution context (see: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg504089.html). > > In that case the 'you get to keep both pieces' policy of out of tree code > applies - but the HRTICK fix should solve your problem as well, > incidentally. Thanks, Mathieu > > Thanks, > > Ingo -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/