Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757207Ab3G3Tin (ORCPT ); Tue, 30 Jul 2013 15:38:43 -0400 Received: from mail.skyhub.de ([78.46.96.112]:36315 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757063Ab3G3Tik (ORCPT ); Tue, 30 Jul 2013 15:38:40 -0400 Date: Tue, 30 Jul 2013 21:38:38 +0200 From: Borislav Petkov To: Vince Weaver Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , trinity@vger.kernel.org Subject: Re: perf : fuzzer-related NMI lockup Message-ID: <20130730193838.GC23299@pd.tnic> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2712 Lines: 57 On Tue, Jul 30, 2013 at 03:01:27PM -0400, Vince Weaver wrote: > Hello > > so my perf_fuzzer has been causing problems again. > > After running a while all login shells on the system (even unrelated > local ones) get killed. Nothing is logged when this happens and it > doesn't appear to be OOM related. > > In an attempt to find out what was going on I ran the fuzzer with "nohup" > which led to the following NMI lockup which looks perf related. The > system became unusable after this. > > The first WARNING is I think a known issue but I'm including it in the > dump in case it is related. It's the NMI lockup that is the problem. > > There was possibly some sort of RCU message printed to the screen also > that didn't make it to the logs but I wasn't able to write it down in > time. > > This is on a recent ivybridge mac-mini running 3.11-rc3 > > Jul 30 11:08:28 mac-mini kernel: [ 651.209212] hrtimer: interrupt took 1152 ns > Jul 30 11:08:50 mac-mini kernel: [ 673.441360] perf samples too long (2557 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 > Jul 30 11:08:58 mac-mini kernel: [ 680.886547] perf samples too long (5003 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 > Jul 30 11:08:58 mac-mini kernel: [ 681.401917] perf samples too long (10002 > 10000), lowering kernel.perf_event_max_sample_rate to 12500 Interesting, saw a similar thing today while running perf top --stdio -a [47314.677201] perf samples too long (2505 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 [47314.686347] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.148 msecs [47315.946675] perf samples too long (5009 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 [47315.955825] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.154 msecs [47391.116117] Uhhuh. NMI received for unknown reason 21 on CPU 0. [47391.122034] Do you have a strange power saving mode enabled? [47391.127731] Dazed and confused, but trying to continue [53627.692616] Uhhuh. NMI received for unknown reason 31 on CPU 0. [53627.698547] Do you have a strange power saving mode enabled? [53627.704202] Dazed and confused, but trying to continue [64212.289657] usb 1-1.2: USB disconnect, device number 4 along with strange "forgotten" NMIs firing later. Machine is still running normally after that though. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/