Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752044AbbEHEQ6 (ORCPT ); Fri, 8 May 2015 00:16:58 -0400 Received: from mail-qk0-f181.google.com ([209.85.220.181]:34527 "EHLO mail-qk0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751260AbbEHEQz (ORCPT ); Fri, 8 May 2015 00:16:55 -0400 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Fri, 8 May 2015 00:22:07 -0400 (EDT) To: Ingo Molnar cc: Vince Weaver , linux-kernel@vger.kernel.org, Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Ingo Molnar , Paul Mackerras Subject: Re: perf: WARNING perfevents: irq loop stuck! In-Reply-To: <20150501070226.GB18957@gmail.com> Message-ID: References: <20150501070226.GB18957@gmail.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1241 Lines: 29 On Fri, 1 May 2015, Ingo Molnar wrote: > So 0000fffffffffffe corresponds to 2 events left until overflow, > right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we > allow these super short periods. > > Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, > one similar to bdw_limit_period()? Something like the patch below? > > Totally untested and such. I picked 128 because of Broadwell, but > lower values might work as well. You could try to increase it to 3 and > upwards and see which one stops triggering stuck NMI loops? I spent a lot of time trying to come up with a test case that triggered this more reliably but failed. It definitely is an issue with PMC0 being -2 causing the PMC0 bit in the status register getting stuck and no clearing. Often there is also a PEBS event active at the same time but that might be coincidence. With your patch applied I can't trigger the issue. I haven't tried narrowing down the exact value yet. Vince -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/