Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966325Ab3DRMEE (ORCPT ); Thu, 18 Apr 2013 08:04:04 -0400 Received: from mail-qe0-f54.google.com ([209.85.128.54]:39008 "EHLO mail-qe0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752488Ab3DRMEB (ORCPT ); Thu, 18 Apr 2013 08:04:01 -0400 MIME-Version: 1.0 In-Reply-To: <1366285369.19383.19.camel@laptop> References: <1366285369.19383.19.camel@laptop> Date: Thu, 18 Apr 2013 14:04:00 +0200 Message-ID: Subject: Re: [PATCH v2] NMI: fix NMI period is not correct when cpu frequency changes issue. From: Stephane Eranian To: Peter Zijlstra Cc: "Pan, Zhenjie" , "paulus@samba.org" , "mingo@redhat.com" , "acme@ghostprotocols.net" , "akpm@linux-foundation.org" , "dzickus@redhat.com" , "tglx@linutronix.de" , "Liu, Chuansheng" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1610 Lines: 31 On Thu, Apr 18, 2013 at 1:42 PM, Peter Zijlstra wrote: > On Tue, 2013-04-16 at 06:57 +0000, Pan, Zhenjie wrote: >> Watchdog use performance monitor of cpu clock cycle to generate NMI to detect hard lockup. >> But when cpu's frequency changes, the event period will also change. >> It's not as expected as the configration. >> For example, set the NMI event handler period is 10 seconds when the cpu is 2.0GHz. >> If the cpu changes to 800MHz, the period will be 10*(2000/800)=25 seconds. >> So it may make hard lockup detect not work if the watchdog timeout is not long enough. >> Now, set a notifier to listen to the cpu frequency change. >> And dynamic re-config the NMI event to make the event period correct. >> > > > Urgh,. does this really matter.. all we really want is for that NMI to > hit eventually in the not too distant future. Does the frequency really > matter _that_ much? > I agree, it does not really matter. Set the watchdog to a couple of minutes and it should be fine, shouldn't it? > Also, can't we simply pick an event that's invariant to the cpufreq > nonsense? Something like CPU_CLK_UNHALTED.REF -- or better the > fixed_ctr2 which nobody ever uses anyway. > You don't want to use fixed counter 2 for NMI watchdog because it's pinned. No other counter can count this event. And it is very useful. I use it often. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/