Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759743Ab1ENA0h (ORCPT ); Fri, 13 May 2011 20:26:37 -0400 Received: from mail-vx0-f174.google.com ([209.85.220.174]:53504 "EHLO mail-vx0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758516Ab1ENA0g convert rfc822-to-8bit (ORCPT ); Fri, 13 May 2011 20:26:36 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=eZsrYTLJ+K66MfDLImk8B+smOmAkjpkL62nQsOsp1LChDHodd6Y3vAPI2BOVQVIS3b o2p7Q1lAvwt3DPRqgDyie257xeBtNuWv5s7H7UyBTlU/7jlg+kVOSXAcMnh8hTqZR4rk AVISA2Tmf3C5KBXiVMKvQ3MpkEww2WdnBBnEk= MIME-Version: 1.0 In-Reply-To: <4DCD4B85.3040702@gmail.com> References: <1305275018-20596-1-git-send-email-ying.huang@intel.com> <4DCD4B85.3040702@gmail.com> Date: Sat, 14 May 2011 08:26:34 +0800 Message-ID: Subject: Re: [RFC] x86, NMI, Treat unknown NMI as hardware error From: huang ying To: Cyrill Gorcunov Cc: Huang Ying , Ingo Molnar , Don Zickus , linux-kernel@vger.kernel.org, Andi Kleen , Robert Richter , Andi Kleen Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1980 Lines: 46 On Fri, May 13, 2011 at 11:17 PM, Cyrill Gorcunov wrote: > On 05/13/2011 12:23 PM, Huang Ying wrote: >> In general, unknown NMI is used by hardware and firmware to notify >> fatal hardware errors to OS. So the Linux should treat unknown NMI as >> hardware error and go panic upon unknown NMI for better error >> containment. >> >> But there are some legacy machine which would randomly send unknown >> NMIs for no good reason.  To support these machines, a white list >> mechanism is provided to treat unknown NMI as hardware error only on >> some known working system. >> >> These systems are identified via the presentation of APEI HEST or >> some PCI ID of the host bridge. The PCI ID of host bridge instead of >> DMI ID is used, so that the checking can be done based on the platform >> type instead of motherboard. This should be simpler and sufficient. >> >> The method to identify the platforms is designed by Andi Kleen. >> >> Signed-off-by: Huang Ying >> Cc: Andi Kleen >> Cc: Don Zickus >> --- > ... > > Hi Ying, > > just curious (regardless the concerns Don and Ingo have) -- if there still a need > for such semi-unknown nmi handling maybe it's worth to register a *notifier* for it > and we panic only when user *explicitly* specify how to treat this class of NMIs > (via say "hest-nmi-panic" boot option or something like that). Maybe such partially > modular scheme would be better? If only I don't miss anything. Hi, Cyrill, IMHO, Pushing all policy to user is not good too. How many users understand unknown NMI and hardware error clearly? It is better if we can determine what is the right behavior. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/