Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933868AbbLOXqH (ORCPT ); Tue, 15 Dec 2015 18:46:07 -0500 Received: from mga14.intel.com ([192.55.52.115]:46613 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751842AbbLOXqG (ORCPT ); Tue, 15 Dec 2015 18:46:06 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,434,1444719600"; d="scan'208";a="872291217" From: "Luck, Tony" To: Borislav Petkov CC: Ingo Molnar , Andrew Morton , Andy Lutomirski , "Williams, Dan J" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "linux-nvdimm@ml01.01.org" , "x86@kernel.org" Subject: RE: [PATCHV2 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas Thread-Topic: [PATCHV2 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas Thread-Index: AQHRNy3VWd7zJHLLfUmdzBfwI2jiqJ7MtWLQ Date: Tue, 15 Dec 2015 23:46:03 +0000 Message-ID: <3908561D78D1C84285E8C5FCA982C28F39F85DBE@ORSMSX114.amr.corp.intel.com> References: <20151215114314.GD25973@pd.tnic> In-Reply-To: <20151215114314.GD25973@pd.tnic> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsIiwiaWQiOiI2MTg1ZGQwNy1hNWY2LTQ1M2UtYTg4Yi02ZGQ5ZWI3YWE2ZjciLCJwcm9wcyI6W3sibiI6IkludGVsRGF0YUNsYXNzaWZpY2F0aW9uIiwidmFscyI6W3sidmFsdWUiOiJDVFBfSUMifV19XX0sIlN1YmplY3RMYWJlbHMiOltdLCJUTUNWZXJzaW9uIjoiMTUuNC4xMC4xOSIsIlRydXN0ZWRMYWJlbEhhc2giOiJZWUl2Q3k3VG1XXC9XS0NLR0FxZkZuNVRxMU55QUl3RzJlSDF0akJqVXA2VT0ifQ== x-inteldataclassification: CTP_IC x-originating-ip: [10.22.254.140] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id tBFNkCOm005428 Content-Length: 1207 Lines: 31 >> + /* Fault was in recoverable area of the kernel */ >> + if ((m.cs & 3) != 3 && worst == MCE_AR_SEVERITY) >> + if (!fixup_mcexception(regs, m.addr)) >> + mce_panic("Failed kernel mode recovery", &m, NULL); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Does that always imply a failed kernel mode recovery? I don't see > > (m.cs == 0 and MCE_AR_SEVERITY) > > MCEs always meaning that a recovery should be attempted there. I think > this should simply say > > mce_panic("Fatal machine check on current CPU", &m, msg); I don't think this can ever happen. If we were in kernel mode and decided that the severity was AR_SEVERITY ... then search_mcexception_table() found an entry for the IP where the machine check happened. The only way for fixup_exception to fail is if search_mcexception_table() now suddenly doesn't find the entry it found earlier. But if this "can't happen" thing actually does happen ... I'd like the panic message to be different from other mce_panic() so you'll know to blame me. Applied all the other suggestions. -Tony ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?