Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754309AbdGSPOg (ORCPT ); Wed, 19 Jul 2017 11:14:36 -0400 Received: from mga05.intel.com ([192.55.52.43]:18905 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753969AbdGSPOe (ORCPT ); Wed, 19 Jul 2017 11:14:34 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,381,1496127600"; d="scan'208";a="129315740" From: "Luck, Tony" To: Borislav Petkov , Mauro Carvalho Chehab CC: "Kani, Toshimitsu" , "linux-kernel@vger.kernel.org" , "tglx@linutronix.de" , "mchehab@kernel.org" , "rjw@rjwysocki.net" , "srinivas.pandruvada@linux.intel.com" , "lenb@kernel.org" , "linux-acpi@vger.kernel.org" , "linux-edac@vger.kernel.org" Subject: RE: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac Thread-Topic: [PATCH 3/3] ghes_edac: add platform check to enable ghes_edac Thread-Index: AQHS/4sx93QQ8jMonkW0hrLisfafFqJadtMAgAAVeYCAAJIXAIAAIm4A Date: Wed, 19 Jul 2017 15:14:32 +0000 Message-ID: <3908561D78D1C84285E8C5FCA982C28F6130D126@ORSMSX114.amr.corp.intel.com> References: <20170717215912.26070-1-toshi.kani@hpe.com> <20170717215912.26070-4-toshi.kani@hpe.com> <20170718060007.GB8736@nazgul.tnic> <1500407379.2042.21.camel@hpe.com> <20170718181545.32bd9181@vento.lan> <20170719055838.GF26030@nazgul.tnic> In-Reply-To: <20170719055838.GF26030@nazgul.tnic> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-originating-ip: [10.22.254.139] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id v6JFEfug006042 Content-Length: 1153 Lines: 27 > "The module number of the memory error location. (NODE, CARD, and MODULE > should provide the information necessary to identify the failing FRU)." > > So this tuple is sufficient to pinpoint the DIMM, IIUC. > > Which means, ghes_edac can have a single layer of DIMMs without channels. The tricky part is that you have to rely on SMBIOS/DMI to know what DIMMs are on the system when the driver initializes so you can populate /sys/.*/edac Later when GHES gives you a NODE/CARD/MODULE) in an error record. You need to match these up. But SMBIOS only gave you two strings "Locator" and "Bank Locator" which have no defined syntax. You are at the mercy of the BIOS writer to put in something parseable. Some writers used zero based counts, others are Fortran fans and use one-based. Still other use letters. About the one guarantee is that they will make almost no effort to match the silkscreen labels on the motherboard itself. E.g. my Broadwell-EX has things like: Locator: CHANNEL D DIMM 1 Bank Locator: Memriser8 Channel is A,B,C,D. DIMM is 0, 1, 2. Memriser is {1..8} so this manages to use all three counting options! -Tony