Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933221AbbFIQfX (ORCPT ); Tue, 9 Jun 2015 12:35:23 -0400 Received: from mail-bn1on0136.outbound.protection.outlook.com ([157.56.110.136]:43275 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933090AbbFIQez (ORCPT ); Tue, 9 Jun 2015 12:34:55 -0400 Authentication-Results: spf=none (sender IP is 165.204.84.221) smtp.mailfrom=amd.com; alien8.de; dkim=none (message not signed) header.d=none; X-WSS-ID: 0NPORDW-07-7F4-02 X-M-MSG: From: Aravind Gopalakrishnan To: , , CC: , , Subject: [PATCH 3/3] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Date: Tue, 9 Jun 2015 11:45:17 -0500 Message-ID: <1433868317-18417-4-git-send-email-Aravind.Gopalakrishnan@amd.com> X-Mailer: git-send-email 2.4.0 In-Reply-To: <1433868317-18417-1-git-send-email-Aravind.Gopalakrishnan@amd.com> References: <1433868317-18417-1-git-send-email-Aravind.Gopalakrishnan@amd.com> MIME-Version: 1.0 Content-Type: text/plain X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;BY2FFO11OLC015;1:TkRQvzjxdCowOVvt6mM4Pod9L3UzhaVyWkznBsqZDRP4JPaNs2Vax8I40MW8q6FYEJm6Zm5EwK8B6jb4l7hrgFyeNKl7Hznj2Uq1AlsFSq4b5c2BZQfKau5Wehu5z25W1PxLAbgnhxBF2gcKXCl960YraAOH3W8cgj8lSQNp2FrC7PDUIEdSEWpz9ujNgj3WoUsSLhSvUuQOx4tcdOrTVp/2DSKue0doP6atjuzMgXcqIpXNoMwFCjn+TabyOejwD1IVverxin6jV/7RmW5+bHgKEvMK4/XwvGTMpFq6VWnnUWiq8KSUmp9p4tzVpnoz X-Forefront-Antispam-Report: CIP:165.204.84.221;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(428002)(199003)(189002)(47776003)(87936001)(2201001)(36756003)(106466001)(105586002)(50226001)(53416004)(229853001)(62966003)(86362001)(77096005)(92566002)(46102003)(5001770100001)(101416001)(48376002)(76176999)(189998001)(50466002)(2950100001)(50986999)(77156002)(19580395003)(19580405001)(5001920100001)(21314002);DIR:OUT;SFP:1102;SCL:1;SRVR:BY1PR02MB1113;H:atltwp01.amd.com;FPR:;SPF:None;MLV:sfv;A:1;MX:1;LANG:en; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY1PR02MB1113; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(520003)(5005006)(3002001);SRVR:BY1PR02MB1113;BCL:0;PCL:0;RULEID:;SRVR:BY1PR02MB1113; X-Forefront-PRVS: 06022AA85F X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jun 2015 16:34:46.1236 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96;Ip=[165.204.84.221];Helo=[atltwp01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY1PR02MB1113 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3341 Lines: 118 For bank 4 errors, MCE is logged and reported only on node base cores. Refer D18F3x44[NbMcaToMstCpuEn] field in Fam10h and later BKDGs. This patch ensures that we inject the error on the node base core for bank 4 errors. Otherwise, triggering #MC or apic interrupts on a non node base core would not have any effect on the system. (i.e), we would not see any relevant output on kernel logs for the error we just injected. Update copyrights info while at it. Signed-off-by: Aravind Gopalakrishnan --- drivers/edac/mce_amd_inj.c | 57 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c index 3e1b53f..af55d49 100644 --- a/drivers/edac/mce_amd_inj.c +++ b/drivers/edac/mce_amd_inj.c @@ -6,7 +6,7 @@ * This file may be distributed under the terms of the GNU General Public * License version 2. * - * Copyright (c) 2010-14: Borislav Petkov + * Copyright (c) 2010-15: Borislav Petkov * Advanced Micro Devices Inc. */ @@ -17,10 +17,13 @@ #include #include #include +#include #include #include +#include #include "mce_amd.h" +#include "amd64_edac.h" /* * Collect all the MCi_XXX settings @@ -200,6 +203,44 @@ static void trigger_thr_int(void *info) asm volatile("int %0" :: "i" (THRESHOLD_APIC_VECTOR)); } +static u32 amd_get_nbc_for_node(int node_id) +{ + struct cpuinfo_x86 *c = &boot_cpu_data; + u32 cores_per_node; + + cores_per_node = c->x86_max_cores / amd_get_nodes_cnt(); + + return cores_per_node * node_id; +} + +static void toggle_nb_mca_mst_cpu(u16 nid) +{ + struct pci_dev *F3 = node_to_amd_nb(nid)->misc; + u32 val; + int err; + + if (!F3) + return; + + err = pci_read_config_dword(F3, NBCFG, &val); + if (err) { + pr_err("%s: Error reading F%dx%03x.\n", __func__, + PCI_FUNC(F3->devfn), + NBCFG); + return; + } + + if (!(val & BIT(27))) { + pr_err("%s: BIOS not setting D18F3x44[NbMcaToMstCpuEn]. Doing that here\n", __func__); + val |= BIT(27); + err = pci_write_config_dword(F3, NBCFG, val); + if (err) + pr_err("%s: Error writing F%dx%03x.\n", __func__, + PCI_FUNC(F3->devfn), + NBCFG); + } +} + static void do_inject(void) { u64 mcg_status = 0; @@ -235,6 +276,20 @@ static void do_inject(void) if (!(i_mce.status & MCI_STATUS_PCC)) mcg_status |= MCG_STATUS_RIPV; + /* + * For multi node cpus, logging and reporting of bank == 4 errors + * happen only on the node base core. Refer D18F3x44[NbMcaToMstCpuEn] + * for Fam10h and later BKDGs + */ + if (static_cpu_has(X86_FEATURE_AMD_DCM) && b == 4) { + /* + * BIOS sets D18F3x44[NbMcaToMstCpuEn] by default. + * But make sure of it here just in case.. + */ + toggle_nb_mca_mst_cpu(amd_get_nb_id(cpu)); + cpu = amd_get_nbc_for_node(amd_get_nb_id(cpu)); + } + toggle_hw_mce_inject(cpu, true); wrmsr_on_cpu(cpu, MSR_IA32_MCG_STATUS, -- 2.4.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/