Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752682AbbFBU1n (ORCPT ); Tue, 2 Jun 2015 16:27:43 -0400 Received: from mail-bl2on0135.outbound.protection.outlook.com ([65.55.169.135]:59904 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751993AbbFBU0Z (ORCPT ); Tue, 2 Jun 2015 16:26:25 -0400 Authentication-Results: spf=none (sender IP is 165.204.84.222) smtp.mailfrom=amd.com; alien8.de; dkim=none (message not signed) header.d=none; X-WSS-ID: 0NPC3FU-08-TLN-02 X-M-MSG: From: Aravind Gopalakrishnan To: , , CC: , , Subject: [PATCH V2 9/9] edac, mce_amd_inj: Inject errors on NBC for bank 4 errors Date: Tue, 2 Jun 2015 15:36:02 -0500 Message-ID: <1433277362-10911-10-git-send-email-Aravind.Gopalakrishnan@amd.com> X-Mailer: git-send-email 2.4.0 In-Reply-To: <1433277362-10911-1-git-send-email-Aravind.Gopalakrishnan@amd.com> References: <1433277362-10911-1-git-send-email-Aravind.Gopalakrishnan@amd.com> MIME-Version: 1.0 Content-Type: text/plain X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;BN1BFFO11FD002;1:sRcgYS4BAAPelYWDupSGASIZnnsFAEn9TueqCibyuRRdTxPRbrvoB5WNw+iKWDfl3l+Wk8lhmyNojnmatodLZcGTbJZgqa3ZmSoDUgqMUZ9iqmuRhsGcGXHSfJKwn7B1fH3nkH7eIrNoyUdDjV/iDszsltesSUt+RTBHBpEDuWnRuIvsXjybDONVgjC1LpK4aRY1YCZXG85R26xg5gyzgFh1493mYUZAow681yJSB4DB3aZqQm8KWHTKRDXJu4bV/Of71Tec40jaLARJPMwQwA== X-Forefront-Antispam-Report: CIP:165.204.84.222;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(428002)(199003)(189002)(64706001)(2201001)(77096005)(62966003)(77156002)(87936001)(50226001)(229853001)(106466001)(46102003)(47776003)(53416004)(19580395003)(19580405001)(48376002)(92566002)(50466002)(50986999)(86362001)(101416001)(76176999)(5001860100001)(105586002)(4001540100001)(5001830100001)(5001770100001)(68736005)(189998001)(5001920100001)(97736004)(2950100001)(36756003)(21314002);DIR:OUT;SFP:1102;SCL:1;SRVR:BY1PR02MB1114;H:atltwp02.amd.com;FPR:;SPF:None;PTR:InfoDomainNonexistent;A:1;MX:1;LANG:en; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY1PR02MB1114; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(520003)(3002001);SRVR:BY1PR02MB1114;BCL:0;PCL:0;RULEID:;SRVR:BY1PR02MB1114; X-Forefront-PRVS: 05954A7C45 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Jun 2015 20:26:22.6621 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96;Ip=[165.204.84.222];Helo=[atltwp02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY1PR02MB1114 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2957 Lines: 106 For bank 4 errors, MCE is logged and reported only on node base cores. Refer D18F3x44[NbMcaToMstCpuEn] field in Fam10h and later BKDGs. This patch ensures that we inject the error on the node base core for bank 4 errors. Otherwise, triggering #MC or apic interrupts on a non node base core would not have any effect on the system. (i.e), we would not see any relevant output on kernel logs for the error we just injected. Signed-off-by: Aravind Gopalakrishnan --- drivers/edac/mce_amd_inj.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c index b7e108c..45aac4f 100644 --- a/drivers/edac/mce_amd_inj.c +++ b/drivers/edac/mce_amd_inj.c @@ -17,9 +17,12 @@ #include #include #include +#include #include +#include #include "mce_amd.h" +#include "amd64_edac.h" /* * Collect all the MCi_XXX settings @@ -200,6 +203,44 @@ static void trigger_thr_int(void *info) asm volatile("int %0" :: "i" (THRESHOLD_APIC_VECTOR)); } +static u32 amd_get_nbc_for_node(int node_id) +{ + struct cpuinfo_x86 *c = &boot_cpu_data; + u32 cores_per_node; + + cores_per_node = c->x86_max_cores / amd_get_nodes_cnt(); + + return cores_per_node * node_id; +} + +static void toggle_nb_mca_mst_cpu(u16 nid) +{ + struct pci_dev *F3 = node_to_amd_nb(nid)->misc; + u32 val; + int err; + + if (!F3) + return; + + err = pci_read_config_dword(F3, NBCFG, &val); + if (err) { + pr_err("%s: Error reading F%dx%03x.\n", __func__, + PCI_FUNC(F3->devfn), + NBCFG); + return; + } + + if (!(val & BIT(27))) { + pr_err("%s: BIOS not setting D18F3x44[NbMcaToMstCpuEn]. Doing that here\n", __func__); + val |= BIT(27); + err = pci_write_config_dword(F3, NBCFG, val); + if (err) + pr_err("%s: Error writing F%dx%03x.\n", __func__, + PCI_FUNC(F3->devfn), + NBCFG); + } +} + static void do_inject(void) { u64 mcg_status = 0; @@ -235,6 +276,20 @@ static void do_inject(void) if (!(i_mce.status & MCI_STATUS_PCC)) mcg_status |= MCG_STATUS_RIPV; + /* + * For multi node cpus, logging and reporting of bank == 4 errors + * happen only on the node base core. Refer D18F3x44[NbMcaToMstCpuEn] + * for Fam10h and later BKDGs + */ + if (static_cpu_has(X86_FEATURE_AMD_DCM) && b == 4) { + /* + * BIOS sets D18F3x44[NbMcaToMstCpuEn] by default. + * But make sure of it here just in case.. + */ + toggle_nb_mca_mst_cpu(amd_get_nb_id(cpu)); + cpu = amd_get_nbc_for_node(amd_get_nb_id(cpu)); + } + toggle_hw_mce_inject(cpu, true); wrmsr_on_cpu(cpu, MSR_IA32_MCG_STATUS, -- 2.4.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/