Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756789AbbEESj0 (ORCPT ); Tue, 5 May 2015 14:39:26 -0400 Received: from mail-bn1on0131.outbound.protection.outlook.com ([157.56.110.131]:22193 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752478AbbEESjX (ORCPT ); Tue, 5 May 2015 14:39:23 -0400 Authentication-Results: spf=none (sender IP is 165.204.84.222) smtp.mailfrom=amd.com; alien8.de; dkim=none (message not signed) header.d=none; X-WSS-ID: 0NNW3TD-08-5YC-02 X-M-MSG: Message-ID: <55490E47.7040107@amd.com> Date: Tue, 5 May 2015 13:39:03 -0500 From: Aravind Gopalakrishnan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Borislav Petkov CC: , , , , , , , , , , , , , , , , , , , , , , , Robert Richter Subject: Re: [PATCH 2/4] x86/mce/amd: Introduce deferred error interrupt handler References: <1430405365-4473-1-git-send-email-Aravind.Gopalakrishnan@amd.com> <1430405365-4473-3-git-send-email-Aravind.Gopalakrishnan@amd.com> <20150503092212.GC18048@pd.tnic> <5547906E.3060701@amd.com> <20150504154652.GF3829@pd.tnic> <5547A780.8080800@amd.com> <20150504184643.GH3829@pd.tnic> In-Reply-To: <20150504184643.GH3829@pd.tnic> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.180.168.240] X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:165.204.84.222;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(428002)(24454002)(164054003)(377454003)(51704005)(199003)(479174004)(189002)(76176999)(33656002)(50986999)(99136001)(46102003)(105586002)(83506001)(47776003)(54356999)(92566002)(77096005)(110136002)(2950100001)(106466001)(50466002)(36756003)(93886004)(87936001)(101416001)(23676002)(62966003)(86362001)(80316001)(575784001)(65806001)(120886001)(4001350100001)(77156002);DIR:OUT;SFP:1102;SCL:1;SRVR:BLUPR02MB1108;H:atltwp02.amd.com;FPR:;SPF:None;MLV:sfv;A:1;MX:1;LANG:en; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR02MB1108; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:BLUPR02MB1108;BCL:0;PCL:0;RULEID:;SRVR:BLUPR02MB1108; X-Forefront-PRVS: 0567A15835 X-OriginatorOrg: amd4.onmicrosoft.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 May 2015 18:39:18.5168 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96;Ip=[165.204.84.222];Helo=[atltwp02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR02MB1108 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2037 Lines: 63 On 5/4/2015 1:46 PM, Borislav Petkov wrote: > So you can use mce_read_aux(), yeah, you can move it to mce-internal.h Re-using mce_read_aux() was not as trivial as I initially thought. The MISC address value we read in amd_threshold_interrupt() could also be the value in MSR0xc0000408 or MSR0xc0000409 (for a bank == 4 case). But in mce_read_aux(), we will only look at MSR_IA32_MCx_MISC(i) (which is 0x413 for bank = 4) So, instead of mucking around with mce_read_aux(), I am reusing the 'misc' value from amd_threshold_interrupt() and just adding rdmsrl(MSR_IA32_MCx_ADDR(bank), m.addr) > So you can pass a parameter to __log_error(..., threshold=true, misc) > and do > > if (threshold) > m.misc = misc; > Here's how I have it currently- static void __log_error(unsigned int bank, bool is_thr, u64 misc) { struct mce m; mce_setup(&m); rdmsrl(MSR_IA32_MCx_STATUS(bank), m.status); if (!(m.status & MCI_STATUS_VAL)) return; if (is_thr) m.misc = misc; m.bank = bank; rdmsrl(MSR_IA32_MCx_ADDR(bank), m.addr); mce_log(&m); wrmsrl(MSR_IA32_MCx_STATUS(bank), 0); } and works fine.. Before patch: [76916.275587] [Hardware Error]: Corrected error, no action required. [76916.279576] [Hardware Error]: CPU:0 (15:60:0) MC0_STATUS[-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b [76916.279576] [Hardware Error]: MC0 Error Address: 0x0000000000000000 Corrected error output: [ 102.623490] [Hardware Error]: Corrected error, no action required. [ 102.623668] [Hardware Error]: CPU:0 (15:60:0) MC0_STATUS[-|CE|-|-|AddrV|-|-|CECC]: 0x840041000028017b [ 102.623930] [Hardware Error]: MC0 Error Address: 0x00001f808f0ff040 Thanks, -Aravind. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/