Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752205AbbEDPaG (ORCPT ); Mon, 4 May 2015 11:30:06 -0400 Received: from mail-by2on0110.outbound.protection.outlook.com ([207.46.100.110]:12888 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750899AbbEDPaA (ORCPT ); Mon, 4 May 2015 11:30:00 -0400 Authentication-Results: spf=none (sender IP is 165.204.84.222) smtp.mailfrom=amd.com; alien8.de; dkim=none (message not signed) header.d=none; X-WSS-ID: 0NNU0DR-08-X7W-02 X-M-MSG: Message-ID: <5547906E.3060701@amd.com> Date: Mon, 4 May 2015 10:29:50 -0500 From: Aravind Gopalakrishnan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Borislav Petkov CC: , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH 2/4] x86/mce/amd: Introduce deferred error interrupt handler References: <1430405365-4473-1-git-send-email-Aravind.Gopalakrishnan@amd.com> <1430405365-4473-3-git-send-email-Aravind.Gopalakrishnan@amd.com> <20150503092212.GC18048@pd.tnic> In-Reply-To: <20150503092212.GC18048@pd.tnic> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.180.168.240] X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:165.204.84.222;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(428002)(377454003)(24454002)(189002)(164054003)(51704005)(479174004)(199003)(87936001)(76176999)(62966003)(80316001)(86362001)(83506001)(23676002)(64126003)(110136002)(105586002)(46102003)(36756003)(101416001)(99136001)(50466002)(77096005)(2950100001)(59896002)(47776003)(120886001)(77156002)(4001350100001)(54356999)(92566002)(65956001)(33656002)(50986999)(4580500001);DIR:OUT;SFP:1102;SCL:1;SRVR:BN3PR02MB1112;H:atltwp02.amd.com;FPR:;SPF:None;MLV:sfv;MX:1;A:1;LANG:en; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN3PR02MB1112;UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN3PR02MB1176; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:BN3PR02MB1112;BCL:0;PCL:0;RULEID:;SRVR:BN3PR02MB1112; X-Forefront-PRVS: 05669A7924 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 May 2015 15:29:54.6099 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96;Ip=[165.204.84.222];Helo=[atltwp02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3PR02MB1112 X-OriginatorOrg: amd4.onmicrosoft.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4292 Lines: 122 On 5/3/2015 4:22 AM, Borislav Petkov wrote: > On Thu, Apr 30, 2015 at 09:49:23AM -0500, Aravind Gopalakrishnan wrote: >> Changes introduced in the patch- >> - Assign vector number 0xf4 for Deferred errors >> - Declare deferred_interrupt, allocate gate and bind it >> to DEFERRED_APIC_VECTOR. >> - Declare smp_deferred_interrupt to be used as the >> entry point for the interrupt in mce_amd.c >> - Define trace_deferred_interrupt for tracing >> - Enable deferred error interrupt selectively upon detection >> of 'succor' bitfield >> - Setup amd_deferred_error_interrupt() to handle the interrupt >> and assign it to def_int_vector if feature is present in HW. >> Else, let default handler deal with it. >> - Provide Deferred error interrupt stats on >> /proc/interrupts by incrementing irq_deferred_count > This commit message should explain the feature in more high-level way, > what is it good for and so on, not what you're adding. > > That I can see. :-) Okay, I'll include a short description of deferred errors here for V2. >> +#endif >> #endif >> diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h >> index 0f5fb6b..448451c 100644 >> --- a/arch/x86/include/asm/hardirq.h >> +++ b/arch/x86/include/asm/hardirq.h >> @@ -33,6 +33,9 @@ typedef struct { >> #ifdef CONFIG_X86_MCE_THRESHOLD >> unsigned int irq_threshold_count; >> #endif >> +#ifdef CONFIG_X86_MCE_AMD >> + unsigned int irq_deferred_count; > Right > unsigned int irq_deferred_error_count; Ack. >> +static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) >> +{ >> + u32 low = 0, high = 0; >> + int def_offset = -1, def_new; >> + >> + if (rdmsr_safe(MSR_CU_DEF_ERR, &low, &high)) >> + return; >> + >> + def_new = (low & MASK_DEF_LVTOFF) >> 4; >> + if (c->x86 == 0x15 && c->x86_model == 0x60 && >> + !(low & MASK_DEF_LVTOFF)) { > What's the family check for? for BIOSes which don't set the LVT offset > to 2, as they should? > > If so, we probably should say > > pr_err(FW_BUG "Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly.\n"); > > or similar... Yeah. I meant to provide a comment at least for this. Forgot to do that. I'll print out a error message as you suggested (considering we do this in other places like threshold setup or IBS setup..) >> +/* Apic interrupt handler for deferred errors */ >> +static void amd_deferred_error_interrupt(void) >> +{ >> + u64 status; >> + unsigned int bank; >> + struct mce m; >> + >> + for (bank = 0; bank < mca_cfg.banks; ++bank) { >> + rdmsrl(MSR_IA32_MCx_STATUS(bank), status); >> + >> + if (!(status & MCI_STATUS_VAL) || >> + !(status & MCI_STATUS_DEFERRED)) >> + continue; >> + >> + mce_setup(&m); >> + m.bank = bank; >> + m.status = status; >> + mce_log(&m); >> + wrmsrl(MSR_IA32_MCx_STATUS(bank), 0); >> + break; >> + } > That's very similar to what we do in the end of > amd_threshold_interrupt(). You could add a generic __log_error() static > helper in a pre-patch and then call it here. > Right. I think a __log_error() is a good idea. Except, in amd_threshold_interrupt(), we have- m.misc = ((u64)high << 32) | low; which, is actually useless as we don't use m.misc anywhere in amd_decode_mce() or anywhere else in the decoding pipeline AFAICT. We only print out if 'misc' is valid and we only need status bits for that- ((m->status & MCI_STATUS_MISCV) ? "MiscV" : "-"), But, more importantly, we don't setup 'm.addr' here (in amd_threshold_interrupt() or in amd_deferred_error_interrupt()) Which means anytime we pass an error to be decoded from the interrupt handlers, we don't get any info about the error address. So, we can do one of these- 1. Remove m.misc setup in amd_threshold_interrupt() and rdmsrl(MSR_IA32_MCx_ADDR(bank), m.addr) before we call mce_log() 2. Since we have mce_read_aux() that reads misc and addr registers, we can move the mce_[rd|wr]msrl wrappers and mce_read_aux() into mce.h and use it here in mce_amd.c Thoughts? Thanks, -Aravind. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/