Received: by 10.223.164.202 with SMTP id h10csp1420072wrb; Wed, 15 Nov 2017 20:17:14 -0800 (PST) X-Google-Smtp-Source: AGs4zMYLoL0989rlK3lIPGzgkf67Dm+oWIBVaFiAVMfLyCMHS3DGxHkm4ZoB+sCvAUD/VNyDLZu7 X-Received: by 10.99.157.3 with SMTP id i3mr377974pgd.165.1510805834218; Wed, 15 Nov 2017 20:17:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510805834; cv=none; d=google.com; s=arc-20160816; b=r4Bdu7iAjsQ0NaRGAwC9crl7Vk49oX0FNHx8qCEirQ1VrxSTaKot3FiVnslxY5DcVP bNVlR2SLeND3B//zuPu1WK5IoQv0Z1vY79QbmH4N5rEJ6L4ckwIhQI+vBpc6hzEJklnO T4LSDQOhikhx/x0xYh3/FmCjAJzpsSp9hIrGcc9vXRvafsv54R/2XWqN5QHReDvsQMZg 3gh6H1pbvxfeWYxQsjUk4Ozvdn4twgBHjRBTDa0Jg3GmTrpCqLkE+AjXTyGH1fETFFm+ PpXSbJ29TuUBqeReUPA07bkIi946N+bbcDFbWWkrAhRi7kwcfXC+Y5g9oek5oAh9J8oY shKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject:arc-authentication-results; bh=VhkAnI54pVCdue76aEGTI4cD2brQAw9vhbTJKsccZxc=; b=z6VJ5/xuYCXffUBbm4Iohl9KepaVd1L3dRShaddKKmYnWDLYkWJOWL2b1M0LjPflnG g/DTxdKhx/HyD3UzkW8WGEtmnirScsC99B//8JcXm3A6tjHePHhFIVqBFEetsShJWDto VQvgniuB+PdCaCDjcj2/57Rb4SDY1ORtanYhEWkDjAhonToyC4ojESqrok3pudrTjlSz 4u9woMRlhFLhAF7EasYxjW8d2GRQipE3ktSmGo2dutJX50gMr+DO3YPt0MqyPr3zBH6s kE20+CfLc9UxYF65ZmDGHw8Yq9Vhjqqaj0XsWj5P05K9BJ3FoiZFlUOqQg97J+GaEJOu 40qQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p91si165601plb.587.2017.11.15.20.17.02; Wed, 15 Nov 2017 20:17:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758935AbdKPDCT (ORCPT + 89 others); Wed, 15 Nov 2017 22:02:19 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:10982 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753877AbdKPDCJ (ORCPT ); Wed, 15 Nov 2017 22:02:09 -0500 Received: from 172.30.72.58 (EHLO DGGEMS405-HUB.china.huawei.com) ([172.30.72.58]) by dggrg05-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id DLA85280; Thu, 16 Nov 2017 11:02:05 +0800 (CST) Received: from [127.0.0.1] (10.177.19.210) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.361.1; Thu, 16 Nov 2017 11:00:54 +0800 Subject: Re: [PATCH] x86/mce: add support SRAO reported via CMC check To: Borislav Petkov , "Luck, Tony" References: <1510638911-88703-1-git-send-email-xiexiuqi@huawei.com> <20171114185139.oo6dr6opk7nup3or@agluck-desk> <7180dff1-2e55-5577-85d3-eda288f2be81@huawei.com> <3908561D78D1C84285E8C5FCA982C28F779F1D15@ORSMSX110.amr.corp.intel.com> <3908561D78D1C84285E8C5FCA982C28F779F1E84@ORSMSX110.amr.corp.intel.com> <20171115103335.3s6bekzapodgkseh@pd.tnic> CC: "tglx@linutronix.de" , "mingo@redhat.com" , "hpa@zytor.com" , "x86@kernel.org" , "linux-edac@vger.kernel.org" , "linux-kernel@vger.kernel.org" , chen wei From: Xie XiuQi Message-ID: Date: Thu, 16 Nov 2017 11:00:40 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20171115103335.3s6bekzapodgkseh@pd.tnic> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.19.210] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.5A0CFFAE.000A,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 9c659c36b8e97e3ca366377c6f2e2664 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Borislav, Tony, On 2017/11/15 18:33, Borislav Petkov wrote: > On Wed, Nov 15, 2017 at 02:44:07AM +0000, Luck, Tony wrote: >> This code is subtle :-( > > I'm glad that we agree on this! :-) > > Anyone wanting to rewrite it yet? > In Intel SDM Volume 3B (253669-063US, July 2017), SRAO could be reported either via MCE or CMC: In cases when SRAO is signaled via CMCI the error signature is indicated via UC=1, PCC=0, S=0. Type(*1) UC EN PCC S AR Signaling --------------------------------------------------------------- UC 1 1 1 x x MCE SRAR 1 1 0 1 1 MCE SRAO 1 x(*2) 0 x(*2) 0 MCE/CMC UCNA 1 x 0 0 0 CMC CE 0 x x x x CMC NOTES: 1. SRAR, SRAO and UCNA errors are supported by the processor only when IA32_MCG_CAP[24] (MCG_SER_P) is set. 2. EN=1, S=1 when signaled via MCE. EN=x, S=0 when signaled via CMC. And there is a description in 15.6.2 UCR Error Reporting and Logging, for bit S: S (Signaling) flag, bit 56 - Indicates (when set) that a machine check exception was generated for the UCR error reported in this MC bank... When the S flag in the IA32_MCi_STATUS register is clear, this UCR error was not signaled via a machine check exception and instead was reported as a corrected machine check (CMC). As the description in SDM, I think this flag could be used to determine whether MCE or CMC was triggered. So we could merge this two case in one and just remove the S=0 check for SRAO. How about this patch? >From a06b2a781a86e3b1fe241591b53f7a6d33d63331 Mon Sep 17 00:00:00 2001 From: Xie XiuQi Date: Tue, 14 Nov 2017 10:13:22 +0800 Subject: [PATCH] x86/mce: add support SRAO reported via CMC check In Intel SDM Volume 3B (253669-063US, July 2017), SRAO could be reported either via MCE or CMC: In cases when SRAO is signaled via CMCI the error signature is indicated via UC=1, PCC=0, S=0. Type(*1) UC EN PCC S AR Signaling --------------------------------------------------------------- UC 1 1 1 x x MCE SRAR 1 1 0 1 1 MCE SRAO 1 x(*2) 0 x(*2) 0 MCE/CMC UCNA 1 x 0 0 0 CMC CE 0 x x x x CMC NOTES: 1. SRAR, SRAO and UCNA errors are supported by the processor only when IA32_MCG_CAP[24] (MCG_SER_P) is set. 2. EN=1, S=1 when signaled via MCE. EN=x, S=0 when signaled via CMC. And there is a description in 15.6.2 UCR Error Reporting and Logging, for bit S: S (Signaling) flag, bit 56 - Indicates (when set) that a machine check exception was generated for the UCR error reported in this MC bank... When the S flag in the IA32_MCi_STATUS register is clear, this UCR error was not signaled via a machine check exception and instead was reported as a corrected machine check (CMC). So we could merge this two case, and just remove the S=0 check for SRAO in mce_severity(). Signed-off-by: Xie XiuQi Tested-by: Chen Wei --- arch/x86/kernel/cpu/mcheck/mce-severity.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c index 4ca632a..5bbd06f 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c @@ -59,6 +59,7 @@ #define MCGMASK(x, y) .mcgmask = x, .mcgres = y #define MASK(x, y) .mask = x, .result = y #define MCI_UC_S (MCI_STATUS_UC|MCI_STATUS_S) +#define MCI_UC_AR (MCI_STATUS_UC|MCI_STATUS_AR) #define MCI_UC_SAR (MCI_STATUS_UC|MCI_STATUS_S|MCI_STATUS_AR) #define MCI_ADDR (MCI_STATUS_ADDRV|MCI_STATUS_MISCV) @@ -101,6 +102,22 @@ NOSER, BITCLR(MCI_STATUS_UC) ), + /* + * known AO MCACODs reported via MCE or CMC: + * + * SRAO could be signaled either via a machine check exception or + * CMCI with the corresponding bit S 1 or 0. So we don't need to + * check bit S for SRAO. + */ + MCESEV( + AO, "Action optional: memory scrubbing error", + SER, MASK(MCI_STATUS_OVER|MCI_UC_AR|MCACOD_SCRUBMSK, MCI_STATUS_UC|MCACOD_SCRUB) + ), + MCESEV( + AO, "Action optional: last level cache writeback error", + SER, MASK(MCI_STATUS_OVER|MCI_UC_AR|MCACOD, MCI_STATUS_UC|MCACOD_L3WB) + ), + /* ignore OVER for UCNA */ MCESEV( UCNA, "Uncorrected no action required", @@ -149,15 +166,6 @@ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR, MCI_UC_SAR) ), - /* known AO MCACODs: */ - MCESEV( - AO, "Action optional: memory scrubbing error", - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD_SCRUBMSK, MCI_UC_S|MCACOD_SCRUB) - ), - MCESEV( - AO, "Action optional: last level cache writeback error", - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD, MCI_UC_S|MCACOD_L3WB) - ), MCESEV( SOME, "Action optional: unknown MCACOD", SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR, MCI_UC_S) -- 1.8.3.1 From 1584127928710707897@xxx Wed Nov 15 10:35:19 +0000 2017 X-GM-THRID: 1584019955232759909 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread