Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753915AbcKRNA3 (ORCPT ); Fri, 18 Nov 2016 08:00:29 -0500 Received: from mx2.suse.de ([195.135.220.15]:44125 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752411AbcKRNA1 (ORCPT ); Fri, 18 Nov 2016 08:00:27 -0500 Date: Fri, 18 Nov 2016 14:00:22 +0100 From: Borislav Petkov To: "Luck, Tony" Cc: linux-kernel@vger.kernel.org, Andi Kleen Subject: Re: [PATCH 1/2] x86/mce: Include the PPIN in machine check records when it is available Message-ID: <20161118130022.vcnodxrslg6khycc@pd.tnic> References: <1479429348-1664-1-git-send-email-tony.luck@intel.com> <1479429348-1664-2-git-send-email-tony.luck@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1479429348-1664-2-git-send-email-tony.luck@intel.com> User-Agent: NeoMutt/20161014 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2329 Lines: 86 On Thu, Nov 17, 2016 at 04:35:48PM -0800, Luck, Tony wrote: > From: Tony Luck > > Intel Xeons from Ivy Bridge onwards support a processor identification > number. On systems that have it, include it in the machine check record. > I'm told that this would be helpful for users that run large data centers > with multi-socket servers to keep track of which CPUs are seeing errors. > > Signed-off-by: Tony Luck > --- > arch/x86/include/asm/msr-index.h | 4 ++++ > arch/x86/include/uapi/asm/mce.h | 1 + > arch/x86/kernel/cpu/mcheck/mce.c | 35 +++++++++++++++++++++++++++++++++++ > 3 files changed, 40 insertions(+) ... > @@ -2134,8 +2140,37 @@ static int __init mcheck_enable(char *str) > } > __setup("mce", mcheck_enable); > > +static void mcheck_intel_ppin_init(void) So this functionality could all be moved to arch/x86/kernel/cpu/intel.c where you could set an artificial X86_FEATURE_PPIN and get rid of the have_ppin var. > +{ > + unsigned long long msr_ppin_ctl; > + > + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) > + return; Then, that check can go. > + switch (boot_cpu_data.x86_model) { > + case INTEL_FAM6_IVYBRIDGE_X: > + case INTEL_FAM6_HASWELL_X: > + case INTEL_FAM6_BROADWELL_XEON_D: > + case INTEL_FAM6_BROADWELL_X: > + case INTEL_FAM6_SKYLAKE_X: > + if (rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl)) > + return; I don't think you need to check models - if the RDMSR fails, you're done. > + if (msr_ppin_ctl == 1) { & BIT_ULL(0) for future robustness in case those other reserved bits get used. > + pr_info("PPIN available but disabled\n"); We don't care, do we? > + return; > + } > + /* if PPIN is disabled, but not locked, try to enable */ > + if (msr_ppin_ctl == 0) { Also, properly masked off. There are [63:2] reserved bits which might be assigned someday. > + wrmsrl_safe(MSR_PPIN_CTL, 2); > + rdmsrl_safe(MSR_PPIN_CTL, &msr_ppin_ctl); Why aren't we programming a number here? Or are users supposed to do that? If so, please design a proper sysfs interface and not make them use msr-tools. > + } > + if (msr_ppin_ctl == 2) > + have_ppin = 1; set_cpu_cap(c, X86_FEATURE_PPIN); -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --