Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754193AbbKLTol (ORCPT ); Thu, 12 Nov 2015 14:44:41 -0500 Received: from mga11.intel.com ([192.55.52.93]:36154 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752375AbbKLToj (ORCPT ); Thu, 12 Nov 2015 14:44:39 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,283,1444719600"; d="scan'208";a="684106470" Date: Thu, 12 Nov 2015 11:44:23 -0800 From: "Luck, Tony" To: Andy Lutomirski Cc: Borislav Petkov , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, x86@kernel.org, DanWilliamsdan.j.williams@intel.com Subject: Re: [PATCH 1/3] x86, ras: Add new infrastructure for machine check fixup tables Message-ID: <20151112194422.GA31228@agluck-desk.sc.intel.com> References: <5bf6f812a7dd2b619487c57987e29b3884c6c4ec.1447093568.git.tony.luck@intel.com> <56441240.6000607@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56441240.6000607@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1532 Lines: 36 On Wed, Nov 11, 2015 at 08:14:56PM -0800, Andy Lutomirski wrote: > On 11/06/2015 12:57 PM, Tony Luck wrote: > >Copy the existing page fault fixup mechanisms to create a new table > >to be used when fixing machine checks. Note: > >1) At this time we only provide a macro to annotate assembly code > >2) We assume all fixups will in code builtin to the kernel. > > Shouldn't the first step be to fixup failures during user memory access? We already have code to recover from machine checks encountered while the processor is executing ring3 code. This series is gently extending to ring0 code in some places that look to be high enough profile to warrant the attention (and that we have some plan for a recovery action). Initial user will be filessytem code using NVDIMM as storage. I.e. lots of memory accessed by a small amount of code. If we get a machine check reading the NVDIMM, then we turn it into -EIO. > This does something really weird to rax. (Also, what happens on 32-bit > kernels? There's no bit 63.) 32-bit kernels are out of luck for this - but I don't feel bad about it - you simply cannot run a 32-bit kernel on machines that have this level of recovery (they have too much memory to boot 32-bit kernels). > Please at least document it clearly. Will do. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/