Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752141AbbKJVzt (ORCPT ); Tue, 10 Nov 2015 16:55:49 -0500 Received: from mga03.intel.com ([134.134.136.65]:43543 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751284AbbKJVzs (ORCPT ); Tue, 10 Nov 2015 16:55:48 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,272,1444719600"; d="scan'208";a="847360627" Date: Tue, 10 Nov 2015 13:55:46 -0800 From: "Luck, Tony" To: Borislav Petkov Cc: linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, x86@kernel.org Subject: Re: [RFC PATCH 0/3] Machine check recovery when kernel accesses poison Message-ID: <20151110215546.GA28172@agluck-desk.sc.intel.com> References: <20151110112101.GB19187@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151110112101.GB19187@pd.tnic> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2388 Lines: 60 On Tue, Nov 10, 2015 at 12:21:01PM +0100, Borislav Petkov wrote: > Just a general, why-do-we-do-this, question: on big systems, the memory > occupied by the kernel is a very small percentage compared to whole RAM, > right? And yet we want to recover from there too? Not, say, kexec... I need to add more to the motivation part of this. The people who want this are playing with NVDIMMs as storage. So think of many GBytes of non-volatile memory on the source end of the memcpy(). People are used to disk errors just giving them a -EIO error. They'll be unhappy if an NVDIMM error crashes the machine. > > Note that I also fudge the return value. I'd like in the future > > to be able to write a "mcsafe_copy_from_user()" function that > > would be annotated both for page faults, to return a count of > > bytes uncopied, or an indication that there was a machine check. > > Hence the BIT(63) bit. Internal feedback suggested we'd need > > some IS_ERR() like macros to help users decode what happened > > to take the right action. But this is "RFC" to see if people > > have better ideas on how to handle this. > > Hmm, shouldn't this be using MF_ACTION_REQUIRED or even maybe a new MF_ > flag which is converted into a BUS_MCEERR_AR si_code and thus current > gets a signal? > > Only setting bit 63 looks a bit flaky to me... It will be up to the caller to figure out what action to take. In the NVDIMM filessytem scenario outlined above the result may be -EIO for a data block ... something more drastic if we were reading metadata. When I get around to writing mcsafe_copy_from_user() the code might end up like: some_syscall_e_g_write(void __user *buf, size_t cnt) { u64 ret; ret = mcsafe_copy_from_user(kbuf, buf, cnt); if (ret & BIT(63)) { do some machine check thing ... e.g. send a SIGBUS to this process and return -EINTR This is where we use the address (after converting back to a user virtual address). } else if (ret) { user gave us a bad buffer: return -EFAULT } else { success!!! } } Which all looks quite ugly in long-hand ... I'm hoping that with some pretty macros we can make it pretty. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/