Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760890AbZFOMZm (ORCPT ); Mon, 15 Jun 2009 08:25:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757888AbZFOMZe (ORCPT ); Mon, 15 Jun 2009 08:25:34 -0400 Received: from cantor2.suse.de ([195.135.220.15]:37180 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757479AbZFOMZd (ORCPT ); Mon, 15 Jun 2009 08:25:33 -0400 Date: Mon, 15 Jun 2009 14:25:28 +0200 From: Nick Piggin To: Wu Fengguang Cc: Andi Kleen , Balbir Singh , Andrew Morton , LKML , Ingo Molnar , Mel Gorman , Thomas Gleixner , "H. Peter Anvin" , Peter Zijlstra , Hugh Dickins , "riel@redhat.com" , "chris.mason@oracle.com" , "linux-mm@kvack.org" Subject: Re: [PATCH 00/22] HWPOISON: Intro (v5) Message-ID: <20090615122528.GA13256@wotan.suse.de> References: <20090615024520.786814520@intel.com> <4A35BD7A.9070208@linux.vnet.ibm.com> <20090615042753.GA20788@localhost> <20090615064447.GA18390@wotan.suse.de> <20090615070914.GC31969@one.firstfloor.org> <20090615071907.GA8665@wotan.suse.de> <20090615121001.GA10944@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090615121001.GA10944@localhost> User-Agent: Mutt/1.5.9i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1911 Lines: 47 On Mon, Jun 15, 2009 at 08:10:01PM +0800, Wu Fengguang wrote: > On Mon, Jun 15, 2009 at 03:19:07PM +0800, Nick Piggin wrote: > > > For KVM you need early kill, for the others it remains to be seen. > > > > Right. It's almost like you need to do a per-process thing, and > > those that can handle things (such as the new SIGBUS or the new > > EIO) could get those, and others could be killed. > > To send early SIGBUS kills to processes who has called > sigaction(SIGBUS, ...)? KVM will sure do that. For other apps we > don't mind they can understand that signal at all. For apps that hook into SIGBUS for some other means and do not understand the new type of SIGBUS signal? What about those? > > Early-kill for KVM does seem like reasonable justification on the > > surface, but when I think more about it, I wonder does the guest > > actually stand any better chance to correct the error if it is > > reported at time T rather than T+delta? (who knows what the page > > will be used for at any given time). > > Early kill makes a lot difference for KVM. Think about the vast > amount of clean page cache pages. With early kill the page can be > trivially isolated. With late kill the whole virtual machine dies > hard. Why? In both cases it will enter the exception handler and attempt to do something about it... in both cases I would have thought there is some chance that the page error is not recoverable and some chance it is recoverable. Or am I missing something? Anyway, I would like to see a basic analysis of those probabilities to justify early kill. Not saying there is no justification, but it would be helpful to see why. Thanks, Nick -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/