Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753148Ab1EIMjX (ORCPT ); Mon, 9 May 2011 08:39:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:3279 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753048Ab1EIMjU (ORCPT ); Mon, 9 May 2011 08:39:20 -0400 Date: Mon, 9 May 2011 08:39:02 -0400 From: Vivek Goyal To: "K.Prasad" Cc: Linux Kernel Mailing List , Andi Kleen , "Luck, Tony" , kexec@lists.infradead.org Subject: Re: [Bug] Kdump does not work when panic triggered due to MCE Message-ID: <20110509123902.GA5975@redhat.com> References: <20110506165412.GB2719@in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110506165412.GB2719@in.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1444 Lines: 38 On Fri, May 06, 2011 at 10:24:12PM +0530, K.Prasad wrote: > Hi All, > I wanted to test the behaviour of kdump when panic is triggered > due to MCE on x86 and found that kdump is not captured. > > While the kdump service is configured and running and non-MCE panics > (such as those triggered through to /proc/sysrq-trigger) successfully > capture a kdump, any fatal MCE error injected through the mce-inject > tool causes a reboot of the machine. > > The code has been traced (using early_serial_putc()) to enter the kexec > path i.e. panic()->crash_kexec()->machine_kexec()->relocate_kernel() > but is untraceable further. > > Kdump works fine when the same the similar test is carried out inside a > KVM guest. > > Has anybody tested this before? Or have found kdump working when fatal > MCEs have actually occurred? Prasad, I have never tried taking dump in MCE situation. Does kdump work on this machine with normal panic()? Use --debug and --serial option in kexec-tools to print some debug message and look for "I am in purgatory". This will tell you whether you hanged in first kernel or second kernel. Then put "outb()" messages in the kernel to trace what happened. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/