Date: Fri, 19 Oct 2012 11:17:53 -0400
From: Vivek Goyal <vgoyal@redhat.com>
To: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, x86@kernel.org,
        mingo@elte.hu, tglx@linutronix.de, hpa@zytor.com, len.brown@intel.com,
        fenghua.yu@intel.com, ebiederm@xmission.com, grant.likely@secretlab.ca,
        rob.herring@calxeda.com, Michael Holzheu <holzheu@linux.vnet.ibm.com>
Subject: Re: [PATCH v1 0/2] x86, apic: Disable BSP if boot cpu is AP
Message-ID: <20121019151753.GF27052@redhat.com>
References: <20121017141234.GD31663@redhat.com>
 <20121018.120805.464238435.d.hatayama@jp.fujitsu.com>
 <20121018141449.GB18147@redhat.com>
 <20121019.122054.476812873.d.hatayama@jp.fujitsu.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20121019.122054.476812873.d.hatayama@jp.fujitsu.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2543
Lines: 57

On Fri, Oct 19, 2012 at 12:20:54PM +0900, HATAYAMA Daisuke wrote:

[..]
> > Instead of capturing the dump of whole memory, isn't it more efficient
> > to capture the crash dump of VM in question and then if need be just
> > take filtered crash dump of host kernel. 
> > 
> > I think that trying to take unfiltered crash dumps of tera bytes of memory
> > is not practical or woth it for most of the use cases.
> > 
> 
> If there's a lag between VM dump and host dump, situation on the host
> can change, and VM dump itself changes the situation. Then, we cannot
> know what kind of bug resides in now, so we want to do as few things
> as possible between detecting the bug reproduced and taking host
> dump. So I expressed ``capturing the situation''.

I would rather first detect the problem on guest and figure out what's
happening. Once it has been determined that something is wrong from
host side then debug what's wrong with host by using regular kernel
debugging techiniques. 

Even if you are interested in capturing crash dump, after you have
decided that it is a host problem, then you can write some scripts which
trigger host crash dump when relevant event happens.

Seriously, this argument could be extended to regular processes also.
Something is wrong with my application, so lets dump the whole system,
provide a facility to extract each process's code dump from that huge
dump and then examine if it was an application issue or kernel issue.

I am skeptical that this approach is going to fly in practice. Dumping
huge images, processing and transferring these is not very practical.
So I would rather narrow down the problem on a running system and take
filtered dump of appropriate component where I suspect the problem is.

[..]
> > capability was the primary reason that s390 also wants to support
> > kdump otherwise there firmware dumping mechanism was working just
> > fine.
> > 
> 
> I don't know s390 firmware dumping mechanism at all, but is it possble
> for s390 to filter crash dump even on firmware dumping mechanism?

AFAIK, s390 dump mechanism could not filter dump and tha's the reason
they wanted to support kdump and /proc/vmcore so that makedumpfile 
could filter it. I am CCing Michael Holzheu, who did the s390 kdump work.
He can tell it better.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/