Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935794AbdCXLAo (ORCPT ); Fri, 24 Mar 2017 07:00:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42734 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757093AbdCXLAX (ORCPT ); Fri, 24 Mar 2017 07:00:23 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 200297E9E2 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=xpang@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 200297E9E2 Reply-To: xlpang@redhat.com Subject: Re: [PATCH v3 1/3] kexec: Move vmcoreinfo out of the kernel's .bss section References: <1489989033-1179-1-git-send-email-xlpang@redhat.com> <87pohbz4lo.fsf@xmission.com> <20170322025536.GA4424@dhcp-128-65.nay.redhat.com> <87pohaxam1.fsf@xmission.com> <20170322043004.GB4424@dhcp-128-65.nay.redhat.com> <20170322214819.0b64d49c@TP-holzheu> <58D39429.9020200@redhat.com> <20170323184631.1cd671ba@TP-holzheu> To: Michael Holzheu , xlpang@redhat.com Cc: Baoquan He , Atsushi Kumagai , Petr Tesarik , linux-kernel@vger.kernel.org, "Eric W. Biederman" , hbathini@linux.vnet.ibm.com, akpm@linux-foundation.org, Dave Young , kexec@lists.infradead.org From: Xunlei Pang Message-ID: <58D4FCEA.6010708@redhat.com> Date: Fri, 24 Mar 2017 19:03:06 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20170323184631.1cd671ba@TP-holzheu> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 24 Mar 2017 11:00:22 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3560 Lines: 94 On 03/24/2017 at 01:46 AM, Michael Holzheu wrote: > Am Thu, 23 Mar 2017 17:23:53 +0800 > schrieb Xunlei Pang : > >> On 03/23/2017 at 04:48 AM, Michael Holzheu wrote: >>> Am Wed, 22 Mar 2017 12:30:04 +0800 >>> schrieb Dave Young : >>> >>>> On 03/21/17 at 10:18pm, Eric W. Biederman wrote: >>>>> Dave Young writes: >>>>> >>> [snip] >>> >>>>>> I think makedumpfile is using it, but I also vote to remove the >>>>>> CRASHTIME. It is better not to do this while crashing and a makedumpfile >>>>>> userspace patch is needed to drop the use of it. >>>>>> >>>>>>> As we are looking at reliability concerns removing CRASHTIME should make >>>>>>> everything in vmcoreinfo a boot time constant. Which should simplify >>>>>>> everything considerably. >>>>>> It is a nice improvement.. >>>>> We also need to take a close look at what s390 is doing with vmcoreinfo. >>>>> As apparently it is reading it in a different kind of crashdump process. >>>> Yes, need careful review from s390 and maybe ppc64 especially about >>>> patch 2/3, better to have comments from IBM about s390 dump tool and ppc >>>> fadump. Added more cc. >>> On s390 we have at least an issue with patch 1/3. For stand-alone dump >>> and also because we create the ELF header for kdump in the new >>> kernel we save the pointer to the vmcoreinfo note in the old kernel on a >>> defined memory address in our absolute zero lowcore. >>> >>> This is done in arch/s390/kernel/setup.c: >>> >>> static void __init setup_vmcoreinfo(void) >>> { >>> mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note()); >>> } >>> >>> Since with patch 1/3 paddr_vmcoreinfo_note() returns NULL at this point in >>> time we have a problem here. >>> >>> To solve this - I think - we could move the initialization to >>> arch/s390/kernel/machine_kexec.c: >>> >>> void arch_crash_save_vmcoreinfo(void) >>> { >>> VMCOREINFO_SYMBOL(lowcore_ptr); >>> VMCOREINFO_SYMBOL(high_memory); >>> VMCOREINFO_LENGTH(lowcore_ptr, NR_CPUS); >>> mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note()); >>> } >>> >>> Probably related to this is my observation that patch 3/3 leads to >>> an empty VMCOREINFO note for kdump on s390. The note is there ... >>> >>> # readelf -n /var/crash/127.0.0.1-2017-03-22-21:14:39/vmcore | grep VMCORE >>> VMCOREINFO 0x0000068e Unknown note type: (0x00000000) >>> >>> But it contains only zeros. >> Yes, this is a good catch, I will do more tests. > Hello Xunlei, > > After spending some time on this, I now understood the problem: > > In patch 3/3 you copy vmcoreinfo into the control page before > machine_kexec_prepare() is called. For s390 we give back all the > crashkernel memory to the hypervisor before the new crashkernel > is loaded: > > /* > * Give back memory to hypervisor before new kdump is loaded > */ > static int machine_kexec_prepare_kdump(void) > { > #ifdef CONFIG_CRASH_DUMP > if (MACHINE_IS_VM) > diag10_range(PFN_DOWN(crashk_res.start), > PFN_DOWN(crashk_res.end - crashk_res.start + 1)); > return 0; > #else > return -EINVAL; > #endif > } > > So after machine_kexec_prepare_kdump() the contents of your control page > is gone and therefore the vmcorinfo ELF note contains only zeros. > > If you call kimage_crash_copy_vmcoreinfo() after > machine_kexec_prepare_kdump() the problem should be solved for s390. Will update, thanks for finding the root cause. Regards, Xunlei