Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754897AbcDFM0M (ORCPT ); Wed, 6 Apr 2016 08:26:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54051 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754082AbcDFM0J (ORCPT ); Wed, 6 Apr 2016 08:26:09 -0400 Reply-To: xlpang@redhat.com Subject: Re: [PATCH] s390/kexec: Consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(unprotect)_crashkres() References: <1459338441-21919-1-git-send-email-xlpang@redhat.com> <20160330123026.GB2607@x1.redhat.com> <20160331024309.GA16292@dhcp-128-25.nay.redhat.com> <20160331025255.GB2555@x1.redhat.com> <56FC9EE7.7020506@redhat.com> To: Baoquan He , Minfei Huang , akpm@linux-foundation.org Cc: kexec@lists.infradead.org, linux-kernel@vger.kernel.org, ebiederm@xmission.com, Michael Holzheu , Vivek Goyal From: Xunlei Pang Message-ID: <5705005A.4000600@redhat.com> Date: Wed, 6 Apr 2016 20:26:02 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <56FC9EE7.7020506@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13766 Lines: 374 On 2016/03/31 at 11:52, Xunlei Pang wrote: > Hi Bao, > > On 2016/03/31 at 10:52, Baoquan He wrote: >> On 03/31/16 at 10:43am, Minfei Huang wrote: >>> On 03/30/16 at 08:30pm, Baoquan He wrote: >>>> Hi Xunlei, >>>> >>>> I have two questions. >>>> >>>> One is do we still need Minfei's patch if this patch is applied since >>>> you have completely delete crash_map/unmap_reserved_pages in >>>> kernel/kexec.c ? >>> I think it is necessary to apply my bug-fixing patch firstly before >>> apply this, since other maintainers can backport my bug-fixing patch to >>> fix issue for stable linux kernel. >> This is why previously I said you two need get together to discuss how >> to fix this issue and post. Two questions: 1st is Xunlei is doing a >> cleanup but leave the map/unmap there thought they are doing the same >> thing in different way; 2nd is your bug fix patch with his clean up. It >> looks totally mess, to reviewers and maintainers. So now I will leave >> these to other people interested to review because I personally don't >> like it, but I don't object it strongly since I don't like always aruging >> by type writing. >> > Thanks for your comments, and I'm fine with your concern. > > There is a "historical" reason, we didn't expect these patches back then, > they were coming out gradually due to some discussion in the mailinglist. > > It would be clear if these patches were reordered as follows: > Minfei's patchset: > [Patch01] kexec: make a pair of map/unmap reserved pages in error path > [Patch02] kexec: do a cleanup for function kexec_load > > Then my patchset: > [Patch01] kexec: introduce a protection mechanism for the crashkernel reserved memory > [Patch02] s390/kexec: Consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(unprotect)_crashkres() > [Patch03(x86_64)] kexec: provide arch_kexec_protect(unprotect)_crashkres() > > I don't know if it is possible to reorder that since they are already in "linux-next", ask Andrew for help :-) Ping Andrew :-) > > Regards, > Xunlei > >>> Thanks >>> Minfei >>> >>>> On 03/30/16 at 07:47pm, Xunlei Pang wrote: >>>>> Commit 3f625002581b ("kexec: introduce a protection mechanism >>>>> for the crashkernel reserved memory") is a similar mechanism >>>>> for protecting the crash kernel reserved memory to previous >>>>> crash_map/unmap_reserved_pages() implementation, the new one >>>>> is more generic in name and cleaner in code (besides, some >>>>> arch may not be allowed to unmap the pgtable). >>>>> >>>>> Therefore, this patch consolidates them, and uses the new >>>>> arch_kexec_protect(unprotect)_crashkres() to replace former >>>>> crash_map/unmap_reserved_pages() which by now has been only >>>>> used by S390. >>>>> >>>>> The consolidation work needs the crash memory to be mapped >>>>> initially, so get rid of S390 crash kernel memblock removal >>>>> in reserve_crashkernel(). Once kdump kernel is loaded, the >>>>> new arch_kexec_protect_crashkres() implemented for S390 will >>>>> actually unmap the pgtable like before. >>>>> >>>>> The patch also fixed a S390 crash_shrink_memory() bad page warning >>>>> in passing due to not using memblock_reserve(): >>>>> BUG: Bad page state in process bash pfn:7e400 >>>>> page:000003d101f90000 count:0 mapcount:1 mapping: (null) index:0x0 >>>>> flags: 0x0() >>>>> page dumped because: nonzero mapcount >>>>> Modules linked in: ghash_s390 prng aes_s390 des_s390 des_generic >>>>> CPU: 0 PID: 1558 Comm: bash Not tainted 4.6.0-rc1-next-20160327 #1 >>>>> 0000000073007a58 0000000073007ae8 0000000000000002 0000000000000000 >>>>> 0000000073007b88 0000000073007b00 0000000073007b00 000000000022cf4e >>>>> 0000000000a579b8 00000000007b0dd6 0000000000791a8c >>>>> 000000000000000b >>>>> 0000000073007b48 0000000073007ae8 0000000000000000 0000000000000000 >>>>> 070003d100000001 0000000000112f20 0000000073007ae8 0000000073007b48 >>>>> Call Trace: >>>>> ([<0000000000112e0c>] show_trace+0x5c/0x78) >>>>> ([<0000000000112ed4>] show_stack+0x6c/0xe8) >>>>> ([<00000000003f28dc>] dump_stack+0x84/0xb8) >>>>> ([<0000000000235454>] bad_page+0xec/0x158) >>>>> ([<00000000002357a4>] free_pages_prepare+0x2e4/0x308) >>>>> ([<00000000002383a2>] free_hot_cold_page+0x42/0x198) >>>>> ([<00000000001c45e0>] crash_free_reserved_phys_range+0x60/0x88) >>>>> ([<00000000001c49b0>] crash_shrink_memory+0xb8/0x1a0) >>>>> ([<000000000015bcae>] kexec_crash_size_store+0x46/0x60) >>>>> ([<000000000033d326>] kernfs_fop_write+0x136/0x180) >>>>> ([<00000000002b253c>] __vfs_write+0x3c/0x100) >>>>> ([<00000000002b35ce>] vfs_write+0x8e/0x190) >>>>> ([<00000000002b4ca0>] SyS_write+0x60/0xd0) >>>>> ([<000000000063067c>] system_call+0x244/0x264) >>>>> >>>>> Cc: Michael Holzheu >>>>> Signed-off-by: Xunlei Pang >>>>> --- >>>>> Tested kexec/kdump on S390x >>>>> >>>>> arch/s390/kernel/machine_kexec.c | 86 ++++++++++++++++++++++------------------ >>>>> arch/s390/kernel/setup.c | 7 ++-- >>>>> include/linux/kexec.h | 2 - >>>>> kernel/kexec.c | 12 ------ >>>>> kernel/kexec_core.c | 11 +---- >>>>> 5 files changed, 54 insertions(+), 64 deletions(-) >>>>> >>>>> diff --git a/arch/s390/kernel/machine_kexec.c b/arch/s390/kernel/machine_kexec.c >>>>> index 2f1b721..1ec6cfc 100644 >>>>> --- a/arch/s390/kernel/machine_kexec.c >>>>> +++ b/arch/s390/kernel/machine_kexec.c >>>>> @@ -35,6 +35,52 @@ extern const unsigned long long relocate_kernel_len; >>>>> #ifdef CONFIG_CRASH_DUMP >>>>> >>>>> /* >>>>> + * Map or unmap crashkernel memory >>>>> + */ >>>>> +static void crash_map_pages(int enable) >>>>> +{ >>>>> + unsigned long size = resource_size(&crashk_res); >>>>> + >>>>> + BUG_ON(crashk_res.start % KEXEC_CRASH_MEM_ALIGN || >>>>> + size % KEXEC_CRASH_MEM_ALIGN); >>>>> + if (enable) >>>>> + vmem_add_mapping(crashk_res.start, size); >>>>> + else { >>>>> + vmem_remove_mapping(crashk_res.start, size); >>>>> + if (size) >>>>> + os_info_crashkernel_add(crashk_res.start, size); >>>>> + else >>>>> + os_info_crashkernel_add(0, 0); >>>>> + } >>>>> +} >>>>> + >>>>> +/* >>>>> + * Map crashkernel memory >>>>> + */ >>>>> +static void crash_map_reserved_pages(void) >>>>> +{ >>>>> + crash_map_pages(1); >>>>> +} >>>>> + >>>>> +/* >>>>> + * Unmap crashkernel memory >>>>> + */ >>>>> +static void crash_unmap_reserved_pages(void) >>>>> +{ >>>>> + crash_map_pages(0); >>>>> +} >>>>> + >>>>> +void arch_kexec_protect_crashkres(void) >>>> The second is in kernel I saw res is abbreviation of resource. So here >>>> what is the full name of crashkres? >>>> >>>> >>>>> +{ >>>>> + crash_unmap_reserved_pages(); >>>>> +} >>>>> + >>>>> +void arch_kexec_unprotect_crashkres(void) >>>>> +{ >>>>> + crash_map_reserved_pages(); >>>>> +} >>>>> + >>>>> +/* >>>>> * PM notifier callback for kdump >>>>> */ >>>>> static int machine_kdump_pm_cb(struct notifier_block *nb, unsigned long action, >>>>> @@ -43,12 +89,12 @@ static int machine_kdump_pm_cb(struct notifier_block *nb, unsigned long action, >>>>> switch (action) { >>>>> case PM_SUSPEND_PREPARE: >>>>> case PM_HIBERNATION_PREPARE: >>>>> - if (crashk_res.start) >>>>> + if (kexec_crash_image) >>>>> crash_map_reserved_pages(); >>>>> break; >>>>> case PM_POST_SUSPEND: >>>>> case PM_POST_HIBERNATION: >>>>> - if (crashk_res.start) >>>>> + if (kexec_crash_image) >>>>> crash_unmap_reserved_pages(); >>>>> break; >>>>> default: >>>>> @@ -147,42 +193,6 @@ static int kdump_csum_valid(struct kimage *image) >>>>> } >>>>> >>>>> /* >>>>> - * Map or unmap crashkernel memory >>>>> - */ >>>>> -static void crash_map_pages(int enable) >>>>> -{ >>>>> - unsigned long size = resource_size(&crashk_res); >>>>> - >>>>> - BUG_ON(crashk_res.start % KEXEC_CRASH_MEM_ALIGN || >>>>> - size % KEXEC_CRASH_MEM_ALIGN); >>>>> - if (enable) >>>>> - vmem_add_mapping(crashk_res.start, size); >>>>> - else { >>>>> - vmem_remove_mapping(crashk_res.start, size); >>>>> - if (size) >>>>> - os_info_crashkernel_add(crashk_res.start, size); >>>>> - else >>>>> - os_info_crashkernel_add(0, 0); >>>>> - } >>>>> -} >>>>> - >>>>> -/* >>>>> - * Map crashkernel memory >>>>> - */ >>>>> -void crash_map_reserved_pages(void) >>>>> -{ >>>>> - crash_map_pages(1); >>>>> -} >>>>> - >>>>> -/* >>>>> - * Unmap crashkernel memory >>>>> - */ >>>>> -void crash_unmap_reserved_pages(void) >>>>> -{ >>>>> - crash_map_pages(0); >>>>> -} >>>>> - >>>>> -/* >>>>> * Give back memory to hypervisor before new kdump is loaded >>>>> */ >>>>> static int machine_kexec_prepare_kdump(void) >>>>> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c >>>>> index d3f9688..5f00437 100644 >>>>> --- a/arch/s390/kernel/setup.c >>>>> +++ b/arch/s390/kernel/setup.c >>>>> @@ -603,7 +603,7 @@ static void __init reserve_crashkernel(void) >>>>> crashk_res.start = crash_base; >>>>> crashk_res.end = crash_base + crash_size - 1; >>>>> insert_resource(&iomem_resource, &crashk_res); >>>>> - memblock_remove(crash_base, crash_size); >>>>> + memblock_reserve(crash_base, crash_size); >>>>> pr_info("Reserving %lluMB of memory at %lluMB " >>>>> "for crashkernel (System RAM: %luMB)\n", >>>>> crash_size >> 20, crash_base >> 20, >>>>> @@ -871,7 +871,6 @@ void __init setup_arch(char **cmdline_p) >>>>> setup_memory(); >>>>> >>>>> check_initrd(); >>>>> - reserve_crashkernel(); >>>>> #ifdef CONFIG_CRASH_DUMP >>>>> /* >>>>> * Be aware that smp_save_dump_cpus() triggers a system reset. >>>>> @@ -890,7 +889,9 @@ void __init setup_arch(char **cmdline_p) >>>>> /* >>>>> * Create kernel page tables and switch to virtual addressing. >>>>> */ >>>>> - paging_init(); >>>>> + paging_init(); >>>>> + >>>>> + reserve_crashkernel(); >>>>> >>>>> /* Setup default console */ >>>>> conmode_default(); >>>>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h >>>>> index f82d6a2..c76641c 100644 >>>>> --- a/include/linux/kexec.h >>>>> +++ b/include/linux/kexec.h >>>>> @@ -230,8 +230,6 @@ extern void crash_kexec(struct pt_regs *); >>>>> int kexec_should_crash(struct task_struct *); >>>>> void crash_save_cpu(struct pt_regs *regs, int cpu); >>>>> void crash_save_vmcoreinfo(void); >>>>> -void crash_map_reserved_pages(void); >>>>> -void crash_unmap_reserved_pages(void); >>>>> void arch_crash_save_vmcoreinfo(void); >>>>> __printf(1, 2) >>>>> void vmcoreinfo_append_str(const char *fmt, ...); >>>>> diff --git a/kernel/kexec.c b/kernel/kexec.c >>>>> index b73dc21..4384672 100644 >>>>> --- a/kernel/kexec.c >>>>> +++ b/kernel/kexec.c >>>>> @@ -136,9 +136,6 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments, >>>>> if (ret) >>>>> return ret; >>>>> >>>>> - if (flags & KEXEC_ON_CRASH) >>>>> - crash_map_reserved_pages(); >>>>> - >>>>> if (flags & KEXEC_PRESERVE_CONTEXT) >>>>> image->preserve_context = 1; >>>>> >>>>> @@ -161,12 +158,6 @@ out: >>>>> if ((flags & KEXEC_ON_CRASH) && kexec_crash_image) >>>>> arch_kexec_protect_crashkres(); >>>>> >>>>> - /* >>>>> - * Once the reserved memory is mapped, we should unmap this memory >>>>> - * before returning >>>>> - */ >>>>> - if (flags & KEXEC_ON_CRASH) >>>>> - crash_unmap_reserved_pages(); >>>>> kimage_free(image); >>>>> return ret; >>>>> } >>>>> @@ -232,9 +223,6 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments, >>>>> >>>>> result = do_kexec_load(entry, nr_segments, segments, flags); >>>>> >>>>> - if ((flags & KEXEC_ON_CRASH) && kexec_crash_image) >>>>> - arch_kexec_protect_crashkres(); >>>>> - >>>>> mutex_unlock(&kexec_mutex); >>>>> >>>>> return result; >>>>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c >>>>> index f826e11..58cd872 100644 >>>>> --- a/kernel/kexec_core.c >>>>> +++ b/kernel/kexec_core.c >>>>> @@ -953,7 +953,6 @@ int crash_shrink_memory(unsigned long new_size) >>>>> start = roundup(start, KEXEC_CRASH_MEM_ALIGN); >>>>> end = roundup(start + new_size, KEXEC_CRASH_MEM_ALIGN); >>>>> >>>>> - crash_map_reserved_pages(); >>>>> crash_free_reserved_phys_range(end, crashk_res.end); >>>>> >>>>> if ((start == end) && (crashk_res.parent != NULL)) >>>>> @@ -967,7 +966,6 @@ int crash_shrink_memory(unsigned long new_size) >>>>> crashk_res.end = end - 1; >>>>> >>>>> insert_resource(&iomem_resource, ram_res); >>>>> - crash_unmap_reserved_pages(); >>>>> >>>>> unlock: >>>>> mutex_unlock(&kexec_mutex); >>>>> @@ -1549,17 +1547,12 @@ int kernel_kexec(void) >>>>> } >>>>> >>>>> /* >>>>> - * Add and remove page tables for crashkernel memory >>>>> + * Protection mechanism for crashkernel reserved memory after >>>>> + * the kdump kernel is loaded. >>>>> * >>>>> * Provide an empty default implementation here -- architecture >>>>> * code may override this >>>>> */ >>>>> -void __weak crash_map_reserved_pages(void) >>>>> -{} >>>>> - >>>>> -void __weak crash_unmap_reserved_pages(void) >>>>> -{} >>>>> - >>>>> void __weak arch_kexec_protect_crashkres(void) >>>>> {} >>>>> >>>>> -- >>>>> 1.8.3.1 >>>>> >>>>> >>>>> _______________________________________________ >>>>> kexec mailing list >>>>> kexec@lists.infradead.org >>>>> http://lists.infradead.org/mailman/listinfo/kexec >>>> _______________________________________________ >>>> kexec mailing list >>>> kexec@lists.infradead.org >>>> http://lists.infradead.org/mailman/listinfo/kexec >> _______________________________________________ >> kexec mailing list >> kexec@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/kexec > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec