Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753010AbaANBpM (ORCPT ); Mon, 13 Jan 2014 20:45:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36336 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751924AbaANBpH (ORCPT ); Mon, 13 Jan 2014 20:45:07 -0500 Date: Tue, 14 Jan 2014 09:45:16 +0800 From: Dave Young To: Prarit Bhargava Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Len Brown , "Rafael J. Wysocki" , Linn Crosetto , Pekka Enberg , Yinghai Lu , Andrew Morton , Toshi Kani , Tang Chen , Wen Congyang , Vivek Goyal , kosaki.motohiro@gmail.com, linux-acpi@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] x86, acpi memory hotplug, add parameter to disable memory hotplug Message-ID: <20140114014516.GC4327@dhcp-16-126.nay.redhat.com> References: <1389650161-13292-1-git-send-email-prarit@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1389650161-13292-1-git-send-email-prarit@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/13/14 at 04:56pm, Prarit Bhargava wrote: > When booting a kexec/kdump kernel on a system that has specific memory hotplug > regions the boot will fail with warnings like: > > [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0 > [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 3.10.0-65.el7.x86_64 #1 > [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS > QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011 > [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 > ffff8800341bd950 > [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009 > 00000000000084d0 > [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000 > 0000000000000200 > [ 2.989821] Call Trace: > [ 2.992560] [] dump_stack+0x19/0x1b > [ 2.998300] [] warn_alloc_failed+0xf0/0x160 > [ 3.004817] [] ? > __alloc_pages_direct_compact+0xac/0x196 > [ 3.012594] [] __alloc_pages_nodemask+0x7ff/0xa00 > [ 3.019692] [] vmemmap_alloc_block+0x62/0xba > [ 3.026303] [] vmemmap_alloc_block_buf+0x15/0x3b > [ 3.033302] [] vmemmap_populate+0xb4/0x21b > [ 3.039718] [] sparse_mem_map_populate+0x27/0x35 > [ 3.046717] [] sparse_add_one_section+0x7a/0x185 > [ 3.053720] [] __add_pages+0xaf/0x240 > [ 3.059656] [] arch_add_memory+0x59/0xd0 > [ 3.065877] [] add_memory+0xb9/0x1b0 > [ 3.071713] [] acpi_memory_device_add+0x18d/0x26d > [ 3.078813] [] acpi_bus_device_attach+0x7d/0xcd > [ 3.085719] [] acpi_ns_walk_namespace+0xc8/0x17f > [ 3.092716] [] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.100004] [] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.107293] [] acpi_walk_namespace+0x95/0xc5 > [ 3.113904] [] acpi_bus_scan+0x8b/0x9d > [ 3.119933] [] acpi_scan_init+0x63/0x160 > [ 3.126153] [] acpi_init+0x25d/0x2a6 > [ 3.131987] [] ? acpi_sleep_proc_init+0x2a/0x2a > [ 3.138889] [] do_one_initcall+0xe2/0x190 > [ 3.145210] [] kernel_init_freeable+0x17c/0x207 > [ 3.152111] [] ? do_early_param+0x88/0x88 > [ 3.158430] [] ? rest_init+0x80/0x80 > [ 3.164264] [] kernel_init+0xe/0x180 > [ 3.170097] [] ret_from_fork+0x7c/0xb0 > [ 3.176123] [] ? rest_init+0x80/0x80 > [ 3.181956] Mem-Info: > [ 3.184490] Node 0 DMA per-cpu: > [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0 > [ 3.193353] Node 0 DMA32 per-cpu: > [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0 > [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0 > [ 3.202410] active_file:0 inactive_file:0 isolated_file:0 > [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0 > [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880 > [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0 > [ 3.202410] free_cma:0 > > because the system has run out of memory at boot time. This occurs > because of the following sequence in the boot: > > Main kernel boots and sets E820 map. The second kernel is booted with a > map generated by the kdump service using memmap= and memmap=exactmap. > These parameters are added to the kernel parameters of the kexec/kdump > kernel. The kexec/kdump kernel has limited memory resources so as not > to severely impact the main kernel. > > The system then panics and the kdump/kexec kernel boots (which is a > completely new kernel boot). During this boot ACPI is initialized and the > kernel (as can be seen above) traverses the ACPI namespace and finds an > entry for a memory device to be hotadded. > > ie) > > [ 3.053720] [] __add_pages+0xaf/0x240 > [ 3.059656] [] arch_add_memory+0x59/0xd0 > [ 3.065877] [] add_memory+0xb9/0x1b0 > [ 3.071713] [] acpi_memory_device_add+0x18d/0x26d > [ 3.078813] [] acpi_bus_device_attach+0x7d/0xcd > [ 3.085719] [] acpi_ns_walk_namespace+0xc8/0x17f > [ 3.092716] [] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.100004] [] ? acpi_bus_type_and_status+0x90/0x90 > [ 3.107293] [] acpi_walk_namespace+0x95/0xc5 > [ 3.113904] [] acpi_bus_scan+0x8b/0x9d > [ 3.119933] [] acpi_scan_init+0x63/0x160 > [ 3.126153] [] acpi_init+0x25d/0x2a6 > > At this point the kernel adds page table information and the the kexec/kdump > kernel runs out of memory. > > This can also be reproduced with a "regular" kernel by using the > memmap=exactmap and mem=X parameters on the main kernel and booting. > > This patchset resolves the problem by adding a kernel parameter, > acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug > should also be disabled by default when a user specified a memory mapping with > "memmap=exactmap" or "mem=X". > > Signed-off-by: Prarit Bhargava > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: x86@kernel.org > Cc: Len Brown > Cc: "Rafael J. Wysocki" > Cc: Linn Crosetto > Cc: Pekka Enberg > Cc: Yinghai Lu > Cc: Andrew Morton > Cc: Toshi Kani > Cc: Tang Chen > Cc: Wen Congyang > Cc: Vivek Goyal > Cc: kosaki.motohiro@gmail.com > Cc: dyoung@redhat.com > Cc: Toshi Kani > Cc: linux-acpi@vger.kernel.org > Cc: linux-mm@kvack.org > --- > Documentation/kernel-parameters.txt | 3 +++ > arch/x86/kernel/e820.c | 4 ++++ > drivers/acpi/acpi_memhotplug.c | 18 ++++++++++++++++++ > include/linux/memory_hotplug.h | 9 +++++++++ > 4 files changed, 34 insertions(+) > > diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt > index b9e9bd8..ea93f75 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -343,6 +343,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. > no: ACPI OperationRegions are not marked as reserved, > no further checks are performed. > > + acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kexec > + and kdump kernels. > + > add_efi_memmap [EFI; X86] Include EFI memory map in > kernel's map of available physical RAM. > > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c > index 174da5f..3c431fe 100644 > --- a/arch/x86/kernel/e820.c > +++ b/arch/x86/kernel/e820.c > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -834,6 +835,8 @@ static int __init parse_memopt(char *p) > return -EINVAL; > e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1); > > + set_acpi_no_memhotplug(); > + > return 0; > } > early_param("mem", parse_memopt); > @@ -857,6 +860,7 @@ static int __init parse_memmap_one(char *p) > #endif > e820.nr_map = 0; > userdef = 1; > + set_acpi_no_memhotplug(); > return 0; > } > > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c > index 551dad7..d104a7d 100644 > --- a/drivers/acpi/acpi_memhotplug.c > +++ b/drivers/acpi/acpi_memhotplug.c > @@ -361,7 +361,25 @@ static void acpi_memory_device_remove(struct acpi_device *device) > acpi_memory_device_free(mem_device); > } > > +static bool acpi_no_memhotplug; > + > +void set_acpi_no_memhotplug(void) > +{ > + acpi_no_memhotplug = true; > + pr_info_once("ACPI: Memory Hotplug Disabled\n"); > +} > + > void __init acpi_memory_hotplug_init(void) > { > + if (acpi_no_memhotplug) > + return; > + > acpi_scan_add_handler_with_hotplug(&memory_device_handler, "memory"); > } > + > +static int __init disable_acpi_memory_hotplug(char *str) > +{ > + set_acpi_no_memhotplug(); > + return 1; > +} > +__setup("acpi_no_memhotplug", disable_acpi_memory_hotplug); > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index 4ca3d95..3cdb6e0 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -12,6 +12,15 @@ struct pglist_data; > struct mem_section; > struct memory_block; > > +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY > +/* set flag to disable ACPI memory hotplug */ > +extern void set_acpi_no_memhotplug(void); > +#else > +static inline void set_acpi_no_memhotplug(void) > +{ > +} > +#endif > + > #ifdef CONFIG_MEMORY_HOTPLUG > > /* > -- > 1.7.9.3 > Acked-by: Dave Young -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/