Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261255AbVBRAIw (ORCPT ); Thu, 17 Feb 2005 19:08:52 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261260AbVBRAIv (ORCPT ); Thu, 17 Feb 2005 19:08:51 -0500 Received: from e6.ny.us.ibm.com ([32.97.182.146]:51615 "EHLO e6.ny.us.ibm.com") by vger.kernel.org with ESMTP id S261255AbVBRAEE (ORCPT ); Thu, 17 Feb 2005 19:04:04 -0500 Subject: [RFC][PATCH] Sparse Memory Handling (hot-add foundation) From: Dave Hansen To: Linux Kernel Mailing List Cc: lhms , linux-mm , "David C. Hansen [imap]" , Andy Whitcroft Content-Type: multipart/mixed; boundary="=-A3i1EPyLLA3BSUWJ5xIF" Date: Thu, 17 Feb 2005 16:03:53 -0800 Message-Id: <1108685033.6482.38.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 83214 Lines: 2762 --=-A3i1EPyLLA3BSUWJ5xIF Content-Type: text/plain Content-Transfer-Encoding: 7bit The attached patch, largely written by Andy Whitcroft, implements a feature which is similar to DISCONTIGMEM, but has some added features. Instead of splitting up the mem_map for each NUMA node, this splits it up into areas that represent fixed blocks of memory. This allows individual pieces of that memory to be easily added and removed. Because it is so similar to DISCONTIGMEM, it can actually be used in place of it on NUMA systems such as the NUMAQ, or Summit architectures. This patch includes an i386 and ppc64 implementation, but there are x86_64 and ia64 implementations as well. There are a number of individual patches (with descriptions) which are rolled up in the attached patch: all of the files up to and including "G2-no-memory-at-high_memory-ppc64.patch" from this directory: http://www.sr71.net/patches/2.6.11/2.6.11-rc3-mhp1/broken-out/ I can post individual patches if anyone would like to comment on them. -- Dave --=-A3i1EPyLLA3BSUWJ5xIF Content-Disposition: attachment; filename=sparse-2.6.11-rc3.patch Content-Type: text/x-patch; name=sparse-2.6.11-rc3.patch; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 7bit --- sparse/arch/arm/mm/init.c~A6-no_arch_mem_map_init 2005-02-17 15:47:42.000000000 -0800 +++ /arch/arm/mm/init.c 2005-02-17 15:47:42.000000000 -0800 @@ -501,10 +501,6 @@ bdata->node_boot_start >> PAGE_SHIFT, zhole_size); } -#ifndef CONFIG_DISCONTIGMEM - mem_map = contig_page_data.node_mem_map; -#endif - /* * finish off the bad pages once * the mem_map is initialised --- sparse/arch/arm26/mm/init.c~A6-no_arch_mem_map_init 2005-02-17 15:47:42.000000000 -0800 +++ /arch/arm26/mm/init.c 2005-02-17 15:47:42.000000000 -0800 @@ -309,8 +309,6 @@ free_area_init_node(0, pgdat, zone_size, bdata->node_boot_start >> PAGE_SHIFT, zhole_size); - mem_map = NODE_DATA(0)->node_mem_map; - /* * finish off the bad pages once * the mem_map is initialised --- sparse/arch/cris/arch-v10/mm/init.c~A6-no_arch_mem_map_init 2005-02-17 15:47:42.000000000 -0800 +++ /arch/cris/arch-v10/mm/init.c 2005-02-17 15:47:42.000000000 -0800 @@ -184,7 +184,6 @@ */ free_area_init_node(0, &contig_page_data, zones_size, PAGE_OFFSET >> PAGE_SHIFT, 0); - mem_map = contig_page_data.node_mem_map; } /* Initialize remaps of some I/O-ports. It is important that this --- sparse/arch/i386/Kconfig~B-sparse-080-alloc_remap-i386 2005-02-17 15:47:43.000000000 -0800 +++ /arch/i386/Kconfig 2005-02-17 15:47:47.000000000 -0800 @@ -68,7 +68,7 @@ config X86_NUMAQ bool "NUMAQ (IBM/Sequent)" - select DISCONTIGMEM + #select DISCONTIGMEM select NUMA help This option is used for getting Linux to run on a (IBM/Sequent) NUMA @@ -759,16 +759,22 @@ comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI" depends on X86_SUMMIT && (!HIGHMEM64G || !ACPI) -config DISCONTIGMEM +config HAVE_ARCH_BOOTMEM_NODE bool depends on NUMA default y -config HAVE_ARCH_BOOTMEM_NODE +config HAVE_ARCH_ALLOC_REMAP bool depends on NUMA default y +config ARCH_SPARSEMEM_DEFAULT + bool + depends on (X86_NUMAQ || X86_SUMMIT) + +source "mm/Kconfig" + config HIGHPTE bool "Allocate 3rd-level pagetables from highmem" depends on HIGHMEM4G || HIGHMEM64G --- sparse/arch/i386/kernel/numaq.c~B-sparse-140-abstract-discontig 2005-02-17 15:47:45.000000000 -0800 +++ /arch/i386/kernel/numaq.c 2005-02-17 15:47:45.000000000 -0800 @@ -32,7 +32,7 @@ #include /* These are needed before the pgdat's are created */ -extern long node_start_pfn[], node_end_pfn[]; +extern long node_start_pfn[], node_end_pfn[], node_remap_size[]; #define MB_TO_PAGES(addr) ((addr) << (20 - PAGE_SHIFT)) @@ -59,6 +59,8 @@ eq->hi_shrd_mem_start - eq->priv_mem_size); node_end_pfn[node] = MB_TO_PAGES( eq->hi_shrd_mem_start + eq->hi_shrd_mem_size); + node_remap_size[node] += memory_present(node, + node_start_pfn[node], node_end_pfn[node]); } } } --- sparse/arch/i386/kernel/setup.c~FROM-MM-refactor-i386-memory-setup 2005-02-17 15:47:38.000000000 -0800 +++ /arch/i386/kernel/setup.c 2005-02-17 15:48:55.000000000 -0800 @@ -40,6 +40,8 @@ #include #include #include +#include +#include #include