Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935710AbYCSW1Q (ORCPT ); Wed, 19 Mar 2008 18:27:16 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1763789AbYCSVDX (ORCPT ); Wed, 19 Mar 2008 17:03:23 -0400 Received: from wf-out-1314.google.com ([209.85.200.170]:53831 "EHLO wf-out-1314.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965210AbYCSVDR (ORCPT ); Wed, 19 Mar 2008 17:03:17 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=reply-to:to:subject:user-agent:cc:references:in-reply-to:mime-version:content-disposition:x-length:x-uid:date:content-type:content-transfer-encoding:message-id:from; b=G2rOBd/vfB5S/9p8Mz+vtGOnZkMnrDFLrox8wwrFNlXWVnMxB6E1Ru5okkIVkQ7dNZYGJVrKXC/AHZzLqK0jeP241hIQFMEO+pK7bK7URb/DzzmBf6MYxhhMjov4hzUBqu8p93J9vaYBc+MfnrwjYPy2YomaYdv6w3tbgD66oWU= Reply-To: yhlu.kernel@gmail.com To: Andrew Morton , Ingo Molnar Subject: [PATCH 05/12] mm: make mem_map allocation continuous. User-Agent: KMail/1.9.6 (enterprise 20070904.708012) Cc: Christoph Lameter , kernel list References: <200803181237.33861.yhlu.kernel@gmail.com> In-Reply-To: <200803181237.33861.yhlu.kernel@gmail.com> MIME-Version: 1.0 Content-Disposition: inline X-Length: 3807 Date: Wed, 19 Mar 2008 14:04:02 -0700 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200803191404.02389.yhlu.kernel@gmail.com> From: Yinghai Lu Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3078 Lines: 86 [PATCH] mm: make mem_map allocation continuous. vmemmap allocation current got [ffffe20000000000-ffffe200001fffff] PMD ->ffff810001400000 on node 0 [ffffe20000200000-ffffe200003fffff] PMD ->ffff810001800000 on node 0 [ffffe20000400000-ffffe200005fffff] PMD ->ffff810001c00000 on node 0 [ffffe20000600000-ffffe200007fffff] PMD ->ffff810002000000 on node 0 [ffffe20000800000-ffffe200009fffff] PMD ->ffff810002400000 on node 0 ... there is 2M hole between them. the rootcause is that usemap (24 bytes) will be allocated after every 2M mem_map. and it will push next vmemmap (2M) to next align (2M). solution: try to allocate mem_map continously. after patch, will get [ffffe20000000000-ffffe200001fffff] PMD ->ffff810001400000 on node 0 [ffffe20000200000-ffffe200003fffff] PMD ->ffff810001600000 on node 0 [ffffe20000400000-ffffe200005fffff] PMD ->ffff810001800000 on node 0 [ffffe20000600000-ffffe200007fffff] PMD ->ffff810001a00000 on node 0 [ffffe20000800000-ffffe200009fffff] PMD ->ffff810001c00000 on node 0 ... and usemap will share in page because of they are allocated continuously too. sparse_early_usemap_alloc: usemap = ffff810024e00000 size = 24 sparse_early_usemap_alloc: usemap = ffff810024e00080 size = 24 sparse_early_usemap_alloc: usemap = ffff810024e00100 size = 24 sparse_early_usemap_alloc: usemap = ffff810024e00180 size = 24 ... so we make the bootmem allocation more compact and use less memory for usemap. Signed-off-by: Yinghai Lu Index: linux-2.6/mm/sparse.c =================================================================== --- linux-2.6.orig/mm/sparse.c +++ linux-2.6/mm/sparse.c @@ -285,6 +286,8 @@ struct page __init *sparse_early_mem_map return NULL; } +/* section_map pointer array is 64k */ +static __initdata struct page *section_map[NR_MEM_SECTIONS]; /* * Allocate the accumulated non-linear sections, allocate a mem_map * for each and record the physical to section mapping. @@ -295,14 +298,29 @@ void __init sparse_init(void) struct page *map; unsigned long *usemap; + /* + * map is using big page (aka 2M in x86 64 bit) + * usemap is less one page (aka 24 bytes) + * so alloc 2M (with 2M align) and 24 bytes in turn will + * make next 2M slip to one more 2M later. + * then in big system, the memmory will have a lot hole... + * here try to allocate 2M pages continously. + */ for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) { if (!present_section_nr(pnum)) continue; + section_map[pnum] = sparse_early_mem_map_alloc(pnum); + } - map = sparse_early_mem_map_alloc(pnum); - if (!map) + + for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) { + if (!present_section_nr(pnum)) continue; + map = section_map[pnum]; + if (!map) + continue; + usemap = sparse_early_usemap_alloc(pnum); if (!usemap) continue; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/