Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263817AbUFKLMh (ORCPT ); Fri, 11 Jun 2004 07:12:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263818AbUFKLMh (ORCPT ); Fri, 11 Jun 2004 07:12:37 -0400 Received: from mx1.redhat.com ([66.187.233.31]:392 "EHLO mx1.redhat.com") by vger.kernel.org with ESMTP id S263817AbUFKLMf (ORCPT ); Fri, 11 Jun 2004 07:12:35 -0400 From: David Howells In-Reply-To: <20040611034809.41dc9205.akpm@osdl.org> References: <20040611034809.41dc9205.akpm@osdl.org> <567.1086950642@redhat.com> To: Andrew Morton Cc: torvalds@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Permit inode & dentry hash tables to be allocated > MAX_ORDER size User-Agent: EMH/1.14.1 SEMI/1.14.5 (Awara-Onsen) FLIM/1.14.5 (Demachiyanagi) APEL/10.6 Emacs/21.3 (i386-redhat-linux-gnu) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.5 - "Awara-Onsen") Content-Type: text/plain; charset=US-ASCII Date: Fri, 11 Jun 2004 12:12:30 +0100 Message-ID: <1056.1086952350@redhat.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1705 Lines: 40 > > Here's a patch to allocate memory for big system hash tables with the bootmem > > allocator rather than with main page allocator. > > umm, why? Three reasons: (1) So that the size can be bigger than MAX_ORDER. IBM have done some testing on their big PPC64 systems (64GB of RAM) with linux-2.4 and found that they get better performance if the sizes of the inode cache hash, dentry cache hash, buffer head hash and page cache hash are increased beyond MAX_ORDER (order 11). Now the main allocator can't allocate anything larger than MAX_ORDER, but the bootmem allocator can. In 2.6 it appears that only the inode and dentry hashes remain of those four, but there are other hash tables that could use this service. (2) Changing MAX_ORDER appears to have a number of effects beyond just limiting the maximum size that can be allocated in one go. (3) Should someone want a hash table in which each bucket isn't a power of two in size, memory will be wasted as the chunk of memory allocated will be a power of two in size (to hold a power of two number of buckets). On the other hand, using the bootmem allocator means the allocation will only take up sufficient pages to hold it, rather than the next power of two up. Admittedly, this point doesn't apply to the dentry and inode hashes, but it might to another hash table that might want to use this service. David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/