Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756837AbYGaKbt (ORCPT ); Thu, 31 Jul 2008 06:31:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754536AbYGaKbl (ORCPT ); Thu, 31 Jul 2008 06:31:41 -0400 Received: from gir.skynet.ie ([193.1.99.77]:37360 "EHLO gir.skynet.ie" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754399AbYGaKbk (ORCPT ); Thu, 31 Jul 2008 06:31:40 -0400 Date: Thu, 31 Jul 2008 11:31:38 +0100 From: Mel Gorman To: Andrew Morton Cc: ebmunson@us.ibm.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org, libhugetlbfs-devel@lists.sourceforge.net, abh@cray.com Subject: Re: [RFC] [PATCH 0/5 V2] Huge page backed user-space stacks Message-ID: <20080731103137.GD1704@csn.ul.ie> References: <20080730014308.2a447e71.akpm@linux-foundation.org> <20080730172317.GA14138@csn.ul.ie> <20080730103407.b110afc2.akpm@linux-foundation.org> <20080730193010.GB14138@csn.ul.ie> <20080730130709.eb541475.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20080730130709.eb541475.akpm@linux-foundation.org> User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2632 Lines: 52 On (30/07/08 13:07), Andrew Morton didst pronounce: > On Wed, 30 Jul 2008 20:30:10 +0100 > Mel Gorman wrote: > > > With Erics patch and libhugetlbfs, we can automatically back text/data[1], > > malloc[2] and stacks without source modification. Fairly soon, libhugetlbfs > > will also be able to override shmget() to add SHM_HUGETLB. That should cover > > a lot of the memory-intensive apps without source modification. > > The weak link in all of this still might be the need to reserve > hugepages and the unreliability of dynamically allocating them. > > The dynamic allocation should be better nowadays, but I've lost track > of how reliable it really is. What's our status there? > We are a lot more reliable than we were although exact quantification is difficult because it's workload dependent. For a long time, I've been able to test bits and pieces with hugepages by allocating the pool at the time I needed it even after days of uptime. Previously this required a reboot. I've also been able to use the dynamic hugepage pool resizing effectively and we track how much it is succeeding and failing in /proc/vmstat (see the htlb fields) to watch for problems. Between that and /proc/pagetypeinfo, I am expecting to be able to identify availablilty problems. As an administrator can now set a minimum pool size and the maximum size of the pool (nr_hugepages and nr_overcommit_hugepages), the configuration difficulties should be relaxed. If it is found that anti-fragmentation can be broken down and pool resizing starts failing after X amount of time on Y workloads, there is still the option of using movablecore=BiggestPoolSizeIWillEverNeed and writing 1 to /proc/sys/vm/hugepages_treat_as_movable so the hugepage pool can grow/shrink reliably there. Overall, it's in pretty good shape. To be fair, one snag is that that swap is almost required for pool resizing to work as I never pushed to complete memory compaction (http://lwn.net/Articles/238837/). Hence, we depend on the workload to have lots of filesystem-backed data for lumpy-reclaim to do its job, for pool resizing to take place between batch jobs or for swap to be configured even if it's just for the duration of a pool resize. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/