Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759557AbZCPMZ3 (ORCPT ); Mon, 16 Mar 2009 08:25:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754337AbZCPMZN (ORCPT ); Mon, 16 Mar 2009 08:25:13 -0400 Received: from cantor.suse.de ([195.135.220.2]:54305 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752702AbZCPMZM (ORCPT ); Mon, 16 Mar 2009 08:25:12 -0400 Date: Mon, 16 Mar 2009 13:25:05 +0100 From: Nick Piggin To: Mel Gorman Cc: Linux Memory Management List , Pekka Enberg , Rik van Riel , KOSAKI Motohiro , Christoph Lameter , Johannes Weiner , Linux Kernel Mailing List , Lin Ming , Zhang Yanmin , Peter Zijlstra Subject: Re: [PATCH 00/35] Cleanup and optimise the page allocator V3 Message-ID: <20090316122505.GD30802@wotan.suse.de> References: <1237196790-7268-1-git-send-email-mel@csn.ul.ie> <20090316104054.GA23046@wotan.suse.de> <20090316111906.GA6382@csn.ul.ie> <20090316113358.GA30802@wotan.suse.de> <20090316120216.GB6382@csn.ul.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090316120216.GB6382@csn.ul.ie> User-Agent: Mutt/1.5.9i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2211 Lines: 52 On Mon, Mar 16, 2009 at 12:02:17PM +0000, Mel Gorman wrote: > On Mon, Mar 16, 2009 at 12:33:58PM +0100, Nick Piggin wrote: > > Wheras if you defer this until the point you need a higher order > > page, the only thing you have to work with are the pages that are > > free *right now*. > > > > Well, buddy always uses the smallest available page first. Even with > deferred coalescing, it will merge up to order-5 at least. Lets say they > could have merged up to order-10 in ordinary circumstances, they are > still avoided for as long as possible. Granted, it might mean that an > order-5 is split that could have been merged but it's hard to tell how > much of a difference that makes. But the kinds of pages *you* are interested in are order-10, right? > > Your anti-frag tests probably don't stress this long term fragmentation > > problem. > > > > Probably not, but we have little data on long-term fragmentation other than > anecdotal evidence that it's ok these days. Well, I think before anti-frag there was lots of anecdotal evidence that it's "ok", except for loads heavily using large higher order allocations. I don't know if we'd have many systems running with hundreds of days of uptime on such workloads post-anti-frag? Google might? But I don't know how long their uptimes are. I expect we'd have a better idea in a couple more years after the next enterprise distro release cycles with anti-frag. > > Still, it's significant enough that I think it should be made > > optional (and arguably default to on) even if it does harm higher > > order allocations a bit. > > > > I could make PAGE_ORDER_MERGE_ORDER a proc tunable? If it's placed as a > read-mostly variable beside the gfp_zone table, it might even fit in the > same cache line. Hmm, possibly. OTOH I don't like tunables. If you don't think it will be a problem for hugepage allocations, then I would prefer just to leave it on and 5 by default (or even less? COSTLY_ORDER?) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/