Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754064AbZGNJOl (ORCPT ); Tue, 14 Jul 2009 05:14:41 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753577AbZGNJOk (ORCPT ); Tue, 14 Jul 2009 05:14:40 -0400 Received: from gir.skynet.ie ([193.1.99.77]:43520 "EHLO gir.skynet.ie" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753462AbZGNJOj (ORCPT ); Tue, 14 Jul 2009 05:14:39 -0400 Date: Tue, 14 Jul 2009 10:14:38 +0100 From: Mel Gorman To: Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 5/13] Choose pages from the per cpu list-based on migration type Message-ID: <20090714091438.GA28569@csn.ul.ie> References: <20070910112011.3097.8438.sendpatchset@skynet.skynet.ie> <20070910112151.3097.54726.sendpatchset@skynet.skynet.ie> <20090713121628.bde62c65.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20090713121628.bde62c65.akpm@linux-foundation.org> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4377 Lines: 102 On Mon, Jul 13, 2009 at 12:16:28PM -0700, Andrew Morton wrote: > On Mon, 10 Sep 2007 12:21:51 +0100 (IST) > Mel Gorman wrote: > > > > A somewhat belated review comment. > > > The freelists for each migrate type can slowly become polluted due to the > > per-cpu list. Consider what happens when the following happens > > > > 1. A 2^pageblock_order list is reserved for __GFP_MOVABLE pages > > 2. An order-0 page is allocated from the newly reserved block > > 3. The page is freed and placed on the per-cpu list > > 4. alloc_page() is called with GFP_KERNEL as the gfp_mask > > 5. The per-cpu list is used to satisfy the allocation > > > > This results in a kernel page is in the middle of a migratable region. This > > patch prevents this leak occuring by storing the MIGRATE_ type of the page in > > page->private. On allocate, a page will only be returned of the desired type, > > else more pages will be allocated. This may temporarily allow a per-cpu list > > to go over the pcp->high limit but it'll be corrected on the next free. Care > > is taken to preserve the hotness of pages recently freed. > > > > The additional code is not measurably slower for the workloads we've tested. > > It sure looks slower. > > > Signed-off-by: Mel Gorman > > Signed-off-by: Andrew Morton > > --- > > > > mm/page_alloc.c | 18 ++++++++++++++++-- > > 1 file changed, 16 insertions(+), 2 deletions(-) > > > > diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc5-004-split-the-free-lists-for-movable-and-unmovable-allocations/mm/page_alloc.c linux-2.6.23-rc5-005-choose-pages-from-the-per-cpu-list-based-on-migration-type/mm/page_alloc.c > > --- linux-2.6.23-rc5-004-split-the-free-lists-for-movable-and-unmovable-allocations/mm/page_alloc.c 2007-09-02 16:19:34.000000000 +0100 > > +++ linux-2.6.23-rc5-005-choose-pages-from-the-per-cpu-list-based-on-migration-type/mm/page_alloc.c 2007-09-02 16:20:09.000000000 +0100 > > @@ -757,7 +757,8 @@ static int rmqueue_bulk(struct zone *zon > > struct page *page = __rmqueue(zone, order, migratetype); > > if (unlikely(page == NULL)) > > break; > > - list_add_tail(&page->lru, list); > > + list_add(&page->lru, list); > > + set_page_private(page, migratetype); > > } > > spin_unlock(&zone->lock); > > return i; > > @@ -884,6 +885,7 @@ static void fastcall free_hot_cold_page( > > local_irq_save(flags); > > __count_vm_event(PGFREE); > > list_add(&page->lru, &pcp->list); > > + set_page_private(page, get_pageblock_migratetype(page)); > > pcp->count++; > > if (pcp->count >= pcp->high) { > > free_pages_bulk(zone, pcp->batch, &pcp->list, 0); > > @@ -948,7 +950,19 @@ again: > > if (unlikely(!pcp->count)) > > goto failed; > > } > > - page = list_entry(pcp->list.next, struct page, lru); > > + > > + /* Find a page of the appropriate migrate type */ > > + list_for_each_entry(page, &pcp->list, lru) > > + if (page_private(page) == migratetype) > > + break; > > We're doing a linear search through the per-cpu magaznines right there > in the page allocator hot path. Even if the search matches the first > element, the setup costs will matter. > > Surely we can make this search go away with a better choice of data > structures? > I have a patch that expands the per-cpu structure and eliminates the search and I made various attempts at reducing the setup cost (e.g. checking if the first element suited before starting the search). However, I wasn't been able to show for definite it made anything faster but it did increase the size of a per-cpu structure. > > > + /* Allocate more to the pcp list if necessary */ > > + if (unlikely(&page->lru == &pcp->list)) { > > + pcp->count += rmqueue_bulk(zone, 0, > > + pcp->batch, &pcp->list, migratetype); > > + page = list_entry(pcp->list.next, struct page, lru); > > + } > > + > > list_del(&page->lru); > > pcp->count--; > > } else { > -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/