Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761992AbXEXIB7 (ORCPT ); Thu, 24 May 2007 04:01:59 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756735AbXEXIBZ (ORCPT ); Thu, 24 May 2007 04:01:25 -0400 Received: from ausmtp05.au.ibm.com ([202.81.18.154]:35413 "EHLO ausmtp05.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755649AbXEXIBW (ORCPT ); Thu, 24 May 2007 04:01:22 -0400 Message-ID: <465545FB.2080801@linux.vnet.ibm.com> Date: Thu, 24 May 2007 13:29:55 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com Organization: IBM User-Agent: Thunderbird 1.5.0.10 (X11/20070306) MIME-Version: 1.0 To: Pavel Emelianov CC: Andrew Morton , Paul Menage , Srivatsa Vaddagiri , Balbir Singh , devel@openvz.org, Linux Kernel Mailing List , Kirill Korotaev , Chandra Seetharaman , Cedric Le Goater , "Eric W. Biederman" , Rohit Seth , Linux Containers , Linux Memory Management List Subject: Re: [PATCH 8/8] Per-container pages reclamation References: <461A3010.90403@sw.ru> <461A397A.8080609@sw.ru> <464C3D0E.3010603@linux.vnet.ibm.com> <4651B794.4040302@sw.ru> In-Reply-To: <4651B794.4040302@sw.ru> Content-Type: multipart/mixed; boundary="------------040409070904020008050306" Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4309 Lines: 126 This is a multi-part message in MIME format. --------------040409070904020008050306 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Pavel Emelianov wrote: >> Index: linux-2.6.20/mm/rss_container.c >> =================================================================== >> --- linux-2.6.20.orig/mm/rss_container.c 2007-05-15 05:13:46.000000000 -0700 >> +++ linux-2.6.20/mm/rss_container.c 2007-05-16 20:45:45.000000000 -0700 >> @@ -212,6 +212,7 @@ void container_rss_del(struct page_conta >> >> css_put(&rss->css); >> kfree(pc); >> + init_page_container(page); > > This hunk is bad. > See, when the page drops its mapcount to 0 it may be reused right > after this if it belongs to a file map - another CPU can touch it. > Thus you're risking to reset the wrong container. > > The main idea if the accounting is that you cannot trust the > page_container(page) value after the page's mapcount became 0. > Good catch, I'll move the initialization to free_hot_cold_page(). I'm attaching a new patch. I've also gotten rid of the unused variable page in container_rss_del(). I've compile and boot tested the fix -- Thanks, Balbir Singh Linux Technology Center IBM, ISTL --------------040409070904020008050306 Content-Type: text/x-patch; name="rss-fix-lru-race.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="rss-fix-lru-race.patch" Signed-off-by: Balbir Singh --- Index: linux-2.6.20/mm/page_alloc.c =================================================================== --- linux-2.6.20.orig/mm/page_alloc.c 2007-05-16 10:30:10.000000000 -0700 +++ linux-2.6.20/mm/page_alloc.c 2007-05-24 00:41:00.000000000 -0700 @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -791,6 +792,7 @@ static void fastcall free_hot_cold_page( if (!PageHighMem(page)) debug_check_no_locks_freed(page_address(page), PAGE_SIZE); + init_page_container(page); arch_free_page(page, 0); kernel_map_pages(page, 1, 0); @@ -1977,6 +1979,7 @@ void __meminit memmap_init_zone(unsigned set_page_links(page, zone, nid, pfn); init_page_count(page); reset_page_mapcount(page); + init_page_container(page); SetPageReserved(page); INIT_LIST_HEAD(&page->lru); #ifdef WANT_PAGE_VIRTUAL Index: linux-2.6.20/include/linux/rss_container.h =================================================================== --- linux-2.6.20.orig/include/linux/rss_container.h 2007-05-16 10:31:04.000000000 -0700 +++ linux-2.6.20/include/linux/rss_container.h 2007-05-16 10:32:14.000000000 -0700 @@ -28,6 +28,11 @@ void container_rss_move_lists(struct pag unsigned long isolate_pages_in_container(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, struct zone *zone, struct rss_container *, int active); +static inline void init_page_container(struct page *page) +{ + page_container(page) = NULL; +} + #else static inline int container_rss_prepare(struct page *pg, struct vm_area_struct *vma, struct page_container **pc) @@ -56,6 +61,10 @@ static inline void mm_free_container(str { } +static inline void init_page_container(struct page *page) +{ +} + #define isolate_container_pages(nr, dst, scanned, rss, act, zone) ({ BUG(); 0;}) #define container_rss_move_lists(pg, active) do { } while (0) #endif Index: linux-2.6.20/mm/rss_container.c =================================================================== --- linux-2.6.20.orig/mm/rss_container.c 2007-05-15 05:13:46.000000000 -0700 +++ linux-2.6.20/mm/rss_container.c 2007-05-24 00:58:43.000000000 -0700 @@ -199,12 +199,9 @@ void container_rss_add(struct page_conta void container_rss_del(struct page_container *pc) { - struct page *page; struct rss_container *rss; - page = pc->page; rss = pc->cnt; - spin_lock_irq(&rss->res.lock); list_del(&pc->list); res_counter_uncharge_locked(&rss->res, 1); --------------040409070904020008050306-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/