Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755793AbYLaDOT (ORCPT ); Tue, 30 Dec 2008 22:14:19 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754751AbYLaDOF (ORCPT ); Tue, 30 Dec 2008 22:14:05 -0500 Received: from smtp120.mail.mud.yahoo.com ([209.191.84.77]:37496 "HELO smtp120.mail.mud.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1754503AbYLaDOD (ORCPT ); Tue, 30 Dec 2008 22:14:03 -0500 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com.au; h=Received:X-YMail-OSG:X-Yahoo-Newman-Property:From:To:Subject:Date:User-Agent:Cc:References:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Content-Disposition:Message-Id; b=eJuksSU7Ad7yVx12Rvd/ORPC2cLxn+2z2V1h+ze6GIsrjycuibT8qU1l/tWVIvBJQbf7/JAMuzJqMvVC1aQGijyW+NqKH9zQyfRWPKlY4yEOglpOyce2F/HvSBC6lVuXGa3V+RrqnjhN7zITn8LszCseiPL8kMj5WEo/S9RqNtQ= ; X-YMail-OSG: Dm.86T4VM1lNCxeODOFZDlOiQMf.BJ87EYlpqfkhwfVUscZ2o5rJLs8qshXlQYZA1gUs9adhUZai3OIaMI5eeOjQ3U_FOpxXAnVf2Y8F6Mu7Rr79fcmoOm3TpUkj2dTwbQOHbStaqi0RBClWFwA.R0pldfgtNrbdH9cjWvdsoYRmdwfM0QBCDvYlPJ_WRWMzvs3tnqH2cq6RM_9ftLZnnSkpEhcqpZdaug-- X-Yahoo-Newman-Property: ymail-3 From: Nick Piggin To: Andrew Morton Subject: Re: [PATCH] cpuset,mm: fix allocating page cache/slab object on the unallowed node when memory spread is set Date: Wed, 31 Dec 2008 14:13:44 +1100 User-Agent: KMail/1.9.51 (KDE/4.0.4; ; ) Cc: miaox@cn.fujitsu.com, menage@google.com, cl@linux-foundation.org, penberg@cs.helsinki.fi, mpm@selenic.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <49547B93.5090905@cn.fujitsu.com> <20081230142805.3c6f78e3.akpm@linux-foundation.org> In-Reply-To: <20081230142805.3c6f78e3.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200812311413.45127.nickpiggin@yahoo.com.au> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2957 Lines: 81 On Wednesday 31 December 2008 09:28:05 Andrew Morton wrote: > On Fri, 26 Dec 2008 14:37:07 +0800 > > Miao Xie wrote: > > The task still allocated the page caches on old node after modifying its > > cpuset's mems when 'memory_spread_page' was set, it is caused by the old > > mem_allowed_list of the task. Slab has the same problem. > > ok... > > > diff --git a/mm/filemap.c b/mm/filemap.c > > index f3e5f89..d978983 100644 > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -517,6 +517,9 @@ int add_to_page_cache_lru(struct page *page, struct > > address_space *mapping, #ifdef CONFIG_NUMA > > struct page *__page_cache_alloc(gfp_t gfp) > > { > > + if ((gfp & __GFP_WAIT) && !in_interrupt()) > > + cpuset_update_task_memory_state(); > > + > > if (cpuset_do_page_mem_spread()) { > > int n = cpuset_mem_spread_node(); > > return alloc_pages_node(n, gfp, 0); > > diff --git a/mm/slab.c b/mm/slab.c > > index 0918751..3b6e3d7 100644 > > --- a/mm/slab.c > > +++ b/mm/slab.c > > @@ -3460,6 +3460,9 @@ __cache_alloc(struct kmem_cache *cachep, gfp_t > > flags, void *caller) if (should_failslab(cachep, flags)) > > return NULL; > > > > + if ((flags & __GFP_WAIT) && !in_interrupt()) > > + cpuset_update_task_memory_state(); > > + These paths are pretty performance critical. Why don't cpusets code do this work in the slowpath where the cpuset's mems_allowed gets changed rather than putting these calls all over the place with apparently no real rhyme or reason :( (this is not against your patch, but just this part of the cpusets design) > > cache_alloc_debugcheck_before(cachep, flags); > > local_irq_save(save_flags); > > objp = __do_cache_alloc(cachep, flags); > > Problems. > > a) There's no need to test in_interrupt(). Any caller who passed us > __GFP_WAIT from interrupt context is horridly buggy and needs to be > fixed. Right. There are existing sites that do the same check, which is probably where it is copied from. > b) Even if the caller _did_ set __GFP_WAIT, there's no guarantee > that we're deadlock safe here. Does anyone ever do a __GFP_WAIT > allocation while holding callback_mutex? If so, it'll deadlock. It's static to cpuset.c, so I'd hope not. > c) These are two of the kernel's hottest code paths. We really > really really really don't want to be polling for some dopey > userspace admin change on each call to __cache_alloc()! Yeah, right. Let's try to fix cpuset.c instead... > d) How does slub handle this problem? SLUB seems to do a "sloppy" kind of memory policy allocation, where it just relies on the page allocator to hand us the correct page and AFAIKS does not exactly obey this stuff all the time. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/