Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753210Ab1DTPDI (ORCPT ); Wed, 20 Apr 2011 11:03:08 -0400 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:44397 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751020Ab1DTPDF (ORCPT ); Wed, 20 Apr 2011 11:03:05 -0400 Subject: Re: [PATCH v3] mm: make expand_downwards symmetrical to expand_upwards From: James Bottomley To: Christoph Lameter Cc: Pekka Enberg , Matthew Wilcox , KOSAKI Motohiro , Michal Hocko , Andrew Morton , Hugh Dickins , linux-mm@kvack.org, LKML , linux-parisc@vger.kernel.org, David Rientjes , Ingo Molnar , x86 maintainers , linux-arch@vger.kernel.org, Mel Gorman In-Reply-To: References: <20110420102314.4604.A69D9226@jp.fujitsu.com> <20110420161615.462D.A69D9226@jp.fujitsu.com> <20110420112020.GA31296@parisc-linux.org> <1303308938.2587.8.camel@mulgrave.site> Content-Type: text/plain; charset="UTF-8" Date: Wed, 20 Apr 2011 10:02:59 -0500 Message-ID: <1303311779.2587.19.camel@mulgrave.site> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2927 Lines: 60 On Wed, 2011-04-20 at 09:50 -0500, Christoph Lameter wrote: > On Wed, 20 Apr 2011, James Bottomley wrote: > > > 1. We can look at what imposing NUMA on the DISCONTIGMEM archs > > would do ... the embedded ones are going to be hardest hit, but > > if it's not too much extra code, it might be palatable. > > 2. The other is that we can audit mm to look at all the node > > assumptions in the non-numa case. My suspicion is that > > accidentally or otherwise, it mostly works for the normal case, > > so there might not be much needed to pull it back to working > > properly for DISCONTIGMEM. > > The older code may work. SLAB f.e. does not call page_to_nid() in the > !NUMA case but keeps special metadata structures around in each slab page > that records the node used for allocation. The problem is with new code > added/revised in the last 5 years or so that uses page_to_nid() and > allocates only a single structure for !NUMA. There are also VM_BUG_ONs in > the page allocator that should trigger if page_to_nid() returns strange > values. I wonder why that never occurred. Actually, I think slab got changed when discontigmem was added ... that's why it all works OK. > > 3. Finally we could look at deprecating DISCONTIGMEM in favour > of > SPARSEMEM, but we'd still need to fix -stable for that case. > > Especially as it will take time to convert all the architectures > > The fix needed is to mark DISCONTIGMEM without NUMA as broken for now. We > need an audit of the core VM before removing that or making it contingent > on the configurations of various VM subsystems. Don't be stupid ... that would cause six architectures to get marked broken. > > I'm certainly with Matthew: DISCONTIGMEM is supposed to be a lightweight > > framework which allows machines with split physical memory ranges to > > work. That's a very common case nowadays. Numa is supposed to be a > > heavyweight framework to preserve node locality for non-uniform memory > > access boxes (which none of the DISCONTIGMEM && !NUMA systems are). > > Well yes but we have SPARSE for that today. DISCONTIG with multiple per > pgdat structures in a !NUMA case is just weird and unexpected for many who > have done VM coding in the last years. Look, I'm not really interested in who understands what. The fact is we have six architectures with the possibility for DISCONTIGMEM && !NUMA, so that's the case we need to fix in -stable. They oops with SLUB, as far as I can tell, there are still no oops reports with SLAB. The simplest -stable fix seems to be to mark SLUB broken on DISCONTIGMEM && !NUMA. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/