Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754362AbYAVUL0 (ORCPT ); Tue, 22 Jan 2008 15:11:26 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752064AbYAVULT (ORCPT ); Tue, 22 Jan 2008 15:11:19 -0500 Received: from relay1.sgi.com ([192.48.171.29]:46324 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751875AbYAVULS (ORCPT ); Tue, 22 Jan 2008 15:11:18 -0500 Date: Tue, 22 Jan 2008 12:11:14 -0800 (PST) From: Christoph Lameter X-X-Sender: clameter@schroedinger.engr.sgi.com To: Mel Gorman cc: Olaf Hering , Pekka Enberg , linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org, "Aneesh Kumar K.V" , hanth Aravamudan , KAMEZAWA Hiroyuki , lee.schermerhorn@hp.com, Linux MM , akpm@linux-foundation.org Subject: Re: crash in kmem_cache_init In-Reply-To: <20080122195448.GA15567@csn.ul.ie> Message-ID: References: <20080115150949.GA14089@aepfle.de> <84144f020801170414q7d408a74uf47a84b777c36a4a@mail.gmail.com> <20080117181222.GA24411@aepfle.de> <20080117211511.GA25320@aepfle.de> <20080118213011.GC10491@csn.ul.ie> <20080118225713.GA31128@aepfle.de> <20080122195448.GA15567@csn.ul.ie> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1617 Lines: 32 On Tue, 22 Jan 2008, Mel Gorman wrote: > Christoph/Pekka, this patch is papering over the problem and something > more fundamental may be going wrong. The crash occurs because l3 is NULL > and the cache is kmem_cache so this is early in the boot process. It is > selecting l3 based on node 2 which is correct in terms of available memory > but it initialises the lists on node 0 because that is the node the CPUs are > located. Hence later it uses an uninitialised nodelists and BLAM. Relevant > parts of the log for seeing the memoryless nodes in relation to CPUs is; Would it be possible to run the bootstrap on a cpu that has a node with memory associated to it? I believe we had the same situation last year when GFP_THISNODE was introduced? After you reverted the slab memoryless node patch there should be per node structures created for node 0 unless the node is marked offline. Is it? If so then you are booting a cpu that is associated with an offline node. > Can you see a better solution than this? Well this means that bootstrap will work by introducing foreign objects into the per cpu queue (should only hold per cpu objects). They will later be consumed and then the queues will contain the right objects so the effect of the patch is minimal. I thought we fixed the similar situation last year by dropping GFP_THISNODE for some allocations? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/