Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756831AbYLaTFf (ORCPT ); Wed, 31 Dec 2008 14:05:35 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756153AbYLaTF1 (ORCPT ); Wed, 31 Dec 2008 14:05:27 -0500 Received: from mail.lang.hm ([64.81.33.126]:59607 "EHLO bifrost.lang.hm" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756139AbYLaTF0 (ORCPT ); Wed, 31 Dec 2008 14:05:26 -0500 Date: Wed, 31 Dec 2008 12:07:33 -0800 (PST) From: david@lang.hm X-X-Sender: dlang@asgard.lang.hm To: Cyrill Gorcunov cc: Andi Kleen , linux-kernel Subject: Re: early exception error In-Reply-To: <20081231183039.GE20882@localhost> Message-ID: References: <87k59hur5f.fsf@basil.nowhere.org> <20081231093803.GA20882@localhost> <20081231183039.GE20882@localhost> User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2297 Lines: 66 On Wed, 31 Dec 2008, Cyrill Gorcunov wrote: > [david@lang.hm - Wed, Dec 31, 2008 at 11:12:12AM -0800] >> On Wed, 31 Dec 2008, Cyrill Gorcunov wrote: >> >>> [david@lang.hm - Tue, Dec 30, 2008 at 05:39:29PM -0800] >>>> On Wed, 31 Dec 2008, Andi Kleen wrote: >>>> >>>>> david@lang.hm writes: >>>>>> >>>>>> doing a grep through System.map for the address that appears in the >>>>>> error returns nothing >>>>> >>>>> This might be obvious, but you can't grep directly for these addresses >>>>> because System.map contains the starting addresses of functions only >>>>> and normally the reported address is somewhere in the middle of a >>>>> function. So you instead have to look for the highest number lower or equal >>>>> the address from the exception. >>>> >>>> thanks, this was not obvious to me >>>> >>>> the -2 error maps to >>>> >>>> ffffffff8099e4c1 T free_bootmem_node >>>> ffffffff8099e4e5 t alloc_bootmem_core >>>> ffffffff8099e774 t ___alloc_bootmem_nopanic >>>> >>>> >>>> the first error maps to >>>> >>>> ffffffff809c2de4 T free_bootmem_node >>>> ffffffff809c2e08 t alloc_bootmem_core >>>> ffffffff809c3097 t ___alloc_bootmem_nopanic >>>> >>>> >>>> so it looks like this is in alloc_bootmem_core in both cases. >>>> >>>> David Lang >>>> >>> >>> Along with Andi's proposed earlyprintk=vga I think >>> bootmem_debug option could be usefull here too. >> >> adding bootmem_debug creates so much additonal output that the oops >> scrolls off the screen (except the last 'paragraph' of it) >> >> it looks like it's individual items being allocated (trying to scan it as >> it scrolled by) > > on the picture you sent me i noticed the message > "Your memory is not aligned you need to rebuild your > kernel with bigger NODEMAP SIZE shift=20" and then > srat code complains about "No NUMA code hash function found" > which looks a bit scary. Btw, could you post this picture > on some public resource so NUMA people could check it? http://linux.lang.hm/linux/IMG00030.jpg I'll try rebuilding with a bigger nodemap size and let you know David Lang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/