Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753654AbXJaG0b (ORCPT ); Wed, 31 Oct 2007 02:26:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751326AbXJaG0X (ORCPT ); Wed, 31 Oct 2007 02:26:23 -0400 Received: from mga02.intel.com ([134.134.136.20]:64022 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751303AbXJaG0W (ORCPT ); Wed, 31 Oct 2007 02:26:22 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.21,350,1188802800"; d="scan'208";a="259229725" Subject: Re: 2.6.23 boot failures on x86-64. From: Zou Nan hai To: Martin Ebourne Cc: Dave Jones , Andi Kleen , Linux Kernel , Suresh Siddha , stable@kernel.org, Andrew Morton , Linus Torvalds In-Reply-To: <1193810651.817.14.camel@linux-znh> References: <20071029175014.GH7793@redhat.com> <200710291918.43869.ak@suse.de> <20071029184747.GB1650@redhat.com> <200710292003.09317.ak@suse.de> <20071029194311.GE1650@redhat.com> <1193692862.3133.14.camel@avenin.ebourne.me.uk> <1193810651.817.14.camel@linux-znh> Content-Type: text/plain Organization: Message-Id: <1193811559.817.36.camel@linux-znh> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 31 Oct 2007 14:19:19 +0800 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2350 Lines: 75 On Wed, 2007-10-31 at 14:04, Zou Nan hai wrote: > On Tue, 2007-10-30 at 05:21, Martin Ebourne wrote: > > On Mon, 2007-10-29 at 15:43 -0400, Dave Jones wrote: > > > On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > > > > > > But if allocating bootmem >4G doesn't work on these systems > > > > > > most likely they have more problems anyways. It might be better > > > > > > to find out what goes wrong exactly. > > > > > Any ideas on what to instrument ? > > > > > > > > See what address the bootmem_alloc_high returns; check if it overlaps > > > > with something etc. > > > > > > > > Fill the memory on the system and see if it can access all of its memory. > > > > > > Martin, as you have one of the affected systems, do you feel up to this? > > > > Faking a node at 0000000000000000-000000001fff0000 > > Bootmem setup node 0 0000000000000000-000000001fff0000 > > sparse_early_mem_map_alloc: returned address ffff81000070b000 > > > > My box has 512MB of RAM. > > > > Cheers, > > > > Martin. > > Oops, sorry, > seem to be a mistake of me. > I forget to exclude the DMA range. > > Does the following patch fix the issue? > > Thanks > Zou Nan hai > > --- a/arch/x86/mm/init_64.c 2007-10-31 11:24:11.000000000 +0800 > +++ b/arch/x86/mm/init_64.c 2007-10-31 12:31:02.000000000 +0800 > @@ -731,7 +731,7 @@ int in_gate_area_no_task(unsigned long a > void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) > { > return __alloc_bootmem_core(pgdat->bdata, size, > - SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0); > + SMP_CACHE_BYTES, (4UL*1024*1024*1024), __pa(MAX_DMA_ADDRESS)); > } > > const char *arch_vma_name(struct vm_area_struct *vma) > > > > Please ignore the patch, the patch is wrong. However I think the root cause is when __alloc_bootmem_core fail to allocate a memory above 4G it will fall back to allocate from the lowest page. Then happens to be allocated in DMA region sometimes... Since this code path is dead, I am OK to revert the patch. Suresh and I will check the CONFIG_SPARSE_VMEMMAP path. Thanks Zou Nan hai - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/