Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932142AbWIVAlm (ORCPT ); Thu, 21 Sep 2006 20:41:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932145AbWIVAlm (ORCPT ); Thu, 21 Sep 2006 20:41:42 -0400 Received: from smtp.osdl.org ([65.172.181.4]:31410 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S932142AbWIVAll (ORCPT ); Thu, 21 Sep 2006 20:41:41 -0400 Date: Thu, 21 Sep 2006 17:41:34 -0700 From: Andrew Morton To: kmannth@us.ibm.com Cc: lkml , Christoph Lameter Subject: Re: [BUG] i386 2.6.18 cpu_up: attempt to bring up CPU 4 failed : kernel BUG at mm/slab.c:2698! Message-Id: <20060921174134.4e0d30f2.akpm@osdl.org> In-Reply-To: <1158884252.5657.38.camel@keithlap> References: <1158884252.5657.38.camel@keithlap> X-Mailer: Sylpheed version 2.2.7 (GTK+ 2.8.6; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2082 Lines: 55 On Thu, 21 Sep 2006 17:17:31 -0700 keith mannthey wrote: > I wanted to just give 2.6.18 a spin and I tripped over something I > didn't expect. > > > cpu_up: attempt to bring up CPU 4 failed > kfree_debugcheck: bad ptr c15f6000h. > ------------[ cut here ]------------ > kernel BUG at mm/slab.c:2698! > invalid opcode: 0000 [#1] > SMP > Modules linked in: > CPU: 0 > EIP: 0060:[] Not tainted VLI > EFLAGS: 00010046 (2.6.18 #1) > EIP is at kfree_debugcheck+0x7f/0x90 > eax: 00000028 ebx: 000015f6 ecx: c1025289 edx: c7653540 > esi: c15f6000 edi: c15f6000 ebp: c764af38 esp: c764af28 > ds: 007b es: 007b ss: 0068 > Process swapper (pid: 1, ti=c764a000 task=c7653540 task.ti=c764a000) > Stack: c122c68d c15f6000 c1635000 00000004 c764af5c c106ef93 00000286 > c76a77d0 > 00000004 00000001 c1635000 00000004 00000004 c764af6c c10557f6 > c1274eac > c12743dc c764af84 c1207467 00000004 c12734c0 00000004 00000004 > c764af98 > Call Trace: > [] kfree+0x24/0x1d8 > [] pageset_cpuup_callback+0x40/0x58 > [] notifier_call_chain+0x20/0x31 > [] blocking_notifier_call_chain+0x1d/0x2d > [] cpu_up+0xb5/0xcf > [] init+0x78/0x296 > [] kernel_thread_helper+0x5/0xb I think we have two problems here: a) CPU4 didn't come up. To diagnose that I think we'll need to ask you to into cpu_up(), add debug printks to blocking_notifier_call_chain(), work out which entry on that chain returned NOTIFY_BAD, then work out why it did so. b) pageset_cpuup_callback()'s CPU_UP_CANCELED path possibly hasn't been tested before. I'd be guessing that we're not zeroing out the zone.pageset[] array when the `struct zone' is first allocated, but I don't immediately recall where that code lives. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/