Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753345AbXL1JLi (ORCPT ); Fri, 28 Dec 2007 04:11:38 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751922AbXL1JLa (ORCPT ); Fri, 28 Dec 2007 04:11:30 -0500 Received: from e28smtp05.in.ibm.com ([59.145.155.5]:36883 "EHLO e28esmtp05.in.ibm.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751806AbXL1JL3 (ORCPT ); Fri, 28 Dec 2007 04:11:29 -0500 Message-ID: <4774BDBA.70904@linux.vnet.ibm.com> Date: Fri, 28 Dec 2007 14:41:22 +0530 From: Kamalesh Babulal User-Agent: Thunderbird 1.5.0.14pre (X11/20071023) MIME-Version: 1.0 To: Andrew Morton CC: linux-kernel@vger.kernel.org, Ingo Molnar , Andy Whitcroft , Balbir Singh , Srivatsa Vaddagiri , Dhaval Giani Subject: Re: 2.6.24-rc6-mm1 Kernel panics at different functions () References: <20071222233056.d652743e.akpm@linux-foundation.org> <47736732.7040400@linux.vnet.ibm.com> <20071227015410.2c09db2f.akpm@linux-foundation.org> In-Reply-To: <20071227015410.2c09db2f.akpm@linux-foundation.org> Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5210 Lines: 113 Andrew Morton wrote: > On Thu, 27 Dec 2007 14:19:54 +0530 Kamalesh Babulal wrote: > >> Hi Andrew, >> >> The 2.6.24-rc6-mm1 kernel with hotfix x86-fix-system-gate-related-crash.patch applied >> panics while booting on a x86_64 box >> >> Unable to handle kernel NULL pointer dereference at 0000000000000046 RIP: >> [] rb_erase+0xe7/0x2a3 >> PGD 17ff65067 PUD 17f1c7067 PMD 0 >> Oops: 0000 [1] SMP >> last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:02:04.0/host0/target0:0:6/0:0:6:0/type >> CPU 0 >> Modules linked in: >> Pid: 0, comm: swapper Not tainted 2.6.24-rc6-mm1-autokern1 #1 >> RIP: 0010:[] [] rb_erase+0xe7/0x2a3 >> RSP: 0000:ffffffff80650e00 EFLAGS: 00010002 >> RAX: ffff8101fe9568c8 RBX: ffff8100010062a8 RCX: ffff8101fe9568b0 >> RDX: ffff8101fe9568c8 RSI: 0000000000000046 RDI: 0000000000000000 >> RBP: ffffffff80650e10 R08: ffff8101fe9568c8 R09: 0000000000000086 >> R10: 0000000000000000 R11: 00000000000001e8 R12: ffff8100010062b8 >> R13: 0000000000000002 R14: ffff810001006260 R15: 0000000000000001 >> FS: 0000000000000000(0000) GS:ffffffff805dc000(0000) knlGS:00000000f31ffbb0 >> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b >> CR2: 0000000000000046 CR3: 000000017f0ab000 CR4: 00000000000006e0 >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Process swapper (pid: 0, threadinfo ffffffff805f6000, task ffffffff805a2080) >> Stack: ffff8100010062a8 ffff8101fe9568b0 ffffffff80650e40 ffffffff8024be16 >> ffffffff80369d65 ffffffff80369d65 ffff8101fe9568b0 ffff8100010062a8 >> ffffffff80650eb0 ffffffff8024c1d5 ffffffffb88cc28e 0000000006e73eff >> Call Trace: >> [] __remove_hrtimer+0x2e/0x3c >> [] __down_read_trylock+0x16/0x42 >> [] __down_read_trylock+0x16/0x42 >> [] hrtimer_run_queues+0x130/0x191 >> [] run_timer_softirq+0x28/0x1a7 >> [] __do_softirq+0x55/0xc2 >> [] call_softirq+0x1c/0x28 >> [] do_softirq+0x32/0x9d >> [] irq_exit+0x3f/0x41 >> [] smp_apic_timer_interrupt+0x92/0xa7 >> [] apic_timer_interrupt+0x66/0x70 >> [] default_idle+0x36/0x5e >> [] default_idle+0x31/0x5e >> [] default_idle+0x0/0x5e >> [] cpu_idle+0x90/0xb2 >> [] rest_init+0x5a/0x5c >> [] start_kernel+0x2b8/0x2c4 >> [] _sinittext+0x12b/0x132 >> >> > > It does seem to be mostly hrtimer-related. But surely the hrtimer system > is initialised by the time tis happens. > > The usual refrain: is it possible to run a bisection search? Hi Andrew, While doing the git bisect, following panic was seen Unable to handle kernel paging request at 000000000000401e RIP: [] load_balance_monitor+0x15e/0x2a4 PGD 0 Oops: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:04.0/host0/target0:0:6/0:0:6:0/type CPU 1 Modules linked in: Pid: 15, comm: load_balance_mo Not tainted 2.6.24-rc6-mm1-autokern1 #1 RIP: 0010:[] [] load_balance_monitor+0x15e/0x2a4 RSP: 0000:ffff81007ffb7eb0 EFLAGS: 00010297 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: 000000000000401e RSI: ffff81007ffb7ed8 RDI: 0000000000000000 RBP: ffff81007ffb7f20 R08: ffff81007ffb6000 R09: ffff81007ffb6000 R10: ffff81007ffb6000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000003 R14: 0000000000000800 R15: ffff8101fe997f00 FS: 0000000000000000(0000) GS:ffff8100e3b10000(0000) knlGS:00000000f73e1bb0 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000000401e CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process load_balance_mo (pid: 15, threadinfo ffff81007ffb6000, task ffff81007ff94790) Stack: 0000000000002000 0000000000000000 ffff810001009cc0 00000001e3b29d90 0000008000000000 000000000000000f ffff81007f0be780 000000000000000f 000000017ffb7f20 0000000000000000 00000000fffffffc ffffffffffffffff Call Trace: [] load_balance_monitor+0x0/0x2a4 [] kthread+0x3d/0x63 [] child_rip+0xa/0x12 [] kthread+0x0/0x63 [] child_rip+0x0/0x12 Code: 48 8b 04 c2 48 8b 10 48 01 55 98 e8 ce 40 12 00 83 f8 07 41 RIP [] load_balance_monitor+0x15e/0x2a4 RSP CR2: 000000000000401e The git-sched.patch is causing this panic, and i am searching for the patch causing the hrtimer-related panic. -- Thanks & Regards, Kamalesh Babulal, Linux Technology Center, IBM, ISTL. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/