Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753911AbYLRWGf (ORCPT ); Thu, 18 Dec 2008 17:06:35 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752443AbYLRWG1 (ORCPT ); Thu, 18 Dec 2008 17:06:27 -0500 Received: from mail-bw0-f21.google.com ([209.85.218.21]:63785 "EHLO mail-bw0-f21.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752427AbYLRWG0 (ORCPT ); Thu, 18 Dec 2008 17:06:26 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=Mf1B4EG6vOmn8IjarPTKklODyDIP8dOVfODXv8GjZDQe0NikxHuboZie/9m2ezJ+YB tRfQzW0jdkVfTOH+oFCWn7nUQ5uyujXxkLCaY2wecGVtgnGkiQ0JbV8fEe2f313cP3VZ KaMcvmuN9cLZOzyClSynCXryg5/yNHcp6SwDA= Message-ID: <19f34abd0812181406n48712c81j41a560aaf6ba6cd8@mail.gmail.com> Date: Thu, 18 Dec 2008 23:06:23 +0100 From: "Vegard Nossum" To: "Linux Kernel Mailing List" Subject: v2.6.28-rc7: error in panic code? (NULL pointer dereference at 0000004c) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6149 Lines: 138 Hi, With such a patch: diff --git a/init/main.c b/init/main.c index 7e117a2..2f93119 100644 --- a/init/main.c +++ b/init/main.c @@ -465,6 +465,8 @@ static void noinline __init_refok rest_init(void) { int pid; + *(char *) NULL = 0; + kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND); numa_default_policy(); pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); ...I would expect a page fault and that's that. So panic() is called, but it causes a new page fault somewhere. Here is the log: (this part is correct and expected) [ 0.031003] BUG: unable to handle kernel NULL pointer dereference at 00000000 [ 0.033997] IP: [] rest_init+0xf/0x57 [ 0.035997] *pde = 00000000 [ 0.037289] Oops: 0002 [#1] SMP [ 0.037994] last sysfs file: [ 0.037994] Modules linked in: [ 0.037994] [ 0.037994] Pid: 0, comm: swapper Not tainted (2.6.28-rc7 #181) 945P-A [ 0.037994] EIP: 0060:[] EFLAGS: 00010246 CPU: 0 [ 0.037994] EIP is at rest_init+0xf/0x57 [ 0.037994] EAX: c16631e3 EBX: 00000040 ECX: 00000a00 EDX: 00000000 [ 0.037994] ESI: 00099800 EDI: c160a000 EBP: c165dfd0 ESP: c165dfd0 [ 0.037994] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 0.037994] Process swapper (pid: 0, ti=c165c000 task=c156e334 task.ti=c165c000) [ 0.037994] Stack: [ 0.037994] c165dfe0 c16637af c1691768 00000000 c165dff8 c1663080 0175a000 00000000 [ 0.037994] c14ceaae 00020800 01b7e003 00000000 [ 0.037994] Call Trace: [ 0.037994] [] ? start_kernel+0x2a2/0x2a7 [ 0.037994] [] ? __init_begin+0x80/0x88 [ 0.037994] Code: 00 8b 43 04 8d 56 04 89 50 04 89 46 04 8d 43 04 89 46 08 89 53 04 fe 03 5b 5e 5d c3 55 b9 00 0a 00 00 8 9 e5 31 d2 b8 e3 31 66 c1 05 00 00 00 00 00 e8 00 df c4 ff b9 00 06 00 00 31 d2 b8 af [ 0.037994] EIP: [] rest_init+0xf/0x57 SS:ESP 0068:c165dfd0 [ 0.038004] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.038998] Kernel panic - not syncing: Attempted to kill the idle task! And now the unexpected part: [ 0.039999] Rebooting in 10 seconds..<1>BUG: unable to handle kernel NULL pointer dereference at 0000004c [ 0.040993] IP: [] klist_next+0x10/0x8d [ 0.040993] *pde = 00000000 [ 0.040993] Oops: 0000 [#2] SMP [ 0.040993] last sysfs file: [ 0.040993] Modules linked in: [ 0.040993] [ 0.040993] Pid: 0, comm: swapper Tainted: G D (2.6.28-rc7 #181) 945P-A [ 0.040993] EIP: 0060:[] EFLAGS: 00010286 CPU: 0 [ 0.040993] EIP is at klist_next+0x10/0x8d [ 0.040993] EAX: 0000003c EBX: c165dd60 ECX: 00000000 EDX: c165dd60 [ 0.040993] ESI: c165dd60 EDI: 00000000 EBP: c165dd58 ESP: c165dd48 [ 0.040993] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 0.040993] Process swapper (pid: 0, ti=c165c000 task=c156e334 task.ti=c165c000) [ 0.040993] Stack: [ 0.040993] c1026a97 c165dd60 c165dd60 00000000 c165dd74 c11be196 0000003c 00000000 [ 0.040993] c13dbde0 00001078 00000100 c165dd84 c114ed26 c114e458 c13dbde0 c165dd9c [ 0.040993] c1152546 ffffffff c13dbde0 00002710 0000000b c165ddac c115259a ffffffff [ 0.040993] Call Trace: [ 0.040993] [] ? release_console_sem+0x16c/0x199 [ 0.040993] [] ? bus_find_device+0x4e/0x6e [ 0.040993] [] ? no_pci_devices+0x17/0x2d [ 0.040993] [] ? find_anything+0x0/0xa [ 0.040993] [] ? pci_get_subsys+0x15/0x5b [ 0.040993] [] ? pci_get_device+0xe/0x10 [ 0.040993] [] ? mach_reboot_fixups+0x27/0x3c [ 0.040993] [] ? native_machine_emergency_restart+0x3e/0xd7 [ 0.040993] [] ? machine_emergency_restart+0x9/0xb [ 0.040993] [] ? emergency_restart+0x8/0xa [ 0.040993] [] ? panic+0xb9/0xd6 [ 0.040993] [] ? do_exit+0x5b/0x740 [ 0.040993] [] ? printk+0xf/0x11 [ 0.040993] [] ? print_oops_end_marker+0x1e/0x23 [ 0.040993] [] ? oops_end+0x7f/0x87 [ 0.040993] [] ? die+0x5b/0x63 [ 0.040993] [] ? do_page_fault+0x581/0x66f [ 0.040993] [] ? sched_clock_cpu+0x136/0x142 [ 0.040993] [] ? sched_clock_cpu+0x136/0x142 [ 0.040993] [] ? ktime_get+0x13/0x2f [ 0.040993] [] ? sched_clock_idle_sleep_event+0xe/0x10 [ 0.040993] [] ? __do_softirq+0x119/0x121 [ 0.040993] [] ? acpi_hw_low_level_read+0x3b/0x68 [ 0.040993] [] ? acpi_hw_register_read+0xa0/0x112 [ 0.040993] [] ? acpi_get_register_unlocked+0x2c/0x48 [ 0.040993] [] ? acpi_os_release_lock+0x8/0xa [ 0.040993] [] ? acpi_get_register+0x2d/0x34 [ 0.040993] [] ? do_page_fault+0x0/0x66f [ 0.040993] [] ? error_code+0x72/0x78 [ 0.040993] [] ? kernel_init+0x0/0x148 [ 0.040993] [] ? rest_init+0xf/0x57 [ 0.040993] [] ? start_kernel+0x2a2/0x2a7 [ 0.040993] [] ? __init_begin+0x80/0x88 [ 0.040993] Code: 89 4a 04 74 08 8d 41 0c e8 fa 04 d9 ff 5d c3 55 31 c9 89 e5 e8 e0 ff ff ff 5d c3 55 89 e5 57 56 89 c6 5 3 83 ec 04 8b 00 8b 7e 04 <8b> 50 10 89 55 f0 e8 7b cf 01 00 85 ff 74 23 8b 47 04 ba ec 42 [ 0.040993] EIP: [] klist_next+0x10/0x8d SS:ESP 0068:c165dd48 I know this is not much to fuzz about since it was artificially induced with the NULL pointer dereference, but what if such an error (a real one) made it into the kernel, it could scroll away the real oops. Anyway -- to reproduce, apply the patch and boot with panic=10 (1 also works). Thanks for the attention, Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/