Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758230AbYHRUcs (ORCPT ); Mon, 18 Aug 2008 16:32:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752614AbYHRUci (ORCPT ); Mon, 18 Aug 2008 16:32:38 -0400 Received: from mail-gx0-f16.google.com ([209.85.217.16]:35525 "EHLO mail-gx0-f16.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752600AbYHRUch (ORCPT ); Mon, 18 Aug 2008 16:32:37 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:mime-version:content-type :content-transfer-encoding:content-disposition; b=OoQ3PtF48vL69lnqUx7waTlEugkpfCSYd0CsUG+kr/T+/q9vOLiZFVNfTsKKJGUXro U7wusT6f5clE5wbZISUQrmcfUn2R745ycQIUEgm1HEVoa/GWOA8w/SzSio5TsbYBm2Xs irc4C6E798X0pB66gN6AfJ+MZqRxxfeFo4qzQ= Message-ID: <19f34abd0808181332k3c02496auabd04e927bb7cab5@mail.gmail.com> Date: Mon, 18 Aug 2008 22:32:35 +0200 From: "Vegard Nossum" To: "Pavel Machek" , "Rafael J. Wysocki" , "Robert Richter" , "Ingo Molnar" , "Andi Kleen" , "Philippe Elie" Subject: oprofile + hibernation = badness Cc: "Linux Kernel Mailing List" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10527 Lines: 272 Hi, I'm probably crazy to do it, but... $ opcontrol --start $ echo disk > /sys/power/state ...leads to lots of badness/strangeness. First, lots of APIC errors: APIC error on CPU0: 40(40) [repeat 8 times] APIC error on CPU1: 40(40) APIC error on CPU1: 40(40) APIC error on CPU0: 40(40) These keep on coming all through the suspend/shutdown sequence, also intermixing with other messages. I'm guessing oprofile is trying to NMI CPUs that have been brought down? Now I get some ACPI Exceptions, but I think that these are unrelated to starting oprofile, because I have seen them on regular shutdowns as well: PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. PM: Shrinking memory... done (0 pages freed) PM: Freed 0 kbytes in 0.44 seconds (0.00 MB/s) Suspending console(s) (use no_console_suspend to debug) ACPI Exception (exoparg2-0444): AE_AML_PACKAGE_LIMIT, Index (000000007) is beyond end of object [20080609] ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.PCI0.IDE0.GTM_] (Node f783bfc0), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0530): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN0._GTM] (Node f783bb40), AE_AML_PACKAGE_LIMIT ACPI handle has no context! serial 00:0d: disabled serial 00:06: disabled ehci_hcd 0000:00:1d.7: PCI INT A disabled uhci_hcd 0000:00:1d.3: PCI INT D disabled uhci_hcd 0000:00:1d.2: PCI INT C disabled uhci_hcd 0000:00:1d.1: PCI INT B disabled uhci_hcd 0000:00:1d.0: PCI INT A disabled ACPI: Preparing to enter system sleep state S4 After that, I see the message "WQ on CPU0, prefer CPU1" a few times. This must also be bad. Now some warnings: ------------[ cut here ]------------ WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s mp_call_function_mask+0x194/0x1a0() Pid: 3711, comm: bash Not tainted 2.6.27-rc3-00415-g122c9e0 #10 [] warn_on_slowpath+0x4f/0x80 [] ? log_sample+0xa8/0xc0 [] ? device_power_down+0x5d/0x110 [] ? oprofile_add_ext_sample+0x40/0xc0 [] ? oprofile_add_sample+0x34/0x50 [] ? p4_check_ctr+0x13c/0x160 [] smp_call_function_mask+0x194/0x1a0 [] ? nmi_cpu_stop+0x0/0x30 [] ? __atomic_notifier_call_chain+0x3c/0x50 [] ? atomic_notifier_call_chain+0x1a/0x20 [] ? notify_die+0x2d/0x30 [] ? nmi_cpu_stop+0x0/0x30 [] smp_call_function+0x30/0x60 [] ? nmi_cpu_stop+0x0/0x30 [] on_each_cpu+0x2c/0x80 [] nmi_stop+0x14/0x20 [] nmi_suspend+0x15/0x20 [] sysdev_suspend+0xa3/0x1f0 [] device_power_down+0xdc/0x110 [] hibernation_snapshot+0x15b/0x220 [] ? printk+0x1b/0x20 [] ? freeze_processes+0x52/0xa0 [] hibernate+0xe0/0x180 [] ? state_store+0x0/0xd0 [] state_store+0xbf/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x96/0x130 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x3f [] ? _read_unlock_irqrestore+0x60/0x70 ======================= ---[ end trace df36f449a70f1c8e ]--- PM: Creating hibernation image: PM: Need to copy 54048 pages PM: Hibernation image created (54048 pages copied) x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 Intel machine check architecture supported. BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711 caller is intel_p4_mcheck_init+0xb9/0x2b0 Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10 [] debug_smp_processor_id+0xce/0xd0 [] intel_p4_mcheck_init+0xb9/0x2b0 [] mcheck_init+0x59/0x80 [] restore_processor_state+0x1f6/0x230 [] hibernation_snapshot+0x1d1/0x220 [] ? printk+0x1b/0x20 [] ? freeze_processes+0x52/0xa0 [] hibernate+0xe0/0x180 [] ? state_store+0x0/0xd0 [] state_store+0xbf/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x96/0x130 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x3f [] ? _read_unlock_irqrestore+0x60/0x70 ======================= Intel machine check reporting enabled on CPU#0. BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711 caller is intel_p4_mcheck_init+0x100/0x2b0 Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10 [] debug_smp_processor_id+0xce/0xd0 [] intel_p4_mcheck_init+0x100/0x2b0 [] mcheck_init+0x59/0x80 [] restore_processor_state+0x1f6/0x230 [] hibernation_snapshot+0x1d1/0x220 [] ? printk+0x1b/0x20 [] ? freeze_processes+0x52/0xa0 [] hibernate+0xe0/0x180 [] ? state_store+0x0/0xd0 [] state_store+0xbf/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x96/0x130 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x3f [] ? _read_unlock_irqrestore+0x60/0x70 ======================= CPU0: Intel P4/Xeon Extended MCE MSRs (24) available BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711 caller is intel_p4_mcheck_init+0x119/0x2b0 Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10 [] debug_smp_processor_id+0xce/0xd0 [] intel_p4_mcheck_init+0x119/0x2b0 [] mcheck_init+0x59/0x80 [] restore_processor_state+0x1f6/0x230 [] hibernation_snapshot+0x1d1/0x220 [] ? printk+0x1b/0x20 [] ? freeze_processes+0x52/0xa0 [] hibernate+0xe0/0x180 [] ? state_store+0x0/0xd0 [] state_store+0xbf/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x96/0x130 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x3f [] ? _read_unlock_irqrestore+0x60/0x70 ======================= CPU0: Thermal monitoring enabled Force enabled HPET at resume BUG: using smp_processor_id() in preemptible [00000000] code: bash/3711 caller is retrigger_next_event+0x15/0xb0 Pid: 3711, comm: bash Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10 [] debug_smp_processor_id+0xce/0xd0 [] retrigger_next_event+0x15/0xb0 [] ? preempt_schedule+0x3f/0x50 [] ? _spin_unlock+0x3c/0x50 [] hres_timers_resume+0xa/0x10 [] timekeeping_resume+0x12c/0x180 [] __sysdev_resume+0x14/0x50 [] sysdev_resume+0x3e/0x80 [] device_power_up+0xb/0x20 [] hibernation_snapshot+0x1b0/0x220 [] ? printk+0x1b/0x20 [] ? freeze_processes+0x52/0xa0 [] hibernate+0xe0/0x180 [] ? state_store+0x0/0xd0 [] state_store+0xbf/0xd0 [] ? state_store+0x0/0xd0 [] kobj_attr_store+0x24/0x30 [] sysfs_write_file+0xa2/0x100 [] vfs_write+0x96/0x130 [] ? sysfs_write_file+0x0/0x100 [] sys_write+0x3d/0x70 [] sysenter_do_call+0x12/0x3f [] ? _read_unlock_irqrestore+0x60/0x70 ======================= Enabling non-boot CPUs ... ...and then, while writing the image, it looks something like this: APIC error on CPU1: 40(40) ^H^H^H^H 3%<7>APIC error on CPU1: 40(40) WQ on CPU0, prefer CPU1 APIC error on CPU1: 40(40) [8 times] APIC error on CPU1: 40(40) WQ on CPU0, prefer CPU1 ... The machine comes back with this spamming the logs (rate of about 2 KB/sec): WQ on CPU0, prefer CPU1 APIC error on CPU1: 40(40) [9 times, repeat] I can still log in with ssh and reboot the machine, which finally gives this warning: BUG: sleeping function called from invalid context at kernel/sched.c:4625 in_atomic():1, irqs_disabled():0 INFO: lockdep is turned off. Pid: 4027, comm: oprofiled Tainted: G W 2.6.27-rc3-00415-g122c9e0 #10 [] __might_sleep+0xee/0x140 [] wait_for_common+0x28/0x130 [] ? trace_hardirqs_on+0xb/0x10 [] ? _spin_unlock_irqrestore+0x43/0x70 [] wait_for_completion+0x12/0x20 [] synchronize_rcu+0x2d/0x40 [] ? wakeme_after_rcu+0x0/0x10 [] ? trace_hardirqs_on+0xb/0x10 [] atomic_notifier_chain_unregister+0x36/0x50 [] unregister_die_notifier+0xf/0x20 [] nmi_shutdown+0x44/0x80 [] oprofile_shutdown+0x2a/0x60 [] event_buffer_release+0x10/0x40 [] __fput+0xb6/0x180 [] fput+0x19/0x20 [] filp_close+0x47/0x70 [] put_files_struct+0x9b/0xb0 [] exit_files+0x42/0x60 [] do_exit+0x192/0x880 [] ? up_read+0x16/0x30 [] ? do_page_fault+0x2e3/0x700 [] ? do_sync_write+0x0/0x110 [] do_group_exit+0x31/0x90 [] sys_exit_group+0xf/0x20 [] sysenter_do_call+0x12/0x3f ======================= Sending all processes the KILL signal... The last time I did this, the machine also came up with a corrupt ext3 root filesystem. But it didn't happen this time, so I guess it could also be unrelated. What would be the proper way to fix all this? Should oprofile be able to profile the suspend/resume code as well? Or should "echo disk" give -EBUSY if oprofile is running? Or do we simply insert a huge kludge in the form of mutual exclusion between oprofile and suspend in the Kconfig? Thanks, Vegard PS: This is not a regression. It's probably always been like that. -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/