Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755283Ab3ETDQr (ORCPT ); Sun, 19 May 2013 23:16:47 -0400 Received: from e28smtp08.in.ibm.com ([122.248.162.8]:59101 "EHLO e28smtp08.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755046Ab3ETDQp (ORCPT ); Sun, 19 May 2013 23:16:45 -0400 Message-ID: <51999591.8030401@linux.vnet.ibm.com> Date: Mon, 20 May 2013 11:16:33 +0800 From: Michael Wang User-Agent: Mozilla/5.0 (X11; Linux i686; rv:16.0) Gecko/20121011 Thunderbird/16.0.1 MIME-Version: 1.0 To: Borislav Petkov CC: "Paul E. McKenney" , Jiri Kosina , Frederic Weisbecker , Tony Luck , linux-kernel@vger.kernel.org, x86@kernel.org, Thomas Gleixner Subject: Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2 References: <20130509125040.GF27333@pd.tnic> <20130509125859.GG27333@pd.tnic> <20130515184528.GO4442@linux.vnet.ibm.com> <20130515224358.GF11783@pd.tnic> <20130515235512.GT4442@linux.vnet.ibm.com> <20130517135641.GF23035@pd.tnic> In-Reply-To: <20130517135641.GF23035@pd.tnic> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13052003-2000-0000-0000-00000C2CDF3A Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14784 Lines: 204 Hi, Borislav On 05/17/2013 09:56 PM, Borislav Petkov wrote: [snip] > [ 51.737378] [] native_smp_send_reschedule+0x58/0x60 > [ 51.744013] [] wake_up_nohz_cpu+0x2d/0xa0 I suppose the reason is that the cpu we passed to mod_delayed_work_on() has a chance to become offline before we disabled irq, what about check it before send resched ipi? like: diff --git a/kernel/sched/core.c b/kernel/sched/core.c index bfa7e77..d0e8f15 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -626,7 +626,7 @@ static bool wake_up_full_nohz_cpu(int cpu) void wake_up_nohz_cpu(int cpu) { - if (!wake_up_full_nohz_cpu(cpu)) + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) wake_up_idle_cpu(cpu); } Regards, Michael Wang > [ 51.749745] [] add_timer_on+0x8f/0x110 > [ 51.755214] [] __queue_delayed_work+0x16e/0x1a0 > [ 51.761470] [] ? try_to_grab_pending+0xd1/0x1a0 > [ 51.767724] [] mod_delayed_work_on+0x5a/0xa0 > [ 51.773719] [] gov_queue_work+0x4d/0xc0 > [ 51.779271] [] od_dbs_timer+0xcb/0x170 > [ 51.784734] [] process_one_work+0x1fd/0x540 > [ 51.790634] [] ? process_one_work+0x192/0x540 > [ 51.796711] [] worker_thread+0x122/0x380 > [ 51.802350] [] ? rescuer_thread+0x320/0x320 > [ 51.808264] [] kthread+0xea/0xf0 > [ 51.813200] [] ? flush_kthread_worker+0x150/0x150 > [ 51.819644] [] ret_from_fork+0x7c/0xb0 > [ 51.918165] nouveau E[ DRM] GPU lockup - switching to software fbcon > [ 51.930505] [] ? flush_kthread_worker+0x150/0x150 > [ 51.936994] ---[ end trace f419538ada83b5c5 ]--- > [ 51.942915] ------------[ cut here ]------------ > [ 51.942928] ------------[ cut here ]------------ > [ 51.942936] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x58/0x60() > [ 51.942974] Modules linked in: ext2 vfat fat loop snd_hda_codec_hdmi usbhid snd_hda_codec_realtek coretemp kvm_intel kvm snd_hda_intel snd_hda_codec crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hwdep snd_pcm aesni_intel sb_edac aes_x86_64 ehci_pci snd_page_alloc glue_helper snd_timer xhci_hcd snd iTCO_wdt iTCO_vendor_support ehci_hcd edac_core lpc_ich acpi_cpufreq lrw gf128mul ablk_helper cryptd mperf usbcore usb_common soundcore mfd_core dcdbas evdev pcspkr processor i2c_i801 button microcode > [ 51.942978] CPU: 5 PID: 740 Comm: kworker/3:2 Tainted: G W 3.10.0-rc1+ #10 > [ 51.942979] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 > [ 51.942985] Workqueue: events od_dbs_timer > [ 51.942990] 0000000000000009 ffff88043ab0db68 ffffffff8161441c ffff88043ab0dba8 > [ 51.942994] ffffffff8103e540 000000003ab0dbf8 0000000000000003 ffff88043d708000 > [ 51.942998] 00000000ffff0d32 0000000000000003 ffff88044fccfc08 ffff88043ab0dbb8 > [ 51.942999] Call Trace: > [ 51.943005] [] dump_stack+0x19/0x1b > [ 51.943010] [] warn_slowpath_common+0x70/0xa0 > [ 51.943014] [] warn_slowpath_null+0x1a/0x20 > [ 51.943017] [] native_smp_send_reschedule+0x58/0x60 > [ 51.943021] [] wake_up_nohz_cpu+0x2d/0xa0 > [ 51.943027] [] add_timer_on+0x8f/0x110 > [ 51.943031] [] __queue_delayed_work+0x16e/0x1a0 > [ 51.943035] [] ? try_to_grab_pending+0xd1/0x1a0 > [ 51.943038] [] mod_delayed_work_on+0x5a/0xa0 > [ 51.943043] [] gov_queue_work+0x4d/0xc0 > [ 51.943046] [] od_dbs_timer+0xcb/0x170 > [ 51.943050] [] process_one_work+0x1fd/0x540 > [ 51.943053] [] ? process_one_work+0x192/0x540 > [ 51.943057] [] worker_thread+0x122/0x380 > [ 51.943060] [] ? rescuer_thread+0x320/0x320 > [ 51.943063] [] kthread+0xea/0xf0 > [ 51.943068] [] ? flush_kthread_worker+0x150/0x150 > [ 51.943071] [] ret_from_fork+0x7c/0xb0 > [ 51.943074] [] ? flush_kthread_worker+0x150/0x150 > [ 51.943076] ---[ end trace f419538ada83b5c6 ]--- > [ 52.178461] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x58/0x60() > [ 52.188097] Modules linked in: ext2 vfat fat loop snd_hda_codec_hdmi usbhid snd_hda_codec_realtek coretemp kvm_intel kvm snd_hda_intel snd_hda_codec crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hwdep snd_pcm aesni_intel sb_edac aes_x86_64 ehci_pci snd_page_alloc glue_helper snd_timer xhci_hcd snd iTCO_wdt iTCO_vendor_support ehci_hcd edac_core lpc_ich acpi_cpufreq lrw gf128mul ablk_helper cryptd mperf usbcore usb_common soundcore mfd_core dcdbas evdev pcspkr processor i2c_i801 button microcode > [ 52.238477] CPU: 0 PID: 85 Comm: kworker/2:1 Tainted: G W 3.10.0-rc1+ #10 > [ 52.247669] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 > [ 52.256604] Workqueue: events od_dbs_timer > [ 52.262219] 0000000000000009 ffff88043b62db68 ffffffff8161441c ffff88043b62dba8 > [ 52.271194] ffffffff8103e540 0000000000000033 0000000000000002 ffff88043d6dc000 > [ 52.280163] 00000000ffff0d32 0000000000000002 ffff88044fc8fc08 ffff88043b62dbb8 > [ 52.289141] Call Trace: > [ 52.293066] [] dump_stack+0x19/0x1b > [ 52.299704] [] warn_slowpath_common+0x70/0xa0 > [ 52.307213] [] warn_slowpath_null+0x1a/0x20 > [ 52.314540] [] native_smp_send_reschedule+0x58/0x60 > [ 52.322592] [] wake_up_nohz_cpu+0x2d/0xa0 > [ 52.329763] [] add_timer_on+0x8f/0x110 > [ 52.336660] [] __queue_delayed_work+0x16e/0x1a0 > [ 52.344349] [] ? try_to_grab_pending+0xd1/0x1a0 > [ 52.352031] [] mod_delayed_work_on+0x5a/0xa0 > [ 52.359458] [] gov_queue_work+0x4d/0xc0 > [ 52.366438] [] od_dbs_timer+0xcb/0x170 > [ 52.373338] [] process_one_work+0x1fd/0x540 > [ 52.380670] [] ? process_one_work+0x192/0x540 > [ 52.388176] [] worker_thread+0x122/0x380 > [ 52.395247] [] ? rescuer_thread+0x320/0x320 > [ 52.402588] [] kthread+0xea/0xf0 > [ 52.408954] [] ? flush_kthread_worker+0x150/0x150 > [ 52.416830] [] ret_from_fork+0x7c/0xb0 > [ 52.423722] [] ? flush_kthread_worker+0x150/0x150 > [ 52.431588] ---[ end trace f419538ada83b5c7 ]--- > [ 52.460411] ------------[ cut here ]------------ > [ 52.467744] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x58/0x60() > [ 52.478684] Modules linked in: ext2 vfat fat loop snd_hda_codec_hdmi usbhid snd_hda_codec_realtek coretemp kvm_intel kvm snd_hda_intel snd_hda_codec crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hwdep snd_pcm aesni_intel sb_edac aes_x86_64 ehci_pci snd_page_alloc glue_helper snd_timer xhci_hcd snd iTCO_wdt iTCO_vendor_support ehci_hcd edac_core lpc_ich acpi_cpufreq lrw gf128mul ablk_helper cryptd mperf usbcore usb_common soundcore mfd_core dcdbas evdev pcspkr processor i2c_i801 button microcode > [ 52.533573] CPU: 5 PID: 740 Comm: kworker/3:2 Tainted: G W 3.10.0-rc1+ #10 > [ 52.544303] ------------[ cut here ]------------ > [ 52.544305] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x58/0x60() > [ 52.544317] Modules linked in: ext2 vfat fat loop snd_hda_codec_hdmi usbhid snd_hda_codec_realtek coretemp kvm_intel kvm snd_hda_intel snd_hda_codec crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hwdep snd_pcm aesni_intel sb_edac aes_x86_64 ehci_pci snd_page_alloc glue_helper snd_timer xhci_hcd snd iTCO_wdt iTCO_vendor_support ehci_hcd edac_core lpc_ich acpi_cpufreq lrw gf128mul ablk_helper cryptd mperf usbcore usb_common soundcore mfd_core dcdbas evdev pcspkr processor i2c_i801 button microcode > [ 52.544318] CPU: 0 PID: 71 Comm: kworker/4:1 Tainted: G W 3.10.0-rc1+ #10 > [ 52.544318] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 > [ 52.544322] Workqueue: events od_dbs_timer > [ 52.544324] 0000000000000009 ffff88043c271b68 ffffffff8161441c ffff88043c271ba8 > [ 52.544325] ffffffff8103e540 0000000000000033 0000000000000004 ffff88043d738000 > [ 52.544326] 00000000ffff0dc8 0000000000000004 ffff88044fd0fc08 ffff88043c271bb8 > [ 52.544327] Call Trace: > [ 52.544330] [] dump_stack+0x19/0x1b > [ 52.544333] [] warn_slowpath_common+0x70/0xa0 > [ 52.544334] [] warn_slowpath_null+0x1a/0x20 > [ 52.544335] [] native_smp_send_reschedule+0x58/0x60 > [ 52.544337] [] wake_up_nohz_cpu+0x2d/0xa0 > [ 52.544340] [] add_timer_on+0x8f/0x110 > [ 52.544342] [] __queue_delayed_work+0x16e/0x1a0 > [ 52.544343] [] ? try_to_grab_pending+0xd1/0x1a0 > [ 52.544344] [] mod_delayed_work_on+0x5a/0xa0 > [ 52.544346] [] gov_queue_work+0x4d/0xc0 > [ 52.544347] [] od_dbs_timer+0xcb/0x170 > [ 52.544348] [] process_one_work+0x1fd/0x540 > [ 52.544349] [] ? process_one_work+0x192/0x540 > [ 52.544350] [] worker_thread+0x122/0x380 > [ 52.544351] [] ? rescuer_thread+0x320/0x320 > [ 52.544353] [] kthread+0xea/0xf0 > [ 52.544354] [] ? flush_kthread_worker+0x150/0x150 > [ 52.544356] [] ret_from_fork+0x7c/0xb0 > [ 52.544357] [] ? flush_kthread_worker+0x150/0x150 > [ 52.544357] ---[ end trace f419538ada83b5c8 ]--- > [ 52.798038] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 > [ 52.806551] Workqueue: events od_dbs_timer > [ 52.811736] 0000000000000009 ffff88043ab0db68 ffffffff8161441c ffff88043ab0dba8 > [ 52.820284] ffffffff8103e540 0000000000000033 0000000000000003 ffff88043d708000 > [ 52.828828] 00000000ffff0db3 0000000000000003 ffff88044fccfc08 ffff88043ab0dbb8 > [ 52.837372] Call Trace: > [ 52.840874] [] dump_stack+0x19/0x1b > [ 52.847090] [] warn_slowpath_common+0x70/0xa0 > [ 52.854176] [] warn_slowpath_null+0x1a/0x20 > [ 52.861086] [] native_smp_send_reschedule+0x58/0x60 > [ 52.868694] [] wake_up_nohz_cpu+0x2d/0xa0 > [ 52.875432] [] add_timer_on+0x8f/0x110 > [ 52.881902] [] __queue_delayed_work+0x16e/0x1a0 > [ 52.889160] [] ? try_to_grab_pending+0xd1/0x1a0 > [ 52.896416] [] mod_delayed_work_on+0x5a/0xa0 > [ 52.903409] [] gov_queue_work+0x4d/0xc0 > [ 52.909966] [] od_dbs_timer+0xcb/0x170 > [ 52.916434] [] process_one_work+0x1fd/0x540 > [ 52.923342] [] ? process_one_work+0x192/0x540 > [ 52.930427] [] worker_thread+0x122/0x380 > [ 52.937074] [] ? rescuer_thread+0x320/0x320 > [ 52.943983] [] kthread+0xea/0xf0 > [ 52.949926] [] ? flush_kthread_worker+0x150/0x150 > [ 52.957370] [] ret_from_fork+0x7c/0xb0 > [ 52.963841] [] ? flush_kthread_worker+0x150/0x150 > [ 52.971275] ---[ end trace f419538ada83b5c9 ]--- > [ 52.976979] nouveau W[ PFIFO][0000:03:00.0] unknown intr 0x00400000, ch 1 > [ 53.092122] ------------[ cut here ]------------ > [ 53.099585] WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x58/0x60() > [ 53.110571] Modules linked in: ext2 vfat fat loop snd_hda_codec_hdmi usbhid snd_hda_codec_realtek coretemp kvm_intel kvm snd_hda_intel snd_hda_codec crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hwdep snd_pcm aesni_intel sb_edac aes_x86_64 ehci_pci snd_page_alloc glue_helper snd_timer xhci_hcd snd iTCO_wdt iTCO_vendor_support ehci_hcd edac_core lpc_ich acpi_cpufreq lrw gf128mul ablk_helper cryptd mperf usbcore usb_common soundcore mfd_core dcdbas evdev pcspkr processor i2c_i801 button microcode > [ 53.165267] CPU: 0 PID: 123 Comm: kworker/5:1 Tainted: G W 3.10.0-rc1+ #10 > [ 53.175902] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013 > [ 53.186190] Workqueue: events od_dbs_timer > [ 53.193136] 0000000000000009 ffff88043b277b68 ffffffff8161441c ffff88043b277ba8 > [ 53.203477] ffffffff8103e540 000000003b277bb8 0000000000000005 ffff88043d764000 > [ 53.213727] 00000000ffff0e52 0000000000000005 ffff88044fd4fc08 ffff88043b277bb8 > [ 53.223894] Call Trace: > [ 53.228887] [] dump_stack+0x19/0x1b > [ 53.236593] [] warn_slowpath_common+0x70/0xa0 > [ 53.245160] [] warn_slowpath_null+0x1a/0x20 > [ 53.253519] [] native_smp_send_reschedule+0x58/0x60 > [ 53.262582] [] wake_up_nohz_cpu+0x2d/0xa0 > [ 53.270756] [] add_timer_on+0x8f/0x110 > [ 53.278654] [] __queue_delayed_work+0x16e/0x1a0 > [ 53.287335] [] ? try_to_grab_pending+0xd1/0x1a0 > [ 53.296002] [] mod_delayed_work_on+0x5a/0xa0 > [ 53.304412] [] gov_queue_work+0x4d/0xc0 > [ 53.312388] [] od_dbs_timer+0xcb/0x170 > [ 53.320267] [] process_one_work+0x1fd/0x540 > [ 53.328584] [] ? process_one_work+0x192/0x540 > [ 53.337083] [] worker_thread+0x122/0x380 > [ 53.345142] [] ? rescuer_thread+0x320/0x320 > [ 53.353484] [] kthread+0xea/0xf0 > [ 53.360847] [] ? flush_kthread_worker+0x150/0x150 > [ 53.369709] [] ret_from_fork+0x7c/0xb0 > [ 53.377603] [] ? flush_kthread_worker+0x150/0x150 > [ 53.386474] ---[ end trace f419538ada83b5ca ]--- > [ 53.395276] Power down. > [ 53.399033] acpi_power_off called > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/