Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752277AbbKBJsL (ORCPT ); Mon, 2 Nov 2015 04:48:11 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:33442 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751014AbbKBJsF (ORCPT ); Mon, 2 Nov 2015 04:48:05 -0500 X-Greylist: delayed 908 seconds by postgrey-1.27 at vger.kernel.org; Mon, 02 Nov 2015 04:48:04 EST MIME-Version: 1.0 In-Reply-To: <87611kakr3.fsf@yhuang-dev.intel.com> References: <87611kakr3.fsf@yhuang-dev.intel.com> Date: Mon, 2 Nov 2015 12:32:55 +0300 Message-ID: Subject: Re: [lkp] [fs] df4c0e36f1: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] From: Andrey Ryabinin To: kernel test robot Cc: LKP , LKML , Andrew Morton , David Rientjes , Pekka Enberg , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , Andi Kleen , Dave Hansen , Joonsoo Kim , Christoph Lameter , Sasha Levin , Konstantin Khlebnikov , Yuri Gribov , Andrey Konovalov , Dmitry Chernenkov , Konstantin Serebryany , Dmitry Vyukov , Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6396 Lines: 111 2015-11-02 11:06 GMT+03:00 kernel test robot : > FYI, we noticed the below changes on > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > commit df4c0e36f1b1782b0611a77c52cc240e5c4752dd ("fs: dcache: manually unpoison dname after allocation to shut up kasan's reports") > > > The commit fixed a KASan bug, but introduced or revealed a soft lockup > issue as follow. > I think that this issue is caused by several factors. 1. Non-preempt kernel + heavy debug options (CONFIG_PREEMPT=n CONFIG_GCOV_PROFILE_ALL=y CONFIG_KASAN=y) 2. I suspect that your host is overloaded, thus KVM guest runs too slow. And the major factor here is number 2. In your dmesg: [ 67.891156] rbtree testing -> 570841 cycles [ 88.609636] augmented rbtree testing [ 116.546697] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] I've tried to reproduce this, and got this: [ 0.693574] rbtree testing -> 15513 cycles 570841/15513 = 36x times faster. [ 1.159450] augmented rbtree testing -> 23675 cycles [ 1.864996] It took less than a second, meanwhile in your case it didn't finish in 22 seconds. This makes me think that your host is overloaded and the problem is on your side. > > [ 116.546697] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] > [ 116.546697] irq event stamp: 3018750 > [ 116.546697] hardirqs last enabled at (3018749): [] restore_args+0x0/0x30 > [ 116.546697] hardirqs last disabled at (3018750): [] apic_timer_interrupt+0x6d/0x80 > [ 116.546697] softirqs last enabled at (3018748): [] __do_softirq+0x417/0x65b > [ 116.546697] softirqs last disabled at (3018743): [] irq_exit+0x55/0xc3 > [ 116.546697] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-05249-gdf4c0e3 #1 > [ 116.546697] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 > [ 116.546697] task: ffff88002f3f0000 ti: ffff88002f3f8000 task.ti: ffff88002f3f8000 > [ 116.546697] RIP: 0010:[] [] __asan_load4+0x12/0x91 > [ 116.546697] RSP: 0000:ffff88002f3fbdc8 EFLAGS: 00000282 > [ 116.546697] RAX: ffff7fffffffffff RBX: 0000000000000000 RCX: dffffc0000000000 > [ 116.546697] RDX: fffffbfff0763087 RSI: ffffffff83b18b10 RDI: ffffffff83b1803c > [ 116.546697] RBP: ffff88002f3fbdf8 R08: 0000000000000001 R09: fffffbfff0763074 > [ 116.546697] R10: ffffffff83b183a7 R11: fffffbfff0763074 R12: ffff880038d29003 > [ 116.546697] R13: ffffed00071a5200 R14: ffffffff818cfadb R15: ffff88002f3fbdf8 > [ 116.546697] FS: 0000000000000000(0000) GS:ffff88002f600000(0000) knlGS:0000000000000000 > [ 116.546697] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 116.546697] CR2: 0000000000000000 CR3: 0000000002793000 CR4: 00000000000006f0 > [ 116.546697] Stack: > [ 116.546697] ffffffff83b18d5c ffffffff83b18b08 ffffffff83b18878 ffffffff83b18b10 > [ 116.546697] ffffffff83b18868 0000000000000000 ffff88002f3fbe48 ffffffff8223c4d3 > [ 116.546697] 0000000000000000 c4400d6283b18d40 0000000000000202 0000000000000d98 > [ 116.546697] Call Trace: > [ 116.546697] [] insert_augmented+0x52/0xf5 > [ 116.546697] [] rbtree_test_init+0x151/0x205 > [ 116.546697] [] ? glob_init+0x11c/0x11c > [ 116.546697] [] do_one_initcall+0x105/0x197 > [ 116.546697] [] kernel_init_freeable+0x193/0x22e > [ 116.546697] [] ? rest_init+0x14d/0x14d > [ 116.546697] [] kernel_init+0xe/0xdf > [ 116.546697] [] ret_from_fork+0x7c/0xb0 > [ 116.546697] [] ? rest_init+0x14d/0x14d > [ 116.546697] Code: 38 00 74 13 48 8b 4d 08 ba 01 00 00 00 be 02 00 00 00 e8 e0 09 00 00 c9 c3 55 48 b8 ff ff ff ff ff 7f ff ff 48 89 e5 48 83 ec 30 <48> 39 c7 77 23 48 8b 45 08 48 89 7d d8 48 8d 7d d8 48 c7 45 e8 > [ 116.546697] Kernel panic - not syncing: softlockup: hung tasks > [ 116.546697] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G L 3.19.0-05249-gdf4c0e3 #1 > [ 116.546697] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 > [ 116.546697] 0000000000000000 0000000000000000 0000000000000000 ffff88002f603d48 > [ 116.546697] ffffffff8223bcf7 fffffbfff074360b ffffffff8263d4e2 ffff88002f603dc8 > [ 116.546697] ffffffff82234300 ffff88002f603dc8 ffffffff00000008 ffff88002f603dd8 > [ 116.546697] Call Trace: > [ 116.546697] [] dump_stack+0x4c/0x65 > [ 116.546697] [] panic+0xed/0x231 > [ 116.546697] [] watchdog_timer_fn+0x1d2/0x1f8 > [ 116.546697] [] ? __touch_watchdog+0x1c/0x1c > [ 116.546697] [] __run_hrtimer+0x283/0x6a2 > [ 116.546697] [] hrtimer_run_queues+0x126/0x145 > [ 116.546697] [] run_local_timers+0xe/0x1a > [ 116.546697] [] update_process_times+0x27/0x54 > [ 116.546697] [] tick_nohz_handler+0x101/0x164 > [ 116.546697] [] local_apic_timer_interrupt+0x78/0x80 > [ 116.546697] [] smp_apic_timer_interrupt+0x57/0x69 > [ 116.546697] [] apic_timer_interrupt+0x72/0x80 > [ 116.546697] [] ? __asan_load4+0x12/0x91 > [ 116.546697] [] insert_augmented+0x52/0xf5 > [ 116.546697] [] rbtree_test_init+0x151/0x205 > [ 116.546697] [] ? glob_init+0x11c/0x11c > [ 116.546697] [] do_one_initcall+0x105/0x197 > [ 116.546697] [] kernel_init_freeable+0x193/0x22e > [ 116.546697] [] ? rest_init+0x14d/0x14d > [ 116.546697] [] kernel_init+0xe/0xdf > [ 116.546697] [] ret_from_fork+0x7c/0xb0 > [ 116.546697] [] ? rest_init+0x14d/0x14d > [ 116.546697] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) > > > Thanks, > Ying Huang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/