Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751787AbbBQQOg (ORCPT ); Tue, 17 Feb 2015 11:14:36 -0500 Received: from mail-qc0-f173.google.com ([209.85.216.173]:63323 "EHLO mail-qc0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750800AbbBQQOe (ORCPT ); Tue, 17 Feb 2015 11:14:34 -0500 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Tue, 17 Feb 2015 11:17:55 -0500 (EST) To: linux-kernel@vger.kernel.org cc: Peter Zijlstra , Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , Jiri Olsa Subject: perf: fuzzer causes lockup in x86_pmu_event_init() Message-ID: User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7711 Lines: 103 This is on a Haswell machine, current git as of this past Friday. I let the perf_fuzzer run and it took 4 days to find this. Sadly it doesn't seem to be reproducible so I am not sure how it exactly got into this state. It got stuck and kept printing the following message until I rebooted the system. [328363.704011] INFO: rcu_sched self-detected stall on CPU { 2} (t=5251 jiffies g=28607531 c=28607530 q=232) [328363.715076] Task dump for CPU 2: [328363.719278] perf_fuzzer R running task 0 20935 2309 0x00000008 [328363.727653] 0000000000000905 ffff88011ea83d68 ffffffff8109a126 ffffffff8109a095 [328363.736466] 0000000000000007 0000000000000002 ffffffff81c50780 ffff88011ea83d88 [328363.745256] ffffffff8109d9fd ffff88011ea95b00 0000000000000003 ffff88011ea83db8 [328363.754037] Call Trace: [328363.757382] [] sched_show_task+0xf6/0x160 [328363.764853] [] ? sched_show_task+0x65/0x160 [328363.771769] [] dump_cpu_task+0x3d/0x50 [328363.778205] [] rcu_dump_cpu_stacks+0x91/0xd0 [328363.785181] [] rcu_check_callbacks+0x48b/0x760 [328363.792345] [] update_process_times+0x39/0x60 [328363.799431] [] tick_sched_handle.isra.18+0x25/0x60 [328363.806923] [] tick_sched_timer+0x44/0x80 [328363.813594] [] __run_hrtimer+0x94/0x1f0 [328363.820090] [] ? hrtimer_interrupt+0x83/0x220 [328363.827166] [] ? tick_sched_do_timer+0x40/0x40 [328363.834322] [] hrtimer_interrupt+0xfb/0x220 [328363.841188] [] local_apic_timer_interrupt+0x3c/0x70 [328363.848802] [] smp_apic_timer_interrupt+0x41/0x60 [328363.856197] [] apic_timer_interrupt+0x6d/0x80 [328363.863238] [] ? __raw_spin_lock_init+0x36/0x60 [328363.871175] [] ? collect_events+0x55/0xb0 [328363.877844] [] x86_pmu_event_init+0xfe/0x3c0 [328363.884817] [] perf_try_init_event+0x33/0x70 [328363.891797] [] perf_init_event+0x13f/0x170 [328363.898586] [] ? perf_init_event+0x5/0x170 [328363.905369] [] perf_event_alloc+0x398/0x450 [328363.912237] [] SYSC_perf_event_open+0x373/0xc80 [328363.919502] [] SyS_perf_event_open+0xe/0x10 [328363.926387] [] system_call_fastpath+0x16/0x1b [328363.933452] INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 0, t=5308 jiffies, g=28607531, c=28607530, q=232) [328363.946389] Task dump for CPU 2: [328363.950495] perf_fuzzer R running task 0 20935 2309 0x00000008 [328363.958750] 0000000000000246 ffff88011edf8f18 0000000000000141 ffff8800947d0990 [328363.967451] ffff880036e36000 ffff88011a000640 000001ff00000020 0000000000000000 [328363.976158] ffff88011a000640 ffff880036e36000 0000000080002800 ffffffff81032ead [328363.984776] Call Trace: [328363.988059] [] ? allocate_shared_regs+0x3d/0x90 [328363.995306] [] ? cache_alloc_debugcheck_after.isra.51+0x1fb/0x280 [328364.004275] [] ? lockdep_init_map+0x5d/0x640 [328364.011286] [] ? allocate_fake_cpuc+0x32/0x90 [328364.018369] [] ? cache_alloc_debugcheck_after.isra.51+0x1fb/0x280 [328364.027342] [] ? __raw_spin_lock_init+0x36/0x60 [328364.034626] [] ? collect_events+0x55/0xb0 [328364.041342] [] ? x86_pmu_event_init+0xfe/0x3c0 [328364.048514] [] ? perf_try_init_event+0x33/0x70 [328364.055727] [] ? perf_init_event+0x13f/0x170 [328364.062731] [] ? perf_init_event+0x5/0x170 [328364.069552] [] ? perf_event_alloc+0x398/0x450 [328364.076619] [] ? SYSC_perf_event_open+0x373/0xc80 [328364.084124] [] ? SyS_perf_event_open+0xe/0x10 [328364.091184] [] ? system_call_fastpath+0x16/0x1b [328390.679373] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [perf_fuzzer:20935] [328390.688418] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_controller ghash_clmulni_intel snd_hda_codec aesni_intel aes_x86_64 snd_hwdep lrw ppdev i915 snd_pcm gf128mul iTCO_wdt evdev iTCO_vendor_support drm_kms_helper glue_helper psmouse drm ablk_helper cryptd snd_timer serio_raw mei_me parport_pc tpm_tis snd lpc_ich tpm soundcore i2c_i801 xhci_pci battery mei i2c_algo_bit parport pcspkr xhci_hcd mfd_core wmi button processor video sg sr_mod cdrom sd_mod ahci e1000e ehci_pci libahci ptp ehci_hcd libata usbcore crc32c_intel scsi_mod usb_common pps_core thermal fan thermal_sys [328390.762307] CPU: 2 PID: 20935 Comm: perf_fuzzer Tainted: G W 3.19.0+ #127 [328390.771527] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 [328390.780237] task: ffff8800947d0990 ti: ffff8800946c4000 task.ti: ffff8800946c4000 [328390.789004] RIP: 0010:[] [] collect_events+0x4d/0xb0 [328390.798577] RSP: 0018:ffff8800946c7d38 EFLAGS: 00000202 [328390.805033] RAX: ffff8800ce837000 RBX: ffffffff810bad96 RCX: 0000000000000001 [328390.813454] RDX: ffff8800ce837010 RSI: ffff8801193cb020 RDI: ffff8800958b3000 [328390.821845] RBP: ffff8800946c7d38 R08: 0000000000000007 R09: ffffffff828fb310 [328390.830256] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81c220c0 [328390.838661] R13: ffff8801193cb000 R14: ffffffff81e7f181 R15: ffffffff81a24c53 [328390.847094] FS: 00007fb692472700(0000) GS:ffff88011ea80000(0000) knlGS:0000000000000000 [328390.856548] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [328390.863438] CR2: 0000000001e1d038 CR3: 0000000094724000 CR4: 00000000001407e0 [328390.871839] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [328390.880251] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 [328390.888646] Stack: [328390.891588] ffff8800946c7d88 ffffffff8102a6be ffff8800946c7db8 0000000000000292 [328390.900379] ffff8800947d0990 ffffffff81c220c0 ffff8800a0274000 0000000000000000 [328390.909169] 0000000000000000 0000000000000000 ffff8800946c7db8 ffffffff811559d3 [328390.917922] Call Trace: [328390.921321] [] x86_pmu_event_init+0xfe/0x3c0 [328390.928405] [] perf_try_init_event+0x33/0x70 [328390.935479] [] perf_init_event+0x13f/0x170 [328390.942426] [] ? perf_init_event+0x5/0x170 [328390.949360] [] perf_event_alloc+0x398/0x450 [328390.956398] [] SYSC_perf_event_open+0x373/0xc80 [328390.963774] [] SyS_perf_event_open+0xe/0x10 [328390.970773] [] system_call_fastpath+0x16/0x1b [328390.977968] Code: 84 d2 89 c8 74 5a 48 8b 56 20 48 83 c6 20 48 39 d6 48 8d 42 f0 75 14 eb 45 0f 1f 44 00 00 48 8b 50 10 48 39 d6 48 8d 42 f0 74 33 <48> 81 78 70 c0 20 c2 81 75 e9 8b 50 78 85 d2 78 e2 41 39 c8 7e [328390.679373] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [perf_fuzzer:20935] [328418.691171] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [perf_fuzzer:20935] ... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/