Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752341AbbBXDwm (ORCPT ); Mon, 23 Feb 2015 22:52:42 -0500 Received: from mail-qc0-f179.google.com ([209.85.216.179]:37499 "EHLO mail-qc0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752028AbbBXDwk (ORCPT ); Mon, 23 Feb 2015 22:52:40 -0500 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Mon, 23 Feb 2015 22:56:10 -0500 (EST) To: Vince Weaver cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , Jiri Olsa Subject: Re: perf: fuzzer causes lockup in x86_pmu_event_init() In-Reply-To: Message-ID: References: User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3896 Lines: 65 On Tue, 17 Feb 2015, Vince Weaver wrote: > This is on a Haswell machine, current git as of this past Friday. > > I let the perf_fuzzer run and it took 4 days to find this. > Sadly it doesn't seem to be reproducible so I am not sure > how it exactly got into this state. I have hit this on another machine, my core2 machine (after 10 days of fuzzing). So this seems to be a real issue although hard to hit. The problem seems to map to arch/x86/kernel/cpu/perf_event.c:824 It is stuck forever in this loop in collect_events() list_for_each_entry(event, &leader->sibling_list, group_entry) { if (!is_x86_event(event) || event->state <= PERF_EVENT_STATE_OFF) continue; if (n >= max_count) return -EINVAL; cpuc->event_list[n] = event; n++; } [884044.228001] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:17282] [884044.228001] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg mcs7830 usbnet evdev video pcspkr acpi_cpufreq coretemp psmouse serio_raw processor thermal_sys ohci_pci ohci_hcd i2c_nforce2 wmi button sg ehci_pci ehci_hcd sd_mod usbcore usb_common [884044.228001] CPU: 1 PID: 17282 Comm: perf_fuzzer Tainted: G W 3.19.0+ #201 [884044.228001] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012 [884044.228001] task: ffff88003dca4980 ti: ffff8801049dc000 task.ti: ffff8801049dc000 [884044.228001] RIP: 0010:[] [] x86_pmu_event_init+0x138/0x31d [884044.228001] RSP: 0018:ffff8801049dfd98 EFLAGS: 00000286 [884044.228001] RAX: ffff880042cd2000 RBX: ffff88003d11c000 RCX: 0000000000000005 [884044.228001] RDX: 0000000000000001 RSI: ffff880042cd2010 RDI: ffffffff810135c1 [884044.228001] RBP: ffff8801049dfdc8 R08: 00000000000080d0 R09: 0000000000000000 [884044.228001] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000286 [884044.228001] R13: 0000000000008000 R14: ffff88011f000700 R15: 0000000000000000 [884044.228001] FS: 00007faf3205f700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 [884044.228001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [884044.228001] CR2: 0000000005463038 CR3: 0000000106371000 CR4: 00000000000407e0 [884044.228001] DR0: 00000000020a4000 DR1: 0000000001e96000 DR2: 0000000001e96000 [884044.228001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 [884044.228001] Stack: [884044.228001] 0000000000000002 ffffffff81a1b270 ffff88003e96e000 0000000000000000 [884044.228001] ffffffff81a3b7a0 ffff88003e96e000 ffff8801049dfde8 ffffffff810cec1b [884044.228001] ffffffff81a1b270 ffff88003e96e000 ffff8801049dfe28 ffffffff810d488d [884044.228001] Call Trace: [884044.228001] [] perf_try_init_event+0x25/0x47 [884044.228001] [] perf_init_event+0x93/0xca [884044.228001] [] perf_event_alloc+0x29b/0x32d [884044.228001] [] SYSC_perf_event_open+0x417/0x89c [884044.228001] [] SyS_perf_event_open+0x9/0xb [884044.228001] [] system_call_fastpath+0x16/0x1b [884044.228001] Code: a1 81 8b 90 14 02 00 00 75 15 39 ca 0f 8d e7 01 00 00 48 63 c2 ff c2 4d 89 bc c5 20 05 00 00 49 8b 47 20 49 83 c7 20 48 83 e8 10 <48> 8d 70 10 4c 39 fe 74 2f 48 81 78 70 70 b2 a1 81 75 1b 83 78 [884044.228001] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:17282] [884072.228001] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [perf_fuzzer:17282] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/