Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756039AbaKSO7t (ORCPT ); Wed, 19 Nov 2014 09:59:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49665 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755047AbaKSO7q (ORCPT ); Wed, 19 Nov 2014 09:59:46 -0500 Date: Wed, 19 Nov 2014 09:59:02 -0500 From: Dave Jones To: Linus Torvalds Cc: Don Zickus , Thomas Gleixner , Linux Kernel , the arch/x86 maintainers , Peter Zijlstra Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141119145902.GA13387@redhat.com> Mail-Followup-To: Dave Jones , Linus Torvalds , Don Zickus , Thomas Gleixner , Linux Kernel , the arch/x86 maintainers , Peter Zijlstra References: <20141118020959.GA2091@redhat.com> <20141118023930.GA2871@redhat.com> <20141118145234.GA7487@redhat.com> <20141118215540.GD35311@redhat.com> <20141119021902.GA14216@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 18, 2014 at 08:40:55PM -0800, Linus Torvalds wrote: > On Tue, Nov 18, 2014 at 6:19 PM, Dave Jones wrote: > > > > NMI watchdog: BUG: soft lockup - CPU#2 stuck for 21s! [trinity-c42:31480] > > CPU: 2 PID: 31480 Comm: trinity-c42 Not tainted 3.18.0-rc5+ #91 [loadavg: 174.61 150.35 148.64 9/411 32140] > > RIP: 0010:[] [] context_tracking_user_enter+0xa4/0x190 > > Call Trace: > > [] syscall_trace_leave+0xa5/0x160 > > [] int_check_syscall_exit_work+0x34/0x3d > > Hmm, if we are getting soft-lockups here, maybe it suggest too much exit-work. > > Some TIF_NOHZ loop, perhaps? You have nohz on, don't you? > > That makes me wonder: does the problem go away if you disable NOHZ? Aparently not. NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [trinity-c75:25175] CPU: 3 PID: 25175 Comm: trinity-c75 Not tainted 3.18.0-rc5+ #92 [loadavg: 168.72 151.72 150.38 9/410 27945] task: ffff8800364e44d0 ti: ffff880192d2c000 task.ti: ffff880192d2c000 RIP: 0010:[] [] context_tracking_user_exit+0x57/0x120 RSP: 0018:ffff880192d2fee8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000100000046 RCX: 000000336ee35b47 RDX: 0000000000000001 RSI: ffffffff94ac1e84 RDI: ffffffff94a93725 RBP: ffff880192d2fef8 R08: 00007f9b74d0b740 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff940d8503 R13: ffff880192d2fe98 R14: ffffffff943884e7 R15: ffff880192d2fe48 FS: 00007f9b74d0b740(0000) GS:ffff880244600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000336f1b7740 CR3: 0000000229a95000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff880192d30000 0000000000080000 ffff880192d2ff78 ffffffff94012c25 00007f9b747a5000 00007f9b747a5068 0000000000000000 0000000000000000 0000000000000000 ffffffff9437b3be 0000000000000000 0000000000000000 Call Trace: [] syscall_trace_enter_phase1+0x125/0x1a0 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] tracesys+0x14/0x4a Code: 42 fd ff 48 c7 c7 7a 1e ac 94 e8 25 29 21 00 65 8b 04 25 34 f7 1c 00 83 f8 01 74 28 f6 c7 02 74 13 0f 1f 00 e8 bb 43 fd ff 53 9d <5b> 41 5c 5d c3 0f 1f 40 00 53 9d e8 89 42 fd ff eb ee 0f 1f 80 sending NMI to other CPUs: NMI backtrace for cpu 1 CPU: 1 PID: 25164 Comm: trinity-c64 Not tainted 3.18.0-rc5+ #92 [loadavg: 168.72 151.72 150.38 9/410 27945] task: ffff88011600dbc0 ti: ffff8801a99a4000 task.ti: ffff8801a99a4000 RIP: 0010:[] [] generic_exec_single+0xee/0x1a0 RSP: 0018:ffff8801a99a7d18 EFLAGS: 00000202 RAX: 0000000000000000 RBX: ffff8801a99a7d20 RCX: 0000000000000038 RDX: 00000000000000ff RSI: 0000000000000008 RDI: 0000000000000000 RBP: ffff8801a99a7d78 R08: ffff880242b57ce0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003 R13: 0000000000000001 R14: ffff880083c28948 R15: ffffffff94166aa0 FS: 00007f9b74d0b740(0000) GS:ffff880244200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000001 CR3: 00000001d8611000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff8801a99a7d28 0000000000000000 ffffffff94166aa0 ffff880083c28948 0000000000000003 00000000e38f9aac ffff880083c28948 00000000ffffffff 0000000000000003 ffffffff94166aa0 ffff880083c28948 0000000000000001 Call Trace: [] ? perf_swevent_add+0x120/0x120 [] ? perf_swevent_add+0x120/0x120 [] smp_call_function_single+0x6a/0xe0 [] ? preempt_count_sub+0x7b/0x100 [] perf_event_read+0xca/0xd0 [] perf_event_read_value+0x90/0xe0 [] perf_read+0x226/0x370 [] ? security_file_permission+0x87/0xa0 [] vfs_read+0x9f/0x180 [] SyS_read+0x58/0xd0 [] tracesys_phase2+0xd4/0xd9 Code: 48 89 de 48 03 14 c5 20 65 d1 94 48 89 df e8 8a 4b 28 00 84 c0 75 46 45 85 ed 74 11 f6 43 18 01 74 0b 0f 1f 00 f3 90 f6 43 18 01 <75> f8 31 c0 48 8b 4d c8 65 48 33 0c 25 28 00 00 00 0f 85 8e 00 NMI backtrace for cpu 0 INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 35.055 msecs CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc5+ #92 [loadavg: 168.72 151.72 150.38 11/410 27945] task: ffffffff94c164c0 ti: ffffffff94c00000 task.ti: ffffffff94c00000 RIP: 0010:[] [] intel_idle+0xd5/0x180 RSP: 0018:ffffffff94c03e28 EFLAGS: 00000046 RAX: 0000000000000032 RBX: 0000000000000010 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff94c03fd8 RDI: 0000000000000000 RBP: ffffffff94c03e58 R08: 000000008baf8b86 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000005 R13: 0000000000000032 R14: 0000000000000004 R15: ffffffff94c00000 FS: 0000000000000000(0000) GS:ffff880244000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f162e060000 CR3: 0000000014c11000 CR4: 00000000001407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: 0000000094c03e58 5955c5b31ad5e8cf ffffe8fffee031a8 0000000000000005 ffffffff94ca9dc0 0000000000000000 ffffffff94c03ea8 ffffffff94661f05 00001cb7dcf6fd93 ffffffff94ca9f90 ffffffff94c00000 ffffffff94d18870 Call[31557.908912] NMI backtrace for cpu 2 INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 68.178 msecs CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.18.0-rc5+ #92 [loadavg: 168.72 151.72 150.38 10/410 27945] task: ffff880242b596f0 ti: ffff880242b6c000 task.ti: ffff880242b6c000 RIP: 0010:[] [] intel_idle+0xd5/0x180 RSP: 0018:ffff880242b6fdf8 EFLAGS: 00000046 RAX: 0000000000000032 RBX: 0000000000000010 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffff880242b6ffd8 RDI: 0000000000000002 RBP: ffff880242b6fe28 R08: 000000008baf8b86 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000005 R13: 0000000000000032 R14: 0000000000000004 R15: ffff880242b6c000 FS: 0000000000000000(0000) GS:ffff880244400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000014c11000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: 0000000242b6fe28 da97aa9b9f42090a ffffe8ffff2031a8 0000000000000005 ffffffff94ca9dc0 0000000000000002 ffff880242b6fe78 ffffffff94661f05 00001cb7dcdd1af6 ffffffff94ca9f90 ffff880242b6c000 ffffffff94d18870 Call Trace: [] cpuidle_enter_state+0x55/0x1c0 [] cpuidle_enter+0x17/0x20 [] cpu_startup_entry+0x423/0x4d0 [] start_secondary+0x1a3/0x220 Code: 31 d2 65 48 8b 34 25 08 ba 00 00 48 8d 86 38 c0 ff ff 48 89 d1 0f 01 c8 48 8b 86 38 c0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <65> 48 8b 0c 25 08 ba 00 00 f0 80 a1 3a c0 ff ff df 0f ae f0 48 INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 95.994 msecs Also, today I learned we can reach the perf_event_read code from read(). Given I had /proc/sys/kernel/perf_event_paranoid set to 1, I'm not sure how this is even possible. The only user of perf_fops is perf_event_open syscall _after_ it's checked that sysctl. Oh, there's an ioctl path to perf too. Though trinity doesn't know anything about it, so I find it surprising if it managed to pull the right combination of entropy to make that do the right thing. Still, that ioctl path probably needs to also be checking that sysctl shouldn't it ? Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/