Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751515AbaLZW6O (ORCPT ); Fri, 26 Dec 2014 17:58:14 -0500 Received: from arcturus.aphlor.org ([188.246.204.175]:57788 "EHLO arcturus.aphlor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751065AbaLZW6J (ORCPT ); Fri, 26 Dec 2014 17:58:09 -0500 Date: Fri, 26 Dec 2014 17:57:44 -0500 From: Dave Jones To: Linus Torvalds Cc: Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141226225744.GA30955@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linus Torvalds , Thomas Gleixner , Chris Mason , Mike Galbraith , Ingo Molnar , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List , Suresh Siddha , Oleg Nesterov , Peter Anvin , John Stultz References: <20141221223204.GA9618@codemonkey.org.uk> <20141222225725.GA8140@codemonkey.org.uk> <20141224030125.GA8725@codemonkey.org.uk> <20141226163410.GA25161@codemonkey.org.uk> <20141226181204.GA26527@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Report: Spam report generated by SpamAssassin on "arcturus.aphlor.org" Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Authenticated-User: davej@codemonkey.org.uk Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 26, 2014 at 12:57:07PM -0800, Linus Torvalds wrote: > I have a newer version of the patch that gets rid of the false > positives with some ordering rules instead, and just for you I hacked > it up to say where the problem happens too, but it's likely too late. hm. [ 2733.047100] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 2733.047188] Tasks blocked on level-0 rcu_node (CPUs 0-7): P25811 [ 2733.047216] Tasks blocked on level-0 rcu_node (CPUs 0-7): P25811 [ 2733.047242] (detected by 0, t=6502 jiffies, g=52141, c=52140, q=0) [ 2733.047271] trinity-c406 R running task 13416 25811 24907 0x00000000 [ 2733.047305] ffff88022208fd28 0000000000000002 ffffffffa819f627 ffff8801df2c0000 [ 2733.047341] 00000000001d31c0 0000000000000002 ffff88022208ffd8 00000000001d31c0 [ 2733.047375] ffff8800806e1780 ffff8801df2c0000 ffff88022208fd18 ffff88022208ffd8 [ 2733.047411] Call Trace: [ 2733.047429] [] ? context_tracking_user_exit+0x67/0x280 [ 2733.047457] [] preempt_schedule_irq+0x52/0xb0 [ 2733.047482] [] retint_kernel+0x20/0x30 [ 2733.047505] [] ? check_kill_permission+0xb1/0x1e0 [ 2733.047531] [] ? check_kill_permission+0x152/0x1e0 [ 2733.047557] [] group_send_sig_info+0x65/0x150 [ 2733.047581] [] ? group_send_sig_info+0x5/0x150 [ 2733.047607] [] ? rcu_read_lock_held+0x6e/0x80 [ 2733.047632] [] kill_pid_info+0x78/0x130 [ 2733.047654] [] ? kill_pid_info+0x5/0x130 [ 2733.047677] [] SYSC_kill+0xf2/0x2f0 [ 2733.047699] [] ? SYSC_kill+0x9b/0x2f0 [ 2733.047721] [] ? trace_hardirqs_on+0xd/0x10 [ 2733.047745] [] ? syscall_trace_enter_phase1+0x125/0x1a0 [ 2733.048607] [] ? trace_hardirqs_on_caller+0x10d/0x1d0 [ 2733.049469] [] SyS_kill+0xe/0x10 [ 2733.050332] [] system_call_fastpath+0x12/0x17 [ 2733.051197] trinity-c406 R running task 13416 25811 24907 0x00000000 [ 2733.052064] ffff88022208fd28 0000000000000002 ffffffffa819f627 ffff8801df2c0000 [ 2733.052932] 00000000001d31c0 0000000000000002 ffff88022208ffd8 00000000001d31c0 [ 2733.053792] ffff880209e2c680 ffff8801df2c0000 ffff88022208fd18 ffff88022208ffd8 [ 2733.054651] Call Trace: [ 2733.055500] [] ? context_tracking_user_exit+0x67/0x280 [ 2733.056362] [] preempt_schedule_irq+0x52/0xb0 [ 2733.057222] [] retint_kernel+0x20/0x30 [ 2733.058076] [] ? check_kill_permission+0xb1/0x1e0 [ 2733.058930] [] ? check_kill_permission+0x152/0x1e0 [ 2733.059778] [] group_send_sig_info+0x65/0x150 [ 2733.060624] [] ? group_send_sig_info+0x5/0x150 [ 2733.061472] [] ? rcu_read_lock_held+0x6e/0x80 [ 2733.062322] [] kill_pid_info+0x78/0x130 [ 2733.063168] [] ? kill_pid_info+0x5/0x130 [ 2733.064015] [] SYSC_kill+0xf2/0x2f0 [ 2733.064863] [] ? SYSC_kill+0x9b/0x2f0 [ 2733.065704] [] ? trace_hardirqs_on+0xd/0x10 [ 2733.066541] [] ? syscall_trace_enter_phase1+0x125/0x1a0 [ 2733.067384] [] ? trace_hardirqs_on_caller+0x10d/0x1d0 [ 2733.068217] [] SyS_kill+0xe/0x10 [ 2733.069045] [] system_call_fastpath+0x12/0x17 [ 3708.217920] perf interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 [ 4583.530580] request_module: runaway loop modprobe personality-87 still running though.. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/