Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755443AbbHCVh3 (ORCPT ); Mon, 3 Aug 2015 17:37:29 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:56226 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754875AbbHCVh2 (ORCPT ); Mon, 3 Aug 2015 17:37:28 -0400 X-Helo: d03dlp01.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Mon, 3 Aug 2015 14:37:23 -0700 From: "Paul E. McKenney" To: Dave Jones , Linux Kernel , Josh Triplett Cc: sasha.levin@oracle.com, fweisbec@gmail.com Subject: Re: 4.2-rc5 rcu stalls. Message-ID: <20150803213723.GN27280@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150803210835.GA4467@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150803210835.GA4467@codemonkey.org.uk> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15080321-0029-0000-0000-00000BB05D83 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10548 Lines: 167 On Mon, Aug 03, 2015 at 05:08:35PM -0400, Dave Jones wrote: > Content analysis details: (-2.9 points, 5.0 required) > > pts rule name description > ---- ---------------------- -------------------------------------------------- > -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > [score: 0.0000] > X-Authenticated-User: davej@codemonkey.org.uk > X-ZLA-Header: unknown; 0 > X-ZLA-DetailInfo: BA=6.00003574; NDR=6.00000001; ZLA=6.00000002; ZF=6.00000004; ZB=6.00041153; ZH=6.00102541; ZP=6.00079473; ZU=6.00000001; UDB=6.00244655; UTC=2015-08-03 21:08:45 > x-cbid: 15080321-8236-0000-0000-00000DB1C0C1 > X-IBM-ISS-SpamDetectors: Score=0.405233; BY=0; FL=0; FP=0; FZ=0; HX=0; KW=0; > PH=0; RB=0; SC=0.405233; ST=0; TS=0; UL=0; ISC= > X-IBM-ISS-DetailInfo: BY=3.00004248; HX=3.00000236; KW=3.00000007; > PH=3.00000003; SC=3.00000115; SDB=6.00568899; UDB=6.00244655; UTC=2015-08-03 > 21:08:47 > X-TM-AS-MML: disable > > [ 2120.854974] INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 2120.855128] Tasks blocked on level-0 rcu_node (CPUs 0-3): P1497 > [ 2120.855263] (detected by 2, t=65002 jiffies, g=78835, c=78834, q=0) > [ 2120.855403] trinity-watchdo R running task 14336 1497 1496 0x00080000 > [ 2120.855563] ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8804f58b51c0 > [ 2120.855728] ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000 > [ 2120.855893] 0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8 > [ 2120.856062] Call Trace: > [ 2120.856116] [] ? preempt_schedule_irq+0x40/0xa0 > [ 2120.856252] [] preempt_schedule_irq+0x46/0xa0 Sasha Levin (CCed) was having roughly similar stalls, and found that reverting b30f0e3ffedf (sched/preempt: Optimize preemption operations on __schedule() callers) made the stalls go away. Does that help in your case? Thanx, Paul > [ 2120.856387] [] retint_kernel+0x1b/0x2d > [ 2120.856507] [] ? get_parent_ip+0x11/0x50 > [ 2120.856627] [] ? get_parent_ip+0x11/0x50 > [ 2120.856752] [] ? lock_release+0x33e/0x700 > [ 2120.856878] [] ? group_send_sig_info+0x79/0x120 > [ 2120.857015] [] ? get_parent_ip+0x11/0x50 > [ 2120.857137] [] ? kill_pid_info+0x9b/0x150 > [ 2120.857259] [] kill_pid_info+0xb8/0x150 > [ 2120.857380] [] ? kill_pid_info+0x20/0x150 > [ 2120.857508] [] SYSC_kill+0xf4/0x2b0 > [ 2120.857620] [] ? SYSC_kill+0x9d/0x2b0 > [ 2120.857736] [] ? trace_hardirqs_on_caller+0x14b/0x1e0 > [ 2120.857882] [] ? trace_hardirqs_on+0xd/0x10 > [ 2120.858012] [] ? context_tracking_user_exit+0x13/0x20 > [ 2120.858161] [] ? syscall_trace_enter_phase1+0xf7/0x150 > [ 2120.858314] [] ? trace_hardirqs_on_thunk+0x17/0x19 > [ 2120.858456] [] SyS_kill+0xe/0x10 > [ 2120.858563] [] entry_SYSCALL_64_fastpath+0x12/0x6f > [ 2120.858706] trinity-watchdo R running task 14336 1497 1496 0x00080000 > [ 2120.858863] ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8804f58b51c0 > [ 2120.859028] ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000 > [ 2120.859194] 0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8 > [ 2120.859358] Call Trace: > [ 2120.859410] [] ? preempt_schedule_irq+0x40/0xa0 > [ 2120.859545] [] preempt_schedule_irq+0x46/0xa0 > [ 2120.859676] [] retint_kernel+0x1b/0x2d > [ 2120.859794] [] ? get_parent_ip+0x11/0x50 > [ 2120.859916] [] ? get_parent_ip+0x11/0x50 > [ 2120.860037] [] ? lock_release+0x33e/0x700 > [ 2120.860163] [] ? group_send_sig_info+0x79/0x120 > [ 2120.860297] [] ? get_parent_ip+0x11/0x50 > [ 2120.860422] [] ? kill_pid_info+0x9b/0x150 > [ 2120.860548] [] kill_pid_info+0xb8/0x150 > [ 2120.860670] [] ? kill_pid_info+0x20/0x150 > [ 2120.860793] [] SYSC_kill+0xf4/0x2b0 > [ 2120.860906] [] ? SYSC_kill+0x9d/0x2b0 > [ 2120.861022] [] ? trace_hardirqs_on_caller+0x14b/0x1e0 > [ 2120.861172] [] ? trace_hardirqs_on+0xd/0x10 > [ 2120.861301] [] ? context_tracking_user_exit+0x13/0x20 > [ 2120.861449] [] ? syscall_trace_enter_phase1+0xf7/0x150 > [ 2120.866641] [] ? trace_hardirqs_on_thunk+0x17/0x19 > [ 2120.871850] [] SyS_kill+0xe/0x10 > [ 2120.877038] [] entry_SYSCALL_64_fastpath+0x12/0x6f > [ 2316.020180] INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 2316.025395] Tasks blocked on level-0 rcu_node (CPUs 0-3): P1497 > [ 2316.030662] (detected by 2, t=260007 jiffies, g=78835, c=78834, q=0) > [ 2316.035890] trinity-watchdo R running task 14336 1497 1496 0x00080000 > [ 2316.041153] ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8800cd2db680 > [ 2316.046552] ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000 > [ 2316.052039] 0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8 > [ 2316.057560] Call Trace: > [ 2316.063066] [] ? preempt_schedule_irq+0x40/0xa0 > [ 2316.068594] [] preempt_schedule_irq+0x46/0xa0 > [ 2316.074034] [] retint_kernel+0x1b/0x2d > [ 2316.079419] [] ? get_parent_ip+0x11/0x50 > [ 2316.084778] [] ? get_parent_ip+0x11/0x50 > [ 2316.090041] [] ? lock_release+0x33e/0x700 > [ 2316.095219] [] ? group_send_sig_info+0x79/0x120 > [ 2316.100359] [] ? get_parent_ip+0x11/0x50 > [ 2316.105489] [] ? kill_pid_info+0x9b/0x150 > [ 2316.110584] [] kill_pid_info+0xb8/0x150 > [ 2316.115594] [] ? kill_pid_info+0x20/0x150 > [ 2316.120568] [] SYSC_kill+0xf4/0x2b0 > [ 2316.125476] [] ? SYSC_kill+0x9d/0x2b0 > [ 2316.130391] [] ? trace_hardirqs_on_caller+0x14b/0x1e0 > [ 2316.135348] [] ? trace_hardirqs_on+0xd/0x10 > [ 2316.140351] [] ? context_tracking_user_exit+0x13/0x20 > [ 2316.145399] [] ? syscall_trace_enter_phase1+0xf7/0x150 > [ 2316.150413] [] ? trace_hardirqs_on_thunk+0x17/0x19 > [ 2316.155461] [] SyS_kill+0xe/0x10 > [ 2316.160499] [] entry_SYSCALL_64_fastpath+0x12/0x6f > [ 2316.165609] trinity-watchdo R running task 14336 1497 1496 0x00080000 > [ 2316.170851] ffff8804b94e3c88 ffffffffa17fa0b0 ffff8805010a1b40 ffff8800cd2db680 > [ 2316.176131] ffff8805010a1b40 ffff8804b94e3c78 0000000000000000 ffff8804b94e4000 > [ 2316.181443] 0000000000000001 0000000000000001 0000000000000002 ffff8804b94e3ca8 > [ 2316.186735] Call Trace: > [ 2316.192021] [] ? preempt_schedule_irq+0x40/0xa0 > [ 2316.197426] [] preempt_schedule_irq+0x46/0xa0 > [ 2316.202818] [] retint_kernel+0x1b/0x2d > [ 2316.208181] [] ? get_parent_ip+0x11/0x50 > [ 2316.213541] [] ? get_parent_ip+0x11/0x50 > [ 2316.218823] [] ? lock_release+0x33e/0x700 > [ 2316.224042] [] ? group_send_sig_info+0x79/0x120 > [ 2316.229215] [] ? get_parent_ip+0x11/0x50 > [ 2316.234346] [] ? kill_pid_info+0x9b/0x150 > [ 2316.239491] [] kill_pid_info+0xb8/0x150 > [ 2316.244538] [] ? kill_pid_info+0x20/0x150 > [ 2316.249525] [] SYSC_kill+0xf4/0x2b0 > [ 2316.254503] [] ? SYSC_kill+0x9d/0x2b0 > [ 2316.259492] [] ? trace_hardirqs_on_caller+0x14b/0x1e0 > [ 2316.264565] [] ? trace_hardirqs_on+0xd/0x10 > [ 2316.269594] [] ? context_tracking_user_exit+0x13/0x20 > [ 2316.274681] [] ? syscall_trace_enter_phase1+0xf7/0x150 > [ 2316.279772] [] ? trace_hardirqs_on_thunk+0x17/0x19 > [ 2316.284873] [] SyS_kill+0xe/0x10 > [ 2316.289981] [] entry_SYSCALL_64_fastpath+0x12/0x6f > > some time later.. > > [66859.981825] INFO: rcu_preempt detected stalls on CPUs/tasks: > [66859.987582] Tasks blocked on level-0 rcu_node (CPUs 0-3): P20837 > [66859.993375] (detected by 2, t=65013 jiffies, g=5129960, c=5129959, q=0) > [66859.999146] trinity-c255 R running task 13728 20837 20581 0x00080002 > [66860.004980] ffff8804d1417cb8 ffffffffa17fa0b0 ffff8804e4dab680 ffff8804ba2951c0 > [66860.010831] ffff8804e4dab680 ffff8804d1417ca8 0000000000000000 ffff8804d1418000 > [66860.016666] ffff88045f49a440 ffff8805047a0620 ffff880480fdf528 ffff8804d1417cd8 > [66860.022423] Call Trace: > [66860.028116] [] ? preempt_schedule_irq+0x40/0xa0 > [66860.033912] [] preempt_schedule_irq+0x46/0xa0 > [66860.039663] [] retint_kernel+0x1b/0x2d > [66860.045352] [] ? lock_acquire+0xd9/0x260 > [66860.051044] [] ? rcu_is_watching+0x1/0x60 > [66860.056725] [] ? dput+0x10c/0x390 > [66860.062377] [] ? dput+0x29/0x390 > [66860.068103] [] __fput+0x181/0x200 > [66860.073797] [] ? __fput+0x6/0x200 > [66860.079444] [] ____fput+0xe/0x10 > [66860.085070] [] task_work_run+0x8d/0xc0 > [66860.090699] [] do_exit+0x416/0xc60 > [66860.096304] [] ? context_tracking_user_exit+0x13/0x20 > [66860.101978] [] do_group_exit+0x54/0xe0 > [66860.107684] [] SyS_exit_group+0x14/0x20 > [66860.113360] [] entry_SYSCALL_64_fastpath+0x12/0x6f > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/