Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752587AbbHEMiE (ORCPT ); Wed, 5 Aug 2015 08:38:04 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:34629 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751242AbbHEMiC (ORCPT ); Wed, 5 Aug 2015 08:38:02 -0400 Date: Wed, 5 Aug 2015 14:37:59 +0200 From: Frederic Weisbecker To: Dave Jones , Sasha Levin , paulmck@linux.vnet.ibm.com, Linux Kernel , Josh Triplett , Peter Zijlstra Subject: Re: 4.2-rc5 rcu stalls. Message-ID: <20150805123757.GA7051@lerouge> References: <20150803210835.GA4467@codemonkey.org.uk> <20150803213723.GN27280@linux.vnet.ibm.com> <20150803215535.GA13717@codemonkey.org.uk> <20150803220355.GO27280@linux.vnet.ibm.com> <55C0458B.6080003@oracle.com> <20150805001250.GA22259@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150805001250.GA22259@codemonkey.org.uk> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3088 Lines: 55 On Tue, Aug 04, 2015 at 08:12:50PM -0400, Dave Jones wrote: > On Tue, Aug 04, 2015 at 12:54:35AM -0400, Sasha Levin wrote: > > On 08/03/2015 06:03 PM, Paul E. McKenney wrote: > > >> > Ugh, that doesn't revert cleanly. Got something handy ? > > > I do not, but perhaps either Sasha or Frederic do. > > > > I've attached a revert courtesy of Peter. > > Thanks. At first I thought this was doing the trick, but then I hit this again. > > > [23643.545873] INFO: rcu_preempt detected stalls on CPUs/tasks: > [23643.546031] Tasks blocked on level-0 rcu_node (CPUs 0-3): P31722 > [23643.546173] (detected by 3, t=65002 jiffies, g=2256887, c=2256886, q=0) > [23643.546326] trinity-watchdo R running task 14336 31722 31721 0x00080000 > [23643.546488] ffff8804fcfe7cc8 000000000000ded0 0000000000000002 ffff8804f58bb680 > [23643.546661] ffff8800ce4951c0 ffff8804fcfe7cb8 ffff8804fcfe8000 ffff8804f6552608 > [23643.546830] 0000000000000009 ffff8804fcfe7e88 0000000000000009 ffff8804fcfe7ce8 > [23643.547001] Call Trace: > [23643.547058] [] preempt_schedule_common+0x22/0x40 > [23643.547201] [] preempt_schedule+0x1f/0x30 > [23643.547329] [] ___preempt_schedule+0x12/0x14 > [23643.547465] [] ? do_send_sig_info+0x5d/0x80 > [23643.547599] [] ? _raw_spin_unlock_irqrestore+0x42/0x70 > [23643.547753] [] ? _raw_spin_unlock_irqrestore+0x60/0x70 > [23643.547910] [] do_send_sig_info+0x5d/0x80 > [23643.548039] [] group_send_sig_info+0xb2/0x120 > [23643.548175] [] ? group_send_sig_info+0x5/0x120 > [23643.548314] [] ? rcu_read_lock_held+0x4f/0x60 > [23643.548451] [] kill_pid_info+0x7f/0x150 > [23643.548576] [] ? kill_pid_info+0x20/0x150 > [23643.548705] [] SYSC_kill+0xf4/0x2b0 > [23643.548821] [] ? SYSC_kill+0x9d/0x2b0 > [23643.548942] [] ? trace_hardirqs_on_caller+0x14b/0x1e0 > [23643.549097] [] ? trace_hardirqs_on+0xd/0x10 > [23643.549231] [] ? context_tracking_user_exit+0x13/0x20 > [23643.549387] [] ? syscall_trace_enter_phase1+0xf7/0x150 > [23643.549540] [] ? trace_hardirqs_on_thunk+0x17/0x19 > [23643.549687] [] SyS_kill+0xe/0x10 > [23643.549799] [] entry_SYSCALL_64_fastpath+0x12/0x6f If it still happens after Sasha's revert, which basically revert all the offending patches related to preempt lately, then the reason might be elsewhere. How hard was it to reproduce? I see 23000 secs in your dmesg logs which is around 6 hours. Also did you just launch trinity? no specific options? I'll try to reproduce that. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/