Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752220AbbHEOiS (ORCPT ); Wed, 5 Aug 2015 10:38:18 -0400 Received: from mail-wi0-f172.google.com ([209.85.212.172]:33124 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751703AbbHEOiQ (ORCPT ); Wed, 5 Aug 2015 10:38:16 -0400 Date: Wed, 5 Aug 2015 16:38:14 +0200 From: Frederic Weisbecker To: Dave Jones , Sasha Levin , paulmck@linux.vnet.ibm.com, Linux Kernel , Josh Triplett , Peter Zijlstra Subject: Re: 4.2-rc5 rcu stalls. Message-ID: <20150805143813.GF7051@lerouge> References: <20150803210835.GA4467@codemonkey.org.uk> <20150803213723.GN27280@linux.vnet.ibm.com> <20150803215535.GA13717@codemonkey.org.uk> <20150803220355.GO27280@linux.vnet.ibm.com> <55C0458B.6080003@oracle.com> <20150805001250.GA22259@codemonkey.org.uk> <20150805123757.GA7051@lerouge> <20150805131857.GA596@codemonkey.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150805131857.GA596@codemonkey.org.uk> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1911 Lines: 45 On Wed, Aug 05, 2015 at 09:18:57AM -0400, Dave Jones wrote: > On Wed, Aug 05, 2015 at 02:37:59PM +0200, Frederic Weisbecker wrote: > > On Tue, Aug 04, 2015 at 08:12:50PM -0400, Dave Jones wrote: > > > On Tue, Aug 04, 2015 at 12:54:35AM -0400, Sasha Levin wrote: > > > > On 08/03/2015 06:03 PM, Paul E. McKenney wrote: > > > > >> > Ugh, that doesn't revert cleanly. Got something handy ? > > > > > I do not, but perhaps either Sasha or Frederic do. > > > > > > > > I've attached a revert courtesy of Peter. > > > > > > Thanks. At first I thought this was doing the trick, but then I hit this again. > > > > > > > > > [23643.545873] INFO: rcu_preempt detected stalls on CPUs/tasks: > > > > If it still happens after Sasha's revert, which basically revert all the offending > > patches related to preempt lately, then the reason might be elsewhere. > > > > How hard was it to reproduce? I see 23000 secs in your dmesg logs which is around 6 hours. > > yeah. That's why I thought it had fixed it up until that point. > My subsequent overnight run hit a different bug (that unpinning an unpinned lock bug in the scheduler) > so I haven't had it happen since. > > > Also did you just launch trinity? no specific options? > > basically > > while [ 1 ]; > do > trinity -N 1000000 -q -l off -C256 -a64 -x fsync -x fdatasync -x syncfs -x sync -P INET --enable-fds=sockets > sudo ipcrm -a > done > > (The ipcrm thing is needed for long runs or eventually you oom, because trinity lacks the cleanup smarts) Ok, can I run that safely on my testbox without it eating some of my files or should I use some special purposed guest? Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/