Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753623AbaDCUCB (ORCPT ); Thu, 3 Apr 2014 16:02:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:18477 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753409AbaDCUBx (ORCPT ); Thu, 3 Apr 2014 16:01:53 -0400 Date: Thu, 3 Apr 2014 16:01:43 -0400 From: Dave Jones To: "Paul E. McKenney" , Linux Kernel Subject: Re: rcu_prempt stalls / lockup Message-ID: <20140403200143.GA8119@redhat.com> Mail-Followup-To: Dave Jones , "Paul E. McKenney" , Linux Kernel References: <20140331233552.GB30019@redhat.com> <20140401004801.GQ4284@linux.vnet.ibm.com> <20140401150849.GA14757@redhat.com> <20140401153032.GT4284@linux.vnet.ibm.com> <20140401172244.GA10363@redhat.com> <20140401175545.GV4284@linux.vnet.ibm.com> <20140401180414.GA12326@redhat.com> <20140401183245.GA12473@linux.vnet.ibm.com> <20140402162043.GA11090@linux.vnet.ibm.com> <20140402224840.GA5385@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140402224840.GA5385@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 02, 2014 at 06:48:40PM -0400, Dave Jones wrote: > > > > > Waiting uninterruptibly. Presumably blocked on mutex_lock(). But > > > > > you have CONFIG_PROVE_LOCKING(), so any deadlocks should have been > > > > > reported. > > > > > > > > Lockdep had reported something a little earlier (timestamped at 1108.xxxxxx) > > > > but that's a known false-positive in xfs. > > > > > > Yep, I would be very surprised if that was related to the grace-period hang. > > > > Ah, but it could be suppressing later lockdep splats. So if this can be > > reproduced without xfs, we might get additional information from lockdep. > > Hrmph. > > $ git bisect bad > The merge base 5cb480f6b488128140c940abff3c36f524a334a8 is bad. > This means the bug has been fixed between 5cb480f6b488128140c940abff3c36f524a334a8 and [455c6fdbd219161bd09b1165f11699d6d73de11c 62c206bd514600d4d73751ade00dca8e488390a3 e086481baf9d0436bdd6e9b739bfa4a83fb89ef5]. > > Not sure where to go from here.. > > The 'good' news is I can reproduce it pretty reliably now. > I start my fuzz tester, and immediately do a git diff in my working tree, > and then boom.. Even better, now I realise I don't even need my fuzzer in the mix. Just doing a fair amount of disk io (like a git diff on a dirty tree) will trigger it. I've tried adding a show_state() call when the stall happens, but another stall seems to occur before it gets a chance to even dump everything over the usb-serial console. And of course nothing ever makes it to disk, even though I can sysrq-sync, on the next reboot systemd has stuffed a bunch of ^@ in the log where the interesting stuff should be. Any other ideas ? Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/