Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752466AbbLETBA (ORCPT ); Sat, 5 Dec 2015 14:01:00 -0500 Received: from e35.co.us.ibm.com ([32.97.110.153]:50027 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751698AbbLETA7 (ORCPT ); Sat, 5 Dec 2015 14:00:59 -0500 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Sat, 5 Dec 2015 11:01:24 -0800 From: "Paul E. McKenney" To: tglx@linutronix.de, peterz@infradead.org, preeti@linux.vnet.ibm.com, viresh.kumar@linaro.org, mtosatti@redhat.com, fweisbec@gmail.com Cc: linux-kernel@vger.kernel.org, sasha.levin@oracle.com Subject: Re: Possible issue with commit 4961b6e11825? Message-ID: <20151205190124.GA1990@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20151204232022.GA15891@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151204232022.GA15891@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15120519-0013-0000-0000-00001ACC29B9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1907 Lines: 40 On Fri, Dec 04, 2015 at 03:20:22PM -0800, Paul E. McKenney wrote: > Hello! > > Are there any known issues with commit 4961b6e11825 (sched: core: Use > hrtimer_start[_expires]())? > > The reason that I ask is that I am about 90% sure that an rcutorture > failure bisects to that commit. I will be running more tests on > 3497d206c4d9 (perf: core: Use hrtimer_start()), which is the predecessor > of 4961b6e11825, and which, unlike 4961b6e11825, passes a 12-hour > rcutorture test with scenario TREE03. In contrast, 4961b6e11825 gets > 131 RCU CPU stall warnings, 132 reports of one of RCU's grace-period > kthreads being starved, and 525 reports of one of rcutorture's kthreads > being starved. Most of the test runs hang on shutdown, which is no > surprise if an RCU CPU stall is happening at about that time. > > But perhaps 3497d206c4d9 was just getting lucky, hence additional testing > over the weekend. And it was getting lucky. In a set of 24 two-hour runs (triple parallel) on an earlier commit (not 3497d206c4d9, no clue what I was thinking) got me two failed runs, for a total of 49 reports of one of RCU's grace-period kthreads being starved, no reports of rcutorture's kthreads being starved, and no hangs on shutdown. So much lower failure rate, but still failures. At this point, I am a bit disgusted with bisection, so my next test cycle (36 two-hour runs on a system capable of doing three concurrently) is on the most recent -rcu, but with CPU hotplug disabled. If that shows failures, then I hammer 3497d206c4d9 hard. Anyway, if you have any ideas as to what might be happening, please don't keep them a secret! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/