Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755249Ab1FCPdu (ORCPT ); Fri, 3 Jun 2011 11:33:50 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:58847 "EHLO e9.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753313Ab1FCPds (ORCPT ); Fri, 3 Jun 2011 11:33:48 -0400 Date: Fri, 3 Jun 2011 08:33:44 -0700 From: "Paul E. McKenney" To: Vivek Goyal Cc: Paul Bolle , Jens Axboe , linux kernel mailing list Subject: Re: Mysterious CFQ crash and RCU Message-ID: <20110603153344.GB2333@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20110519222404.GG12600@redhat.com> <20110521210013.GJ2271@linux.vnet.ibm.com> <20110523152141.GB4019@redhat.com> <20110523153848.GC2310@linux.vnet.ibm.com> <1306401337.27271.3.camel@t41.thuisdomein> <20110603050724.GB2304@linux.vnet.ibm.com> <20110603134514.GA31057@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110603134514.GA31057@redhat.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2447 Lines: 66 On Fri, Jun 03, 2011 at 09:45:14AM -0400, Vivek Goyal wrote: > On Thu, Jun 02, 2011 at 10:07:24PM -0700, Paul E. McKenney wrote: > > [..] > > > Thu May 26 10:47:20 CEST 2011 > > > /sys/kernel/debug/rcu/rcugp: > > > rcu_sched: completed=682249 gpnum=682250 > > > > 15 more seconds, a few thousand more grace periods. About 500 grace > > periods per second, which is quite reasonable on a single-CPU system. > > PaulB mentioned that crash happened at May 26 10:47:07. I am wondering > how are we able to sample the data after the crash. I am assuming > that above data gives information only before crash and does not > tell us anything about what happened just before crash. What am I missing. > > PaulM, in one of the mails you had mentioned that one could print > context switch id to make sure we did not block in rcu section. Would > you have quick pointer where is context switch id stored. May be > I can write a small patch for PaulB. >From what I can see, the task_struct nvcsw and nivcsw fields should do it, though I am not seeing where these are incremented. So if these don't do what you need, the following (untested but trivial) patch will provide an rcu_switch_count in the task structure. Thanx, Paul ------------------------------------------------------------------------ rcu: add diagnostic per-task context-switch count Note that this is also incremented by softirqs. Signed-off-by: Paul E. McKenney diff --git a/include/linux/sched.h b/include/linux/sched.h index 2a8621c..5ef22e2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1261,6 +1261,7 @@ struct task_struct { #ifdef CONFIG_RCU_BOOST struct rt_mutex *rcu_boost_mutex; #endif /* #ifdef CONFIG_RCU_BOOST */ + unsigned long rcu_switch_count; #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) struct sched_info sched_info; diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 89419ff..080c6eb 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -154,6 +154,7 @@ void rcu_bh_qs(int cpu) */ void rcu_note_context_switch(int cpu) { + current->rcu_switch_count++; rcu_sched_qs(cpu); rcu_preempt_note_context_switch(cpu); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/