Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261781AbVEJU63 (ORCPT ); Tue, 10 May 2005 16:58:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261793AbVEJU62 (ORCPT ); Tue, 10 May 2005 16:58:28 -0400 Received: from amsfep13-int.chello.nl ([213.46.243.23]:35100 "EHLO amsfep13-int.chello.nl") by vger.kernel.org with ESMTP id S261790AbVEJU4Z (ORCPT ); Tue, 10 May 2005 16:56:25 -0400 Subject: Re: [RFC] RCU and CONFIG_PREEMPT_RT progress From: Peter Zijlstra To: paulmck@us.ibm.com Cc: dipankar@in.ibm.com, Ingo Molnar , LKML In-Reply-To: <20050510202915.GH1566@us.ibm.com> References: <20050510012444.GA3011@us.ibm.com> <1115755692.26548.15.camel@twins> <20050510202915.GH1566@us.ibm.com> Content-Type: text/plain Date: Tue, 10 May 2005 22:56:24 +0200 Message-Id: <1115758584.26548.33.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.2.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3837 Lines: 106 On Tue, 2005-05-10 at 13:29 -0700, Paul E. McKenney wrote: > On Tue, May 10, 2005 at 10:08:11PM +0200, Peter Zijlstra wrote: > > On Mon, 2005-05-09 at 18:24 -0700, Paul E. McKenney wrote: > > > > > > > > Counter-Based Approach > > > > > > The current implementation in Ingo's CONFIG_PREEMPT_RT patch uses a > > > counter-based approach, which seems to work, but which can result in > > > indefinite-duration grace periods. The following are very hazy thoughts > > > on how to get the benefits of this approach, but with short grace periods. > > > > > > 1. The basic trick is to maintain a pair of counters per CPU. > > > There would also be a global boolean variable that would select > > > one or the other of each pair. The rcu_read_lock() primitive > > > would then increment the counter indicated by the boolean > > > corresponding to the CPU that it is currently running on. > > > It would also keep a pointer to that particular counter in > > > the task structure. The rcu_read_unlock() primitive would > > > decrement this counter. (And, yes, you would also have a > > > counter in the task structure so that only the outermost of > > > a set of nested rcu_read_lock()/rcu_read_unlock() pairs would > > > actually increment/decrement the per-CPU counter pairs.) > > > > > > To force a grace period, one would invert the value of the > > > global boolean variable. Once all the counters indicated > > > by the old value of the global boolean variable hit zero, > > > the corresponding set of RCU callbacks can be safely invoked. > > > > > > The big problem with this approach is that a pair of inversions > > > of the global boolean variable could be spaced arbitrarily > > > closely, especially when you consider that the read side code > > > can be preempted. This could cause RCU callbacks to be invoked > > > prematurely, which could greatly reduce the life expectancy > > > of your kernel. > > > > > Thoughts? > > > > How about having another boolean indicating the ability to flip the > > selector boolean. This boolean would be set false on an actual flip and > > cleared during a grace period. That way the flips cannot ever interfere > > with one another such that the callbacks would be cleared prematurely. > > But the flip is an integral part of detecting a grace period. So, if I > understand your proposal correctly, I would have to flip to figure out > when it was safe to flip. > > What am I missing here? int can_flip = 1; int selector = 0; int counter[2] = {0, 0}; void up() { ++counter[current->selection = selector]; } void down() { if (!--counter[current->selection]) do_grace(); } void do_grace() { // free stuff can_flip = 1; } void force_grace() { if (can_flip) { can_flip = 0; selector ^= 1; } } The way I understood your proposal was that in order to force a grace period you flip the selector and once the old one reaches zero again it does a cleanup. Now your problem was that when you force another flip before the old counter reached zero the shit will hit the proverbial fan. So what I proposed (as hopefully illustrated by the code) was another boolean; my can_flip; that controls the flippability :-) One can for example call force_grace every few seconds or when a watermark on the rcu-callback queue has been reached. If can_flip blocks the actual flip nothing is lost since a cleanup is allready pending. I hope to have been clearer in explaining my idea; or if I'm missing the issue tell me to read the thread in the morning ;) -- Peter Zijlstra - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/