Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754506AbcKQRPs (ORCPT ); Thu, 17 Nov 2016 12:15:48 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46907 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753416AbcKQRPo (ORCPT ); Thu, 17 Nov 2016 12:15:44 -0500 Date: Thu, 17 Nov 2016 07:03:50 -0800 From: "Paul E. McKenney" To: Boqun Feng Cc: Lai Jiangshan , LKML , Ingo Molnar , dipankar@in.ibm.com, akpm@linux-foundation.org, Mathieu Desnoyers , Josh Triplett , Thomas Gleixner , Peter Zijlstra , Steven Rostedt , David Howells , edumazet@google.com, dvhart@linux.intel.com, =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , oleg@redhat.com, bobby.prani@gmail.com, ldr709@gmail.com Subject: Re: [PATCH RFC tip/core/rcu] SRCU rewrite Reply-To: paulmck@linux.vnet.ibm.com References: <20161114183636.GA28589@linux.vnet.ibm.com> <20161115014445.GC12110@tardis.cn.ibm.com> <20161115143700.GZ4127@linux.vnet.ibm.com> <20161117143012.GB5227@tardis.cn.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161117143012.GB5227@tardis.cn.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16111715-0024-0000-0000-0000150B2AA1 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006094; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000189; SDB=6.00781970; UDB=6.00377277; IPR=6.00559475; BA=6.00004889; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013361; XFM=3.00000011; UTC=2016-11-17 15:03:54 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16111715-0025-0000-0000-000046369D66 Message-Id: <20161117150350.GY3612@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-17_09:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611170269 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3430 Lines: 90 On Thu, Nov 17, 2016 at 10:31:00PM +0800, Boqun Feng wrote: > On Thu, Nov 17, 2016 at 08:18:51PM +0800, Lai Jiangshan wrote: > > On Tue, Nov 15, 2016 at 10:37 PM, Paul E. McKenney > > wrote: > > > On Tue, Nov 15, 2016 at 09:44:45AM +0800, Boqun Feng wrote: > > > > >> > > >> __srcu_read_lock() used to be called with preemption disabled. I guess > > >> the reason was because we have two percpu variables to increase. So with > > >> only one percpu right, could we remove the preempt_{dis,en}able() in > > >> srcu_read_lock() and use this_cpu_inc() here? > > > > > > Quite possibly... > > > > > > > Hello, Lai ;-) > > > it will be nicer if it is removed. > > > > The reason for the preemption-disabled was also because we > > have to disallow any preemption between the fetching of the idx > > and the increasement. so that we have at most NR_CPUS worth > > of readers using the old index that haven't incremented the counters. > > > > After reading the comment for a while, I actually got a question, maybe > I miss something ;-) > > Why "at most NR_CPUS worth of readers using the old index haven't > incremented the counters" could save us from overflow the counter? > > Please consider the following case in current implementation: > > > {sp->completed = 0} so idx = 1 in srcu_advance_batches(...) > > one thread A is currently in __srcu_read_lock() and using idx = 1 and > about to increase the percpu c[idx], and ULONG_MAX __srcu_read_lock()s > have been called and returned with idx = 1, please note I think this is > possible because I assume we may have some code like this: > > unsigned long i = 0; > for (; i < ULONG_MAX; i++) > srcu_read_lock(); // return the same idx 1; First, please don't do this. For any number of reasons! ;-) Second, the theory is that if the updater fails to see the update from one of the srcu_read_lock() calls in the loop, then the reader must see the new index on the next pass through the loop. Which would be one of the problems with the above loop -- it cannot be guaranteed that they all will return the same index. > And none of the corresponding srcu_read_unlock() has been called; > > In this case, at the time thread A increases the percpu c[idx], that > will result in an overflow, right? So even one reader using old idx will > result in overflow. It is quite possible that the NR_CPUS bound is too tight, but the memory barriers do prevent readers from seeing the old index beyond a certain point. > I think we won't be hit by overflow is not because we have few readers > using old idx, it's because there are unlikely ULONG_MAX + 1 > __srcu_read_lock() called for the same idx, right? And the reason of > this is much complex: because we won't have a fair mount of threads in > the system, because no thread will nest srcu many levels, because there > won't be a lot readers using old idx. > > And this will still be true if we use new mechanism and shrink the > preemption disabled section, right? Well, the analysis needs to be revisited, for sure. ;-) Thanx, Paul > Regards, > Boqun > > > if we remove the preempt_{dis,en}able(). we must change the > > "NR_CPUS" in the comment into ULONG_MAX/4. (I assume > > one on-going reader needs at least need 4bytes at the stack). it is still safe. > > > > but we still need to think more if we want to remove the preempt_{dis,en}able(). > > > > Thanks > > Lai