Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2014221rwb; Thu, 15 Dec 2022 18:10:47 -0800 (PST) X-Google-Smtp-Source: AMrXdXtcQmNDu6xg7vt4f4q/rrDqEYNaOfcYv76So2voV4JGwgbteMnqz0X10M/L1wUZ6sbpNv8w X-Received: by 2002:aa7:c6c4:0:b0:472:2d7e:8c6d with SMTP id b4-20020aa7c6c4000000b004722d7e8c6dmr8137657eds.28.1671156647387; Thu, 15 Dec 2022 18:10:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671156647; cv=none; d=google.com; s=arc-20160816; b=LkRVBFHVbvVkjbgArg+16NL7MrLsqpQu4+4V11SlN8GycdpXsqBhCiRXaOZcUS7QUC V+Ze/TitLpxvM3iAjFU25dSzVaVFbx+nCzvNF2YTlHck9BT+InK5P4L4UOUCulEG5KxF 2skLRC+INekOz4oJ3JKRd5o2WNLMIcBzwOQmKCN1VQnM44UN6c6/5A0hpb0zsXvM3ebo vAw/gMkhipCpCqeRBFp3NPIXdI223rVMD4ugBx1bu6+z8KTHUTyxJ1UT5N6P46XCYu5i cDyxMQb1MVLlulwbg3WyWzjFMr8EfXQyKnXpNQn+jZi4I2I0rC93e35rd2cEUzklu8ic H3GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=FscxWqqa3fHSdCcRFPx8RQ2K/gLcGaN8YjVuv2Qsez0=; b=W6IcH2YJVOEg4W28xHIrvKvgnWg6abbjpSDIFAvrAquI7ZLhI62je8lp/M6GCbtvvu Dygy/zYOZZKCrggwxp1q0zwLWt9BvC3iytnIZqNsOPcl4GDBVOfSOuMgKj+M3EL8ZpL4 jpA/Lj4BfBHom2mFe5FFH9F6GD6h2hFH94METpEpxN+x79OpdVReqvFujLLId/zUEN5I yGLsvFOdBiZBaUsRbPZRQUdBVZ6hXA6WsZQ+nAA1BnsovBLSzLY8HJWiQMnyIJvLlsAC eEDrRMNTAqamn6uDvhkE/1u0z8Wbk/Fr6OyboJDcrWvXarrxXLtgeNoFGo3P9bZy3HRz 6NFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cLUlT8n7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p15-20020a056402500f00b0046ca78c5bffsi985794eda.449.2022.12.15.18.10.31; Thu, 15 Dec 2022 18:10:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cLUlT8n7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229797AbiLPBJV (ORCPT + 68 others); Thu, 15 Dec 2022 20:09:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229554AbiLPBJS (ORCPT ); Thu, 15 Dec 2022 20:09:18 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9087B59162; Thu, 15 Dec 2022 17:09:17 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 50A12B81C34; Fri, 16 Dec 2022 01:09:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E2D58C433D2; Fri, 16 Dec 2022 01:09:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671152955; bh=X8ckF3pMSIDU85SvyTVCSI76pOPRlxgwJxlKvpO8HKI=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=cLUlT8n7Bc58n6ywTsfkt4OwPjLZkPDSk9rUzAGXuwJznqEk7DaMlDQjzo6n5uEfr zdk7k4QPWsMWLDU+dlqt+4Ed5MkdGJZtbvpuD8UCPPr8y6GGJiuSCO6wJvYxMeix6u xiLz7+Mk4ck1nXX/2AiLeJFIhv8Fp3PJYMMQnqWFkQd/SXeDqixujWNDdAn60Xohfz rcR3nBZA29NNigLuVOvewcavqa4FZ11h3uMFwzoVT/UNSSPO9SxFZ4XW+kb/3BQJNv IamGmdkae9W8T6f0R5jLOSo8W1IW5enFawI+c58pRJj/LbpXuKTBxD4t111e+5tfhj EGTMTvFmBkUrw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 8A7C45C1C5B; Thu, 15 Dec 2022 17:09:14 -0800 (PST) Date: Thu, 15 Dec 2022 17:09:14 -0800 From: "Paul E. McKenney" To: Joel Fernandes Cc: Frederic Weisbecker , boqun.feng@gmail.com, neeraj.iitr10@gmail.com, urezki@gmail.com, rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC] srcu: Yet more detail for srcu_readers_active_idx_check() comments Message-ID: <20221216010914.GX4001@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20221215201356.GM4001@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 15, 2022 at 05:13:47PM -0500, Joel Fernandes wrote: > > On Dec 15, 2022, at 3:13 PM, Paul E. McKenney wrote: > > On Thu, Dec 15, 2022 at 05:58:14PM +0000, Joel Fernandes wrote: > >>> On Thu, Dec 15, 2022 at 5:48 PM Joel Fernandes wrote: > >>> > >>>> On Thu, Dec 15, 2022 at 5:08 PM Paul E. McKenney wrote: > >>> > >>>>> Scenario for the reader to increment the old idx once: > >>>>> > >>>>> _ Assume ssp->srcu_idx is initially 0. > >>>>> _ The READER reads idx that is 0 > >>>>> _ The updater runs and flips the idx that is now 1 > >>>>> _ The reader resumes with 0 as an index but on the next srcu_read_lock() > >>>>> it will see the new idx which is 1 > >>>>> > >>>>> What could be the scenario for it to increment the old idx twice? > >>>> > >>>> Unless I am missing something, the reader must reference the > >>>> srcu_unlock_count[old_idx] and then do smp_mb() before it will be > >>>> absolutely guaranteed of seeing the new value of ->srcu_idx. > >>> > >>> I think both of you are right depending on how the flip raced with the > >>> first reader's unlock in that specific task. > >>> > >>> If the first read section's srcu_read_unlock() and its corresponding > >>> smp_mb() happened before the flip, then the increment of old idx > >>> would happen only once. The next srcu_read_lock() will read the new > >>> index. If the srcu_read_unlock() and it's corresponding smp_mb() > >>> happened after the flip, the old_idx will be sampled again and can be > >>> incremented twice. So it depends on how the flip races with > >>> srcu_read_unlock(). > >> > >> I am sorry this is inverted, but my statement's gist stands I believe: > >> > >> 1. Flip+smp_mb() happened before unlock's smp_mb() -- reader will not > >> increment old_idx the second time. > > > > By "increment old_idx" you mean "increment ->srcu_lock_count[old_idx]", > > correct? > > Yes sorry for confusing, i indeed meant lock count increment corresponding to the old index. I guessed correctly!!! Don't worry, it won't happen again. ;-) > > Again, the important ordering isn't the smp_mb(), but the accesses, > > in this case, the accesses to ->srcu_unlock_count[idx]. > > I was talking about ordering of the flip of index (write) with respect > to both the reading of the old index in the rcu_read_lock() and its > subsequent lock count increment corresponding to that index. I believe > we are talking her about how this race can effect the wrap around issues > when scanning for readers in the pre flip index, and we concluded that > there can be at most 2 of these on the SAME task. Agreed. > The third time, reader > will always see the new flipped index because of the memory barriers on > both sides. IOW, the same task cannot overflow the lock counter on the > preflipped index and cause issues. However there can be Nt different > tasks so perhaps you can have 2*Nt number of preempted readers that had > sampled the old index and now will do a lock and unlock on that old index, > potentially causing a lock==unlock match when there should not be a match. So each task can do one old-index ->srcu_lock_count[] increment, and Nc of them can do a second one, where Nc is the number of CPUs. This is because a given task's smp_mb() applies to all later code executed by that task and also to code executed by other tasks running later on that same CPU. > >> 2. unlock()'s smp_mb() happened before Flip+smp_mb() , now the reader > >> has no new smp_mb() that happens AFTER the flip happened. So it can > >> totally sample the old idx again -- that particular reader will > >> increment twice, but the next time, it will see the flipped one. > > > > I will let you transliterate both. ;-) > > I think I see what you mean now :) > > I believe the access I am referring to is the read of idx on one side and the write to idx on the other. However that is incomplete and I need to pair that with some of other access on both sides. > > So perhaps this: > > Writer does flip + smp_mb + read unlock counts [1] > > Reader does: > read idx + smp_mb() + increment lock counts [2] > > And subsequently reader does > Smp_mb() + increment unlock count. [3] > > So [1] races with either [2] or [2]+[3]. > > Is that fair? That does look much better, thank you! > >> Did I get that right? Thanks. > > > > So why am I unhappy with orderings of smp_mb()? > > > > To see this, let's take the usual store-buffering litmus test: > > > > CPU 0 CPU 1 > > WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); > > smp_mb(); smp_mb(); > > r0 = READ_ONCE(y); r1 = READ_ONCE(x); > > > > Suppose CPU 0's smp_mb() happens before that of CPU 1: > > > > CPU 0 CPU 1 > > WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); > > smp_mb(); > > smp_mb(); > > r0 = READ_ONCE(y); r1 = READ_ONCE(x); > > > > We get r0 == r1 == 1. > > > > Compare this to CPU 1's smp_mb() happening before that of CPU 0: > > > > CPU 0 CPU 1 > > WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); > > smp_mb(); > > smp_mb(); > > r0 = READ_ONCE(y); r1 = READ_ONCE(x); > > > > We still get r0 == r1 == 1. Reversing the order of the two smp_mb() > > calls changed nothing. > > > > But, if we order CPU 1's write to follow CPU 0's read, then we have > > this: > > > > CPU 0 CPU 1 > > WRITE_ONCE(x, 1); > > smp_mb(); > > r0 = READ_ONCE(y); > > WRITE_ONCE(y, 1); > > smp_mb(); > > r1 = READ_ONCE(x); > > > > Here, given that r0 had the final value of zero, we know that > > r1 must have a final value of 1. > > > > And suppose we reverse this: > > > > CPU 0 CPU 1 > > WRITE_ONCE(y, 1); > > smp_mb(); > > r1 = READ_ONCE(x); > > WRITE_ONCE(x, 1); > > smp_mb(); > > r0 = READ_ONCE(y); > > > > Now there is a software-visible difference in behavior. The value of > > r0 is now 1 instead of zero and the value of r1 is now 0 instead of 1. > > > > Does this make sense? > > Yes I see what you mean. In first case, smp_mb() ordering didn’t matter. But in the second case it does. Yes, there have to be accesses for the software to even see the effects of a given smp_mb(). Thanx, Paul