Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1968978rwb; Thu, 15 Dec 2022 17:26:56 -0800 (PST) X-Google-Smtp-Source: AA0mqf4XDX3vT/TJY/QmgN+bJPrzo9kkw0z7xmc5cfeq5MUUSV+m93JKbgwh5yFL/mTEKrzQFsjd X-Received: by 2002:a05:6a20:c518:b0:ad:f82d:78ad with SMTP id gm24-20020a056a20c51800b000adf82d78admr9943074pzb.46.1671154016135; Thu, 15 Dec 2022 17:26:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671154016; cv=none; d=google.com; s=arc-20160816; b=fMG2DY0RtFyC6Kv1mNes+YUEsk8ilQ7eNQpv9DZY6lLwbX8CkUZkgVWvS0PfbYY2ZS f59oZP+ayaPhLp+BkTzqqyfSLp+Ri0+TQe8sacwOOeCpVH69SGQ9a5o/JoDSXWUMpELG Q85Y1sJuYpMXxYhSj+9lq764FZwyTLkeLCRhSy65wgPeUoBGR3wQd6mvCspX6B1ziwcT wtcrNYz18CEszNvt99M8InaqGt4Fo8Iy3Yha0F6JXnWaB/Dj4mdy7kB2i2txkyfwmCRR aWGEUT9hveyrLkxOydgmKgihkGLnDK/cBXf/zB3Bu2xREL9py61Rp4BTIeVrQFQwnPDY bzZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=OqcscFNu0n5JyAV4VzbaSLOB4q+BXDDSXWyaEt8gWVQ=; b=QMEhvctcXRMqX1SDza6MOqWxj71/k7g1fZZnHF7/GIri4A3+gD211uNIbMPxZihxDm tjyZMP8cDVwv5uNMK/R2VxY9kzWx63ZCa6TxzRLcpVPTcyM5+W0PJrevyBy/m/+Wz8HT 8NRiXJalG+YboD+cTdpc5vG3dkBV8QpbZo3RU8d8fxrzgl7KFrgzo1pBerRenEh7NbTG G/livgGZ+XQocg2c5mv5OAu+dlvho2RPF0rUkOEe/P5YZht36ORL4ze47NnBAit9JrhD XgX88kTm95llkFpHUEVfF9PzhDGf0/ocXykOt6EwkDcD5INeP9YVRowSxIp23LP+bdCU 5Mug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="aTkS/l17"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l7-20020a170903120700b00189af470730si1090133plh.43.2022.12.15.17.26.45; Thu, 15 Dec 2022 17:26:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="aTkS/l17"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229780AbiLPBN2 (ORCPT + 68 others); Thu, 15 Dec 2022 20:13:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229496AbiLPBN1 (ORCPT ); Thu, 15 Dec 2022 20:13:27 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 093A12CE32; Thu, 15 Dec 2022 17:13:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BBDB6B81CF6; Fri, 16 Dec 2022 01:13:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5E606C433D2; Fri, 16 Dec 2022 01:13:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671153203; bh=sUqE3VXTWozBffwwYpaZ2epjOPCYn1HioBPYHUQ7BbE=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=aTkS/l17fh4qS3syVi2jGKfy/9SUQDuuaRQ1tIYf4IKz4zgwYhX9jXOwOtxxpfon8 8D0HuHS0SzunDLavezR3Y0FLolRVTJiS17gVYdkYEE64rFZ7BmorrdyQOGFTtje+tq e4GtGkC/24jinQEZqcpPrHWChAGufTrG1UNh7cGTfTNNOhMmxc23xYqBIYYgcKpB7s ugMOghjntqFwW548ne54EABkaNp6b4IuTUUwEr+CEFsTWmEOypeVIefasNuAZKI4t6 6Yj3RP2eLKlWuEF3sWeZGmSOB5mnWC/e/WniMcDlY7nF6g6/mKv0tLmWrDeAJhINhv daNTC6sZfjczA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id E9F2D5C1C5B; Thu, 15 Dec 2022 17:13:22 -0800 (PST) Date: Thu, 15 Dec 2022 17:13:22 -0800 From: "Paul E. McKenney" To: Joel Fernandes Cc: Frederic Weisbecker , boqun.feng@gmail.com, neeraj.iitr10@gmail.com, urezki@gmail.com, rcu@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC] srcu: Yet more detail for srcu_readers_active_idx_check() comments Message-ID: <20221216011322.GY4001@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <07A65F0F-89CE-481B-BD6C-6D4946E70482@joelfernandes.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <07A65F0F-89CE-481B-BD6C-6D4946E70482@joelfernandes.org> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 15, 2022 at 05:22:21PM -0500, Joel Fernandes wrote: > > On Dec 15, 2022, at 5:13 PM, Joel Fernandes wrote: > >> On Dec 15, 2022, at 3:13 PM, Paul E. McKenney wrote: > >> > >> On Thu, Dec 15, 2022 at 05:58:14PM +0000, Joel Fernandes wrote: > >>>> On Thu, Dec 15, 2022 at 5:48 PM Joel Fernandes wrote: > >>>> > >>>>>> On Thu, Dec 15, 2022 at 5:08 PM Paul E. McKenney wrote: > >>>>> > >>>>>>> Scenario for the reader to increment the old idx once: > >>>>>>> > >>>>>>> _ Assume ssp->srcu_idx is initially 0. > >>>>>>> _ The READER reads idx that is 0 > >>>>>>> _ The updater runs and flips the idx that is now 1 > >>>>>>> _ The reader resumes with 0 as an index but on the next srcu_read_lock() > >>>>>>> it will see the new idx which is 1 > >>>>>>> > >>>>>>> What could be the scenario for it to increment the old idx twice? > >>>>>> > >>>>>> Unless I am missing something, the reader must reference the > >>>>>> srcu_unlock_count[old_idx] and then do smp_mb() before it will be > >>>>>> absolutely guaranteed of seeing the new value of ->srcu_idx. > >>>>> > >>>>> I think both of you are right depending on how the flip raced with the > >>>>> first reader's unlock in that specific task. > >>>>> > >>>>> If the first read section's srcu_read_unlock() and its corresponding > >>>>> smp_mb() happened before the flip, then the increment of old idx > >>>>> would happen only once. The next srcu_read_lock() will read the new > >>>>> index. If the srcu_read_unlock() and it's corresponding smp_mb() > >>>>> happened after the flip, the old_idx will be sampled again and can be > >>>>> incremented twice. So it depends on how the flip races with > >>>>> srcu_read_unlock(). > >>> > >>> I am sorry this is inverted, but my statement's gist stands I believe: > >>> > >>> 1. Flip+smp_mb() happened before unlock's smp_mb() -- reader will not > >>> increment old_idx the second time. > >> > >> By "increment old_idx" you mean "increment ->srcu_lock_count[old_idx]", > >> correct? > > > > Yes sorry for confusing, i indeed meant lock count increment corresponding to the old index. > >> > >> Again, the important ordering isn't the smp_mb(), but the accesses, > >> in this case, the accesses to ->srcu_unlock_count[idx]. > > > > I was talking about ordering of the flip of index (write) with respect to both the reading of the old index in the rcu_read_lock() and its subsequent lock count increment corresponding to that index. I believe we are talking her about how this race can effect the wrap around issues when scanning for readers in the pre flip index, and we concluded that there can be at most 2 of these on the SAME task. The third time, reader will always see the new flipped index because of the memory barriers on both sides. IOW, the same task cannot overflow the lock counter on the preflipped index and cause issues. However there can be Nt different tasks so perhaps you can have 2*Nt number of preempted > > Sorry, to be more precise, I mean you have Nt preempted readers, which owing to memory barriers, if you have at least Nt CPUs, and they each ran on those CPUs, then you can have 2*Nt increments on the lock count at the old index. > > Or something. Agreed! And if there are more tasks than CPUs, the maximum number of increments is Nt+Nc. Thanx, Paul > Thanks. > > > > > > readers that had sampled the old index and now will do a lock and unlock on that old index, potentially causing a lock==unlock match when there should not be a match. > > > >> > >>> 2. unlock()'s smp_mb() happened before Flip+smp_mb() , now the reader > >>> has no new smp_mb() that happens AFTER the flip happened. So it can > >>> totally sample the old idx again -- that particular reader will > >>> increment twice, but the next time, it will see the flipped one. > >> > >> I will let you transliterate both. ;-) > > > > I think I see what you mean now :) > > > > I believe the access I am referring to is the read of idx on one side and the write to idx on the other. However that is incomplete and I need to pair that with some of other access on both sides. > > > > So perhaps this: > > > > Writer does flip + smp_mb + read unlock counts [1] > > > > Reader does: > > read idx + smp_mb() + increment lock counts [2] > > > > And subsequently reader does > > Smp_mb() + increment unlock count. [3] > > > > So [1] races with either [2] or [2]+[3]. > > > > Is that fair? > > > >>> Did I get that right? Thanks. > >> > >> So why am I unhappy with orderings of smp_mb()? > >> > >> To see this, let's take the usual store-buffering litmus test: > >> > >> CPU 0 CPU 1 > >> WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); > >> smp_mb(); smp_mb(); > >> r0 = READ_ONCE(y); r1 = READ_ONCE(x); > >> > >> Suppose CPU 0's smp_mb() happens before that of CPU 1: > >> > >> CPU 0 CPU 1 > >> WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); > >> smp_mb(); > >> smp_mb(); > >> r0 = READ_ONCE(y); r1 = READ_ONCE(x); > >> > >> We get r0 == r1 == 1. > >> > >> Compare this to CPU 1's smp_mb() happening before that of CPU 0: > >> > >> CPU 0 CPU 1 > >> WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); > >> smp_mb(); > >> smp_mb(); > >> r0 = READ_ONCE(y); r1 = READ_ONCE(x); > >> > >> We still get r0 == r1 == 1. Reversing the order of the two smp_mb() > >> calls changed nothing. > >> > >> But, if we order CPU 1's write to follow CPU 0's read, then we have > >> this: > >> > >> CPU 0 CPU 1 > >> WRITE_ONCE(x, 1); > >> smp_mb(); > >> r0 = READ_ONCE(y); > >> WRITE_ONCE(y, 1); > >> smp_mb(); > >> r1 = READ_ONCE(x); > >> > >> Here, given that r0 had the final value of zero, we know that > >> r1 must have a final value of 1. > >> > >> And suppose we reverse this: > >> > >> CPU 0 CPU 1 > >> WRITE_ONCE(y, 1); > >> smp_mb(); > >> r1 = READ_ONCE(x); > >> WRITE_ONCE(x, 1); > >> smp_mb(); > >> r0 = READ_ONCE(y); > >> > >> Now there is a software-visible difference in behavior. The value of > >> r0 is now 1 instead of zero and the value of r1 is now 0 instead of 1. > >> > >> Does this make sense? > > > > Yes I see what you mean. In first case, smp_mb() ordering didn’t matter. But in the second case it does. > > > > Thanks, > > > > - Joel > > > > > >> > >> Thanx, Paul