Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp223340imd; Fri, 2 Nov 2018 22:14:33 -0700 (PDT) X-Google-Smtp-Source: AJdET5d0c4kz9WpV1+Cyk/myTRm/QRc+tG4IkdpBeE5HpFun+/gQer1XwcrVGmHfHa4nyB4RjUwL X-Received: by 2002:a63:4c6:: with SMTP id 189mr13367997pge.391.1541222073215; Fri, 02 Nov 2018 22:14:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1541222073; cv=none; d=google.com; s=arc-20160816; b=j4Kbe5JMqDfn1UgVeKvEpM0LinyhtH8OGkr+uyS4Kwc3PLu6TEIYBY3B8R8kpLnOwZ mby23KbCvwcLr46flIagjDvnR4PLQPCsEB5VzJccc0BZMiKepG+757AT/ejK6crLCSgm bSsNg4PLGh1bjsw3HQMXmnhmtgv/LID7B44duHuxEFpYaZFOv3WqYpar0+Xby8E32JzG UX2Pzz0+JKuuM+r9McLe9gmmS4jd5W5Kncg/y6KZKFK/kMjeTQN+CXqCPklfmwt4dB5G o1c240oBJZz8jitGsLl3ZWEHsSKEeV7P+CjwNfEoxodeY3PYd3UPeHcVoQIF7ys6sr9Y DslQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=onf0XnrqAWKM4+UsBUPio8TtEZj/vv/axxvutwFrADY=; b=nXRbd+1y9DgyIpFGa+/9OSFeX94sY2TfWLd5/D1XggpFed3xclY/rIsELnQjvU04et xXVCVIg28lC+ilPZQw6sdhsaDvVTnt8qcwGxY2qg/IhMmGNA7iTAPxoPcu1YMMrmkGjc 62TLYMTzyvlQTgvAsqqdulqXAkzyaxBkr2B+rf5eSnQpZ86b3df8p8Tb6Wm4oyG0unoS Eg80jpTEutLJgS2pBWnyrHtM8UaUYXbqHWLdnc8gE2hDBHDtqLVeS6zY1FTnUjEHJ3DU t+N72aXihNX1AIZRRc/WDWSehlTx+rVKLTRv3IhubZgS4gGKi8SyhZj4+6NvVfR195KR EzEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=QXHelVX6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a10-v6si24555856pfi.222.2018.11.02.22.14.17; Fri, 02 Nov 2018 22:14:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=QXHelVX6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726770AbeKCOWd (ORCPT + 99 others); Sat, 3 Nov 2018 10:22:33 -0400 Received: from mail-pf1-f172.google.com ([209.85.210.172]:43960 "EHLO mail-pf1-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726165AbeKCOWc (ORCPT ); Sat, 3 Nov 2018 10:22:32 -0400 Received: by mail-pf1-f172.google.com with SMTP id h4-v6so1925406pfi.10 for ; Fri, 02 Nov 2018 22:12:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=onf0XnrqAWKM4+UsBUPio8TtEZj/vv/axxvutwFrADY=; b=QXHelVX6DoWOA5nEVzpN8csGt277UQJgE5Of9a3eFrc23/AL2Ydjfz1PvCyCLCoKs2 BurBJVZ+1eow4+BYWKOBTmLwY/cV0Bf5TlTv24iprXqFCsyeVNZtA/WOnnvkLEMDo25S 2FHCd5Ue57orsUyvhXV9kR00EzUk+34APQj30= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=onf0XnrqAWKM4+UsBUPio8TtEZj/vv/axxvutwFrADY=; b=jXQj6BR3j2Nb+nLLchxlJQ/QodgRspVL3ZguvIOikjhpbT/zBBwfTe0BRIg+3s+gmn S2d7ky51GmkgpC9vP7WOfB5onSW6Ge4m6215fPngnO3J/olRf4O6ciTN7N2muVOxWUg7 iHfRWIXgog9BI6lojoB7lReABQonR1J8W0yZFbTW6T59pIdNXlTYLseA0f64CN9ThBdk iMDfXUujlHG81sX1GYfNNGL+WvU+wUDep1VUt7/H9aNohHSu5sbyLROxL02FQgDK1hiu Thg9NWyVQyXEjfFOW5c6momaOsySxJ2a94e/fFye2+Az6kY/CqIuZ61+iVagEJ6jGt4u /eWg== X-Gm-Message-State: AGRZ1gIsXHCPTnZdoTiR4s6zK93RXLusZBaCfQR9wPup+YcDGFrOM96G 4KZtWKPgYmNMCgyGi2VJ5na52oDMWTU= X-Received: by 2002:a63:f34b:: with SMTP id t11mr13205590pgj.341.1541221949174; Fri, 02 Nov 2018 22:12:29 -0700 (PDT) Received: from localhost ([2620:0:1000:1601:3aef:314f:b9ea:889f]) by smtp.gmail.com with ESMTPSA id f22-v6sm43927311pff.29.2018.11.02.22.12.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 02 Nov 2018 22:12:27 -0700 (PDT) Date: Fri, 2 Nov 2018 22:12:26 -0700 From: Joel Fernandes To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org Subject: Re: [RFC] doc: rcu: remove note on smp_mb during synchronize_rcu Message-ID: <20181103051226.GA18718@google.com> References: <20181028043046.198403-1-joel@joelfernandes.org> <20181030222649.GA105735@joelaf.mtv.corp.google.com> <20181030234336.GW4170@linux.ibm.com> <20181031011119.GF224709@google.com> <20181031181748.GG4170@linux.ibm.com> <20181101050019.GA45865@google.com> <20181101161307.GO4170@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181101161307.GO4170@linux.ibm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 01, 2018 at 09:13:07AM -0700, Paul E. McKenney wrote: > On Wed, Oct 31, 2018 at 10:00:19PM -0700, Joel Fernandes wrote: > > On Wed, Oct 31, 2018 at 11:17:48AM -0700, Paul E. McKenney wrote: > > > On Tue, Oct 30, 2018 at 06:11:19PM -0700, Joel Fernandes wrote: > > > > Hi Paul, > > > > > > > > On Tue, Oct 30, 2018 at 04:43:36PM -0700, Paul E. McKenney wrote: > > > > > On Tue, Oct 30, 2018 at 03:26:49PM -0700, Joel Fernandes wrote: > > > > > > Hi Paul, > > > > > > > > > > > > On Sat, Oct 27, 2018 at 09:30:46PM -0700, Joel Fernandes (Google) wrote: > > > > > > > As per this thread [1], it seems this smp_mb isn't needed anymore: > > > > > > > "So the smp_mb() that I was trying to add doesn't need to be there." > > > > > > > > > > > > > > So let us remove this part from the memory ordering documentation. > > > > > > > > > > > > > > [1] https://lkml.org/lkml/2017/10/6/707 > > > > > > > > > > > > > > Signed-off-by: Joel Fernandes (Google) > > > > > > > > > > > > I was just checking about this patch. Do you feel it is correct to remove > > > > > > this part from the docs? Are you satisified that a barrier isn't needed there > > > > > > now? Or did I miss something? > > > > > > > > > > Apologies, it got lost in the shuffle. I have now applied it with a > > > > > bit of rework to the commit log, thank you! > > > > > > > > No worries, thanks for taking it! > > > > > > > > Just wanted to update you on my progress reading/correcting the docs. The > > > > 'Memory Ordering' is taking a bit of time so I paused that and I'm focusing > > > > on finishing all the other low hanging fruit. This activity is mostly during > > > > night hours after the baby is asleep but some times I also manage to sneak it > > > > into the day job ;-) > > > > > > If there is anything I can do to make this a more sustainable task for > > > you, please do not keep it a secret!!! > > > > Thanks a lot, that means a lot to me! Will do! > > > > > > BTW I do want to discuss about this smp_mb patch above with you at LPC if you > > > > had time, even though we are removing it from the documentation. I thought > > > > about it a few times, and I was not able to fully appreciate the need for the > > > > barrier (that is even assuming that complete() etc did not do the right > > > > thing). Specifically I was wondering same thing Peter said in the above > > > > thread I think that - if that rcu_read_unlock() triggered all the spin > > > > locking up the tree of nodes, then why is that locking not sufficient to > > > > prevent reads from the read-side section from bleeding out? That would > > > > prevent the reader that just unlocked from seeing anything that happens > > > > _after_ the synchronize_rcu. > > > > > > Actually, I recall an smp_mb() being added, but am not seeing it anywhere > > > relevant to wait_for_completion(). So I might need to add the smp_mb() > > > to synchronize_rcu() and remove the patch (retaining the typo fix). :-/ > > > > No problem, I'm glad atleast the patch resurfaced the topic of the potential > > issue :-) > > And an smp_mb() is needed in Tree RCU's __wait_rcu_gp(). This is > because wait_for_completion() might get a "fly-by" wakeup, which would > mean no ordering for code naively thinking that it was ordered after a > grace period. > > > > The short form answer is that anything before a grace period on any CPU > > > must be seen by any CPU as being before anything on any CPU after that > > > same grace period. This guarantee requires a rather big hammer. > > > > > > But yes, let's talk at LPC! > > > > Sounds great, looking forward to discussing this. > > Would it make sense to have an RCU-implementation BoF? > > > > > Also about GP memory ordering and RCU-tree-locking, I think you mentioned to > > > > me that the RCU reader-sections are virtually extended both forward and > > > > backward and whereever it ends, those paths do heavy-weight synchronization > > > > that should be sufficient to prevent memory ordering issues (such as those > > > > you mentioned in the Requierments document). That is exactly why we don't > > > > need explicit barriers during rcu_read_unlock. If I recall I asked you why > > > > those are not needed. So that answer made sense, but then now on going > > > > through the 'Memory Ordering' document, I see that you mentioned there is > > > > reliance on the locking. Is that reliance on locking necessary to maintain > > > > ordering then? > > > > > > There is a "network" of locking augmented by smp_mb__after_unlock_lock() > > > that implements the all-to-all memory ordering mentioned above. But it > > > also needs to handle all the possible complete()/wait_for_completion() > > > races, even those assisted by hypervisor vCPU preemption. > > > > I see, so it sounds like the lock network is just a partial solution. For > > some reason I thought before that complete() was even called on the CPU > > executing the callback, all the CPUs would have acquired and released a lock > > in the "lock network" atleast once thus ensuring the ordering (due to the > > fact that the quiescent state reporting has to travel up the tree starting > > from the leaves), but I think that's not necessarily true so I see your point > > now. > > There is indeed a lock that is unconditionally acquired and released by > wait_for_completion(), but it lacks the smp_mb__after_unlock_lock() that > is required to get full-up any-to-any ordering. And unfortunate timing > (as well as spurious wakeups) allow the interaction to have only normal > lock-release/acquire ordering, which does not suffice in all cases. Sorry to be so persistent, but I did spend some time on this and I still don't get why every CPU would _not_ have executed smp_mb__after_unlock_lock at least once before the wait_for_completion() returns, because every CPU should have atleast called rcu_report_qs_rdp() -> rcu_report_qs_rnp() atleast once to report its QS up the tree right?. Before that procedure, the complete() cannot happen because the complete() itself is in an RCU callback which is executed only once all the QS(s) have been reported. So I still couldn't see how the synchronize_rcu can return without the rcu_report_qs_rnp called atleast once on the CPU reporting its QS during a grace period. Would it be possible to provide a small example showing this in least number of steps? I appreciate your time and it would be really helpful. If you feel its too complicated, then feel free to keep this for LPC discussion :) Thanks, -Joel