Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp24655pxv; Wed, 21 Jul 2021 14:26:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwQK65w7Qfdhs/tQAcFDWuo2K2eq4/D1R5jo0BQeBPsHCc2JSvvR2BTMC/3+vzAxT09/Jg6 X-Received: by 2002:aa7:c89a:: with SMTP id p26mr49900431eds.373.1626902816727; Wed, 21 Jul 2021 14:26:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626902816; cv=none; d=google.com; s=arc-20160816; b=qz66oPproIHMog0k7Vv7RPebUdpioUx8u1pzUEC7aPfSFps+M2TUrOn0dYxsePakRh vy6YyZxzBRwTbZQrN9ThRCKKQjNW8+m2jGYwQE3a0xTaEVZJYP4nwJ33/U0a1+KQL3bF dL29Yo116AeQ4mNDMN1Qm4pZbGMOj9gvqjQTneN5TUzPQRBbKbaIx+7uxzi4fW4E+qUi 4R511cP9RJqWcoIkpxZhoFqEaoJ03OHr0EtJQz0qlElnvgBpz7ygMQR85f8nGMHpBhxW IUwp680E4A6xr9Wkzj6PIa1BgmQiB1C2jamkyHmDPqzClc08ZpmJgQIlxDvs0f8SJV9H DiRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=AF79Hv02sdqDUNTDuK/Yd3x3b3Up5YCIIc6eVw2LnCk=; b=gNtE7ND84/GrpjqMU28V+lCMvgnoLdLk+sHqiklhbhhE1j1mZUvRvaB8h3hpEui3s0 A9A2mIVzatwWGKrSmCkX8OUdFr+uYP5U72G4U7og8OmhMRk+k9gvJa21tn9kg+wPm6oP xzb/gh0g1TOYXVVzaBoUnJ1HeDtf8vJRiuwQNsk7czlKkEB2ruHT4s5FyEqnEGb2a1O+ hazcKHGe98fYeWZfTSS/YAry/0K6WlouOYTDpwZk9+cXDOSCT2nDHU869vQoGAPENC1C x50QK0jY33q2trZ6sheRKXVkxGKihDIeqBEPvjaSGQGpxu198i2PiQwqcIsuLUx2vUSL CQpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=g2qEUyEj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gw26si28718722ejb.161.2021.07.21.14.26.32; Wed, 21 Jul 2021 14:26:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=g2qEUyEj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229577AbhGUUok (ORCPT + 99 others); Wed, 21 Jul 2021 16:44:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:41062 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229504AbhGUUok (ORCPT ); Wed, 21 Jul 2021 16:44:40 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 346866120C; Wed, 21 Jul 2021 21:25:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1626902716; bh=k4tebm4ckQUUfigmvIsDkaS0hJcmbZNxBgEUPDP9zyw=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=g2qEUyEjuiVizeov2tx4qpJdxW8uKaN5rIVq4g9z7yITohA99A6rbKooxwhlqR0AU FdWXKbL0mbUkOg/X5dsMWDpySXrQAj5pQphp6GPaichiIAEJKwhT2G8YOWoxBkrGrw XnttHRZ/ArFZ9tvuA5+UNM3vvgjoTlRnThdqfTzl/VM4gg3iMEvmnsh4AedfHgIato UIbyK9qWdDYOOEQDcoODfoFbdCMw8zuoLjdqjvz5eLu43xb0W41YLXHAxYKVhLIV7Z gwIoQG619ciTa7iu4ngW/ZzlTYSXWSnwM58DTRdzfV1gh0fwd1TyRBlJAVkjJ9shzp D7WV67/L6C3wQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 0007E5C09A4; Wed, 21 Jul 2021 14:25:15 -0700 (PDT) Date: Wed, 21 Jul 2021 14:25:15 -0700 From: "Paul E. McKenney" To: Linus Torvalds Cc: rcu@vger.kernel.org, Linux Kernel Mailing List , Kernel Team , Ingo Molnar , Lai Jiangshan , Andrew Morton , Mathieu Desnoyers , Josh Triplett , Thomas Gleixner , Peter Zijlstra , Steven Rostedt , David Howells , Eric Dumazet , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , Oleg Nesterov , Joel Fernandes Subject: Re: [PATCH rcu 04/18] rcu: Weaken ->dynticks accesses and updates Message-ID: <20210721212515.GV4397@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20210721202042.GA1472052@paulmck-ThinkPad-P17-Gen-1> <20210721202127.2129660-4-paulmck@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 21, 2021 at 01:41:46PM -0700, Linus Torvalds wrote: > Hmm. > > This actually seems to make some of the ordering worse. > > I'm not seeing a lot of weakening or optimization, but it depends a > bit on what is common and what is not. Agreed, and I expect that I will be reworking this patch rather thoroughly. Something about smp_mb() often being a locked atomic operation on a stack location. :-/ But you did ask for this to be sped up some years back (before the memory model was formalized), so I figured I should at least show what can be done. Plus I expect that you know much more about what Intel is planning than I do. > On Wed, Jul 21, 2021 at 1:21 PM Paul E. McKenney wrote: > > > > +/* > > + * Increment the current CPU's rcu_data structure's ->dynticks field > > + * with ordering. Return the new value. > > + */ > > +static noinstr unsigned long rcu_dynticks_inc(int incby) > > +{ > > + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); > > + int seq; > > + > > + seq = READ_ONCE(rdp->dynticks) + incby; > > + smp_store_release(&rdp->dynticks, seq); > > + smp_mb(); // Fundamental RCU ordering guarantee. > > + return seq; > > +} > > So this is actually likely *more* expensive than the old code was, at > least on x86. > > The READ_ONCE/smp_store_release are cheap, but then the smp_mb() is expensive. > > The old code did just arch_atomic_inc_return(), which included the > memory barrier. > > There *might* be some cache ordering advantage to letting the > READ_ONCE() float upwards, but from a pure barrier standpoint this is > more expensive than what we used to have. No argument here. > > - if (atomic_read(&rdp->dynticks) & 0x1) > > + if (READ_ONCE(rdp->dynticks) & 0x1) > > return; > > - atomic_inc(&rdp->dynticks); > > + rcu_dynticks_inc(1); > > And this one seems to not take advantage of the new rule, so we end up > having two reads, and then that potentially more expensive sequence. This one only executes when a CPU comes online, so I am not worried about its overhead. > > static int rcu_dynticks_snap(struct rcu_data *rdp) > > { > > - return atomic_add_return(0, &rdp->dynticks); > > + smp_mb(); // Fundamental RCU ordering guarantee. > > + return smp_load_acquire(&rdp->dynticks); > > } > > This is likely cheaper - not because of barriers, but simply because > it avoids dirtying the cacheline. > > So which operation do we _care_ about, and do we have numbers for why > this improves anything? Because looking at the patch, it's not obvious > that this is an improvement. It sounds like I should keep this hunk and revert the rest back to atomic operations, but still in the new rcu_dynticks_inc() function. Either way, thank you for looking this over! Thanx, Paul