Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp4860992rwj; Tue, 20 Dec 2022 16:22:30 -0800 (PST) X-Google-Smtp-Source: AA0mqf7Y5WoSwrMRC8zVCmU+0Dt/5+ESvK/6dkKEpLd2N/dG9HG9ygHJxoZpalFfcNeunamzpFHt X-Received: by 2002:a05:6402:380f:b0:45c:834b:f287 with SMTP id es15-20020a056402380f00b0045c834bf287mr39505285edb.4.1671582150539; Tue, 20 Dec 2022 16:22:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671582150; cv=none; d=google.com; s=arc-20160816; b=RtaZt1QjpCOdMyqjsfNsXE+6L7zXhHblGFmKnQmfbXddgClqTjaiDPHewiOiY3lmdQ Uw4XgYcx/U4rvHe4wEBhWWCUOPy0nmMA/P2pRItciycInah5eIteVeW0wIyoE+HTJR5A CnyDLS7OTycYW6l8myexsx6r4rq6177SmA8zsed+HDNbgjsY/NMjMbXnoIEY3lZRRC/S Y5DLywZmuvWdCGU5YKi5ccY8XGEpJR3liU+ey0/x5RN8G+QT7djLU+jj3M5HJzVPhe2Z CsRFBrs4gAFmgFTKM4lflXgJJ6DYMH0Z52Ex5/KDePDpOaQSR6LFngl1t60bQVYqZYmA t+2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Ii2EQMG/bIRMfNz9ayHRXZlolL2DXa/RB4AkP61m/Fs=; b=E4e6RV2TW29YP2IoT26qhhL6xZayg8slebPYZNgo8Ll+iRMNkQrFtKxqDcvyp9+Ttd xvf6dGervCU6iwsUtUxN9+uFRxV3vtTlHQlLQ5GvlkFTQlm9nLSoylLIelmTfYJWJqb9 vOszYSDptMyZPk9AOir2QK1bvLzDFCCYGVgG5dWL4F/AxnEVj9OzL6rcsN6kGaFKwHj5 RaoRR/XJu0L02gwRZo++fpngRFO2ZktvOLsSo+UgC8Nkq3wVPChCkUwoQTGfYIuvDebK 4PUMqWKgrPKwZnIuFJr7gqBRRv3YN0sTk3qIeGD8PuieIVHNAsxCz/f0kUTaFqLgOspn cWCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XbCQzgty; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w11-20020a05640234cb00b0046b953601besi1780216edc.29.2022.12.20.16.22.13; Tue, 20 Dec 2022 16:22:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XbCQzgty; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234006AbiLUAHo (ORCPT + 69 others); Tue, 20 Dec 2022 19:07:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234400AbiLUAHZ (ORCPT ); Tue, 20 Dec 2022 19:07:25 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9477020379; Tue, 20 Dec 2022 16:07:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3A29A61631; Wed, 21 Dec 2022 00:07:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19E87C433EF; Wed, 21 Dec 2022 00:07:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671581238; bh=40/yi23Gx0orewr4vO98PtbOrrAReeIhFbqm5dfcTUI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=XbCQzgtyTmmDubK8T6uZ5vmuJvVbrK1Jiz80I+9O43dxqz4PKSb4lnxCJSDlKArkA GQ3GTrR+wD49I86s+7OJ/sGLjgDJRsNRHKQghkzscRguAnPsBwiIUipmU0OKkvOOxH 55oFuR/l2MNaBzW2Jud2gqfiWChb+O5DQXJi4JjpgCYzGiz2QAK/vo43eDmRwMihY6 lVP+OqFI0c/o+6VM5qAvDfZqXSd0Dtp8ZjljU3Fot7SQhLNp3Tx8E++nIYpcy7Hazh 8X31h/KU8Fk4vPxFnH5BPEfLkQtEtB6svuftb6chtS1Rm5UXy+BU+SqAuGF+kafnGT pskGn8ddFzxOA== Date: Wed, 21 Dec 2022 01:07:15 +0100 From: Frederic Weisbecker To: Mathieu Desnoyers Cc: Joel Fernandes , linux-kernel@vger.kernel.org, Josh Triplett , Lai Jiangshan , "Paul E. McKenney" , rcu@vger.kernel.org, Steven Rostedt Subject: Re: [RFC 0/2] srcu: Remove pre-flip memory barrier Message-ID: <20221221000715.GA27352@lothringen> References: <20221218191310.130904-1-joel@joelfernandes.org> <589da7c9-5fb7-5f6f-db88-ca464987997e@efficios.com> <2da94283-4fce-9aff-ac5d-ba181fa0f008@efficios.com> <659763b0-eee4-10dd-5f4a-37241173809c@efficios.com> <35293ec4-40a1-cf6b-3bdd-0e3e30819c06@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <35293ec4-40a1-cf6b-3bdd-0e3e30819c06@efficios.com> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 20, 2022 at 12:00:58PM -0500, Mathieu Desnoyers wrote: > On 2022-12-19 20:04, Joel Fernandes wrote: > The main benefit I expect is improved performance of the grace period > implementation in common cases where there are few or no readers present, > especially on machines with many cpus. > > It allows scanning both periods (0/1) for each cpu within the same pass, > therefore loading both period's unlock counters sitting in the same cache > line at once (improved locality), and then loading both period's lock > counters, also sitting in the same cache line. > > It also allows skipping the period flip entirely if there are no readers > present, which is an -arguably- tiny performance improvement as well. I would indeed expect performance improvement if there are no readers in the active period/idx but if there are, it's a performance penalty due to the extra scans. So my mean questions are: * Is the no-present-readers the most likely case? I guess it depends on the ssp. * Does the SRCU update side deserve to be optimized with added code (because we are not debating about removing the flip, rather about adding a fast-path and keep the flip as a slow-path) * The SRCU machinery is already quite complicated. Look how we little things lock ourselves in for days doing our exegesis of SRCU state machine. And halfway through it we are still debating some ordering. Is it worth adding a new path there? Thanks.