Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp5062549rwj; Tue, 20 Dec 2022 19:56:43 -0800 (PST) X-Google-Smtp-Source: AMrXdXs6Id5bn4L8z8foNfjSroXjC6QW7WDXWQhSYvsc6IpxAqhCuzrFUJ0A5j2qDHHSddHx0KqP X-Received: by 2002:a17:90a:157:b0:219:2b5f:148b with SMTP id z23-20020a17090a015700b002192b5f148bmr15088848pje.48.1671595002878; Tue, 20 Dec 2022 19:56:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671595002; cv=none; d=google.com; s=arc-20160816; b=yqyijbjCH9kyhqeSArliqs0b+N19/+vdzTY/tAVf302RtR/nB88bJh1NVSP4oKZfK0 gSVllVl3u/qfj/r2n4lK8f9ln7uWI+4OsfjoNm/qwpuEi3fQJ/bdmnFJZ6yhpx5FiUP2 M0I18M8/LKljH278GgQUZQcUKCakweCIwhsVnU9J1KwhJIyxOkipr1KnnbR2CwqHjjoG xq2fCd8AcxDo2c+78KoFg1By8C+M+5C8Irky0cq5xWLHm/4MJ+Fcd8qP3Jm6Xx9W41v1 UNoOyLm9L3dDb2BlniEWRnzfyP/Rj3gPvWxdBiNQrL74kNP1W2paBra75wLNGH3FK4Az udPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=mwIP76RCBzP3N9PsbSrzGpVyjsYO1c5vDXkclIGCl48=; b=cIv87cGGEcYXnuSgVndM7X3FQhYsP8F8xG6FDso8R24Iujk08aTOS7phPNIGdi5dfK j7RN8RqbsvQsZIxcdZbm0PUntWcIRlro1bB2Aa2ql+LrI7Eryl7EFdpIiTu4GrSXYO7n FQwUXQiR4EDVq3VLk21Z8DKSvJSDwZz/p3QgVnW8ncj/a3Ogah4KX2BhFACGw0IkLtbo KCHdXeJI9c5V9INPi4n4ueJuY32W/PRh08VbIeNP7A1F/OO/di5uBrvxYazGjwyGMX45 xataiGrZoWAoRSfLoNEjlpt96uOkscgaZSJZg6a+FmL0/u0TCrB7AIrAaTww9SztsqPZ Gl3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=JumEZRgU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t18-20020a17090ad51200b00213587b200esi623412pju.189.2022.12.20.19.56.33; Tue, 20 Dec 2022 19:56:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=JumEZRgU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229696AbiLUDqn (ORCPT + 71 others); Tue, 20 Dec 2022 22:46:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229482AbiLUDql (ORCPT ); Tue, 20 Dec 2022 22:46:41 -0500 Received: from smtpout.efficios.com (unknown [IPv6:2607:5300:203:b2ee::31e5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 554B1BE6; Tue, 20 Dec 2022 19:46:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1671594399; bh=2ppGBTNEKO9ZGf5Jmr6Z5lBYQ/V8tObp7/FB4z0ImVk=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=JumEZRgUYwBTvJFjedBpRAwFKMWPt91RPx8l1x8EB8Nuk+HafrrTJ1d7ZHKNw9IwU Z6v/YgAOZO9AD2TFEE2afW24EJxUTEtFhiW4FxO/Q/gLPEfo7JeK7nfPlhnscq/OhI 3pjGPY4q9zlh+lsV0L7T5fGW5dSQyQNmzHc6qbjo2o39lMGeOgW+T18iBsxix/0XWa pYkIixIXtekn0NK9W5jPW8ThcmaBJ9xtOyGcviYa/qE1u/DTrxb2Y4iolbvevf2el9 GZk9b3+nfgMl9VKEyUtTRkETe8+hpe0Rxnf55Tziu1VGM9ygRYN0KlTN2BRbSFpKt8 BvROH7PFufK8A== Received: from [10.1.0.30] (192-222-188-97.qc.cable.ebox.net [192.222.188.97]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4NcKB73dhWzbTD; Tue, 20 Dec 2022 22:46:39 -0500 (EST) Message-ID: Date: Tue, 20 Dec 2022 22:47:04 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [RFC 0/2] srcu: Remove pre-flip memory barrier Content-Language: en-US To: Frederic Weisbecker Cc: Joel Fernandes , linux-kernel@vger.kernel.org, Josh Triplett , Lai Jiangshan , "Paul E. McKenney" , rcu@vger.kernel.org, Steven Rostedt References: <20221218191310.130904-1-joel@joelfernandes.org> <589da7c9-5fb7-5f6f-db88-ca464987997e@efficios.com> <2da94283-4fce-9aff-ac5d-ba181fa0f008@efficios.com> <659763b0-eee4-10dd-5f4a-37241173809c@efficios.com> <35293ec4-40a1-cf6b-3bdd-0e3e30819c06@efficios.com> <20221221000715.GA27352@lothringen> From: Mathieu Desnoyers In-Reply-To: <20221221000715.GA27352@lothringen> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RDNS_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-12-20 19:07, Frederic Weisbecker wrote: > On Tue, Dec 20, 2022 at 12:00:58PM -0500, Mathieu Desnoyers wrote: >> On 2022-12-19 20:04, Joel Fernandes wrote: >> The main benefit I expect is improved performance of the grace period >> implementation in common cases where there are few or no readers present, >> especially on machines with many cpus. >> >> It allows scanning both periods (0/1) for each cpu within the same pass, >> therefore loading both period's unlock counters sitting in the same cache >> line at once (improved locality), and then loading both period's lock >> counters, also sitting in the same cache line. >> >> It also allows skipping the period flip entirely if there are no readers >> present, which is an -arguably- tiny performance improvement as well. > > I would indeed expect performance improvement if there are no readers in the > active period/idx but if there are, it's a performance penalty due to the extra > scans. > > So my mean questions are: > > * Is the no-present-readers the most likely case? I guess it depends on the ssp. > > * Does the SRCU update side deserve to be optimized with added code (because > we are not debating about removing the flip, rather about adding a fast-path > and keep the flip as a slow-path) > > * The SRCU machinery is already quite complicated. Look how we little things lock > ourselves in for days doing our exegesis of SRCU state machine. And halfway > through it we are still debating some ordering. Is it worth adding a new path there? I'm not arguing for making things more complex unless there are good reasons to do so. However I think we badly need to improve the documentation of the memory barriers in SRCU, because the claimed barrier pairing is odd. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com