Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1481597pxb; Tue, 26 Oct 2021 09:46:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwaTrrF3oCdg2H74XITndXV7JNpnR2JrSRfqvznd81LzyRh8IpPKHJchzUZslZJ3FuUYAOf X-Received: by 2002:a17:902:e544:b0:13e:e863:6cd2 with SMTP id n4-20020a170902e54400b0013ee8636cd2mr23611402plf.41.1635266808191; Tue, 26 Oct 2021 09:46:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635266808; cv=none; d=google.com; s=arc-20160816; b=IDeVZGP3DJCOK89sOjj/LQjqsP8Qrl24qTzew3yNx5kxKwXB58uRSPOGSO/8if3kHY UoWXo90XkfTWRbuGxDuOKnTljQii5CsufYgt54VpgsC59wXv9dC4L+3aUXxk5pNducw6 Kid72ov7bt3TDu6UdVYEqw+4Y5MWrkxatDv/zCjLypMM3SSLt+KfjUbTLxbW3/ZjWe+j dY5/YF3bAdyncL4NQ9au7RAxXE/lHS3Kaiu9jnIrU98eRIxKNXpByHl2cISkQpNwHkf1 wUFJ/5ceXpzYteOzgfjiqCC5MnBapvWsbJoxiGzAsjg5tarsOHvRjKtNY+9BdZjwbENB MZsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=bMO9aIZEjMzj+cHlABoe9RIyr+B9KaScU7qbVDmoSBU=; b=YRZIsE0en0bT85rNkMlRDdfQI6oyRi6iTy8XqYCN0afPYDAXoiT8B7nBxE62v1wBBp KDNccZhk7CaYCi6+7S8MEFEiKfYCXDluPNp90mOmokcrjOlHE9IJsLB/IsKXKRNZ5nHc AcNGq+cLKsxswZMRkFFALz8oNvAC/lCFsrn/mnKIZf8fUMZiEkR45zY5rQk198ymO7ew i3zPMsh9KTjmcpeo9f013wuR0f+eV3JBskfdG9FVAr2atIqQ7wHVjPUm1YSk6WuaZq6R /xLnEon9QqZnQHAFZy875SYYoDVBNzichqluLNZorTnsLCiM29ouO38o7I+x6XfVdbtz DfRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@forshee.me header.s=google header.b=gs+ZGXEQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bg6si26721866plb.59.2021.10.26.09.46.28; Tue, 26 Oct 2021 09:46:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@forshee.me header.s=google header.b=gs+ZGXEQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236137AbhJZNJ1 (ORCPT + 99 others); Tue, 26 Oct 2021 09:09:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230324AbhJZNJ0 (ORCPT ); Tue, 26 Oct 2021 09:09:26 -0400 Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62D0AC061745 for ; Tue, 26 Oct 2021 06:07:02 -0700 (PDT) Received: by mail-oo1-xc30.google.com with SMTP id q39-20020a4a962a000000b002b8bb100791so547321ooi.0 for ; Tue, 26 Oct 2021 06:07:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=forshee.me; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=bMO9aIZEjMzj+cHlABoe9RIyr+B9KaScU7qbVDmoSBU=; b=gs+ZGXEQlgMQw8l++1hUpXPBydjgeiTLOcEr0x4UZ0P2YBtOQW9Pb5DYjoj927k+1Y UXJRhbTriWliVVTFJvK74OWPMlbtHkV2JNvTzb2hrL6lMlsXU5zPqhh4nWb8JVqP1WBk s7XUR1HjdezHpAHa/CMU9f4ie2pcjajdxfM1X/HrRdmubWp7a1TXILI0+rIhkPtU/Ibi fVosNqtj1En3V0Ay3W6wlulz9tCiHPzjafhEyLM5o/rcko1cIQURuOfgPmZKmwFwgKcK zazGD0jmRxOX29lLM3blPnUgaAAWNeopcqSyMbmTqF6nTkhsFHIBX++x1/mxdn/6Dqn1 rBcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=bMO9aIZEjMzj+cHlABoe9RIyr+B9KaScU7qbVDmoSBU=; b=eeOJvRNlFM7ORFf6mujJQbKWhbaLWAMNCD3BOjc6GKL0GQY8qJhYzb0gIimq2gwT5O ic9JCUBv0d2SXcroZUB//6YOiIIJ4Ik0aQIgZCq/Tjb54SnBEYhWnhFWnrk/+ANRXrLB BsImO+0lh24H7fAb5YR5tyJ5PV2zHN1uVPApbArbfAm4JAQeGhTiSiPYKpQ5Rh6LY6qO lIh3JJvD2PoGJikw0BweNtNnLoG/5fvaNmQhwStRRu31QA7BsAEm7bxe7VH0UjA2gUua wl837Yyb9C54RE8zriq/0Yruv4Wd3z5vdmab+Nq9b4rxV0EIiJCOqMMve3MI94STex5n fWlQ== X-Gm-Message-State: AOAM533wuSFAgsoNP5mwQ6XVLUlKZMOxbxEw5Oku5Dr3dmd9l7m3S2Ay wLcVUd/hTniMy2EQfpC9l5mNew== X-Received: by 2002:a05:6820:35a:: with SMTP id m26mr17338095ooe.45.1635253621041; Tue, 26 Oct 2021 06:07:01 -0700 (PDT) Received: from localhost ([2605:a601:ac0f:820:fca3:95d3:b064:21ae]) by smtp.gmail.com with ESMTPSA id bq10sm3090209oib.25.2021.10.26.06.07.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Oct 2021 06:07:00 -0700 (PDT) From: Seth Forshee To: "David S. Miller" , Jakub Kicinski , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: "Paul E. McKenney" , netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2] net: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap() Date: Tue, 26 Oct 2021 08:06:59 -0500 Message-Id: <20211026130700.121189-1-seth@forshee.me> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Seth Forshee Currently rcu_barrier() is used to ensure that no readers of the inactive mini_Qdisc buffer remain before it is reused. This waits for any pending RCU callbacks to complete, when all that is actually required is to wait for one RCU grace period to elapse after the buffer was made inactive. This means that using rcu_barrier() may result in unnecessary waits. To improve this, store the current RCU state when a buffer is made inactive and use poll_state_synchronize_rcu() to check whether a full grace period has elapsed before reusing it. If a full grace period has not elapsed, wait for a grace period to elapse, and in the non-RT case use synchronize_rcu_expedited() to hasten it. Since this approach eliminates the RCU callback it is no longer necessary to synchronize_rcu() in the tp_head==NULL case. However, the RCU state should still be saved for the previously active buffer. Before this change I would typically see mini_qdisc_pair_swap() take tens of milliseconds to complete. After this change it typcially finishes in less than 1 ms, and often it takes just a few microseconds. Thanks to Paul for walking me through the options for improving this. Cc: "Paul E. McKenney" Signed-off-by: Seth Forshee --- v2: - Rebase to net-next include/net/sch_generic.h | 2 +- net/sched/sch_generic.c | 38 +++++++++++++++++++------------------- 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index ada02c4a4f51..22179b2fda72 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -1302,7 +1302,7 @@ struct mini_Qdisc { struct tcf_block *block; struct gnet_stats_basic_sync __percpu *cpu_bstats; struct gnet_stats_queue __percpu *cpu_qstats; - struct rcu_head rcu; + unsigned long rcu_state; }; static inline void mini_qdisc_bstats_cpu_update(struct mini_Qdisc *miniq, diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index b0ff0dff2773..24899efc51be 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -1487,10 +1487,6 @@ void psched_ppscfg_precompute(struct psched_pktrate *r, u64 pktrate64) } EXPORT_SYMBOL(psched_ppscfg_precompute); -static void mini_qdisc_rcu_func(struct rcu_head *head) -{ -} - void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp, struct tcf_proto *tp_head) { @@ -1503,28 +1499,30 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp, if (!tp_head) { RCU_INIT_POINTER(*miniqp->p_miniq, NULL); - /* Wait for flying RCU callback before it is freed. */ - rcu_barrier(); - return; - } + } else { + miniq = !miniq_old || miniq_old == &miniqp->miniq2 ? + &miniqp->miniq1 : &miniqp->miniq2; - miniq = !miniq_old || miniq_old == &miniqp->miniq2 ? - &miniqp->miniq1 : &miniqp->miniq2; + /* We need to make sure that readers won't see the miniq + * we are about to modify. So ensure that at least one RCU + * grace period has elapsed since the miniq was made + * inactive. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + cond_synchronize_rcu(miniq->rcu_state); + else if (!poll_state_synchronize_rcu(miniq->rcu_state)) + synchronize_rcu_expedited(); - /* We need to make sure that readers won't see the miniq - * we are about to modify. So wait until previous call_rcu callback - * is done. - */ - rcu_barrier(); - miniq->filter_list = tp_head; - rcu_assign_pointer(*miniqp->p_miniq, miniq); + miniq->filter_list = tp_head; + rcu_assign_pointer(*miniqp->p_miniq, miniq); + } if (miniq_old) - /* This is counterpart of the rcu barriers above. We need to + /* This is counterpart of the rcu sync above. We need to * block potential new user of miniq_old until all readers * are not seeing it. */ - call_rcu(&miniq_old->rcu, mini_qdisc_rcu_func); + miniq_old->rcu_state = start_poll_synchronize_rcu(); } EXPORT_SYMBOL(mini_qdisc_pair_swap); @@ -1543,6 +1541,8 @@ void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc, miniqp->miniq1.cpu_qstats = qdisc->cpu_qstats; miniqp->miniq2.cpu_bstats = qdisc->cpu_bstats; miniqp->miniq2.cpu_qstats = qdisc->cpu_qstats; + miniqp->miniq1.rcu_state = get_state_synchronize_rcu(); + miniqp->miniq2.rcu_state = miniqp->miniq1.rcu_state; miniqp->p_miniq = p_miniq; } EXPORT_SYMBOL(mini_qdisc_pair_init); -- 2.30.2