Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp2229787rbb; Tue, 27 Feb 2024 15:24:31 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXKMV1Wd8zJDdg6Z1doEhcg7Z+L3WSiMRNqvm0uKYbLa0JVnaFGK//qgTh+LOzwX6BPe2+2ecvru+AlbT0uQolyYjp6MJw4H27auoHLbg== X-Google-Smtp-Source: AGHT+IFmCjMr++RDf1aMix3KdiQ6ysoXqTWkqqKkrCx0ZkzvuXd1gsGqFYDCZOAQ8wRvJpyw8wZ2 X-Received: by 2002:a17:902:70c1:b0:1dc:69ab:7dc0 with SMTP id l1-20020a17090270c100b001dc69ab7dc0mr11335874plt.27.1709076270935; Tue, 27 Feb 2024 15:24:30 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709076270; cv=pass; d=google.com; s=arc-20160816; b=Ws4ly5dpKH68ppd2050CwcEGIiNXPe2bccA1p3O3Zxgci560KFOef/VqG+73vJ+3WP vMZ7a+CJwiT4dzzgS9q7wb0tvx0FKcj8tC4Sgc3bxX61HIasyKKKMlmtya6+BdR1V7yz AofF2+KF+Os+tLj//ABRJf5TLFi0LrJ5Lj1bHeDggVYZQL3KhcoLI5AcFA9UpAJB0vNw u3UT/bLo6QtD67tG7Ozw+4euqV4zWD1xuG5GCHG8QEppBmAdBg8fgizSln2zw+MaSXDD IpKK423DfonUnfIkooyQ7325DAWHlp8g7jIcvtv7172A/1Q6gT3XnXUGLSalC0zkaq/b pcOA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=gchw1hb1/TNWBBHWByfjrPvUic6mqX73n8fa0+xCF6A=; fh=oMBzUeAMXaDT9zyLRgsy6Z4yVf4P0eZGJwTWHBeDEv8=; b=kqyaw8PG2ZBzr9NQrt55S8JCN+7fl1Qqg1wFASF7b/8PcOoo12InjdiTvd8gLlIrrd q1q5pJAaSK815/rmJSmlzrwSwgjVzp39AZ7R0nuYOGMJVBXkRDBrF9sqVSxZqjzXfldm R6+FbeQoEkQPwMcyxrQtiyDu7Lz4Q6JtRxMQhKo7bvjmyuI2wvUrUviPtfXsUi1YSoLA /lrnMsfwFNJK7HMCf99GWJiSOhb7eYdARx/KeqL2aj8uVbkX/Ztp6+9gTq33phoGzYTM qDbu4UxvS3AZca8eS1fc+/Qzg4qRJPj8HJ9IGgzKVrdm6G2bEaeI5QDzj1RNpL4P74ww tcOw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=sHVhiSY9; arc=pass (i=1 spf=pass spfdomain=joelfernandes.org dkim=pass dkdomain=joelfernandes.org); spf=pass (google.com: domain of linux-kernel+bounces-84143-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84143-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id j15-20020a170902da8f00b001dcb00362bfsi2172504plx.535.2024.02.27.15.24.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 15:24:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84143-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=sHVhiSY9; arc=pass (i=1 spf=pass spfdomain=joelfernandes.org dkim=pass dkdomain=joelfernandes.org); spf=pass (google.com: domain of linux-kernel+bounces-84143-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84143-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id B7032B237BE for ; Tue, 27 Feb 2024 22:51:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EE58655C13; Tue, 27 Feb 2024 22:51:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=joelfernandes.org header.i=@joelfernandes.org header.b="sHVhiSY9" Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 57AF954664 for ; Tue, 27 Feb 2024 22:51:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.175 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709074277; cv=none; b=biLm2n+ENc3LEUYw0RRAKgmIbiFOyPFeNiFLwtJa8cp8W8xYcCiFJKFLoYcB2TDiOFV87FSgoL0ygy5uAQr5IMkcJKKrPEIsOPqBbM2XYGE5i6WTa1oPUvZEg6iPXxjdh82gt7/kD+4qo82JSgYUpcib3lnhzo7pKprMntS59s0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709074277; c=relaxed/simple; bh=whoZVa5OcWCJnKV6DOEusBHRQmyM89qgBJnEQZCkqjE=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=mDdYy7sLwP4s43KoXGAhdE/ONOTLQZUbYnaD1qth6RvO7+aDqxq9cr3J59YYog45iR2Y+4TEAFQYD61/0odAQxvrqxZSiD4qswk2WdV1gCGwkRpJZBPb6zPC7d5UbuRH1eRAVOnYcg3L8xFJDQzkH46DEU3HqWg//s+CS4waAMM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=joelfernandes.org; spf=pass smtp.mailfrom=joelfernandes.org; dkim=pass (1024-bit key) header.d=joelfernandes.org header.i=@joelfernandes.org header.b=sHVhiSY9; arc=none smtp.client-ip=209.85.208.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=joelfernandes.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=joelfernandes.org Received: by mail-lj1-f175.google.com with SMTP id 38308e7fff4ca-2d2533089f6so58892981fa.1 for ; Tue, 27 Feb 2024 14:51:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; t=1709074272; x=1709679072; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gchw1hb1/TNWBBHWByfjrPvUic6mqX73n8fa0+xCF6A=; b=sHVhiSY9q/TI6J2xGSLkTPGdKCCwL4oJy1ACFw7KdmAOziZ4bssS8ZZUpmORdHz60r s2nF4FpGZHJ8NXt1smj8T5E6AXB4CaTSiATmMhl5z/wMydLTa4PlVaa6fuIwN3kW6GHh I08SGoMrlvNAr7UbV0FbMlgkVMsUvJ0gZKC0s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709074272; x=1709679072; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gchw1hb1/TNWBBHWByfjrPvUic6mqX73n8fa0+xCF6A=; b=IRdbnYuwWM47ZmeIYuKoHibrRCYZN0wzdgesdChtsJW0zd0jQ0a9lPk76vZa6yS3Vp wWbq6Ejg1Pf4sy9P60qYZX7G+1hwTPN+fC/oOgdcWAiyxHmfnI8huZWqqSzWma+1+f+F JjcYGzD7M2PhTrxz6kK2NBnhZk3BHVDSzKKSuU4MrOnn6Bt+KMjmHhiZ51tA6Ypaj9L/ MH/ZusLA/XvKOcyvq2EmEF+txepOQ5tk5/EbiQ+psxe3/EH/dvblNKIcNidIzdplwV5D CtPwuwbaHYEXBDaDb57xysyPffM0Q0LF14sP5TLAEGfkZfx38gUVAqlSvHtoVDil338A 7Dvw== X-Forwarded-Encrypted: i=1; AJvYcCXb3VzFW6OFYytrGjNGon6oe9SplghEsWfbyIkPm3Rmf4VyYUQ2vXjmoOg5jmhfhJwYsJCQk6D0xTfndlyxi9W5Pw3TqwFP8gJgvXob X-Gm-Message-State: AOJu0YzOXeTt+yaIV/0rYodJ2PkPv1OiG8gLSb2sTFYMm61ZAGS9zwRL +kq4wVVohAytXhtONYhHNyp2sXTHg6OUKHSpx9O9nrb1egmVjjnqm3wFq6BJ77fnbSJAXIzVeWv uBnwz4QffJ5NIt5fpEo8NcJJzqllaxhpt2qTAJw== X-Received: by 2002:a2e:8902:0:b0:2d2:9e62:64a3 with SMTP id d2-20020a2e8902000000b002d29e6264a3mr1995009lji.6.1709074272384; Tue, 27 Feb 2024 14:51:12 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240220183115.74124-1-urezki@gmail.com> <20240220183115.74124-3-urezki@gmail.com> <4b932245-2825-4e53-87a4-44d2892e7c13@joelfernandes.org> In-Reply-To: <4b932245-2825-4e53-87a4-44d2892e7c13@joelfernandes.org> From: Joel Fernandes Date: Tue, 27 Feb 2024 17:50:58 -0500 Message-ID: Subject: Re: [PATCH v5 2/4] rcu: Reduce synchronize_rcu() latency To: "Uladzislau Rezki (Sony)" , "Paul E . McKenney" Cc: RCU , Neeraj upadhyay , Boqun Feng , Hillf Danton , LKML , Oleksiy Avramchenko , Frederic Weisbecker Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Feb 27, 2024 at 5:39=E2=80=AFPM Joel Fernandes wrote: > > > > On 2/20/2024 1:31 PM, Uladzislau Rezki (Sony) wrote: > > A call to a synchronize_rcu() can be optimized from a latency > > point of view. Workloads which depend on this can benefit of it. > > > > The delay of wakeme_after_rcu() callback, which unblocks a waiter, > > depends on several factors: > > > > - how fast a process of offloading is started. Combination of: > > - !CONFIG_RCU_NOCB_CPU/CONFIG_RCU_NOCB_CPU; > > - !CONFIG_RCU_LAZY/CONFIG_RCU_LAZY; > > - other. > > - when started, invoking path is interrupted due to: > > - time limit; > > - need_resched(); > > - if limit is reached. > > - where in a nocb list it is located; > > - how fast previous callbacks completed; > > > > Example: > > > > 1. On our embedded devices i can easily trigger the scenario when > > it is a last in the list out of ~3600 callbacks: > > > > > > <...>-29 [001] d..1. 21950.145313: rcu_batch_start: rcu_preempt = CBs=3D3613 bl=3D28 > > ... > > <...>-29 [001] ..... 21950.152578: rcu_invoke_callback: rcu_pree= mpt rhp=3D00000000b2d6dee8 func=3D__free_vm_area_struct.cfi_jt > > <...>-29 [001] ..... 21950.152579: rcu_invoke_callback: rcu_pree= mpt rhp=3D00000000a446f607 func=3D__free_vm_area_struct.cfi_jt > > <...>-29 [001] ..... 21950.152580: rcu_invoke_callback: rcu_pree= mpt rhp=3D00000000a5cab03b func=3D__free_vm_area_struct.cfi_jt > > <...>-29 [001] ..... 21950.152581: rcu_invoke_callback: rcu_pree= mpt rhp=3D0000000013b7e5ee func=3D__free_vm_area_struct.cfi_jt > > <...>-29 [001] ..... 21950.152582: rcu_invoke_callback: rcu_pree= mpt rhp=3D000000000a8ca6f9 func=3D__free_vm_area_struct.cfi_jt > > <...>-29 [001] ..... 21950.152583: rcu_invoke_callback: rcu_pree= mpt rhp=3D000000008f162ca8 func=3Dwakeme_after_rcu.cfi_jt > > <...>-29 [001] d..1. 21950.152625: rcu_batch_end: rcu_preempt CB= s-invoked=3D3612 idle=3D.... > > > > > > 2. We use cpuset/cgroup to classify tasks and assign them into > > different cgroups. For example "backgrond" group which binds tasks > > only to little CPUs or "foreground" which makes use of all CPUs. > > Tasks can be migrated between groups by a request if an acceleration > > is needed. > > [...] > > * Initialize a new grace period. Return false if no grace period req= uired. > > */ > > @@ -1432,6 +1711,7 @@ static noinline_for_stack bool rcu_gp_init(void) > > unsigned long mask; > > struct rcu_data *rdp; > > struct rcu_node *rnp =3D rcu_get_root(); > > + bool start_new_poll; > > > > WRITE_ONCE(rcu_state.gp_activity, jiffies); > > raw_spin_lock_irq_rcu_node(rnp); > > @@ -1456,10 +1736,24 @@ static noinline_for_stack bool rcu_gp_init(void= ) > > /* Record GP times before starting GP, hence rcu_seq_start(). */ > > rcu_seq_start(&rcu_state.gp_seq); > > ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq); > > + start_new_poll =3D rcu_sr_normal_gp_init(); > > trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("sta= rt")); > > rcu_poll_gp_seq_start(&rcu_state.gp_seq_polled_snap); > > raw_spin_unlock_irq_rcu_node(rnp); > > > > + /* > > + * The "start_new_poll" is set to true, only when this GP is not = able > > + * to handle anything and there are outstanding users. It happens= when > > + * the rcu_sr_normal_gp_init() function was not able to insert a = dummy > > + * separator to the llist, because there were no left any dummy-n= odes. > > + * > > + * Number of dummy-nodes is fixed, it could be that we are run ou= t of > > + * them, if so we start a new pool request to repeat a try. It is= rare > > + * and it means that a system is doing a slow processing of callb= acks. > > + */ > > + if (start_new_poll) > > + (void) start_poll_synchronize_rcu(); > > + > > Optionally, does it make sense to print a warning if too many retries occ= urred? For future work, I was wondering about slight modification to even avoid this "running out of nodes" issues, why not add a wait node to task_struct and use that. rcu_gp_init() can just use that. Then, there is no limit to how many callers or to the length of the list. And by definition, you cannot have more than 1 caller per task-struct. Would that not work? So in rcu_gp_init(), use the wait node of the first task_struct on the top of the list, mark it as a "special node", perhaps with a flag that says it is also the dummy node. But yeah the new design of the patch is really cool... if you are leaving it alone without going for above suggestion, I can add it to my backlog for future work. Thanks.