Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp1766097rdd; Thu, 11 Jan 2024 08:37:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IHKmn7aSoiPOLz4+GwhQ2y7/XNWMGEs2SRFsxXQzGov7TSjl74jR8jfycJNVwl9zuabF5eE X-Received: by 2002:a05:6830:11d0:b0:6d9:ec35:a234 with SMTP id v16-20020a05683011d000b006d9ec35a234mr72142otq.10.1704991079750; Thu, 11 Jan 2024 08:37:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704991079; cv=none; d=google.com; s=arc-20160816; b=iX773K25tt0WqFETVRtUESyJyC7HgkC3EGH4aV5n4ihv7PVWjxLzLFadDVyF88TNTf Zb5RnG3kV7agz9wDAcqvx3NjRJJLOjabFS0+f+NWmst3K8dj66AyIsfFucFUIWlzC279 txTUK8FWyHna6e2IAm/dUB/+5eK6Hp97iuTqn6QHgXl8gNM4xt6ljVG58QRtlvUOFACe x5dw1CDe0eHux4KFfQnGCJ8cSVWLSTS+KrlrU6GCFvNROa3S2t5vsVGh6aYPVe5YIR6p Y4I837aHJrgRvRgejoLzTuM8tZY+Gep7F9phYyxC3M4yjF8sdp82ak2vuMgf+a4rQ2xD q6KQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=MwE+FOqKzVGunuMnPmDZXXe5C3CxxMmG6JOVjSLFx/g=; fh=gE7HV+wyCnMgm+1kEnoCmQYOuo8lBgjAmYrQwYgcyMY=; b=UCmeVqdX6xfR9gKMvpq43V4jKg0TxHnDCmE7HEOfwIGvNN/CuCw5gWS26ZoW5LoMQG EdxRpbhBt2UYogWzUzO6+nlPYbg/9ONoZti/w1PnCcPoi2v4dxaOlDmJlqzkD3nBDDy2 9YLwbVgxi9uMEP2Md0kPsYz8CIlt18xzrexyBd+ZRFDTgat20RMnBOB//eq9tEYTh1Tr U4tTTp9z3VDuey7NKUGmnP7b33AEUMo7nPrVftEM78HFHJMMfwifTVNhdLfQwn29fkXc xrnxIo3jqE1l/tS6M1AmGHbQNX+6rMlIWhWIBc3OrOkrnvHMMHurif9/PCTLgApZGD0W 64ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=moGfzBYI; spf=pass (google.com: domain of linux-kernel+bounces-23878-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23878-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id w20-20020ac87e94000000b00417b523872dsi1223195qtj.621.2024.01.11.08.37.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 08:37:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23878-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=moGfzBYI; spf=pass (google.com: domain of linux-kernel+bounces-23878-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23878-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id D90AC1C2638B for ; Thu, 11 Jan 2024 16:37:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 98CD651012; Thu, 11 Jan 2024 16:37:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="moGfzBYI" Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AD8D5025B for ; Thu, 11 Jan 2024 16:37:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f46.google.com with SMTP id 4fb4d7f45d1cf-553e36acfbaso12699a12.0 for ; Thu, 11 Jan 2024 08:37:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1704991055; x=1705595855; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MwE+FOqKzVGunuMnPmDZXXe5C3CxxMmG6JOVjSLFx/g=; b=moGfzBYI8sDDjSzcn7WYXGirEV0zthDA+dn8iigdIc5O1Co0wM/80e2IqoEPyMw9Yc fcxbAhypOaSDpra3HDtI4VlcZekONnNuyRlZhOkuvb7iitJ2G3BlIGaCEBxxzu+Pe0+0 ZEVMH+oP+9LkNNZLcutMmmcn7xkEspMxoEgm6YhA/hr5BNHRYDGj8TEILDEgFRatFSSH eY9KD1ryeBbFEHVMh5grDI29N0/yNMnBM3fUr+Gk2bro/2LTCuoa7LmYZbkUHF4d0cVF BuTK/lpC4wpwfIcYvzxSckyZYtWqXIMsGT79lLpQR0MkKtBPGbqzVXKCOBi3G/2odJsB SKbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704991055; x=1705595855; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MwE+FOqKzVGunuMnPmDZXXe5C3CxxMmG6JOVjSLFx/g=; b=Q5FOcXC3+8t1L8Kub7rhitlmiIEZ43Oi2Sc78DWnWllDJnW7ytACUWNG/TZaf5k/U6 ACCYPB1iXgQkwdp8MeqTj5oi5VstcqYmGkzOUnwMpJe8EqtxveS5iN3WmYQdiTlYIvDR g+XbvxuaupojAnYJLskVRYeGOAPRC1Xu1EMTux1cEp2lMfkgAyYL885E+RdTBgtm21lG 6D6k+NXuiw++Y8+UO9vNvUljDSxe8ENG9/Otav80Y9CiGp7N2m5Cwp1gILCSXqnDwWJj QhBweG7+ZqM1PFtsqZSIIaQ/DmINxJVqCWWGgvSnDRuhil0KwTWwTV+izyhWhpOHQlAr Z6Sw== X-Gm-Message-State: AOJu0YxJRsrXldAhyCHgBaqJZK8WOPfpVdLi75bL6d2yYHCXc/Gf3v1G c/V2/BGcuPhc7QheYMDEEDzgJ6UOkdgrPwKqoguQ6pn9GfnO X-Received: by 2002:aa7:c506:0:b0:557:3e55:41e3 with SMTP id o6-20020aa7c506000000b005573e5541e3mr134398edq.0.1704991054745; Thu, 11 Jan 2024 08:37:34 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240104162510.72773-1-urezki@gmail.com> <20240104162510.72773-2-urezki@gmail.com> In-Reply-To: From: Kalesh Singh Date: Thu, 11 Jan 2024 08:37:22 -0800 Message-ID: Subject: Re: [PATCH v4 1/4] rcu: Reduce synchronize_rcu() latency To: Uladzislau Rezki Cc: "Paul E . McKenney" , RCU , Neeraj upadhyay , Boqun Feng , Hillf Danton , Joel Fernandes , LKML , Oleksiy Avramchenko , Frederic Weisbecker Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Jan 10, 2024 at 1:22=E2=80=AFAM Uladzislau Rezki = wrote: > > Hello, Kalesh! > > > > > Hi Uladzislau, > > > > I've tried your patches (v3) on Android with 6.1.43 kernel. > > > > The test cycles 10 apps (including camera) sequentially for 100 > > iterations. > > > > I've set rcu_normal to override the rcu_expedited in the boot > > parameters: > > > > adb shell cat /proc/cmdline | tr ' ' '\n' | grep rcu > > > > rcupdate.rcu_normal=3D1 > > rcupdate.rcu_expedited=3D1 > > rcu_nocbs=3D0-7 > > > > > > The configurations are: > > > > A - echo 0 >/sys/module/rcutree/parameters/rcu_normal_wake_from_gp > > B - echo 1 >/sys/module/rcutree/parameters/rcu_normal_wake_from_gp > > > > Results: > > > > =3D APP LAUNCH TIME =3D > > delta (B-A) ratio(%) > > overall_app_launch_time(ms) -11399.00 -6.65 > > > > > > =3D=3D camera_launch_time > > type delta(B-A %) A_count B_count > > HOT -7.05 99 99 > > COLD -6.33 1 1 > > > > Hi Uladzislau, > If i interpret it correctly you also see that this series reduces > a launch time by 6/7% on your app set. Is that correct? Yes your understanding is correct. > > > =3D=3D=3D Function Latencies =3D=3D=3D > > > > Tracing synchronize_rcu_expedited. Hit Ctrl-C to exit = Tracing synchronize_rcu_expedited. Hit Ctrl-C to exit > > > > nsec : count distribution = nsec : count distribution > > 0 -> 1 : 0 | = | 0 -> 1 : 0 | = | > > 2 -> 3 : 0 | = | 2 -> 3 : 0 | = | > > 4 -> 7 : 0 | = | 4 -> 7 : 0 | = | > > 8 -> 15 : 0 | = | 8 -> 15 : 0 | = | > > 16 -> 31 : 0 | = | 16 -> 31 : 0 | = | > > 32 -> 63 : 0 | = | 32 -> 63 : 0 | = | > > 64 -> 127 : 0 | = | 64 -> 127 : 0 | = | > > 128 -> 255 : 0 | = | 128 -> 255 : 0 | = | > > 256 -> 511 : 0 | = | 256 -> 511 : 0 | = | > > 512 -> 1023 : 0 | = | 512 -> 1023 : 0 | = | > > 1024 -> 2047 : 0 | = | 1024 -> 2047 : 0 | = | > > 2048 -> 4095 : 0 | = | 2048 -> 4095 : 0 | = | > > 4096 -> 8191 : 0 | = | 4096 -> 8191 : 0 | = | > > 8192 -> 16383 : 0 | = | 8192 -> 16383 : 0 | = | > > 16384 -> 32767 : 0 | = | 16384 -> 32767 : 0 | = | > > 32768 -> 65535 : 0 | = | 32768 -> 65535 : 0 | = | > > 65536 -> 131071 : 0 | = | 65536 -> 131071 : 0 | = | > > 131072 -> 262143 : 0 | = | 131072 -> 262143 : 0 | = | > > 262144 -> 524287 : 0 | = | 262144 -> 524287 : 0 | = | > > 524288 -> 1048575 : 0 | = | 524288 -> 1048575 : 0 | = | > > 1048576 -> 2097151 : 0 | = | 1048576 -> 2097151 : 0 | = | > > 2097152 -> 4194303 : 0 | = | 2097152 -> 4194303 : 0 | = | > > 4194304 -> 8388607 : 871 |** = | 4194304 -> 8388607 : 1180 |**** = | > > 8388608 -> 16777215 : 3204 |******** = | 8388608 -> 16777215 : 7020 |*************************= | > > 16777216 -> 33554431 : 15013 |**********************************= ******| 16777216 -> 33554431 : 10952 |*************************= ***************| > > Exiting trace of synchronize_rcu_expedited = Exiting trace of synchronize_rcu_expedited > > > > > > Tracing synchronize_rcu. Hit Ctrl-C to exit = Tracing synchronize_rcu. Hit Ctrl-C to exit > > > > nsec : count distribution = nsec : count distribution > > 0 -> 1 : 0 | = | 0 -> 1 : 0 | = | > > 2 -> 3 : 0 | = | 2 -> 3 : 0 | = | > > 4 -> 7 : 0 | = | 4 -> 7 : 0 | = | > > 8 -> 15 : 0 | = | 8 -> 15 : 0 | = | > > 16 -> 31 : 0 | = | 16 -> 31 : 0 | = | > > 32 -> 63 : 0 | = | 32 -> 63 : 0 | = | > > 64 -> 127 : 0 | = | 64 -> 127 : 0 | = | > > 128 -> 255 : 0 | = | 128 -> 255 : 0 | = | > > 256 -> 511 : 0 | = | 256 -> 511 : 0 | = | > > 512 -> 1023 : 0 | = | 512 -> 1023 : 0 | = | > > 1024 -> 2047 : 0 | = | 1024 -> 2047 : 0 | = | > > 2048 -> 4095 : 0 | = | 2048 -> 4095 : 0 | = | > > 4096 -> 8191 : 0 | = | 4096 -> 8191 : 0 | = | > > 8192 -> 16383 : 0 | = | 8192 -> 16383 : 0 | = | > > 16384 -> 32767 : 0 | = | 16384 -> 32767 : 0 | = | > > 32768 -> 65535 : 0 | = | 32768 -> 65535 : 0 | = | > > 65536 -> 131071 : 0 | = | 65536 -> 131071 : 0 | = | > > 131072 -> 262143 : 0 | = | 131072 -> 262143 : 0 | = | > > 262144 -> 524287 : 0 | = | 262144 -> 524287 : 0 | = | > > 524288 -> 1048575 : 0 | = | 524288 -> 1048575 : 0 | = | > > 1048576 -> 2097151 : 0 | = | 1048576 -> 2097151 : 0 | = | > > 2097152 -> 4194303 : 0 | = | 2097152 -> 4194303 : 0 | = | > > 4194304 -> 8388607 : 861 |** = | 4194304 -> 8388607 : 1136 |**** = | > > 8388608 -> 16777215 : 3111 |******** = | 8388608 -> 16777215 : 6320 |************************ = | > > 16777216 -> 33554431 : 13901 |**********************************= ******| 16777216 -> 33554431 : 10484 |*************************= ***************| > > Exiting trace of synchronize_rcu = Exiting trace of synchronize_rcu > > > Who is B and who is A? Left is A (rcu_normal_wake_from_gp=3D0) and right is B (rcu_normal_wake_from_gp=3D1) > > > > > Interestingly I tried the same experiment without rcu_normal=3D1 (leavi= ng rcu_expedited=3D1): > > > > adb shell cat /proc/cmdline | tr ' ' '\n' | grep rcu > > rcupdate.rcu_expedited=3D1 > > rcu_nocbs=3D0-7 > > > > In this case I also saw the -6 to -7% decrease in the app launch times > > but I don't have a good explanation why that would be? (The fucntion > > latency histograms in this case didn't show any significant difference)= . > > Do you have any insight why this may happen? > > > When rcu_expedited=3D1 is set and rcu_normal=3D0 is disabled. The > synchronize_rcu() call is converted into synchronize_rcu_expidited(): > > > void synchronize_rcu(void) > { > unsigned long flags; > struct rcu_node *rnp; > > RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || > lock_is_held(&rcu_lock_map) || > lock_is_held(&rcu_sched_lock_map), > "Illegal synchronize_rcu() in RCU read-side crit= ical section"); > if (!rcu_blocking_is_gp()) { > if (rcu_gp_is_expedited()) > synchronize_rcu_expedited(); > else > synchronize_rcu_normal(); > return; > } > ... > > > rcu_gp_is_expidited() is true, so invoke "expedited" version. > > I see some concerns in preferring an expedited version as a global > replacement. First of all it is related to latency sensitive workloads > because in order to expedite a grace period it sends out IPIs on all > online CPUs to force them to report a quiescent-state asap. I have not > investigated yet how it affects such workloads. > > Therefore, in your case, you also see a performance boost of your app set= s. IIUC the patch shouldn't affect the case? The only difference in A vs B is rcu_normal_wake_from_gp (both have rcu_expedited=3D1). Thanks, Kalesh > > Thank you for looking at it! > > -- > Uladzislau Rezki