Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp1983913rwl; Thu, 5 Jan 2023 23:42:29 -0800 (PST) X-Google-Smtp-Source: AMrXdXtNbOsTIe06WrEag+Cs+NsO9V1Kqx/DVxZBmCBQbYebRVXVtd3pK3J9HqMcki5rvYSuzeQd X-Received: by 2002:a17:906:b053:b0:7c1:1b89:1fe0 with SMTP id bj19-20020a170906b05300b007c11b891fe0mr43441372ejb.65.1672990948872; Thu, 05 Jan 2023 23:42:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672990948; cv=none; d=google.com; s=arc-20160816; b=oZzqGubiBgfEftTAVZAfe+Yk4VLjSiTIKS6MoK32P9l2C6+iVWjb3OrMh2BOtxeVnK fZ5gl53sffcGlvXknS8xUWFO7yoq3fbXMml7yy6e4YDLJ41bAt8aeoDUqzhGqLMSjLKc 3+p1t6fD5Ol7EL/j0XLThGtMwNUGqslMTSkMvhP+RBqO4BbW5xZY640rnStmCJflB8/x vPc+GWXHbSLYRSuN3SoV6Rt1BjU1zZQpmGphyriw6nfX25y7AbBPHnQY0VPu7AVebtUW P4Ssve22QOSid7bkILrgUGhIKIOO9usb4Qu2bU5xOPlWK1nNqHtk+a+DRPtEMuXTs+NX AwMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=SVy/sACLcIMJJmz0yG0g3oev0OW28eKfQwubVIvq9IM=; b=jmCcTuzg57Mc7btaUJzrGnvjyVkUM8Jy1kjHtimhCK+wg9o7jZkhvF2ve5UGvLb6Ze PsoNYt2N2BpE5rCxwLkc4GkKf2vug4+FKh274wm0QHimcgGn98k8d3TgYNW6yyOXWevv s+1ENz1EuPGMUkl6fmDw+2bJ3l8xWk+AQhOUIiJK5gmQ/YjNIEv9nbD8nXRONRhmJENU Z2wD74JsEzzmwCS3+IJN4CkW6uto/Ti8cM7kSPQYgDAJXscwD2mklwS6yUcos1i+tnUl 8yyJ/0jMLSL7drE0vTjemwD0txF1rFeLeKk2jY5ubEItQqlgkYh6qh7o6kkCsItWSgsv IlXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=FoiwU78T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ds3-20020a170907724300b007c0d4f96d8esi646528ejc.566.2023.01.05.23.42.16; Thu, 05 Jan 2023 23:42:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=FoiwU78T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231602AbjAFHJP (ORCPT + 54 others); Fri, 6 Jan 2023 02:09:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230059AbjAFHJK (ORCPT ); Fri, 6 Jan 2023 02:09:10 -0500 Received: from mail-oa1-x35.google.com (mail-oa1-x35.google.com [IPv6:2001:4860:4864:20::35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E66B6E0C9 for ; Thu, 5 Jan 2023 23:09:08 -0800 (PST) Received: by mail-oa1-x35.google.com with SMTP id 586e51a60fabf-1441d7d40c6so847788fac.8 for ; Thu, 05 Jan 2023 23:09:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=SVy/sACLcIMJJmz0yG0g3oev0OW28eKfQwubVIvq9IM=; b=FoiwU78T1zvNtBqicOW6qnb1PESxn4hs3+xjwAnfZAekT7ZAa1/zcQ9ymhBajA+ZBr LT/PSsn4jpXyGo9abX3ZScJfejcg7X0bvkE2kooMbOjVFGD2Wot0aqYyKf5ClD0npopq 89YzzsYPJNuuUY64l7DQ/1d9rDMuxZHZ4DuHma9XWmrxfWN4hq+FulB4vfyruDSCswmq o+KiD8xg24XyNQeB2H25vPT8wSGhUlSBsMSkq+Hwab8ADmDF+uu+pLMdcIptwR/UUrc2 KoCgwYK6rcj3hg0V8ysV5yh6VR9PAQc6I2u7lu2BQkDM8Qb/M6XOuD7pwXRZBItqhmg7 VR/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=SVy/sACLcIMJJmz0yG0g3oev0OW28eKfQwubVIvq9IM=; b=oXgQkd/LJFuGIOveh9UxohrPTTB8MQzL/tgVs0hTtA9fONpsHsxoC+M3QI0+SZpKSJ 3gKgZwtYrvxBgus2hbnKZPJzWRQCnuFarsGoIzkrc/mdiTA8SwieuUQcdg3D+CImbi5P 8tSThJBQKEYygdhwVXI6jxxNlqHMKyb/vEqC/N4Dre8oKPlm0Wv+FqFh3rGNR6GDbJik BtL5gjdRBg8vJ+85r8lyurs2a258xk/zmIC5Yl5QuquLUHhrnGbR4MuzzdIbSyFqKu9N vLluhD3frUmBENr/O0RztpQjDuh00XDZqyKq3p2Zipay2iUs4upO4SMvzC8Sr77uNUSy OcWQ== X-Gm-Message-State: AFqh2ko25/gMJH6m5xuHbpg6vyge49035XwK3eStK7l+LRDVEPN+osXN rz4YMbLDcP5TPGmXhnroZLAL5uxewsO33S2YyD55hTxJU5U+XA== X-Received: by 2002:a05:6871:a691:b0:14f:c2a7:3aba with SMTP id wh17-20020a056871a69100b0014fc2a73abamr2179559oab.228.1672988946740; Thu, 05 Jan 2023 23:09:06 -0800 (PST) MIME-Version: 1.0 References: <20230106054105.3998588-1-dhavale@google.com> In-Reply-To: From: Sandeep Dhavale Date: Thu, 5 Jan 2023 23:08:55 -0800 Message-ID: Subject: Re: [PATCH v3] erofs: replace erofs_unzipd workqueue with per-cpu threads To: Gao Xiang Cc: Chao Yu , Yue Hu , Jeffle Xu , kernel-team@android.com, linux-erofs@lists.ozlabs.org, linux-kernel@vger.kernel.org, Gao Xiang Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-14.3 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE, SPF_PASS,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 5, 2023 at 10:02 PM Gao Xiang wrote: > > Hi Sandeep, > > On 2023/1/6 13:41, Sandeep Dhavale wrote: > > Using per-cpu thread pool we can reduce the scheduling latency compared > > to workqueue implementation. With this patch scheduling latency and > > variation is reduced as per-cpu threads are high priority kthread_workers. > > > > The results were evaluated on arm64 Android devices running 5.10 kernel. > > > > The table below shows resulting improvements of total scheduling latency > > for the same app launch benchmark runs with 50 iterations. Scheduling > > latency is the latency between when the task (workqueue kworker vs > > kthread_worker) became eligible to run to when it actually started > > running. > > +-------------------------+-----------+----------------+---------+ > > | | workqueue | kthread_worker | diff | > > +-------------------------+-----------+----------------+---------+ > > | Average (us) | 15253 | 2914 | -80.89% | > > | Median (us) | 14001 | 2912 | -79.20% | > > | Minimum (us) | 3117 | 1027 | -67.05% | > > | Maximum (us) | 30170 | 3805 | -87.39% | > > | Standard deviation (us) | 7166 | 359 | | > > +-------------------------+-----------+----------------+---------+ > > > > Signed-off-by: Sandeep Dhavale > > --- > > V2 -> V3 > > Fix a warning Reported-by: kernel test robot > > > > V1 -> V2 > > * Changed name of kthread_workers from z_erofs to erofs_worker > > * Added kernel configuration to run kthread_workers at normal or > > high priority > > * Added cpu hotplug support > > * Added wrapped kthread_workers under worker_pool > > * Added one unbound thread in a pool to handle a context where > > we already stopped per-cpu kthread worker > > * Updated commit message > > > > Background: Boot times and cold app launch benchmarks are very > > important to android ecosystem as they directly translate to > > responsiveness from user point of view. While erofs provides > > lot of important features like space savings, we saw some > > performance penalty in cold app launch benchmarks in few scenarios. > > Analysis showed that the significant variance was coming from the > > scheduling cost while decompression cost was more or less the same. > > > > Having per-cpu thread pool we can see from above table that this > > variation is reduced by ~80% on average. Link to LPC 2022 slides and > > talk at [1] > > > > [1] https://lpc.events/event/16/contributions/1338/ > > Let's move these words into the commit message as well for others to refer. Sure, I will do and send V4. > > I think look into (and try) this version later this week. Thank you. -Sandeep. > > Thanks, > Gao Xiang > > > --- > > fs/erofs/Kconfig | 11 +++ > > fs/erofs/zdata.c | 201 +++++++++++++++++++++++++++++++++++++++++------ > > fs/erofs/zdata.h | 4 +- > > 3 files changed, 192 insertions(+), 24 deletions(-) > > > > diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig > > index 85490370e0ca..879f493c6641 100644 > > --- a/fs/erofs/Kconfig > > +++ b/fs/erofs/Kconfig > > @@ -108,3 +108,14 @@ config EROFS_FS_ONDEMAND > > read support. > > > > If unsure, say N. > > + > > +config EROFS_FS_KTHREAD_HIPRI > > + bool "EROFS high priority percpu kthread workers" > > + depends on EROFS_FS > > + default n > > + help > > + EROFS uses per cpu kthread workers pool to carry out async work. > > + This permits EROFS to configure per cpu kthread workers to run > > + at higher priority. > > + > > + If unsure, say N. > > diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c > > index ccf7c55d477f..b4bf2da72d1a 100644 > > --- a/fs/erofs/zdata.c > > +++ b/fs/erofs/zdata.c > > @@ -8,6 +8,8 @@ > > #include "compress.h" > > #include > > #include > > +#include > > +#include > > > > #include > > > > @@ -184,26 +186,152 @@ typedef tagptr1_t compressed_page_t; > > #define tag_compressed_page_justfound(page) \ > > tagptr_fold(compressed_page_t, page, 1) > > > > -static struct workqueue_struct *z_erofs_workqueue __read_mostly; > > +struct erofs_kthread_worker_pool { > > + struct kthread_worker __rcu **workers; > > + struct kthread_worker *unbound_worker; > > +}; > > > > -void z_erofs_exit_zip_subsystem(void) > > +static struct erofs_kthread_worker_pool worker_pool; > > +DEFINE_SPINLOCK(worker_pool_lock); > > + > > +static void erofs_destroy_worker_pool(void) > > { > > - destroy_workqueue(z_erofs_workqueue); > > - z_erofs_destroy_pcluster_pool(); > > + unsigned int cpu; > > + struct kthread_worker *worker; > > + > > + for_each_possible_cpu(cpu) { > > + worker = rcu_dereference_protected( > > + worker_pool.workers[cpu], > > + 1); > > + rcu_assign_pointer(worker_pool.workers[cpu], NULL); > > + > > + if (worker) > > + kthread_destroy_worker(worker); > > + } > > + > > + if (worker_pool.unbound_worker) > > + kthread_destroy_worker(worker_pool.unbound_worker); > > + > > + kfree(worker_pool.workers); > > } > > > > -static inline int z_erofs_init_workqueue(void) > > +static inline void erofs_set_worker_priority(struct kthread_worker *worker) > > { > > - const unsigned int onlinecpus = num_possible_cpus(); > > +#ifdef CONFIG_EROFS_FS_KTHREAD_HIPRI > > + sched_set_fifo_low(worker->task); > > +#else > > + sched_set_normal(worker->task, 0); > > +#endif > > +} > > > > - /* > > - * no need to spawn too many threads, limiting threads could minimum > > - * scheduling overhead, perhaps per-CPU threads should be better? > > - */ > > - z_erofs_workqueue = alloc_workqueue("erofs_unzipd", > > - WQ_UNBOUND | WQ_HIGHPRI, > > - onlinecpus + onlinecpus / 4); > > - return z_erofs_workqueue ? 0 : -ENOMEM; > > +static int erofs_create_kthread_workers(void) > > +{ > > + unsigned int cpu; > > + struct kthread_worker *worker; > > + > > + for_each_online_cpu(cpu) { > > + worker = kthread_create_worker_on_cpu(cpu, 0, "erofs_worker/%u", cpu); > > + if (IS_ERR(worker)) { > > + erofs_destroy_worker_pool(); > > + return -ENOMEM; > > + } > > + erofs_set_worker_priority(worker); > > + rcu_assign_pointer(worker_pool.workers[cpu], worker); > > + } > > + > > + worker = kthread_create_worker(0, "erofs_unbound_worker"); > > + if (IS_ERR(worker)) { > > + erofs_destroy_worker_pool(); > > + return PTR_ERR(worker); > > + } > > + erofs_set_worker_priority(worker); > > + worker_pool.unbound_worker = worker; > > + > > + return 0; > > +} > > + > > +static int erofs_init_worker_pool(void) > > +{ > > + int err; > > + > > + worker_pool.workers = kcalloc(num_possible_cpus(), > > + sizeof(struct kthread_worker *), GFP_ATOMIC); > > + if (!worker_pool.workers) > > + return -ENOMEM; > > + err = erofs_create_kthread_workers(); > > + > > + return err; > > +} > > + > > +#ifdef CONFIG_HOTPLUG_CPU > > +static enum cpuhp_state erofs_cpuhp_state; > > +static int erofs_cpu_online(unsigned int cpu) > > +{ > > + struct kthread_worker *worker; > > + > > + worker = kthread_create_worker_on_cpu(cpu, 0, "erofs_worker/%u", cpu); > > + if (IS_ERR(worker)) > > + return -ENOMEM; > > + > > + erofs_set_worker_priority(worker); > > + > > + spin_lock(&worker_pool_lock); > > + rcu_assign_pointer(worker_pool.workers[cpu], worker); > > + spin_unlock(&worker_pool_lock); > > + > > + synchronize_rcu(); > > + > > + return 0; > > +} > > + > > +static int erofs_cpu_offline(unsigned int cpu) > > +{ > > + struct kthread_worker *worker; > > + > > + spin_lock(&worker_pool_lock); > > + worker = rcu_dereference_protected(worker_pool.workers[cpu], > > + lockdep_is_held(&worker_pool_lock)); > > + rcu_assign_pointer(worker_pool.workers[cpu], NULL); > > + spin_unlock(&worker_pool_lock); > > + > > + synchronize_rcu(); > > + > > + if (worker) > > + kthread_destroy_worker(worker); > > + > > + return 0; > > +} > > + > > +static int erofs_cpu_hotplug_init(void) > > +{ > > + int state; > > + > > + state = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, > > + "fs/erofs:online", > > + erofs_cpu_online, erofs_cpu_offline); > > + if (state < 0) > > + return state; > > + > > + erofs_cpuhp_state = state; > > + > > + return 0; > > +} > > + > > +static void erofs_cpu_hotplug_destroy(void) > > +{ > > + if (erofs_cpuhp_state) > > + cpuhp_remove_state_nocalls(erofs_cpuhp_state); > > +} > > +#else /* !CONFIG_HOTPLUG_CPU */ > > +static inline int erofs_cpu_hotplug_init(void) { return 0; } > > +static inline void erofs_cpu_hotplug_destroy(void) {} > > +#endif > > + > > +void z_erofs_exit_zip_subsystem(void) > > +{ > > + erofs_cpu_hotplug_destroy(); > > + erofs_destroy_worker_pool(); > > + z_erofs_destroy_pcluster_pool(); > > } > > > > int __init z_erofs_init_zip_subsystem(void) > > @@ -211,10 +339,23 @@ int __init z_erofs_init_zip_subsystem(void) > > int err = z_erofs_create_pcluster_pool(); > > > > if (err) > > - return err; > > - err = z_erofs_init_workqueue(); > > + goto out_error_pcluster_pool; > > + > > + err = erofs_init_worker_pool(); > > if (err) > > - z_erofs_destroy_pcluster_pool(); > > + goto out_error_worker_pool; > > + > > + err = erofs_cpu_hotplug_init(); > > + if (err < 0) > > + goto out_error_cpuhp_init; > > + > > + return err; > > + > > +out_error_cpuhp_init: > > + erofs_destroy_worker_pool(); > > +out_error_worker_pool: > > + z_erofs_destroy_pcluster_pool(); > > +out_error_pcluster_pool: > > return err; > > } > > > > @@ -1143,7 +1284,7 @@ static void z_erofs_decompress_queue(const struct z_erofs_decompressqueue *io, > > } > > } > > > > -static void z_erofs_decompressqueue_work(struct work_struct *work) > > +static void z_erofs_decompressqueue_kthread_work(struct kthread_work *work) > > { > > struct z_erofs_decompressqueue *bgq = > > container_of(work, struct z_erofs_decompressqueue, u.work); > > @@ -1156,6 +1297,20 @@ static void z_erofs_decompressqueue_work(struct work_struct *work) > > kvfree(bgq); > > } > > > > +static void erofs_schedule_kthread_work(struct kthread_work *work) > > +{ > > + struct kthread_worker *worker; > > + unsigned int cpu = raw_smp_processor_id(); > > + > > + rcu_read_lock(); > > + worker = rcu_dereference(worker_pool.workers[cpu]); > > + if (!worker) > > + worker = worker_pool.unbound_worker; > > + > > + kthread_queue_work(worker, work); > > + rcu_read_unlock(); > > +} > > + > > static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, > > bool sync, int bios) > > { > > @@ -1170,15 +1325,15 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io, > > > > if (atomic_add_return(bios, &io->pending_bios)) > > return; > > - /* Use workqueue and sync decompression for atomic contexts only */ > > + /* Use kthread_workers and sync decompression for atomic contexts only */ > > if (in_atomic() || irqs_disabled()) { > > - queue_work(z_erofs_workqueue, &io->u.work); > > + erofs_schedule_kthread_work(&io->u.work); > > /* enable sync decompression for readahead */ > > if (sbi->opt.sync_decompress == EROFS_SYNC_DECOMPRESS_AUTO) > > sbi->opt.sync_decompress = EROFS_SYNC_DECOMPRESS_FORCE_ON; > > return; > > } > > - z_erofs_decompressqueue_work(&io->u.work); > > + z_erofs_decompressqueue_kthread_work(&io->u.work); > > } > > > > static struct page *pickup_page_for_submission(struct z_erofs_pcluster *pcl, > > @@ -1306,7 +1461,7 @@ jobqueue_init(struct super_block *sb, > > *fg = true; > > goto fg_out; > > } > > - INIT_WORK(&q->u.work, z_erofs_decompressqueue_work); > > + kthread_init_work(&q->u.work, z_erofs_decompressqueue_kthread_work); > > } else { > > fg_out: > > q = fgq; > > @@ -1500,7 +1655,7 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, > > > > /* > > * although background is preferred, no one is pending for submission. > > - * don't issue workqueue for decompression but drop it directly instead. > > + * don't issue kthread_work for decompression but drop it directly instead. > > */ > > if (!*force_fg && !nr_bios) { > > kvfree(q[JQ_SUBMIT]); > > diff --git a/fs/erofs/zdata.h b/fs/erofs/zdata.h > > index d98c95212985..808bbbf71b7b 100644 > > --- a/fs/erofs/zdata.h > > +++ b/fs/erofs/zdata.h > > @@ -6,6 +6,8 @@ > > #ifndef __EROFS_FS_ZDATA_H > > #define __EROFS_FS_ZDATA_H > > > > +#include > > + > > #include "internal.h" > > #include "tagptr.h" > > > > @@ -107,7 +109,7 @@ struct z_erofs_decompressqueue { > > > > union { > > struct completion done; > > - struct work_struct work; > > + struct kthread_work work; > > } u; > > > > bool eio; -- Thanks, SandeepD.