Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp4744350rwl; Mon, 10 Apr 2023 16:26:12 -0700 (PDT) X-Google-Smtp-Source: AKy350YajeiEJTjS32A1LtFOQfOYTpM7LI5DEeCTRZYrEfWO4IToXtKg/Jp2ERrPxFcyaBaYyFk2 X-Received: by 2002:a05:6402:2038:b0:4fe:1b62:4741 with SMTP id ay24-20020a056402203800b004fe1b624741mr6902104edb.28.1681169171780; Mon, 10 Apr 2023 16:26:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681169171; cv=none; d=google.com; s=arc-20160816; b=XkGP8yahp/qv3pktq8eE5V29mSB2aVIitXAV/Uy2Dni6I9gQE8NIuNqlAc27muvf43 4SjRE2qKv4bDvoRKIB1+XDg1jBX6fuKhw0cIs1caCLRr9wvkpCMYF4+cup7oJkwdoTdY w0Wz2uCFBKTv8EB6kqqOY7/XBBxrygGZujkNL6cJNfTM/kShMYE1+reToty2HJ59aSot 7188TMOmURLf8SRGZzQ74/6/BnzECH/9L7MXfoQoxS1dFWuctXLy5lz+WLP9FH6S3DMv Wne6VyfXBUAe12kpHA1jLHdK465RVDN2KHP6+08e69/uRADkjKLR8UkezPomcZahma/y 2DZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=A8MxpOxNzp9Ml7DJUuM/nGANoed3mKpg5qB4q5+wsrU=; b=vRkLndtlO2ANS77BYfJTPPyIvdsNtpYzXQAI2h657pSwrTUGzrRmcvP8t7v95yw/uH 7W2lmuEkpjezoMqTCXPM2eu3w6LHTcd2vGhR9wq2D4rpifOHM8Nv+dq0CfPTP+XC6KHm jyZsMJuW3Cu5BTBm2hEJmdHzpq6bmukiBnPB6w+lh3itGxQFrhnPqhL5hJi/6FzItqGb +E40GNCalzlPy1UgGgVnBOAd1ELrllvx9hDTVBhRKgdDPvXkYS6DWXwJ0Eb3Cc9HdkVH E9Jixjyc7uUbwOC0urrBiyVmY9HFiHBtJbdPkpa6T8TLw8UYa6mRsBtzff3w8+Z5f+R6 WQGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=s8hzaNyI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s24-20020a056402165800b00504b2288779si1856453edx.551.2023.04.10.16.25.47; Mon, 10 Apr 2023 16:26:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=s8hzaNyI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229873AbjDJXVz (ORCPT + 99 others); Mon, 10 Apr 2023 19:21:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbjDJXVx (ORCPT ); Mon, 10 Apr 2023 19:21:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 701DB2100 for ; Mon, 10 Apr 2023 16:21:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DC1D461FEF for ; Mon, 10 Apr 2023 23:21:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 36BA4C433EF; Mon, 10 Apr 2023 23:21:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1681168908; bh=oxQX+UqO5Av7KgllwTbzTq9L9gnk5w5BnaS1yJ01Ttg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=s8hzaNyI11A/P74F81vBntD+/1EJ2p9wEVprEZtDrRtJezqibBY4JDiXSCxPcru7s 1bLHngUL34NRz364m9zNl7eYgLSz2CUx0dkFW9PLhs9VGOjW6KAIbZXWhPyalQEw7v hyh3A4gxfy2EEnWhH7YNeoFPCpJ7KSYHrz861YGBmkJlg39dbjPUuKhAvcVVfOMBU9 One5u6dMiDFE5Z5pAIukSEfLt0f+AgLo4y0OMiHgdAZfPPWamm63NsW3xnTVuynGyf /P2O90mh8moyIZWiHe+0hFAqk5lf++EkytkDGhttW1grHd/efh7NQdHwX309sm8K1G VJpWHuRUQKelg== Date: Mon, 10 Apr 2023 16:21:46 -0700 From: Jaegeuk Kim To: Chao Yu Cc: linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org Subject: Re: [PATCH] f2fs: fix to trigger a checkpoint in the end of foreground garbage collection Message-ID: References: <20230324071028.336982-1-chao@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-5.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/10, Chao Yu wrote: > On 2023/4/5 23:55, Jaegeuk Kim wrote: > > On 04/05, Chao Yu wrote: > > > On 2023/4/5 5:39, Jaegeuk Kim wrote: > > > > Can we do like this? > > > > > > > > From 9a58f0e59364241aa31b555cfe793d278e39b0dc Mon Sep 17 00:00:00 2001 > > > > From: Jaegeuk Kim > > > > Date: Tue, 4 Apr 2023 14:36:00 -0700 > > > > Subject: [PATCH] f2fs: do checkpoint when there's not enough free sections > > > > > > > > We didn't do checkpoint in FG_GC case, which may cause losing to reclaim prefree > > > > sctions in time. > > > > > > > > Fixes: 6f8d4455060d ("f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc") > > > > Signed-off-by: Chao Yu > > > > Signed-off-by: Jaegeuk Kim > > > > --- > > > > fs/f2fs/gc.c | 24 +++++++++++------------- > > > > 1 file changed, 11 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c > > > > index 56c53dbe05c9..f1d0dd9c5a6c 100644 > > > > --- a/fs/f2fs/gc.c > > > > +++ b/fs/f2fs/gc.c > > > > @@ -1806,6 +1806,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) > > > > }; > > > > unsigned int skipped_round = 0, round = 0; > > > > unsigned int upper_secs; > > > > + bool stop_gc = false; > > > > trace_f2fs_gc_begin(sbi->sb, gc_type, gc_control->no_bg_gc, > > > > gc_control->nr_free_secs, > > > > @@ -1876,19 +1877,15 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) > > > > (gc_type == FG_GC) ? sec_freed : 0, 0)) { > > > > if (gc_type == FG_GC && sec_freed < gc_control->nr_free_secs) > > > > goto go_gc_more; > > > > - goto stop; > > > > - } > > > > - > > > > - /* FG_GC stops GC by skip_count */ > > > > - if (gc_type == FG_GC) { > > > > + stop_gc = true; > > > > > > I guess below condition is for emergency recycle of prefree segments during > > > foreground GC, in order to avoid exhausting free sections due to to many > > > metadata allocation during CP. > > > > > > if (free_sections(sbi) <= upper_secs + NR_GC_CHECKPOINT_SECS && > > > prefree_segments(sbi)) { > > > > > > But for common case, free_sections() is close to reserved_segments(), and > > > upper_secs + NR_GC_CHECKPOINT_SECS value may be far smaller than free_sections(), > > > so checkpoint may not be trggered as expected, IIUC. > > > > > > So it's fine to just trigger CP in the end of foreground garbage collection? > > > > My major concern is to avoid unnecessary checkpointing given multiple FG_GC > > requests were pending in parallel. And, I don't want to add so many combination > > which gives so many corner cases, and feel f2fs_gc() needs to call checkpoint > > automatically in the worst case scenario only. > > Alright. > > > > > By the way, do we just need to call checkpoint here including FG_GC as well? > > I didn't get it, do you mean? > > - f2fs_balance_fs() > - f2fs_gc() creates prefree segments but not call checkpoint to reclaim > > - f2fs_balance_fs() > - f2fs_gc() > - detect prefree segments created by last f2fs_balance_fs, then call > f2fs_write_checkpoint to reclaim > > Or could you please provide a draft patch? :-P Testing this. From ec5f37bbe33110257c04e0ec97a80b0111465b52 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Mon, 10 Apr 2023 14:48:50 -0700 Subject: [PATCH] f2fs: refactor f2fs_gc to call checkpoint in urgent condition The major change is to call checkpoint, if there's not enough space while having some prefree segments in FG_GC case. Signed-off-by: Jaegeuk Kim --- fs/f2fs/gc.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index c748cdfb0501..0a823d2e8b9d 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -1829,7 +1829,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) goto stop; } - if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0, 0)) { + /* Let's run FG_GC, if we don't have enough space. */ + if (has_not_enough_free_secs(sbi, 0, 0)) { + gc_type = FG_GC; + /* * For example, if there are many prefree_segments below given * threshold, we can make them free by checkpoint. Then, we @@ -1840,8 +1843,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) if (ret) goto stop; } - if (has_not_enough_free_secs(sbi, 0, 0)) - gc_type = FG_GC; } /* f2fs_balance_fs doesn't need to do BG_GC in critical path. */ @@ -1868,19 +1869,14 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) if (seg_freed == f2fs_usable_segs_in_sec(sbi, segno)) sec_freed++; - if (gc_type == FG_GC) + if (gc_type == FG_GC) { sbi->cur_victim_sec = NULL_SEGNO; - if (gc_control->init_gc_type == FG_GC || - !has_not_enough_free_secs(sbi, - (gc_type == FG_GC) ? sec_freed : 0, 0)) { - if (gc_type == FG_GC && sec_freed < gc_control->nr_free_secs) - goto go_gc_more; - goto stop; - } - - /* FG_GC stops GC by skip_count */ - if (gc_type == FG_GC) { + if (!has_not_enough_free_secs(sbi, sec_freed, 0)) { + if (sec_freed < gc_control->nr_free_secs) + goto go_gc_more; + goto stop; + } if (sbi->skipped_gc_rwsem) skipped_round++; round++; @@ -1889,6 +1885,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) ret = f2fs_write_checkpoint(sbi, &cpc); goto stop; } + } else if (!has_not_enough_free_secs(sbi, 0, 0)) { + goto stop; } __get_secs_required(sbi, NULL, &upper_secs, NULL); -- 2.40.0.577.gac1e443424-goog > > Thanks, > > > > > 1832 > > 1833 if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0, 0)) { > > 1834 /* > > 1835 * For example, if there are many prefree_segments below given > > 1836 * threshold, we can make them free by checkpoint. Then, we > > 1837 * secure free segments which doesn't need fggc any more. > > 1838 */ > > 1839 if (prefree_segments(sbi)) { > > 1840 ret = f2fs_write_checkpoint(sbi, &cpc); > > 1841 if (ret) > > 1842 goto stop; > > 1843 } > > 1844 if (has_not_enough_free_secs(sbi, 0, 0)) > > 1845 gc_type = FG_GC; > > 1846 } > > > > > > > > One other concern is for those path as below: > > > - disable_checkpoint > > > - ioc_gc > > > - ioc_gc_range > > > - ioc_resize > > > ... > > > > I think the upper caller should decide to call checkpoint, if they want to > > reclaim the prefree likewise f2fs_disable_checkpoint. > > > > > > > > We've passed gc_type as FG_GC, but the demand here is to migrate block in time, > > > rather than dirtying blocks, and callers don't expect checkpoint in f2fs_gc(), > > > instead the callers will do the checkpoit as it needs. > > > > > > That means it's better to decouple FG_GC and write_checkpoint behavior, so I > > > added another parameter .reclaim_space to just let f2fs_balance_fs() to trigger > > > checkpoit in the end of f2fs_gc(). > > > > > > > > Thanks, > > > > > > > + } else if (gc_type == FG_GC) { > > > > + /* FG_GC stops GC by skip_count */ > > > > if (sbi->skipped_gc_rwsem) > > > > skipped_round++; > > > > round++; > > > > if (skipped_round > MAX_SKIP_GC_COUNT && > > > > - skipped_round * 2 >= round) { > > > > - ret = f2fs_write_checkpoint(sbi, &cpc); > > > > - goto stop; > > > > - } > > > > + skipped_round * 2 >= round) > > > > + stop_gc = true; > > > > } > > > > __get_secs_required(sbi, NULL, &upper_secs, NULL); > > > > @@ -1901,12 +1898,13 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) > > > > prefree_segments(sbi)) { > > > > ret = f2fs_write_checkpoint(sbi, &cpc); > > > > if (ret) > > > > - goto stop; > > > > + stop_gc = true; > > > > } > > > > go_gc_more: > > > > - segno = NULL_SEGNO; > > > > - goto gc_more; > > > > - > > > > + if (!stop_gc) { > > > > + segno = NULL_SEGNO; > > > > + goto gc_more; > > > > + } > > > > stop: > > > > SIT_I(sbi)->last_victim[ALLOC_NEXT] = 0; > > > > SIT_I(sbi)->last_victim[FLUSH_DEVICE] = gc_control->victim_segno;