Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp2207707rdb; Fri, 8 Dec 2023 01:05:49 -0800 (PST) X-Google-Smtp-Source: AGHT+IHWzXU+zVBzDpXcpng3xatxmexkHLjwwMHIQBTvXx5uSJbvvRa6OwAfX0S6og1h4oDEjKQt X-Received: by 2002:a05:6359:2d96:b0:170:17eb:203c with SMTP id rn22-20020a0563592d9600b0017017eb203cmr3780417rwb.37.1702026348971; Fri, 08 Dec 2023 01:05:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702026348; cv=none; d=google.com; s=arc-20160816; b=gJsNzGmTXUpuOwveLmKa8EEIPX0H5EhZAqmtzXuKkzr46xNDhyiBvI+3+vdHV0vQ4F shSDD4OpG7zvy5dCtYGwYhRU95mzOex8OzETsyx3KFy6PEZ0zim9d9sRpJR0jmHL97w8 cDzXIyCnrj09kAf0XuCn/BYxdOkQwl6VnsKhyT7KhLEvuFOeUTPzuYK8fvjrK7rZJBXS J710zUx56QzwdUASJEu948+UIkSGEsqkdZxAhBw/IQ9oLZQMOA18vRrBkj/ZrID9lA7X Ig5mOdwO9Vrw6eBsix1/mAAZbR1OhNMv0P0KA/viQZU1gnxobqoDUE6xz2RnB4ZDee/j H/BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:dlp-filter:cms-type :content-transfer-encoding:date:message-id:cc:to:from:sender :reply-to:subject:mime-version:dkim-signature:dkim-filter; bh=yYMzrKo30mS8dgRGXyEvNMI4KEH0C0IvwnCAtCks3nA=; fh=rBA9dKHhCzWEKWLOzU+LsVAW2JHw1k8beROQL0OiM8o=; b=W0J42sceInNvaqotkc+EB9orN0KHMQj4DSutak/LDVKKZEbWuEm1ix8+11KFcNKeNw 57Idw2alZVB523hPKmnBdHZmlwT0CC+NTbtkMJNtX875wlmu9NohbjAYeoG3hlbq40iz NcxLtzWQFUjS+gHiiYavEPm546lt8LzhPEY4Jgy5sgq13qn9LrYLw0euSa6cWBvsEtfO zclSHUA1rT8a/GJNzUmHAPu6sLwhB6lHMtEwWIaGvKXLO7g/2GtKfmAdKW6bncUQpNMr uozOqhudkhb/st3IMpI3R2YPQuDxT9sNguAA+wFZuMbubcLqRuJs85289ChM5dCm7YtC VYug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@samsung.com header.s=mail20170921 header.b=FuglznW6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=samsung.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id x29-20020a056a00189d00b006cb997a5f83si1227478pfh.31.2023.12.08.01.05.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Dec 2023 01:05:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@samsung.com header.s=mail20170921 header.b=FuglznW6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=samsung.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 73AF98083A95; Fri, 8 Dec 2023 01:05:14 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232974AbjLHJEq (ORCPT + 99 others); Fri, 8 Dec 2023 04:04:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235800AbjLHJEo (ORCPT ); Fri, 8 Dec 2023 04:04:44 -0500 Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4741170F for ; Fri, 8 Dec 2023 01:04:48 -0800 (PST) Received: from epcas2p1.samsung.com (unknown [182.195.41.53]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20231208090445epoutp03953d9422610205448c37dcd8c1870325~ez_OhyMlq2928529285epoutp03g for ; Fri, 8 Dec 2023 09:04:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20231208090445epoutp03953d9422610205448c37dcd8c1870325~ez_OhyMlq2928529285epoutp03g DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1702026285; bh=yYMzrKo30mS8dgRGXyEvNMI4KEH0C0IvwnCAtCks3nA=; h=Subject:Reply-To:From:To:CC:Date:References:From; b=FuglznW6sLq896miMfkbrcSQbHHtDuTSWH6NTAjdvHQfHjn6sYZmCFFE1P03fReYz Hd3Vd+4azn20pybbeNI/JVxdsIMtlZMoO8aMxbyRKl8b3VgUoXbcYxLSx8cuvj/Wzy xGcPWy4utokemCcwmClgfkGGx9gx89OQPhxoP/xU= Received: from epsnrtp1.localdomain (unknown [182.195.42.162]) by epcas2p3.samsung.com (KnoxPortal) with ESMTP id 20231208090445epcas2p3170471713275562791a3debe07f37623~ez_OM1uPD1535915359epcas2p3L; Fri, 8 Dec 2023 09:04:45 +0000 (GMT) Received: from epsmges2p3.samsung.com (unknown [182.195.36.88]) by epsnrtp1.localdomain (Postfix) with ESMTP id 4SmlZh63LXz4x9QB; Fri, 8 Dec 2023 09:04:44 +0000 (GMT) X-AuditID: b6c32a47-bfdfa70000002726-da-6572dc2cdbdb Received: from epcas2p4.samsung.com ( [182.195.41.56]) by epsmges2p3.samsung.com (Symantec Messaging Gateway) with SMTP id 3E.EC.10022.C2CD2756; Fri, 8 Dec 2023 18:04:44 +0900 (KST) Mime-Version: 1.0 Subject: [PATCH v2] f2fs: New victim selection for GC Reply-To: yonggil.song@samsung.com Sender: Yonggil Song From: Yonggil Song To: "jaegeuk@kernel.org" , "chao@kernel.org" , "linux-f2fs-devel@lists.sourceforge.net" , "linux-kernel@vger.kernel.org" CC: Yonggil Song , Seokhwan Kim , Daejun Park , Siwoo Jung X-Priority: 3 X-Content-Kind-Code: NORMAL X-CPGS-Detection: blocking_info_exchange X-Drm-Type: N,general X-Msg-Generator: Mail X-Msg-Type: PERSONAL X-Reply-Demand: N Message-ID: <20231208090444epcms2p33884216391931d04c1771dfb51a08a44@epcms2p3> Date: Fri, 08 Dec 2023 18:04:44 +0900 X-CMS-MailID: 20231208090444epcms2p33884216391931d04c1771dfb51a08a44 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" X-Sendblock-Type: AUTO_CONFIDENTIAL CMS-TYPE: 102P X-CPGSPASS: Y X-CPGSPASS: Y X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmplk+LIzCtJLcpLzFFi42LZdljTQlfnTlGqwfxFehanp55lsnh5SNNi 1YNwiyfrZzFbXFrkbnF51xw2i/MTXzNZrOqYy2gx9fwRJgdOj02rOtk8di/4zOTRt2UVo8fn TXIBLFHZNhmpiSmpRQqpecn5KZl56bZK3sHxzvGmZgaGuoaWFuZKCnmJuam2Si4+AbpumTlA hygplCXmlAKFAhKLi5X07WyK8ktLUhUy8otLbJVSC1JyCswL9IoTc4tL89L18lJLrAwNDIxM gQoTsjMOrVrPXHDDuuLAhvtsDYxr9LoYOTgkBEwkeqcYdjFycggJ7GCUaD7BCxLmFRCU+LtD GCQsLGAqsXXZNCaIEiWJawd6WSDi+hKbFy9jB7HZBHQl/m5YDmaLCLxilDizWhTEZhZYwCjR 2OgAYksI8ErMaH/KAmFLS2xfvpURwtaQ+LGslxnCFpW4ufotO4z9/th8qBoRidZ7Z6FqBCUe /NwNFZeUWHToPBOEnS/xd8V1Ngi7RmJrQxtUXF/iWsdGsL28Ar4SW1Z/AIuzCKhKvH/4Ceoe F4nPr9cwQtwsL7H97RxmUDAwC2hKrN+lDwkoZYkjt1ggKvgkOg7/ZYf5qmHjb6zsHfOeQF2g JrF502ZWCFtG4sLjNsYJjEqzEOE8C8neWQh7FzAyr2IUSy0ozk1PLTYqMIZHbHJ+7iZGcHrU ct/BOOPtB71DjEwcjIcYJTiYlUR4c87npwrxpiRWVqUW5ccXleakFh9iNAX6eCKzlGhyPjBB 55XEG5pYGpiYmRmaG5kamCuJ895rnZsiJJCeWJKanZpakFoE08fEwSnVwNTH+bD9epSwkf+1 l83Fcf9t5wpLMG9/e5Iv7/9ldpPFn6eFhnCdmrFUPaNMpdfqXrXlZmEPh8tqf/fMuiJRFHix KMDV2LbgV/ppb913XHpVB+5MP+LeZp1V/jmoM2za6VmMT80s5kn63tB8vGSOYWrU5gnT219o ixxiULM9tsS1r6dc+sKzxkVrLH3ZY46ab7CLmvQ6N0s3QmF625r85HU1wtO3Ldr0veHMuffT 1u3qeZnkzjDvIjfzZ8npVoffKkl8LDaWOrywRFBU98Es7l+tZnEHegsu8Mf4Lla/Nl9nbdOK UybCygtNiw30D14Lfb9qSoHyfqZjZU3nOTZ/W/jlStrn+gg7HaHq+uwv+5RYijMSDbWYi4oT AXzWakcYBAAA DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20231208090444epcms2p33884216391931d04c1771dfb51a08a44 References: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Fri, 08 Dec 2023 01:05:14 -0800 (PST) Overview ======== This patch introduces a new way to preference data sections when selecting GC victims. Migration of data blocks causes invalidation of node blocks. Therefore, in situations where GC is frequent, selecting data blocks as victims can reduce unnecessary block migration by invalidating node blocks. For exceptional situations where free sections are insufficient, node blocks are selected as victims instead of data blocks to get extra free sections. Problem ======= If the total amount of nodes is larger than the size of one section, nodes occupy multiple sections, and node victims are often selected because the gc cost is lowered by data block migration in GC. Since moving the data section causes frequent node victim selection, victim threshing occurs in the node section. This results in an increase in WAF. Experiment ========== Test environment is as follows. System info - 3.6GHz, 16 core CPU - 36GiB Memory Device info - a conventional null_blk with 228MiB - a sequential null_blk with 4068 zones of 8MiB Format - mkfs.f2fs -c -m -Z 8 -o 3.89 Mount - mount Fio script - fio --rw=randwrite --bs=4k --ba=4k --filesize=31187m --norandommap --overwrite=1 --name=job1 --filename=./mnt/sustain --io_size=128g WAF calculation - (IOs on conv. null_blk + IOs on seq. null_blk) / random write IOs Conclusion ========== This experiment showed that the WAF was reduced by 29% (18.75 -> 13.3) when the data section was selected first when selecting GC victims. This was achieved by reducing the migration of the node blocks by 69.4% (253,131,743 blks -> 77,463,278 blks). It is possible to achieve low WAF performance with the GC victim selection method in environments where the section size is relatively small. Signed-off-by: Yonggil Song --- fs/f2fs/f2fs.h | 1 + fs/f2fs/gc.c | 98 ++++++++++++++++++++++++++++++++++++++------------ 2 files changed, 77 insertions(+), 22 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 9043cedfa12b..578d57f6022f 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1649,6 +1649,7 @@ struct f2fs_sb_info { struct f2fs_mount_info mount_opt; /* mount options */ /* for cleaning operations */ + bool need_node_clean; /* need to clean dirty nodes */ struct f2fs_rwsem gc_lock; /* * semaphore for GC, avoid * race between GC and GC or CP diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index f550cdeaa663..682dcf0de59e 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -368,6 +368,14 @@ static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi, if (p->alloc_mode == SSR) return get_seg_entry(sbi, segno)->ckpt_valid_blocks; + /* + * If we don't need to clean dirty nodes, + * we just skip node victims. + */ + if (IS_NODESEG(get_seg_entry(sbi, segno)->type) && + !sbi->need_node_clean) + return get_max_cost(sbi, p); + /* alloc_mode == LFS */ if (p->gc_mode == GC_GREEDY) return get_valid_blocks(sbi, segno, true); @@ -557,6 +565,14 @@ static void atgc_lookup_victim(struct f2fs_sb_info *sbi, if (ve->mtime >= max_mtime || ve->mtime < min_mtime) goto skip; + /* + * If we don't need to clean dirty nodes, + * we just skip node victims. + */ + if (IS_NODESEG(get_seg_entry(sbi, ve->segno)->type) && + !sbi->need_node_clean) + goto skip; + /* age = 10000 * x% * 60 */ age = div64_u64(accu * (max_mtime - ve->mtime), total_time) * age_weight; @@ -913,7 +929,21 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned int *result, goto retry; } + if (p.min_segno != NULL_SEGNO) { + if (sbi->need_node_clean && + IS_DATASEG(get_seg_entry(sbi, p.min_segno)->type)) { + /* + * we need to clean node sections. + * but, data victim cost is the lowest. + * if free sections are enough, stop cleaning node victim. + * if not, it goes on by GCing data victims. + */ + if (has_enough_free_secs(sbi, prefree_segments(sbi), 0)) { + p.min_segno = NULL_SEGNO; + goto out; + } + } got_it: *result = (p.min_segno / p.ofs_unit) * p.ofs_unit; got_result: @@ -1830,8 +1860,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) goto stop; } + __get_secs_required(sbi, NULL, &upper_secs, NULL); + + /* + * Write checkpoint to reclaim prefree segments. + * We need more three extra sections for writer's data/node/dentry. + */ + if (free_sections(sbi) <= upper_secs + NR_GC_CHECKPOINT_SECS) { + sbi->need_node_clean = true; + + if (prefree_segments(sbi)) { + stat_inc_cp_call_count(sbi, TOTAL_CALL); + ret = f2fs_write_checkpoint(sbi, &cpc); + if (ret) + goto stop; + /* Reset due to checkpoint */ + sec_freed = 0; + } + } + /* Let's run FG_GC, if we don't have enough space. */ - if (has_not_enough_free_secs(sbi, 0, 0)) { + if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0, 0)) { gc_type = FG_GC; /* @@ -1858,10 +1907,22 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) ret = __get_victim(sbi, &segno, gc_type); if (ret) { /* allow to search victim from sections has pinned data */ - if (ret == -ENODATA && gc_type == FG_GC && - f2fs_pinned_section_exists(DIRTY_I(sbi))) { - f2fs_unpin_all_sections(sbi, false); - goto retry; + if (ret == -ENODATA && gc_type == FG_GC) { + if (f2fs_pinned_section_exists(DIRTY_I(sbi))) { + f2fs_unpin_all_sections(sbi, false); + goto retry; + } + /* + * If we have no more data victims, let's start to + * clean dirty nodes. + */ + if (!sbi->need_node_clean) { + sbi->need_node_clean = true; + goto retry; + } + /* node cleaning is over */ + else if (sbi->need_node_clean) + goto stop; } goto stop; } @@ -1882,7 +1943,13 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) if (!gc_control->no_bg_gc && total_sec_freed < gc_control->nr_free_secs) goto go_gc_more; - goto stop; + /* + * If the need_node_clean flag is set + * even though there are enough free + * sections, node cleaning will continue. + */ + if (!sbi->need_node_clean) + goto stop; } if (sbi->skipped_gc_rwsem) skipped_round++; @@ -1897,21 +1964,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) goto stop; } - __get_secs_required(sbi, NULL, &upper_secs, NULL); - - /* - * Write checkpoint to reclaim prefree segments. - * We need more three extra sections for writer's data/node/dentry. - */ - if (free_sections(sbi) <= upper_secs + NR_GC_CHECKPOINT_SECS && - prefree_segments(sbi)) { - stat_inc_cp_call_count(sbi, TOTAL_CALL); - ret = f2fs_write_checkpoint(sbi, &cpc); - if (ret) - goto stop; - /* Reset due to checkpoint */ - sec_freed = 0; - } go_gc_more: segno = NULL_SEGNO; goto gc_more; @@ -1920,8 +1972,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control) SIT_I(sbi)->last_victim[ALLOC_NEXT] = 0; SIT_I(sbi)->last_victim[FLUSH_DEVICE] = gc_control->victim_segno; - if (gc_type == FG_GC) + if (gc_type == FG_GC) { f2fs_unpin_all_sections(sbi, true); + sbi->need_node_clean = false; + } trace_f2fs_gc_end(sbi->sb, ret, total_freed, total_sec_freed, get_pages(sbi, F2FS_DIRTY_NODES), -- 2.34.1