Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp2711124rdb; Tue, 12 Sep 2023 09:45:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGAPh/iZsHJkgKh8R9x7Gk/4WOp1Cm76OfxQlxTYsh9rAuMFZW/gj6jgOBSXToodsEEOoeZ X-Received: by 2002:a05:6a20:8f02:b0:140:730b:4b3f with SMTP id b2-20020a056a208f0200b00140730b4b3fmr14260421pzk.1.1694537156988; Tue, 12 Sep 2023 09:45:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694537156; cv=none; d=google.com; s=arc-20160816; b=niQBKysOIDC07h/FALnJmb8+I/PWfQagRxE6KYysQ18TttL4YLMUvtyvICtJSfLxQN 9vef9OfwTgHB++NTrhS82yFQ5pM7KTY3eH5m8LDGkUvmWAjcwvmTdr9eOmQfBzsC4oi6 w8tM9bujZJGnIv8ntpA5RII2YURr3JTr8a1mvipmN92ZVw1QKbAHjtmqaR3/X2WUWuQE wRlLOKKB+xPcNFOfblG7OWhy/HXK2003qFjqv4zFJP8vM+0AeytGxLjc4X7E1Lcva5kJ XxpQRz32zcwixXzmKv6d8oCN4Mj56ad1+SPNiSawBWbSBiTiCRbBdkOFPnTJ1FWg93N8 Y0dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:to:from:subject; bh=4w4MJJKHZbujp0qZu4Lw0OBiDITNGTyyd8umvxsjoyI=; fh=JaHQ2wEV+57F2ViXFx3r3KoROlRM9Ph/xpqQAZp/KQo=; b=w03ARnpGF6h5RRICBP/OB3ZFLmmJJCuwHDGX/0pWL9K3w2o/a8gmtwx6yXyub/JUIQ LpBdpisWsjl+LNq3SR0aZ9UXzv6AKMS/gHZUW8+N3pqAKYUg12xYcMeo1aaadlWQeZoX oDPjTZTDsPp5rJg2cLnang46DE4v4fHMqv3s9huep76Ys8MSuFMfcc2UmH+xr6hQIhjb T0ZHCripwv4h80bGXHp8D8IIFfthB45TFbIMDfG0LeSuE6s2gqUNbhrpwz0SHxzxj6Q+ 89uAprwEN/78ITlThWOMsiROxmN1g58JlyWikrwbqBwm7lk9pER16a0c9VdZt3ggPBXg ElxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id fk7-20020a056a003a8700b00682850547a9si4691223pfb.201.2023.09.12.09.45.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Sep 2023 09:45:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id AF12A8339640; Tue, 12 Sep 2023 00:02:25 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231148AbjILHCU (ORCPT + 99 others); Tue, 12 Sep 2023 03:02:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231171AbjILHCR (ORCPT ); Tue, 12 Sep 2023 03:02:17 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A50ADE79; Tue, 12 Sep 2023 00:02:13 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4RlDzQ0KpXz4f3k6J; Tue, 12 Sep 2023 15:02:10 +0800 (CST) Received: from [10.174.178.129] (unknown [10.174.178.129]) by APP4 (Coremail) with SMTP id gCh0CgD3htbxDABlKlAQAQ--.33012S2; Tue, 12 Sep 2023 15:02:10 +0800 (CST) Subject: Re: [PATCH v6 05/11] ext4: Separate block bitmap and buddy bitmap freeing in ext4_mb_clear_bb() From: Kemeng Shi To: Ritesh Harjani , tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org References: <878r9q8cf9.fsf@doe.com> <656f0cf0-1092-84eb-e934-534102c2fa54@huaweicloud.com> Message-ID: <5e2a9358-16ca-2698-f6d1-08352f72d6e9@huaweicloud.com> Date: Tue, 12 Sep 2023 15:02:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0 MIME-Version: 1.0 In-Reply-To: <656f0cf0-1092-84eb-e934-534102c2fa54@huaweicloud.com> Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 7bit X-CM-TRANSID: gCh0CgD3htbxDABlKlAQAQ--.33012S2 X-Coremail-Antispam: 1UD129KBjvJXoW3GrW5Aw13uF4DGFWxXFyxXwb_yoW7Aw4Dpr 9Fk3WUCrn8Grs09r1I9w1jq3WxK3y8GF4UGrW3u34rCF9FqF93KF1xKFy5CFyqyFs7JF1v qw4Y9rZ7Cr4Iva7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyEb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JMxk0xIA0c2IEe2xFo4CEbIxvr21l42xK82IYc2Ij 64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x 8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0xvE 2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42 xK8VAvwI8IcIk0rVWrZr1j6s0DMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIE c7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU1CPfJUUUUU== X-CM-SenderInfo: 5vklyvpphqwq5kxd4v5lfo033gof0z/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 12 Sep 2023 00:02:25 -0700 (PDT) on 9/4/2023 11:00 AM, Kemeng Shi wrote: > > > on 9/1/2023 5:34 PM, Ritesh Harjani wrote: >> Kemeng Shi writes: >> >>> This patch separates block bitmap and buddy bitmap freeing in order to >>> udpate block bitmap with ext4_mb_mark_context in following patch. >> ^^^ update. >> >>> >>> Separated freeing is safe with concurrent allocation as long as: >>> 1. Firstly allocate block in buddy bitmap, and then in block bitmap. >>> 2. Firstly free block in block bitmap, and then buddy bitmap. >>> Then freed block will only be available to allocation when both buddy >>> bitmap and block bitmap are updated by freeing. >>> Allocation obeys rule 1 already, just do sperated freeing with rule 2. >> >> So we also don't need ext4_load_buddy_gfp() before freeing on-disk >> bitmap right. Continue below... >> >>> >>> Separated freeing has no race with generate_buddy as: >>> Once ext4_mb_load_buddy_gfp is executed successfully, the update-to-date >>> buddy page can be found in sbi->s_buddy_cache and no more buddy >>> initialization of the buddy page will be executed concurrently until >>> buddy page is unloaded. As we always do free in "load buddy, free, >>> unload buddy" sequence, separated freeing has no race with generate_buddy. >>> >>> Signed-off-by: Kemeng Shi >>> --- >>> fs/ext4/mballoc.c | 18 ++++++++---------- >>> 1 file changed, 8 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c >>> index e650eac22237..01ad36a1cc96 100644 >>> --- a/fs/ext4/mballoc.c >>> +++ b/fs/ext4/mballoc.c >>> @@ -6519,6 +6519,14 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode, >>> if (err) >>> goto error_return; >> >> >> Let me add the a piece of code before the added changes and continue... >> >> err = ext4_mb_load_buddy_gfp(sb, block_group, &e4b, >> GFP_NOFS|__GFP_NOFAIL); >> if (err) >> goto error_return; >>> >>> + ext4_lock_group(sb, block_group); >>> + mb_clear_bits(bitmap_bh->b_data, bit, count_clusters); >>> + ret = ext4_free_group_clusters(sb, gdp) + count_clusters; >>> + ext4_free_group_clusters_set(sb, gdp, ret); >>> + ext4_block_bitmap_csum_set(sb, gdp, bitmap_bh); >>> + ext4_group_desc_csum_set(sb, block_group, gdp); >>> + ext4_unlock_group(sb, block_group); >>> + >> >> ...Is it required for ext4_mb_load_buddy_gfp() to be called before >> freeing on-disk bitmap blocks? Can you explain why please? >> At least it is not very clear to me that why do we need >> ext4_mb_load_buddy_gfp() before clearing of bitmap blocks. If it's not >> needed then I think we should separate out bitmap freeing logic and >> buddy freeing logic even further. > Yes, ext4_mb_load_buddy_gfp is no needed for clearing of bitmap, put > it before bit clearing for catching error and aborting mark early > to reduce functional change. Hi Ritesh, sorry for bother. I'm going to push a new version and I perfer to call ext4_mb_load_buddy_gfp early to catch potential error in advance. Just wonder is this good to you. Like to hear any advice. Thanks! >> >> Thoughts? >> >>> /* >>> * We need to make sure we don't reuse the freed block until after the >>> * transaction is committed. We make an exception if the inode is to be >>> @@ -6541,13 +6549,8 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode, >>> new_entry->efd_tid = handle->h_transaction->t_tid; >>> >>> ext4_lock_group(sb, block_group); >>> - mb_clear_bits(bitmap_bh->b_data, bit, count_clusters); >>> ext4_mb_free_metadata(handle, &e4b, new_entry); >>> } else { >>> - /* need to update group_info->bb_free and bitmap >>> - * with group lock held. generate_buddy look at >>> - * them with group lock_held >>> - */ >>> if (test_opt(sb, DISCARD)) { >>> err = ext4_issue_discard(sb, block_group, bit, >>> count_clusters, NULL); >>> @@ -6560,14 +6563,9 @@ static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode, >>> EXT4_MB_GRP_CLEAR_TRIMMED(e4b.bd_info); >>> >>> ext4_lock_group(sb, block_group); >>> - mb_clear_bits(bitmap_bh->b_data, bit, count_clusters); >>> mb_free_blocks(inode, &e4b, bit, count_clusters); >>> } >>> >>> - ret = ext4_free_group_clusters(sb, gdp) + count_clusters; >>> - ext4_free_group_clusters_set(sb, gdp, ret); >>> - ext4_block_bitmap_csum_set(sb, gdp, bitmap_bh); >>> - ext4_group_desc_csum_set(sb, block_group, gdp); >>> ext4_unlock_group(sb, block_group); >>> >>> if (sbi->s_log_groups_per_flex) { >> >> >> Adding piece of code here... >> >> if (sbi->s_log_groups_per_flex) { >> ext4_group_t flex_group = ext4_flex_group(sbi, block_group); >> atomic64_add(count_clusters, >> &sbi_array_rcu_deref(sbi, s_flex_groups, >> flex_group)->free_clusters); >> } >> >> <...> >> >> /* We dirtied the bitmap block */ >> BUFFER_TRACE(bitmap_bh, "dirtied bitmap block"); >> err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh); >> /* And the group descriptor block */ >> BUFFER_TRACE(gd_bh, "dirtied group descriptor block"); >> ret = ext4_handle_dirty_metadata(handle, NULL, gd_bh); >> if (!err) >> err = ret; >> >> I was thinking even this can go around bitmap freeing logic above. So >> the next patch becomes very clear that all of the bitmap freeing logic >> is just simply moved into ext4_mb_mark_context() function. >> >> -ritesh >> > >