Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp984630pxb; Wed, 6 Apr 2022 06:03:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUoOSVdwUGLN5OXr/EIpxcGd8I+F3/cFUejWLbjI6kfKobRVhrjnoEXQj+F1XW2AWe/gNl X-Received: by 2002:a05:6638:dd4:b0:323:d48f:83b5 with SMTP id m20-20020a0566380dd400b00323d48f83b5mr4261090jaj.174.1649250197235; Wed, 06 Apr 2022 06:03:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649250197; cv=none; d=google.com; s=arc-20160816; b=eOrk6oVHcZaZhe7CLAjISY68JYnY+GtaQTUOioH6aJc+H7CaZ3rtf4rWk6mKEvLEAw g3RjoTuf+K2o6CkP3nCa7W6uXH/pTFi2KrYACr1wmzv4DoLlG7DVcTn1U9n5enUHDyQo 3yqJyOoji597pZGkZ+0wwyQmBrb2v5xzsc42c9B03qEpluobjlEspO+s9UbrS/G57dKI LrecuFPmZZHQzDIMTXbNXWWQTUuPp2YhGFyaMbbzddBAJEp/PMKtm8BmXobRmfOR9Smr 3vDXjrxb4QCLjS6UUpou+jLJC1BLqCLW2qX/Wr/PT6OVFtW6zT9OR4x4lB87vzvKunvX E25A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=Z3Y2D9mHjaYFUBqJgtSp2eyiGykkPPtzzgP7j/cDqW8=; b=qMifubiuR0+VhQV3KdLlmiECpnCSVr26zuLNNjeZdSdbKu13ah3tXw5gUxdrwZNi1m WjuXzrv481KBAdEUx9xUWV4yzoRZtCKz8KesBrQiLoRDcd/5jq4lUvDZJQLwYRJYTEHu LxBOWXoEmudhfdcTi9IBCW2BkhVgOP0KB5pKn3aNFlsTaxqnGnkAdXLadu8cv8hppD1D d0iw5vfXlA5l2XKpD1v/TBLgiFhPB1Iv75teD8M+9oxG1c5r2k8K6ozBTOZKGMj6Dj9r yGGEWrpr4FwcqXDdnQZzjESlEGXK6kIJH+FpVPBQChTyyy1yM+JS3YrNl2lglft5pMCr xHsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=w0on+jKg; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id q5-20020a056638238500b003236a8ca6absi9268761jat.73.2022.04.06.06.03.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 06:03:17 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=w0on+jKg; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 19C2C219ADB; Wed, 6 Apr 2022 03:22:51 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349865AbiDEUvD (ORCPT + 99 others); Tue, 5 Apr 2022 16:51:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242515AbiDEJzC (ORCPT ); Tue, 5 Apr 2022 05:55:02 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E41FF4990D; Tue, 5 Apr 2022 02:50:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id ABD19B81B7A; Tue, 5 Apr 2022 09:50:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B664C385A1; Tue, 5 Apr 2022 09:50:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1649152255; bh=m43Nf6tzPscNiawCZmRigwyLOc1Pv6hHalythRrKcCE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=w0on+jKgxDLJijDKTgXmnvNbRqsOnQ84SEvC0TnyEb6i2DrRcwhyOxwwjwetK8qS1 aNhPTswBXx5Ex2PoYvyfW+0gz+lITgyuoQfNiEy51Y7+4D3qjRU1HvNTX29a8+h4jX wl9EWnQ0xPkd0ySyU87ihmzpKqlMkS0bq/Ae1AVw= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Ritesh Harjani , Jan Kara , Theodore Tso , Sasha Levin Subject: [PATCH 5.15 712/913] ext4: fix ext4_mb_mark_bb() with flex_bg with fast_commit Date: Tue, 5 Apr 2022 09:29:34 +0200 Message-Id: <20220405070401.177212566@linuxfoundation.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405070339.801210740@linuxfoundation.org> References: <20220405070339.801210740@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ritesh Harjani [ Upstream commit bfdc502a4a4c058bf4cbb1df0c297761d528f54d ] In case of flex_bg feature (which is by default enabled), extents for any given inode might span across blocks from two different block group. ext4_mb_mark_bb() only reads the buffer_head of block bitmap once for the starting block group, but it fails to read it again when the extent length boundary overflows to another block group. Then in this below loop it accesses memory beyond the block group bitmap buffer_head and results into a data abort. for (i = 0; i < clen; i++) if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state) already++; This patch adds this functionality for checking block group boundary in ext4_mb_mark_bb() and update the buffer_head(bitmap_bh) for every different block group. w/o this patch, I was easily able to hit a data access abort using Power platform. <...> [ 74.327662] EXT4-fs error (device loop3): ext4_mb_generate_buddy:1141: group 11, block bitmap and bg descriptor inconsistent: 21248 vs 23294 free clusters [ 74.533214] EXT4-fs (loop3): shut down requested (2) [ 74.536705] Aborting journal on device loop3-8. [ 74.702705] BUG: Unable to handle kernel data access on read at 0xc00000005e980000 [ 74.703727] Faulting instruction address: 0xc0000000007bffb8 cpu 0xd: Vector: 300 (Data Access) at [c000000015db7060] pc: c0000000007bffb8: ext4_mb_mark_bb+0x198/0x5a0 lr: c0000000007bfeec: ext4_mb_mark_bb+0xcc/0x5a0 sp: c000000015db7300 msr: 800000000280b033 dar: c00000005e980000 dsisr: 40000000 current = 0xc000000027af6880 paca = 0xc00000003ffd5200 irqmask: 0x03 irq_happened: 0x01 pid = 5167, comm = mount <...> enter ? for help [c000000015db7380] c000000000782708 ext4_ext_clear_bb+0x378/0x410 [c000000015db7400] c000000000813f14 ext4_fc_replay+0x1794/0x2000 [c000000015db7580] c000000000833f7c do_one_pass+0xe9c/0x12a0 [c000000015db7710] c000000000834504 jbd2_journal_recover+0x184/0x2d0 [c000000015db77c0] c000000000841398 jbd2_journal_load+0x188/0x4a0 [c000000015db7880] c000000000804de8 ext4_fill_super+0x2638/0x3e10 [c000000015db7a40] c0000000005f8404 get_tree_bdev+0x2b4/0x350 [c000000015db7ae0] c0000000007ef058 ext4_get_tree+0x28/0x40 [c000000015db7b00] c0000000005f6344 vfs_get_tree+0x44/0x100 [c000000015db7b70] c00000000063c408 path_mount+0xdd8/0xe70 [c000000015db7c40] c00000000063c8f0 sys_mount+0x450/0x550 [c000000015db7d50] c000000000035770 system_call_exception+0x4a0/0x4e0 [c000000015db7e10] c00000000000c74c system_call_common+0xec/0x250 Signed-off-by: Ritesh Harjani Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/2609bc8f66fc15870616ee416a18a3d392a209c4.1644992609.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin --- fs/ext4/mballoc.c | 131 +++++++++++++++++++++++++++------------------- 1 file changed, 76 insertions(+), 55 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 9a749327336f..9b3fad3235b8 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3901,72 +3901,93 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block, ext4_grpblk_t blkoff; int i, err; int already; - unsigned int clen, clen_changed; + unsigned int clen, clen_changed, thisgrp_len; - clen = EXT4_NUM_B2C(sbi, len); - - ext4_get_group_no_and_offset(sb, block, &group, &blkoff); - bitmap_bh = ext4_read_block_bitmap(sb, group); - if (IS_ERR(bitmap_bh)) { - err = PTR_ERR(bitmap_bh); - bitmap_bh = NULL; - goto out_err; - } - - err = -EIO; - gdp = ext4_get_group_desc(sb, group, &gdp_bh); - if (!gdp) - goto out_err; + while (len > 0) { + ext4_get_group_no_and_offset(sb, block, &group, &blkoff); - ext4_lock_group(sb, group); - already = 0; - for (i = 0; i < clen; i++) - if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state) - already++; - - clen_changed = clen - already; - if (state) - ext4_set_bits(bitmap_bh->b_data, blkoff, clen); - else - mb_test_and_clear_bits(bitmap_bh->b_data, blkoff, clen); - if (ext4_has_group_desc_csum(sb) && - (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) { - gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT); - ext4_free_group_clusters_set(sb, gdp, - ext4_free_clusters_after_init(sb, - group, gdp)); - } - if (state) - clen = ext4_free_group_clusters(sb, gdp) - clen_changed; - else - clen = ext4_free_group_clusters(sb, gdp) + clen_changed; + /* + * Check to see if we are freeing blocks across a group + * boundary. + * In case of flex_bg, this can happen that (block, len) may + * span across more than one group. In that case we need to + * get the corresponding group metadata to work with. + * For this we have goto again loop. + */ + thisgrp_len = min_t(unsigned int, (unsigned int)len, + EXT4_BLOCKS_PER_GROUP(sb) - EXT4_C2B(sbi, blkoff)); + clen = EXT4_NUM_B2C(sbi, thisgrp_len); - ext4_free_group_clusters_set(sb, gdp, clen); - ext4_block_bitmap_csum_set(sb, group, gdp, bitmap_bh); - ext4_group_desc_csum_set(sb, group, gdp); + bitmap_bh = ext4_read_block_bitmap(sb, group); + if (IS_ERR(bitmap_bh)) { + err = PTR_ERR(bitmap_bh); + bitmap_bh = NULL; + break; + } - ext4_unlock_group(sb, group); + err = -EIO; + gdp = ext4_get_group_desc(sb, group, &gdp_bh); + if (!gdp) + break; - if (sbi->s_log_groups_per_flex) { - ext4_group_t flex_group = ext4_flex_group(sbi, group); - struct flex_groups *fg = sbi_array_rcu_deref(sbi, - s_flex_groups, flex_group); + ext4_lock_group(sb, group); + already = 0; + for (i = 0; i < clen; i++) + if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == + !state) + already++; + clen_changed = clen - already; if (state) - atomic64_sub(clen_changed, &fg->free_clusters); + ext4_set_bits(bitmap_bh->b_data, blkoff, clen); else - atomic64_add(clen_changed, &fg->free_clusters); + mb_test_and_clear_bits(bitmap_bh->b_data, blkoff, clen); + if (ext4_has_group_desc_csum(sb) && + (gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))) { + gdp->bg_flags &= cpu_to_le16(~EXT4_BG_BLOCK_UNINIT); + ext4_free_group_clusters_set(sb, gdp, + ext4_free_clusters_after_init(sb, group, gdp)); + } + if (state) + clen = ext4_free_group_clusters(sb, gdp) - clen_changed; + else + clen = ext4_free_group_clusters(sb, gdp) + clen_changed; + + ext4_free_group_clusters_set(sb, gdp, clen); + ext4_block_bitmap_csum_set(sb, group, gdp, bitmap_bh); + ext4_group_desc_csum_set(sb, group, gdp); + + ext4_unlock_group(sb, group); + + if (sbi->s_log_groups_per_flex) { + ext4_group_t flex_group = ext4_flex_group(sbi, group); + struct flex_groups *fg = sbi_array_rcu_deref(sbi, + s_flex_groups, flex_group); + + if (state) + atomic64_sub(clen_changed, &fg->free_clusters); + else + atomic64_add(clen_changed, &fg->free_clusters); + + } + + err = ext4_handle_dirty_metadata(NULL, NULL, bitmap_bh); + if (err) + break; + sync_dirty_buffer(bitmap_bh); + err = ext4_handle_dirty_metadata(NULL, NULL, gdp_bh); + sync_dirty_buffer(gdp_bh); + if (err) + break; + + block += thisgrp_len; + len -= thisgrp_len; + brelse(bitmap_bh); + BUG_ON(len < 0); } - err = ext4_handle_dirty_metadata(NULL, NULL, bitmap_bh); if (err) - goto out_err; - sync_dirty_buffer(bitmap_bh); - err = ext4_handle_dirty_metadata(NULL, NULL, gdp_bh); - sync_dirty_buffer(gdp_bh); - -out_err: - brelse(bitmap_bh); + brelse(bitmap_bh); } /* -- 2.34.1