Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1160968ybf; Thu, 27 Feb 2020 06:08:14 -0800 (PST) X-Google-Smtp-Source: APXvYqxItJMgtfTVUenHhq/QUSRw0uM/iSSwxU38kSlkKhDrY8hMm5diPGmUK9jGJ6qLPEDrQScr X-Received: by 2002:aca:cf12:: with SMTP id f18mr3431998oig.81.1582812494799; Thu, 27 Feb 2020 06:08:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582812494; cv=none; d=google.com; s=arc-20160816; b=updXsuYPcS6GxjimFMrhwtqQBPqlj5C99IREmfJf4l6M58ryA0xKtPLN3qEa8TWUUj zFpZP9wW18u7A5XlicnNbDz9if3ZpiRHDHhdpAiBqG1zuKxr0FyyKuT968HGrgx/kQrT 5IqqnLqnd2YvzXFIpTz98aATNNjnwaKgaxoSZrV6gJGRQ3SpZLK+Rk5dNPs55SxZ0I4c EguVSNZr5BKMeRwkM43D8cQdDqhqPlck+W2BVkiF1+45GMVHm97m++e0aMUzcwlalH7R Ip0qt4eGIoUVfVlIv5MvDUEDsdrAWNr60S4iq7oGn2QyMSVtdLy75M4SKZq1oN0jcrkb ZpKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=yqKJlsdvxbhLBeR0/8A3aJcEtV72LCkfDy+z0hAeuMk=; b=Vsf0W5r09nHVKl8BLpp7G2qGbZo+krmLDCq4ZS0mEwtVKuNoFr8Ds+d+htLpIatBYf V21EOPB6QwKwxzrOUrQs6qUpOZ0D8bYbdnDyt+R/qk1HicDX53svC2qsDlsVtrRQxML6 UlfNxL0x6gJQDGh9hpnK0f9inH1tzr1qSSOYBUil1BlyUbk3rqgkT7VjZAH2oemqfk0L Acs2vtj5lllRVGfQ7o1Dn9BNm/HNCdPC765NCxd3veTa5Er97fAUsjrEPUE0y3Ng1s3r 3L41RvU9puG9VtBKvNhm7aCSPk25VAXQOZgHosa1TwxxIobV1PkydankYNjkgZbH4IpG S7UQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=BT2cPGMN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s11si1602732otr.14.2020.02.27.06.07.42; Thu, 27 Feb 2020 06:08:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=BT2cPGMN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387994AbgB0OGx (ORCPT + 99 others); Thu, 27 Feb 2020 09:06:53 -0500 Received: from mail.kernel.org ([198.145.29.99]:44078 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387980AbgB0OGt (ORCPT ); Thu, 27 Feb 2020 09:06:49 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AF13220578; Thu, 27 Feb 2020 14:06:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1582812407; bh=g/TMolZHmVOSyHJsC+BjTpADaxnWORCWbhIYZqOId4I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BT2cPGMNMRxsbaCwv0Iga51imXJV/S+8ZKn7EUQ/kC/B9xolNYvirsSXO5hctW0Jl q0igmWeohONOstt2D6dqx9q+68I3h+7OtLLrW+HRv2OoulGyQPLEMtEmBma3pKr3vN HB+um6KfB6fjXzJL2n9CVE971ojxF/7yCfNP++9k= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Suraj Jitindar Singh , Theodore Tso , stable@kernel.org Subject: [PATCH 4.19 67/97] ext4: fix potential race between s_flex_groups online resizing and access Date: Thu, 27 Feb 2020 14:37:15 +0100 Message-Id: <20200227132225.438095218@linuxfoundation.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200227132214.553656188@linuxfoundation.org> References: <20200227132214.553656188@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Suraj Jitindar Singh commit 7c990728b99ed6fbe9c75fc202fce1172d9916da upstream. During an online resize an array of s_flex_groups structures gets replaced so it can get enlarged. If there is a concurrent access to the array and this memory has been reused then this can lead to an invalid memory access. The s_flex_group array has been converted into an array of pointers rather than an array of structures. This is to ensure that the information contained in the structures cannot get out of sync during a resize due to an accessor updating the value in the old structure after it has been copied but before the array pointer is updated. Since the structures them- selves are no longer copied but only the pointers to them this case is mitigated. Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443 Link: https://lore.kernel.org/r/20200221053458.730016-4-tytso@mit.edu Signed-off-by: Suraj Jitindar Singh Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman --- fs/ext4/ext4.h | 2 - fs/ext4/ialloc.c | 23 ++++++++++------- fs/ext4/mballoc.c | 9 ++++-- fs/ext4/resize.c | 7 +++-- fs/ext4/super.c | 72 +++++++++++++++++++++++++++++++++++++----------------- 5 files changed, 76 insertions(+), 37 deletions(-) --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1480,7 +1480,7 @@ struct ext4_sb_info { unsigned int s_extent_max_zeroout_kb; unsigned int s_log_groups_per_flex; - struct flex_groups *s_flex_groups; + struct flex_groups * __rcu *s_flex_groups; ext4_group_t s_flex_groups_allocated; /* workqueue for reserved extent conversions (buffered io) */ --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -330,11 +330,13 @@ void ext4_free_inode(handle_t *handle, s percpu_counter_inc(&sbi->s_freeinodes_counter); if (sbi->s_log_groups_per_flex) { - ext4_group_t f = ext4_flex_group(sbi, block_group); + struct flex_groups *fg; - atomic_inc(&sbi->s_flex_groups[f].free_inodes); + fg = sbi_array_rcu_deref(sbi, s_flex_groups, + ext4_flex_group(sbi, block_group)); + atomic_inc(&fg->free_inodes); if (is_directory) - atomic_dec(&sbi->s_flex_groups[f].used_dirs); + atomic_dec(&fg->used_dirs); } BUFFER_TRACE(bh2, "call ext4_handle_dirty_metadata"); fatal = ext4_handle_dirty_metadata(handle, NULL, bh2); @@ -370,12 +372,13 @@ static void get_orlov_stats(struct super int flex_size, struct orlov_stats *stats) { struct ext4_group_desc *desc; - struct flex_groups *flex_group = EXT4_SB(sb)->s_flex_groups; if (flex_size > 1) { - stats->free_inodes = atomic_read(&flex_group[g].free_inodes); - stats->free_clusters = atomic64_read(&flex_group[g].free_clusters); - stats->used_dirs = atomic_read(&flex_group[g].used_dirs); + struct flex_groups *fg = sbi_array_rcu_deref(EXT4_SB(sb), + s_flex_groups, g); + stats->free_inodes = atomic_read(&fg->free_inodes); + stats->free_clusters = atomic64_read(&fg->free_clusters); + stats->used_dirs = atomic_read(&fg->used_dirs); return; } @@ -1056,7 +1059,8 @@ got: if (sbi->s_log_groups_per_flex) { ext4_group_t f = ext4_flex_group(sbi, group); - atomic_inc(&sbi->s_flex_groups[f].used_dirs); + atomic_inc(&sbi_array_rcu_deref(sbi, s_flex_groups, + f)->used_dirs); } } if (ext4_has_group_desc_csum(sb)) { @@ -1079,7 +1083,8 @@ got: if (sbi->s_log_groups_per_flex) { flex_group = ext4_flex_group(sbi, group); - atomic_dec(&sbi->s_flex_groups[flex_group].free_inodes); + atomic_dec(&sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_inodes); } inode->i_ino = ino + group * EXT4_INODES_PER_GROUP(sb); --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3036,7 +3036,8 @@ ext4_mb_mark_diskspace_used(struct ext4_ ext4_group_t flex_group = ext4_flex_group(sbi, ac->ac_b_ex.fe_group); atomic64_sub(ac->ac_b_ex.fe_len, - &sbi->s_flex_groups[flex_group].free_clusters); + &sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_clusters); } err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh); @@ -4930,7 +4931,8 @@ do_more: if (sbi->s_log_groups_per_flex) { ext4_group_t flex_group = ext4_flex_group(sbi, block_group); atomic64_add(count_clusters, - &sbi->s_flex_groups[flex_group].free_clusters); + &sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_clusters); } if (!(flags & EXT4_FREE_BLOCKS_NO_QUOT_UPDATE)) @@ -5079,7 +5081,8 @@ int ext4_group_add_blocks(handle_t *hand if (sbi->s_log_groups_per_flex) { ext4_group_t flex_group = ext4_flex_group(sbi, block_group); atomic64_add(clusters_freed, - &sbi->s_flex_groups[flex_group].free_clusters); + &sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_clusters); } ext4_mb_unload_buddy(&e4b); --- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1452,11 +1452,14 @@ static void ext4_update_super(struct sup percpu_counter_read(&sbi->s_freeclusters_counter)); if (ext4_has_feature_flex_bg(sb) && sbi->s_log_groups_per_flex) { ext4_group_t flex_group; + struct flex_groups *fg; + flex_group = ext4_flex_group(sbi, group_data[0].group); + fg = sbi_array_rcu_deref(sbi, s_flex_groups, flex_group); atomic64_add(EXT4_NUM_B2C(sbi, free_blocks), - &sbi->s_flex_groups[flex_group].free_clusters); + &fg->free_clusters); atomic_add(EXT4_INODES_PER_GROUP(sb) * flex_gd->count, - &sbi->s_flex_groups[flex_group].free_inodes); + &fg->free_inodes); } /* --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -970,6 +970,7 @@ static void ext4_put_super(struct super_ struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_super_block *es = sbi->s_es; struct buffer_head **group_desc; + struct flex_groups **flex_groups; int aborted = 0; int i, err; @@ -1005,8 +1006,13 @@ static void ext4_put_super(struct super_ for (i = 0; i < sbi->s_gdb_count; i++) brelse(group_desc[i]); kvfree(group_desc); + flex_groups = rcu_dereference(sbi->s_flex_groups); + if (flex_groups) { + for (i = 0; i < sbi->s_flex_groups_allocated; i++) + kvfree(flex_groups[i]); + kvfree(flex_groups); + } rcu_read_unlock(); - kvfree(sbi->s_flex_groups); percpu_counter_destroy(&sbi->s_freeclusters_counter); percpu_counter_destroy(&sbi->s_freeinodes_counter); percpu_counter_destroy(&sbi->s_dirs_counter); @@ -2291,8 +2297,8 @@ done: int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup) { struct ext4_sb_info *sbi = EXT4_SB(sb); - struct flex_groups *new_groups; - int size; + struct flex_groups **old_groups, **new_groups; + int size, i; if (!sbi->s_log_groups_per_flex) return 0; @@ -2301,22 +2307,37 @@ int ext4_alloc_flex_bg_array(struct supe if (size <= sbi->s_flex_groups_allocated) return 0; - size = roundup_pow_of_two(size * sizeof(struct flex_groups)); - new_groups = kvzalloc(size, GFP_KERNEL); + new_groups = kvzalloc(roundup_pow_of_two(size * + sizeof(*sbi->s_flex_groups)), GFP_KERNEL); if (!new_groups) { - ext4_msg(sb, KERN_ERR, "not enough memory for %d flex groups", - size / (int) sizeof(struct flex_groups)); + ext4_msg(sb, KERN_ERR, + "not enough memory for %d flex group pointers", size); return -ENOMEM; } - - if (sbi->s_flex_groups) { - memcpy(new_groups, sbi->s_flex_groups, - (sbi->s_flex_groups_allocated * - sizeof(struct flex_groups))); - kvfree(sbi->s_flex_groups); + for (i = sbi->s_flex_groups_allocated; i < size; i++) { + new_groups[i] = kvzalloc(roundup_pow_of_two( + sizeof(struct flex_groups)), + GFP_KERNEL); + if (!new_groups[i]) { + for (i--; i >= sbi->s_flex_groups_allocated; i--) + kvfree(new_groups[i]); + kvfree(new_groups); + ext4_msg(sb, KERN_ERR, + "not enough memory for %d flex groups", size); + return -ENOMEM; + } } - sbi->s_flex_groups = new_groups; - sbi->s_flex_groups_allocated = size / sizeof(struct flex_groups); + rcu_read_lock(); + old_groups = rcu_dereference(sbi->s_flex_groups); + if (old_groups) + memcpy(new_groups, old_groups, + (sbi->s_flex_groups_allocated * + sizeof(struct flex_groups *))); + rcu_read_unlock(); + rcu_assign_pointer(sbi->s_flex_groups, new_groups); + sbi->s_flex_groups_allocated = size; + if (old_groups) + ext4_kvfree_array_rcu(old_groups); return 0; } @@ -2324,6 +2345,7 @@ static int ext4_fill_flex_info(struct su { struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_group_desc *gdp = NULL; + struct flex_groups *fg; ext4_group_t flex_group; int i, err; @@ -2341,12 +2363,11 @@ static int ext4_fill_flex_info(struct su gdp = ext4_get_group_desc(sb, i, NULL); flex_group = ext4_flex_group(sbi, i); - atomic_add(ext4_free_inodes_count(sb, gdp), - &sbi->s_flex_groups[flex_group].free_inodes); + fg = sbi_array_rcu_deref(sbi, s_flex_groups, flex_group); + atomic_add(ext4_free_inodes_count(sb, gdp), &fg->free_inodes); atomic64_add(ext4_free_group_clusters(sb, gdp), - &sbi->s_flex_groups[flex_group].free_clusters); - atomic_add(ext4_used_dirs_count(sb, gdp), - &sbi->s_flex_groups[flex_group].used_dirs); + &fg->free_clusters); + atomic_add(ext4_used_dirs_count(sb, gdp), &fg->used_dirs); } return 1; @@ -3555,6 +3576,7 @@ static int ext4_fill_super(struct super_ struct buffer_head *bh, **group_desc; struct ext4_super_block *es = NULL; struct ext4_sb_info *sbi = kzalloc(sizeof(*sbi), GFP_KERNEL); + struct flex_groups **flex_groups; ext4_fsblk_t block; ext4_fsblk_t sb_block = get_sb_block(&data); ext4_fsblk_t logical_sb_block; @@ -4561,8 +4583,14 @@ failed_mount7: ext4_unregister_li_request(sb); failed_mount6: ext4_mb_release(sb); - if (sbi->s_flex_groups) - kvfree(sbi->s_flex_groups); + rcu_read_lock(); + flex_groups = rcu_dereference(sbi->s_flex_groups); + if (flex_groups) { + for (i = 0; i < sbi->s_flex_groups_allocated; i++) + kvfree(flex_groups[i]); + kvfree(flex_groups); + } + rcu_read_unlock(); percpu_counter_destroy(&sbi->s_freeclusters_counter); percpu_counter_destroy(&sbi->s_freeinodes_counter); percpu_counter_destroy(&sbi->s_dirs_counter);