Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp1485379ybv; Thu, 20 Feb 2020 21:35:41 -0800 (PST) X-Google-Smtp-Source: APXvYqzwVLhYLaByw43ThYwR7fWufLVhfeqSjhMzwtQo2EEZcZxOkkcbGzXcs9cgVbxqVL5TBkBM X-Received: by 2002:a9d:116:: with SMTP id 22mr3800719otu.149.1582263341210; Thu, 20 Feb 2020 21:35:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582263341; cv=none; d=google.com; s=arc-20160816; b=bPBkPyR7U+Ej3JjkA9Qzh66c51rXnud4skmlcc+dJ5Io4IZxBo9QW9y1AOMw6DvcLJ fjavquFx70wegqxnWS052wfd+6YlxDfV9w/AXUrsJ1h/GAXIVwuj0W7+XJoigrO7PA9Q MsLyJmVGO33YPQMCObFdUvU9PBFp6hunRrchSP3BPqVn9mcs6xt72p25R7vXAxGB4oBm p+I6jTsrb2v/O7Hr+wC9539CYxHsbNz2lGrr1Wc64S2N5IqDRkILZd1MO8Cj2kTVhJv4 ov4pWAOTrQQiv6jjrqTaRriEyWJ1xaoLpiD3IkmephTfy2+VfWeA5ZqQnSrycNdt84N4 qEFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=hMfHSkzeteuqfZBgQUjhkEJZMmTcaho0b+Z1wcSZArU=; b=Hrbog0PG4gTPW5bf2J3wlsjVDz6Xk6FYzxaZBzKr8PtsTZfAKUHS+NoTLriHvJpXwV 8XorI0ClgsSzZqpsMjvU90EsG/M3eZ4LzBDlEU2i7718dlxGfN0hjFglJGoXrGz9ie7k 6UIcynbOCvkj0lwDyI9Ydv3L6lZdweua+Ta+2PT8du3G213flfcb6Rd39d3+p8SlPkG8 Wk5+39OJDS1c2K2PofbBQaOMUpUNfwY1aesQPdOWWhEdZkNGq1Z6x3XJ9zy6WB8YWL0e gSXp93VGppETv352uiY7y6wTcRRnCrllzCPbrBNfqlHyLO128OCI6gYj0x0VvC4iRlUM i1Ug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y144si337374oia.67.2020.02.20.21.35.24; Thu, 20 Feb 2020 21:35:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726100AbgBUFfU (ORCPT + 99 others); Fri, 21 Feb 2020 00:35:20 -0500 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:59709 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725973AbgBUFfU (ORCPT ); Fri, 21 Feb 2020 00:35:20 -0500 Received: from callcc.thunk.org (guestnat-104-133-8-109.corp.google.com [104.133.8.109] (may be forged)) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 01L5Z8lH022068 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 21 Feb 2020 00:35:10 -0500 Received: by callcc.thunk.org (Postfix, from userid 15806) id 5987D4211F2; Fri, 21 Feb 2020 00:35:08 -0500 (EST) From: "Theodore Ts'o" To: Ext4 Developers List Cc: Suraj Jitindar Singh , "Theodore Ts'o" , stable@kernel.org Subject: [PATCH 3/3] ext4: fix potential race between s_flex_groups online resizing and access Date: Fri, 21 Feb 2020 00:34:58 -0500 Message-Id: <20200221053458.730016-4-tytso@mit.edu> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200221053458.730016-1-tytso@mit.edu> References: <20200221053458.730016-1-tytso@mit.edu> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Suraj Jitindar Singh During an online resize an array of s_flex_groups structures gets replaced so it can get enlarged. If there is a concurrent access to the array and this memory has been reused then this can lead to an invalid memory access. The s_flex_group array has been converted into an array of pointers rather than an array of structures. This is to ensure that the information contained in the structures cannot get out of sync during a resize due to an accessor updating the value in the old structure after it has been copied but before the array pointer is updated. Since the structures them- selves are no longer copied but only the pointers to them this case is mitigated. Link: https://bugzilla.kernel.org/show_bug.cgi?id=206443 Previous-Patch-Link: https://lore.kernel.org/r/20200219030851.2678-4-surajjs@amazon.com Signed-off-by: Suraj Jitindar Singh Signed-off-by: Theodore Ts'o Cc: stable@kernel.org --- fs/ext4/ext4.h | 2 +- fs/ext4/ialloc.c | 23 +++++++++------ fs/ext4/mballoc.c | 9 ++++-- fs/ext4/resize.c | 7 +++-- fs/ext4/super.c | 72 ++++++++++++++++++++++++++++++++--------------- 5 files changed, 76 insertions(+), 37 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index b1ece5329738..614fefa7dc7a 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1512,7 +1512,7 @@ struct ext4_sb_info { unsigned int s_extent_max_zeroout_kb; unsigned int s_log_groups_per_flex; - struct flex_groups *s_flex_groups; + struct flex_groups * __rcu *s_flex_groups; ext4_group_t s_flex_groups_allocated; /* workqueue for reserved extent conversions (buffered io) */ diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index c66e8f9451a2..501118b9ba90 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -328,11 +328,13 @@ void ext4_free_inode(handle_t *handle, struct inode *inode) percpu_counter_inc(&sbi->s_freeinodes_counter); if (sbi->s_log_groups_per_flex) { - ext4_group_t f = ext4_flex_group(sbi, block_group); + struct flex_groups *fg; - atomic_inc(&sbi->s_flex_groups[f].free_inodes); + fg = sbi_array_rcu_deref(sbi, s_flex_groups, + ext4_flex_group(sbi, block_group)); + atomic_inc(&fg->free_inodes); if (is_directory) - atomic_dec(&sbi->s_flex_groups[f].used_dirs); + atomic_dec(&fg->used_dirs); } BUFFER_TRACE(bh2, "call ext4_handle_dirty_metadata"); fatal = ext4_handle_dirty_metadata(handle, NULL, bh2); @@ -368,12 +370,13 @@ static void get_orlov_stats(struct super_block *sb, ext4_group_t g, int flex_size, struct orlov_stats *stats) { struct ext4_group_desc *desc; - struct flex_groups *flex_group = EXT4_SB(sb)->s_flex_groups; + struct flex_groups *flex_group = sbi_array_rcu_deref(EXT4_SB(sb), + s_flex_groups, g); if (flex_size > 1) { - stats->free_inodes = atomic_read(&flex_group[g].free_inodes); - stats->free_clusters = atomic64_read(&flex_group[g].free_clusters); - stats->used_dirs = atomic_read(&flex_group[g].used_dirs); + stats->free_inodes = atomic_read(&flex_group->free_inodes); + stats->free_clusters = atomic64_read(&flex_group->free_clusters); + stats->used_dirs = atomic_read(&flex_group->used_dirs); return; } @@ -1054,7 +1057,8 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir, if (sbi->s_log_groups_per_flex) { ext4_group_t f = ext4_flex_group(sbi, group); - atomic_inc(&sbi->s_flex_groups[f].used_dirs); + atomic_inc(&sbi_array_rcu_deref(sbi, s_flex_groups, + f)->used_dirs); } } if (ext4_has_group_desc_csum(sb)) { @@ -1077,7 +1081,8 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir, if (sbi->s_log_groups_per_flex) { flex_group = ext4_flex_group(sbi, group); - atomic_dec(&sbi->s_flex_groups[flex_group].free_inodes); + atomic_dec(&sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_inodes); } inode->i_ino = ino + group * EXT4_INODES_PER_GROUP(sb); diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 1b46fb63692a..51a78eb65f3c 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -3038,7 +3038,8 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac, ext4_group_t flex_group = ext4_flex_group(sbi, ac->ac_b_ex.fe_group); atomic64_sub(ac->ac_b_ex.fe_len, - &sbi->s_flex_groups[flex_group].free_clusters); + &sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_clusters); } err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh); @@ -4936,7 +4937,8 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode, if (sbi->s_log_groups_per_flex) { ext4_group_t flex_group = ext4_flex_group(sbi, block_group); atomic64_add(count_clusters, - &sbi->s_flex_groups[flex_group].free_clusters); + &sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_clusters); } /* @@ -5093,7 +5095,8 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb, if (sbi->s_log_groups_per_flex) { ext4_group_t flex_group = ext4_flex_group(sbi, block_group); atomic64_add(clusters_freed, - &sbi->s_flex_groups[flex_group].free_clusters); + &sbi_array_rcu_deref(sbi, s_flex_groups, + flex_group)->free_clusters); } ext4_mb_unload_buddy(&e4b); diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c index 536cc9f38091..a50b51270ea9 100644 --- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1430,11 +1430,14 @@ static void ext4_update_super(struct super_block *sb, percpu_counter_read(&sbi->s_freeclusters_counter)); if (ext4_has_feature_flex_bg(sb) && sbi->s_log_groups_per_flex) { ext4_group_t flex_group; + struct flex_groups *fg; + flex_group = ext4_flex_group(sbi, group_data[0].group); + fg = sbi_array_rcu_deref(sbi, s_flex_groups, flex_group); atomic64_add(EXT4_NUM_B2C(sbi, free_blocks), - &sbi->s_flex_groups[flex_group].free_clusters); + &fg->free_clusters); atomic_add(EXT4_INODES_PER_GROUP(sb) * flex_gd->count, - &sbi->s_flex_groups[flex_group].free_inodes); + &fg->free_inodes); } /* diff --git a/fs/ext4/super.c b/fs/ext4/super.c index e00bcc19099f..6b7e628b7903 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1015,6 +1015,7 @@ static void ext4_put_super(struct super_block *sb) struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_super_block *es = sbi->s_es; struct buffer_head **group_desc; + struct flex_groups **flex_groups; int aborted = 0; int i, err; @@ -1052,8 +1053,13 @@ static void ext4_put_super(struct super_block *sb) for (i = 0; i < sbi->s_gdb_count; i++) brelse(group_desc[i]); kvfree(group_desc); + flex_groups = rcu_dereference(sbi->s_flex_groups); + if (flex_groups) { + for (i = 0; i < sbi->s_flex_groups_allocated; i++) + kvfree(flex_groups[i]); + kvfree(flex_groups); + } rcu_read_unlock(); - kvfree(sbi->s_flex_groups); percpu_counter_destroy(&sbi->s_freeclusters_counter); percpu_counter_destroy(&sbi->s_freeinodes_counter); percpu_counter_destroy(&sbi->s_dirs_counter); @@ -2384,8 +2390,8 @@ static int ext4_setup_super(struct super_block *sb, struct ext4_super_block *es, int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup) { struct ext4_sb_info *sbi = EXT4_SB(sb); - struct flex_groups *new_groups; - int size; + struct flex_groups **old_groups, **new_groups; + int size, i; if (!sbi->s_log_groups_per_flex) return 0; @@ -2394,22 +2400,37 @@ int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup) if (size <= sbi->s_flex_groups_allocated) return 0; - size = roundup_pow_of_two(size * sizeof(struct flex_groups)); - new_groups = kvzalloc(size, GFP_KERNEL); + new_groups = kvzalloc(roundup_pow_of_two(size * + sizeof(*sbi->s_flex_groups)), GFP_KERNEL); if (!new_groups) { - ext4_msg(sb, KERN_ERR, "not enough memory for %d flex groups", - size / (int) sizeof(struct flex_groups)); + ext4_msg(sb, KERN_ERR, + "not enough memory for %d flex group pointers", size); return -ENOMEM; } - - if (sbi->s_flex_groups) { - memcpy(new_groups, sbi->s_flex_groups, - (sbi->s_flex_groups_allocated * - sizeof(struct flex_groups))); - kvfree(sbi->s_flex_groups); + for (i = sbi->s_flex_groups_allocated; i < size; i++) { + new_groups[i] = kvzalloc(roundup_pow_of_two( + sizeof(struct flex_groups)), + GFP_KERNEL); + if (!new_groups[i]) { + for (i--; i >= sbi->s_flex_groups_allocated; i--) + kvfree(new_groups[i]); + kvfree(new_groups); + ext4_msg(sb, KERN_ERR, + "not enough memory for %d flex groups", size); + return -ENOMEM; + } } - sbi->s_flex_groups = new_groups; - sbi->s_flex_groups_allocated = size / sizeof(struct flex_groups); + rcu_read_lock(); + old_groups = rcu_dereference(sbi->s_flex_groups); + if (old_groups) + memcpy(new_groups, old_groups, + (sbi->s_flex_groups_allocated * + sizeof(struct flex_groups *))); + rcu_read_unlock(); + rcu_assign_pointer(sbi->s_flex_groups, new_groups); + sbi->s_flex_groups_allocated = size; + if (old_groups) + ext4_kvfree_array_rcu(old_groups); return 0; } @@ -2417,6 +2438,7 @@ static int ext4_fill_flex_info(struct super_block *sb) { struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_group_desc *gdp = NULL; + struct flex_groups *fg; ext4_group_t flex_group; int i, err; @@ -2434,12 +2456,11 @@ static int ext4_fill_flex_info(struct super_block *sb) gdp = ext4_get_group_desc(sb, i, NULL); flex_group = ext4_flex_group(sbi, i); - atomic_add(ext4_free_inodes_count(sb, gdp), - &sbi->s_flex_groups[flex_group].free_inodes); + fg = sbi_array_rcu_deref(sbi, s_flex_groups, flex_group); + atomic_add(ext4_free_inodes_count(sb, gdp), &fg->free_inodes); atomic64_add(ext4_free_group_clusters(sb, gdp), - &sbi->s_flex_groups[flex_group].free_clusters); - atomic_add(ext4_used_dirs_count(sb, gdp), - &sbi->s_flex_groups[flex_group].used_dirs); + &fg->free_clusters); + atomic_add(ext4_used_dirs_count(sb, gdp), &fg->used_dirs); } return 1; @@ -3641,6 +3662,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) struct buffer_head *bh, **group_desc; struct ext4_super_block *es = NULL; struct ext4_sb_info *sbi = kzalloc(sizeof(*sbi), GFP_KERNEL); + struct flex_groups **flex_groups; ext4_fsblk_t block; ext4_fsblk_t sb_block = get_sb_block(&data); ext4_fsblk_t logical_sb_block; @@ -4692,8 +4714,14 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) ext4_unregister_li_request(sb); failed_mount6: ext4_mb_release(sb); - if (sbi->s_flex_groups) - kvfree(sbi->s_flex_groups); + rcu_read_lock(); + flex_groups = rcu_dereference(sbi->s_flex_groups); + if (flex_groups) { + for (i = 0; i < sbi->s_flex_groups_allocated; i++) + kvfree(flex_groups[i]); + kvfree(flex_groups); + } + rcu_read_unlock(); percpu_counter_destroy(&sbi->s_freeclusters_counter); percpu_counter_destroy(&sbi->s_freeinodes_counter); percpu_counter_destroy(&sbi->s_dirs_counter); -- 2.24.1