Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4200804imw; Tue, 12 Jul 2022 03:57:06 -0700 (PDT) X-Google-Smtp-Source: AGRyM1s1BnhifABG9HBkkXpCEsiKME7aqkdG7Zwg72gJTvzRlLmnS3/iYJa6ZX0TTfzB2phgsh3D X-Received: by 2002:a17:906:b6c7:b0:726:60aa:1b81 with SMTP id ec7-20020a170906b6c700b0072660aa1b81mr23081831ejb.383.1657623426039; Tue, 12 Jul 2022 03:57:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657623426; cv=none; d=google.com; s=arc-20160816; b=VnyZ3YaKRj2qG8tFlT1obVWiiU2h9QTzNiNkQIpNcmKBtf7M2KA2z0MZNG8Z6OHa/4 cFg/usyyV3ZwTJsvoryZa7aiil898TO4mX+TnLPI5hm4+7GSIecK1mEjcrhgRY5YvUiM dN8W8KGi0RxHy+amh2gADjlYYrcNraQUMDFynMU5fKJb6Yfzi9oaQ3bYRMWMBeMfXWSK hwMBUsOESzhkbh6tq+JCHd7yuwTW1NYOXuIyMMACY10D4Z/IKBdwCw52ZbBBgDv8Pe9V gEvgskJDrk8DaFvhmgWJcaXHASijTbZ9nSKjR4YbOaCCYyeApQ99aRHpe+0FVL6p6NTv YHYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-signature; bh=q9cYBzSlWmUCzr3pXP+qZXJ/LNYAFwTH6d0cBWYQuTU=; b=WNW7SW4ox2N5icZTqkSUAymFm7apNsi1x7nNHNAb8MWe5On39KJh13JplZI8DLlfLT qNYN18hwW7rYd3b6A0SWho/TWiPRfKrs9yMH4YweANemp7ogg8j/MN3yGR+TShUauZgK DAX/wPZGCqoM9dZpFelZ4gHennm/9GwqvVa50UinXGw3P+lbrBe68VeCXmPXx5Deq1KZ vBLo5kZSvOSVzNCyZHmjbIEsJOMNE8EEGSwtWf9ksd1qJx1xOZCzpEwOpw2rCLe8nF8S eKtqsqRA/QFACN4PNVrNKb5P9/8Do8Jxs+hoIjh2DyA9gRJbQJxT3UlHKYQmSFdLsGzY 1xFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=IVVU6i5f; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hr2-20020a1709073f8200b0072a84586c63si16081736ejc.564.2022.07.12.03.56.40; Tue, 12 Jul 2022 03:57:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=IVVU6i5f; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232585AbiGLKyy (ORCPT + 99 others); Tue, 12 Jul 2022 06:54:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232586AbiGLKyo (ORCPT ); Tue, 12 Jul 2022 06:54:44 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74BD2AE544; Tue, 12 Jul 2022 03:54:43 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id C19E122961; Tue, 12 Jul 2022 10:54:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1657623277; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q9cYBzSlWmUCzr3pXP+qZXJ/LNYAFwTH6d0cBWYQuTU=; b=IVVU6i5fu+lUqH/fGqwHgj+Hz3o/iFk5RVKRhtxwWZSaUBRUs5+fg/or9vmAQltxcXZw6b PM2HIm54gT7NTpA4lsPJuQDsiudzOs91hBJ/SQi3T+/j/Y8AAxuco52U/iGQ2ZfvXLJcdp hrFjhu8l1B0Eeo1h7d+/CtGNCFYPeUo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1657623277; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q9cYBzSlWmUCzr3pXP+qZXJ/LNYAFwTH6d0cBWYQuTU=; b=JwVkMnZNz889PRRDpFh90HZln/kz1HyNSDv0suuqMdGPzOzov9VhC4H9nxMdbIH0DIz+/4 tWbetV3cR8YhqYCw== Received: from quack3.suse.cz (unknown [10.100.224.230]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id A4F8C2C141; Tue, 12 Jul 2022 10:54:37 +0000 (UTC) Received: by quack3.suse.cz (Postfix, from userid 1000) id 152FBA0644; Tue, 12 Jul 2022 12:54:37 +0200 (CEST) From: Jan Kara To: Ted Tso Cc: , Ritesh Harjani , Jan Kara , stable@vger.kernel.org Subject: [PATCH 05/10] ext4: Fix race when reusing xattr blocks Date: Tue, 12 Jul 2022 12:54:24 +0200 Message-Id: <20220712105436.32204-5-jack@suse.cz> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220712104519.29887-1-jack@suse.cz> References: <20220712104519.29887-1-jack@suse.cz> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6228; h=from:subject; bh=m2o1OpRSX1IBxcIJmJ/ZM79kGa85dicmYV99CCG8gOk=; b=owEBbQGS/pANAwAIAZydqgc/ZEDZAcsmYgBizVLgWV7iLEEEcy/K2kuWP67U2Wz8OZwa0GtD2/4P xtiYQ1aJATMEAAEIAB0WIQSrWdEr1p4yirVVKBycnaoHP2RA2QUCYs1S4AAKCRCcnaoHP2RA2exvB/ 9MOH5cIvC0erP4Hbhe70Xxs8fM7XKpKLSA6DaRUWH3rndtV29TFZ2XHRNl8LFa+HMdCQrZoxgyTWEY FBsjF90HLwkOoBfklkelMwB0NRiH1gzVhYzOCLDU06fTV5zx5gIvZ6XXZ0gBhOZMa3F6MI4lizC2ft nEXxTYm1Ofsc7AIN8bpkAJyUjxJ1c5r6oiDlPfNglgqHTSzOJcr4FNNMPDlaT4D47nVYPbCq39SVu/ SrIvBzRxy0e57hFQwap2HTFRbZ2GZlVinfeqpn74UQFxNA/4P3eTpjUvSVwEBJJ1tWbjZP96etv7Ch hK0Wp0Hop4SJ4IHwoHbSUxsMVYcgh5 X-Developer-Key: i=jack@suse.cz; a=openpgp; fpr=93C6099A142276A28BBE35D815BC833443038D8C Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org When ext4_xattr_block_set() decides to remove xattr block the following race can happen: CPU1 CPU2 ext4_xattr_block_set() ext4_xattr_release_block() new_bh = ext4_xattr_block_cache_find() lock_buffer(bh); ref = le32_to_cpu(BHDR(bh)->h_refcount); if (ref == 1) { ... mb_cache_entry_delete(); unlock_buffer(bh); ext4_free_blocks(); ... ext4_forget(..., bh, ...); jbd2_journal_revoke(..., bh); ext4_journal_get_write_access(..., new_bh, ...) do_get_write_access() jbd2_journal_cancel_revoke(..., new_bh); Later the code in ext4_xattr_block_set() finds out the block got freed and cancels reusal of the block but the revoke stays canceled and so in case of block reuse and journal replay the filesystem can get corrupted. If the race works out slightly differently, we can also hit assertions in the jbd2 code. Fix the problem by making sure that once matching mbcache entry is found, code dropping the last xattr block reference (or trying to modify xattr block in place) waits until the mbcache entry reference is dropped. This way code trying to reuse xattr block is protected from someone trying to drop the last reference to xattr block. Reported-and-tested-by: Ritesh Harjani CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara --- fs/ext4/xattr.c | 67 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 45 insertions(+), 22 deletions(-) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index aadfae53d055..3a0928c8720e 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -439,9 +439,16 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, /* Remove entry from mbcache when EA inode is getting evicted */ void ext4_evict_ea_inode(struct inode *inode) { - if (EA_INODE_CACHE(inode)) - mb_cache_entry_delete(EA_INODE_CACHE(inode), - ext4_xattr_inode_get_hash(inode), inode->i_ino); + struct mb_cache_entry *oe; + + if (!EA_INODE_CACHE(inode)) + return; + /* Wait for entry to get unused so that we can remove it */ + while ((oe = mb_cache_entry_delete_or_get(EA_INODE_CACHE(inode), + ext4_xattr_inode_get_hash(inode), inode->i_ino))) { + mb_cache_entry_wait_unused(oe); + mb_cache_entry_put(EA_INODE_CACHE(inode), oe); + } } static int @@ -1229,6 +1236,7 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode, if (error) goto out; +retry_ref: lock_buffer(bh); hash = le32_to_cpu(BHDR(bh)->h_hash); ref = le32_to_cpu(BHDR(bh)->h_refcount); @@ -1238,9 +1246,18 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode, * This must happen under buffer lock for * ext4_xattr_block_set() to reliably detect freed block */ - if (ea_block_cache) - mb_cache_entry_delete(ea_block_cache, hash, - bh->b_blocknr); + if (ea_block_cache) { + struct mb_cache_entry *oe; + + oe = mb_cache_entry_delete_or_get(ea_block_cache, hash, + bh->b_blocknr); + if (oe) { + unlock_buffer(bh); + mb_cache_entry_wait_unused(oe); + mb_cache_entry_put(ea_block_cache, oe); + goto retry_ref; + } + } get_bh(bh); unlock_buffer(bh); @@ -1867,9 +1884,20 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode, * ext4_xattr_block_set() to reliably detect modified * block */ - if (ea_block_cache) - mb_cache_entry_delete(ea_block_cache, hash, - bs->bh->b_blocknr); + if (ea_block_cache) { + struct mb_cache_entry *oe; + + oe = mb_cache_entry_delete_or_get(ea_block_cache, + hash, bs->bh->b_blocknr); + if (oe) { + /* + * Xattr block is getting reused. Leave + * it alone. + */ + mb_cache_entry_put(ea_block_cache, oe); + goto clone_block; + } + } ea_bdebug(bs->bh, "modifying in-place"); error = ext4_xattr_set_entry(i, s, handle, inode, true /* is_block */); @@ -1885,6 +1913,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode, goto cleanup; goto inserted; } +clone_block: unlock_buffer(bs->bh); ea_bdebug(bs->bh, "cloning"); s->base = kmalloc(bs->bh->b_size, GFP_NOFS); @@ -1991,18 +2020,13 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode, lock_buffer(new_bh); /* * We have to be careful about races with - * freeing, rehashing or adding references to - * xattr block. Once we hold buffer lock xattr - * block's state is stable so we can check - * whether the block got freed / rehashed or - * not. Since we unhash mbcache entry under - * buffer lock when freeing / rehashing xattr - * block, checking whether entry is still - * hashed is reliable. Same rules hold for - * e_reusable handling. + * adding references to xattr block. Once we + * hold buffer lock xattr block's state is + * stable so we can check the additional + * reference fits. */ - if (hlist_bl_unhashed(&ce->e_hash_list) || - !ce->e_reusable) { + ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1; + if (ref > EXT4_XATTR_REFCOUNT_MAX) { /* * Undo everything and check mbcache * again. @@ -2017,9 +2041,8 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode, new_bh = NULL; goto inserted; } - ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1; BHDR(new_bh)->h_refcount = cpu_to_le32(ref); - if (ref >= EXT4_XATTR_REFCOUNT_MAX) + if (ref == EXT4_XATTR_REFCOUNT_MAX) ce->e_reusable = 0; ea_bdebug(new_bh, "reusing; refcount now=%d", ref); -- 2.35.3