Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp693470rwb; Thu, 1 Dec 2022 07:17:28 -0800 (PST) X-Google-Smtp-Source: AA0mqf6y1g9tiXz6g9Uo91Q17JZKofPndLIizAClwPvdIToKNP7A5WR3FWYihuJ5e78AqslSC1To X-Received: by 2002:a17:906:3ad8:b0:7a8:dddc:7ec6 with SMTP id z24-20020a1709063ad800b007a8dddc7ec6mr7185383ejd.734.1669907848404; Thu, 01 Dec 2022 07:17:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669907848; cv=none; d=google.com; s=arc-20160816; b=hu47o+cYAxiWaHMjDXfgNBiQBSn0f3l8lxhSw7uZQzoSWNZyBo615UJm1Zvqu283op 4ug9C8WHMRZ0w+My+znctn1ZtDQMyP3ySlZGjHvxbUUFbShmckk4OLze9x4WShs3T4Mh w0v+oH+fmqs9w0lxl1bjf1JtSkd5jrOAWngj6YSm4sOb9YXaU35MibzJZoIHCwaVvNXo I/siOKQkGo/f0j172CTNJ9yo5rgKsoPTxWE5yLGDbADPP/n2YjZjWScZQGaUTqLwL88O 1SRa+3wOJaMxvXVjbbnZbNbLMYvwilom2rFGVdFyS6z3ow8AcIVw85xP00IIFYjoPdI2 lD1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature:dkim-filter; bh=maFYL7cyZxlxfLpUgiy/yjCpc5B3NqziJG6bC+0IrIw=; b=OSi7bZ5ZkYD2+pRBUFU61Fmgah0yokWvplWvZrsaEelSpnmUsyS8ijN97ygI/CtmiT T3BqduYMGz84F5nTJQImGOl53u2oSyl2cxgpDdDDiB0rOrXUX5/9slw6BJr47UMgWe8d kY09FFLQBqZzwR9f8MZ8vYWda486nmRQ6G48+FRYmhYoKSuwgesWJCGboAXB6ChI0lO4 dieFlDK9fdl025MVnNQq/N6FLWDKW09FBn1lPjNyMdyvhwZdhZlFQcCsrPGh0jI65HBp jzqXfwYUEHi+Jxxoh7NnCPd+s9esfvYrKw2pHaKqcCwBvfBqzHxwzu6S2Wk3tUGHhSuc zB1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=fp91l4yH; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id wy7-20020a170906fe0700b0078d9b5792a0si3921471ejb.319.2022.12.01.07.16.56; Thu, 01 Dec 2022 07:17:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=fp91l4yH; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230463AbiLAPKX (ORCPT + 99 others); Thu, 1 Dec 2022 10:10:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230259AbiLAPKW (ORCPT ); Thu, 1 Dec 2022 10:10:22 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D7DB64AF36; Thu, 1 Dec 2022 07:10:21 -0800 (PST) Received: by linux.microsoft.com (Postfix, from userid 1112) id A26E620B83C2; Thu, 1 Dec 2022 07:10:21 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com A26E620B83C2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1669907421; bh=maFYL7cyZxlxfLpUgiy/yjCpc5B3NqziJG6bC+0IrIw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=fp91l4yHXQcGzBLEnbJye9jkRJdsi18FHM0+PV2h/HuKpJpnMTg45upKwvga1Gn6L uSrPjrfAuvS4hI1kTHPvWW4E6ziKiPkU0OnCsu1LM2eZ88HM0m24/4HGuXkOA3w1pl 9PVnbmSLN8/BefvnaJ0Wzzzc/KFPhBTzXy2tGCyw= Date: Thu, 1 Dec 2022 07:10:21 -0800 From: Jeremi Piotrowski To: Ted Tso Cc: Jan Kara , Andreas Gruenbacher , linux-ext4@vger.kernel.org, stable@vger.kernel.org, Thilo Fromm Subject: Re: [PATCH] ext4: Fix deadlock due to mbcache entry corruption Message-ID: <20221201151021.GA18380@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20221123193950.16758-1-jack@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221123193950.16758-1-jack@suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Nov 23, 2022 at 08:39:50PM +0100, Jan Kara wrote: > When manipulating xattr blocks, we can deadlock infinitely looping > inside ext4_xattr_block_set() where we constantly keep finding xattr > block for reuse in mbcache but we are unable to reuse it because its > reference count is too big. This happens because cache entry for the > xattr block is marked as reusable (e_reusable set) although its > reference count is too big. When this inconsistency happens, this > inconsistent state is kept indefinitely and so ext4_xattr_block_set() > keeps retrying indefinitely. > > The inconsistent state is caused by non-atomic update of e_reusable bit. > e_reusable is part of a bitfield and e_reusable update can race with > update of e_referenced bit in the same bitfield resulting in loss of one > of the updates. Fix the problem by using atomic bitops instead. > > CC: stable@vger.kernel.org > Fixes: 6048c64b2609 ("mbcache: add reusable flag to cache entries") > Reported-and-tested-by: Jeremi Piotrowski > Reported-by: Thilo Fromm > Link: https://lore.kernel.org/r/c77bf00f-4618-7149-56f1-b8d1664b9d07@linux.microsoft.com/ > Signed-off-by: Jan Kara Hi Ted, Could it be that you didn't see this email? We have users who are hitting this and are very eager to see this bugfix get merged and backported to stable. Thanks, Jeremi > --- > fs/ext4/xattr.c | 4 ++-- > fs/mbcache.c | 14 ++++++++------ > include/linux/mbcache.h | 9 +++++++-- > 3 files changed, 17 insertions(+), 10 deletions(-) > > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c > index 800ce5cdb9d2..08043aa72cf1 100644 > --- a/fs/ext4/xattr.c > +++ b/fs/ext4/xattr.c > @@ -1281,7 +1281,7 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode, > ce = mb_cache_entry_get(ea_block_cache, hash, > bh->b_blocknr); > if (ce) { > - ce->e_reusable = 1; > + set_bit(MBE_REUSABLE_B, &ce->e_flags); > mb_cache_entry_put(ea_block_cache, ce); > } > } > @@ -2042,7 +2042,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode, > } > BHDR(new_bh)->h_refcount = cpu_to_le32(ref); > if (ref == EXT4_XATTR_REFCOUNT_MAX) > - ce->e_reusable = 0; > + clear_bit(MBE_REUSABLE_B, &ce->e_flags); > ea_bdebug(new_bh, "reusing; refcount now=%d", > ref); > ext4_xattr_block_csum_set(inode, new_bh); > diff --git a/fs/mbcache.c b/fs/mbcache.c > index e272ad738faf..2a4b8b549e93 100644 > --- a/fs/mbcache.c > +++ b/fs/mbcache.c > @@ -100,8 +100,9 @@ int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key, > atomic_set(&entry->e_refcnt, 2); > entry->e_key = key; > entry->e_value = value; > - entry->e_reusable = reusable; > - entry->e_referenced = 0; > + entry->e_flags = 0; > + if (reusable) > + set_bit(MBE_REUSABLE_B, &entry->e_flags); > head = mb_cache_entry_head(cache, key); > hlist_bl_lock(head); > hlist_bl_for_each_entry(dup, dup_node, head, e_hash_list) { > @@ -165,7 +166,8 @@ static struct mb_cache_entry *__entry_find(struct mb_cache *cache, > while (node) { > entry = hlist_bl_entry(node, struct mb_cache_entry, > e_hash_list); > - if (entry->e_key == key && entry->e_reusable && > + if (entry->e_key == key && > + test_bit(MBE_REUSABLE_B, &entry->e_flags) && > atomic_inc_not_zero(&entry->e_refcnt)) > goto out; > node = node->next; > @@ -284,7 +286,7 @@ EXPORT_SYMBOL(mb_cache_entry_delete_or_get); > void mb_cache_entry_touch(struct mb_cache *cache, > struct mb_cache_entry *entry) > { > - entry->e_referenced = 1; > + set_bit(MBE_REFERENCED_B, &entry->e_flags); > } > EXPORT_SYMBOL(mb_cache_entry_touch); > > @@ -309,9 +311,9 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache, > entry = list_first_entry(&cache->c_list, > struct mb_cache_entry, e_list); > /* Drop initial hash reference if there is no user */ > - if (entry->e_referenced || > + if (test_bit(MBE_REFERENCED_B, &entry->e_flags) || > atomic_cmpxchg(&entry->e_refcnt, 1, 0) != 1) { > - entry->e_referenced = 0; > + clear_bit(MBE_REFERENCED_B, &entry->e_flags); > list_move_tail(&entry->e_list, &cache->c_list); > continue; > } > diff --git a/include/linux/mbcache.h b/include/linux/mbcache.h > index 2da63fd7b98f..97e64184767d 100644 > --- a/include/linux/mbcache.h > +++ b/include/linux/mbcache.h > @@ -10,6 +10,12 @@ > > struct mb_cache; > > +/* Cache entry flags */ > +enum { > + MBE_REFERENCED_B = 0, > + MBE_REUSABLE_B > +}; > + > struct mb_cache_entry { > /* List of entries in cache - protected by cache->c_list_lock */ > struct list_head e_list; > @@ -26,8 +32,7 @@ struct mb_cache_entry { > atomic_t e_refcnt; > /* Key in hash - stable during lifetime of the entry */ > u32 e_key; > - u32 e_referenced:1; > - u32 e_reusable:1; > + unsigned long e_flags; > /* User provided value - stable during lifetime of the entry */ > u64 e_value; > }; > -- > 2.35.3