Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp870993pxb; Sat, 21 Aug 2021 22:18:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz02hyD5HLQEoestbPoQrSii6K2j/DgmiStJtJxWnRSYj8F/xz4EJy7NSN4gO+yevCUfKur X-Received: by 2002:a17:906:f1ca:: with SMTP id gx10mr29834198ejb.387.1629609506927; Sat, 21 Aug 2021 22:18:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629609506; cv=none; d=google.com; s=arc-20160816; b=dyJ2BcbrYLdactkbMEGUh49uVk8OD3QpHKI6AtlQKbl4VcwCxT8dy4Z+3Drmygpdsc EpmaiauvKpDmUNq+Y2MVQV0NTQU+M1x7C52/KWzf0q8125YJE/ObEHwqDESB3K/6g31B wq2OqWNz9QMI19euxSz3fAwMKAc+D9tegi11CP3LruCAjPBfY3q2dE6itLg3lDA+ykXP G9Niy2q3kV421cz7IWtiPdipHJ+qi4FgyvPI5fRPTZT14wz8usOtDe/y1adQ/M0dd1u9 ij5jVs2J8cEItDQyeHXMv5ukKWAFjZ36RD483yD42aguH8LpVwm8k2U6QDhiIBNzzRk2 Nfbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=1t1aQbXtWpstxwxAIkTKfE++deH3Jy/G53Q/+JnCzPU=; b=SROL7JfoKKhiQXVHQIt4H9NqBmgVVPSR6i0FgVcwQ4tw6IlQXUIhxuvqLtjTWeNHEw X9hP5AfzWBfsni/DtH18DeTjCMYeHm6+zCht8gapmlkztxY5p+AOCvy2BstBmrkhfo8B aJznYF15Rdfjh/in9HSnEcoz5MaaHc9UGr3/z2jlrGtBe51BQ2lRQxal6p2OoFCYuXoV L56juXlUuV5ZXU0YOVFSuUvwtpHmjW5SPZ8rvNwkFyvoF5lwuno5EqiTI9ajKg1/A1PK prYuSlllOIGPs+j074g4LLCUaoti1JM9A9Bm1hr5MsYZQrujk86MJ5f6KrTQiXANW+bw UgSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cw2si7803745ejc.730.2021.08.21.22.17.48; Sat, 21 Aug 2021 22:18:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229849AbhHVFST (ORCPT + 99 others); Sun, 22 Aug 2021 01:18:19 -0400 Received: from out30-43.freemail.mail.aliyun.com ([115.124.30.43]:55233 "EHLO out30-43.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229939AbhHVFST (ORCPT ); Sun, 22 Aug 2021 01:18:19 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R411e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0Uksy5lp_1629609454; Received: from B-P7TQMD6M-0146(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0Uksy5lp_1629609454) by smtp.aliyun-inc.com(127.0.0.1); Sun, 22 Aug 2021 13:17:36 +0800 Date: Sun, 22 Aug 2021 13:17:34 +0800 From: Gao Xiang To: Eric Whitney Cc: Jeffle Xu , tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, joseph.qi@linux.alibaba.com Subject: Re: [PATCH] ext4: fix reserved space counter leakage Message-ID: References: <20210819091351.19297-1-jefflexu@linux.alibaba.com> <20210820164556.GA30851@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210820164556.GA30851@localhost.localdomain> Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, Aug 20, 2021 at 12:45:56PM -0400, Eric Whitney wrote: > * Jeffle Xu : > > When ext4_es_insert_delayed_block() returns error, e.g., ENOMEM, > > previously reserved space is not released as the error handling, > > in which case @s_dirtyclusters_counter is left over. Since this delayed > > extent failes to be inserted into extent status tree, when inode is > > written back, the extra @s_dirtyclusters_counter won't be subtracted and > > remains there forever. > > > > This can leads to /sys/fs/ext4//delayed_allocation_blocks remains > > non-zero even when syncfs is executed on the filesystem. > > > > Hi: > > I think the fix below looks fine. However, this comment doesn't look right > to me. Are you really seeing delayed_allocation_blocks values that remain > incorrectly elevated across last closes (or across file system unmounts and > remounts)? s_dirtyclusters_counter isn't written out to stable storage - > it's an in-memory only variable that's created when a file is first opened > and destroyed on last close. hmmm.... Let me explain a bit about this. It can be reproduced easily by fault injection with the code modified below: diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 9a3a8996aacf..29dc0da5960c 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -794,6 +794,9 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes) } } + if (!(ktime_get_ns() % 3)) { + return -ENOMEM; + } es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, newes->es_pblk); if (!es) and then run a loop while true; do dd if=/dev/zero of=aaa bs=8192 count=10000; sync; rm -rf aaa; done After "Cannot allocate memory reported" is shown, s_dirtyclusters_counter was already leaked. It can cause df and free space counting incorrect in this mount. If my understanging is correct, in priciple, we should also check with "WARN_ON(ei->i_reserved_data_blocks)" in the inode evict path since it should be considered as 0. Thanks, Gao Xiang > > Eric > > > Fixes: 51865fda28e5 ("ext4: let ext4 maintain extent status tree") > > Cc: > > Reported-by: Gao Xiang > > Signed-off-by: Jeffle Xu > > --- > > fs/ext4/inode.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 82087657860b..7f15da370281 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -1650,6 +1650,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) > > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > > int ret; > > bool allocated = false; > > + bool reserved = false; > > > > /* > > * If the cluster containing lblk is shared with a delayed, > > @@ -1666,6 +1667,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) > > ret = ext4_da_reserve_space(inode); > > if (ret != 0) /* ENOSPC */ > > goto errout; > > + reserved = true; > > } else { /* bigalloc */ > > if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) { > > if (!ext4_es_scan_clu(inode, > > @@ -1678,6 +1680,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) > > ret = ext4_da_reserve_space(inode); > > if (ret != 0) /* ENOSPC */ > > goto errout; > > + reserved = true; > > } else { > > allocated = true; > > } > > @@ -1688,6 +1691,8 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) > > } > > > > ret = ext4_es_insert_delayed_block(inode, lblk, allocated); > > + if (ret && reserved) > > + ext4_da_release_space(inode, 1); > > > > errout: > > return ret; > > -- > > 2.27.0 > >