Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp398993pxb; Thu, 26 Aug 2021 05:56:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyzQzTyemTUNQCiKTldjt4vT7uTigGWkbRlHjeJC23y2ryhc3lpFNcltW6EYIPNSIQ9rqOE X-Received: by 2002:a17:907:1108:: with SMTP id qu8mr4246092ejb.58.1629982561111; Thu, 26 Aug 2021 05:56:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629982561; cv=none; d=google.com; s=arc-20160816; b=RMUZlgPXpKKPHsyE9GFbQlEuV7oh0kA4UWMuTujKmmQ38JZD+byw2ul/h3zZRViTw4 KbA/0DXEvLvr6bfTkDwYBPDWH9Oi2NpDbloz0s4w2DSbN5pwdSZxDMFtlrJiqYgUYjYg I/nGMhAeFOwwzTAJGNCmNDPG1UVqo/X7LpwAk35REf3XDP0D5sL3eymUvksD13xHfLIc xycN+K2zyhfA6IzL3yCU7njGQTc77RtJGBPZXt/UlXEY/hrJAq0wE89B3f9FjHWn9LIZ 7q/90E7lU8/G1qYtRP7AaC17mE87Nne80kPVneyX0bcTZ2bNvk1qX+2S+JsXbi8UaqPq Pmtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=CfkMsleLg2YSWs1JrWdG5Tp4eeoQk4FaXsATm2+xXgQ=; b=qpVko0sKKKa8KXBf7xebT8mLW3rBEDT5HaNyHVpwxkJ/lvkuyh2GhTjeavZTgFdM9e p9onBCcbfI2YKYBUwK7NK077WSUu4EDpBWIhi2VSxvCReG7KhNsE7LSQEHiUSBSCj0rg uXRqtTfFjGaos+ynu/Os3EMtv6P3caYHEsAQoFjuf5hfk9u3ZZOChe62c5hQGk8XTgA+ WpBV93CtqQzge9WbsrvaxD9yIh1VU46ObV5A/Y3HJz2/WwwLP5bMpvL894M1e7gB2zZS DGmSs8lc9ZcyfCvglD2GNs3EY3I2bVdM6n4el04GewmEOZboRbvQbcuIkTTpzNv1/5bw xvYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g16si2265865edp.400.2021.08.26.05.55.04; Thu, 26 Aug 2021 05:56:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232506AbhHZMzp (ORCPT + 99 others); Thu, 26 Aug 2021 08:55:45 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:18974 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241651AbhHZMzp (ORCPT ); Thu, 26 Aug 2021 08:55:45 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GwN4p2JkZzbbZj; Thu, 26 Aug 2021 20:51:06 +0800 (CST) Received: from [10.174.178.134] (10.174.178.134) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2308.8; Thu, 26 Aug 2021 20:54:56 +0800 Subject: Re: [PATCH v3 4/4] ext4: prevent getting empty inode buffer To: CC: , , , References: <20210821065450.1397451-1-yi.zhang@huawei.com> <20210821065450.1397451-5-yi.zhang@huawei.com> From: Zhang Yi Message-ID: Date: Thu, 26 Aug 2021 20:54:56 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 In-Reply-To: <20210821065450.1397451-5-yi.zhang@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.178.134] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 2021/8/21 14:54, Zhang Yi wrote: > In ext4_get_inode_loc(), we may skip IO and get an zero && uptodate > inode buffer when the inode monopolize an inode block for performance > reason. For most cases, ext4_mark_iloc_dirty() will fill the inode > buffer to make it fine, but we could miss this call if something bad > happened. Finally, __ext4_get_inode_loc_noinmem() may probably get an > empty inode buffer and trigger ext4 error. > > For example, if we remove a nonexistent xattr on inode A, > ext4_xattr_set_handle() will return ENODATA before invoking > ext4_mark_iloc_dirty(), it will left an uptodate but zero buffer. We > will get checksum error message in ext4_iget() when getting inode again. > > EXT4-fs error (device sda): ext4_lookup:1784: inode #131074: comm cat: iget: checksum invalid > > Even worse, if we allocate another inode B at the same inode block, it > will corrupt the inode A on disk when write back inode B. > > So this patch postpone the initialization and mark buffer uptodate logic > until shortly before we fill correct inode data in ext4_do_update_inode() > if skip read I/O, ensure the buffer is really uptodate. > > Signed-off-by: Zhang Yi > Reviewed-by: Jan Kara > --- > fs/ext4/inode.c | 27 ++++++++++++++++++++++++--- > 1 file changed, 24 insertions(+), 3 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 8323d3e8f393..000abb5696b0 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -4367,9 +4367,12 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino, > } > brelse(bitmap_bh); > if (i == start + inodes_per_block) { > - /* all other inodes are free, so skip I/O */ > - memset(bh->b_data, 0, bh->b_size); > - set_buffer_uptodate(bh); > + /* > + * All other inodes are free, skip I/O. Return > + * uninitialized buffer immediately, initialization > + * is postponed until shortly before we fill inode > + * contents. > + */ > unlock_buffer(bh); > goto has_buffer; > } > @@ -5028,6 +5031,24 @@ static int ext4_do_update_inode(handle_t *handle, > gid_t i_gid; > projid_t i_projid; > > + /* > + * If the buffer is not uptodate, it means all information of the > + * inode is in memory and we got this buffer without reading the > + * block. We must be cautious that once we mark the buffer as > + * uptodate, we rely on filling in the correct inode data later > + * in this function. Otherwise if we left uptodate buffer without > + * copying proper inode contents, we could corrupt the inode on > + * disk after allocating another inode in the same block. > + */ > + if (!buffer_uptodate(bh)) { > + lock_buffer(bh); > + if (!buffer_uptodate(bh)) { > + memset(bh->b_data, 0, bh->b_size); > + set_buffer_uptodate(bh); > + } > + unlock_buffer(bh); > + } Hi, Jan. I notice that above solution is not correct. The problem is still in ext4_xattr_set_handle(), if we set a new xattr entry in a pure inode, the above hunk may zero out the ibody xattr entry we just set up in ext4_xattr_ibody_set(). I guess we could not 'zero out buffer && mark buffer uptodate' here, maybe __ext4_get_inode_loc() should return a really initialized buffer, or else it's still fragile and hard to guarantee that the 'zero out' and 'postponed set_buffer_uptodate()' will not zero out something we just set or overwrite something we updated concurrently. How about factor out the filling inode contents from ext4_do_update_inode() into maybe ext4_fill_raw_inode(), and call it in __ext4_get_inode_loc() ? Please see my v4 patchset. Thanks, Yi.