If we failed to add new entry on rename whiteout, we cannot reset the
old->de entry directly, because the old->de could have moved from under
us during make indexed dir. So find the old entry again before reset is
needed, otherwise it may corrupt the filesystem as below.
/dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED.
/dev/sda: Unattached inode 75
/dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
Fixes: 6b4b8e6b4ad ("ext4: fix bug for rename with RENAME_WHITEOUT")
Cc: [email protected]
Signed-off-by: zhangyi (F) <[email protected]>
---
fs/ext4/namei.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index cf652ba3e74d..17d9400563fa 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -3602,6 +3602,31 @@ static int ext4_setent(handle_t *handle, struct ext4_renament *ent,
return retval;
}
+static void ext4_resetent(handle_t *handle, struct ext4_renament *ent,
+ unsigned ino, unsigned file_type)
+{
+ struct ext4_renament old = *ent;
+ int retval = 0;
+
+ /*
+ * old->de could have moved from under us during make indexed dir,
+ * so the old->de may no longer valid and need to find it again
+ * before reset old inode info.
+ */
+ old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
+ if (IS_ERR(old.bh))
+ retval = PTR_ERR(old.bh);
+ if (!old.bh)
+ retval = -ENOENT;
+ if (retval) {
+ ext4_std_error(old.dir->i_sb, retval);
+ return;
+ }
+
+ ext4_setent(handle, &old, ino, file_type);
+ brelse(old.bh);
+}
+
static int ext4_find_delete_entry(handle_t *handle, struct inode *dir,
const struct qstr *d_name)
{
@@ -3924,8 +3949,8 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
end_rename:
if (whiteout) {
if (retval) {
- ext4_setent(handle, &old,
- old.inode->i_ino, old_file_type);
+ ext4_resetent(handle, &old,
+ old.inode->i_ino, old_file_type);
drop_nlink(whiteout);
}
unlock_new_inode(whiteout);
--
2.25.4
On Wed, Mar 03, 2021 at 09:17:02PM +0800, zhangyi (F) wrote:
> If we failed to add new entry on rename whiteout, we cannot reset the
> old->de entry directly, because the old->de could have moved from under
> us during make indexed dir. So find the old entry again before reset is
> needed, otherwise it may corrupt the filesystem as below.
>
> /dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED.
> /dev/sda: Unattached inode 75
> /dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
>
> ....
>
> + /*
> + * old->de could have moved from under us during make indexed dir,
> + * so the old->de may no longer valid and need to find it again
> + * before reset old inode info.
> + */
> + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> + if (IS_ERR(old.bh))
> + retval = PTR_ERR(old.bh);
> + if (!old.bh)
> + retval = -ENOENT;
> + if (retval) {
> + ext4_std_error(old.dir->i_sb, retval);
So if the directory entry may have been deleted out from under us, an
ENOENT failure might happen under normal circumstances, shouldn't it?
In that case, ext4_std_error() will declare that the file system is
inconsistent, potentially resulting in the file system to be remounted
read-only, or causing the system to panic. So calling
ext4_std_error() needs to be done carefully.
Are we sure that calling ext4_std_error() is the right thing to do in
the case where ext4_find_entry() returns ENOENT?
- Ted
On 2021/3/11 23:43, Theodore Ts'o wrote:
> On Wed, Mar 03, 2021 at 09:17:02PM +0800, zhangyi (F) wrote:
>> If we failed to add new entry on rename whiteout, we cannot reset the
>> old->de entry directly, because the old->de could have moved from under
>> us during make indexed dir. So find the old entry again before reset is
>> needed, otherwise it may corrupt the filesystem as below.
>>
>> /dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED.
>> /dev/sda: Unattached inode 75
>> /dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
>>
>> ....
>>
>> + /*
>> + * old->de could have moved from under us during make indexed dir,
>> + * so the old->de may no longer valid and need to find it again
>> + * before reset old inode info.
>> + */
>> + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
>> + if (IS_ERR(old.bh))
>> + retval = PTR_ERR(old.bh);
>> + if (!old.bh)
>> + retval = -ENOENT;
>> + if (retval) {
>> + ext4_std_error(old.dir->i_sb, retval);
>
>
> So if the directory entry may have been deleted out from under us, an
> ENOENT failure might happen under normal circumstances, shouldn't it?
> > In that case, ext4_std_error() will declare that the file system is
> inconsistent, potentially resulting in the file system to be remounted
> read-only, or causing the system to panic. So calling
> ext4_std_error() needs to be done carefully.
>
> Are we sure that calling ext4_std_error() is the right thing to do in
> the case where ext4_find_entry() returns ENOENT?
>
Hi, Ted
In this error path of whiteout rename, we want to restore the old inode
number and old name back to the old entry, it's just a rollback operation.
The old entry will stay where it was in common cases, but it can be moved
from the first block to the leaf block during make indexed dir for one
special case, but it cannot be deleted in theory. So if we cannot find it
again, there must some bad thing happen and the filesystem may probably
inconsistency. So I calling ext4_std_error() here,or am I missing something?
Thanks,
Yi.
On Fri, Mar 12, 2021 at 10:01:50AM +0800, zhangyi (F) wrote:
>
> In this error path of whiteout rename, we want to restore the old inode
> number and old name back to the old entry, it's just a rollback operation.
> The old entry will stay where it was in common cases, but it can be moved
> from the first block to the leaf block during make indexed dir for one
> special case, but it cannot be deleted in theory. So if we cannot find it
> again, there must some bad thing happen and the filesystem may probably
> inconsistency. So I calling ext4_std_error() here,or am I missing something?
After looking at this more closely, I agree, this should be OK. The
directory is going to be locked, so it shouldn't be changing out from
under us.
Thanks, applied.
- Ted