2023-02-10 17:33:16

by Eric Whitney

[permalink] [raw]
Subject: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories

A significant number of xfstests can cause ext4 to log one or more
warning messages when they are run on a test file system where the
inline_data feature has been enabled. An example:

"EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
#16385: comm fsstress: No space for directory leaf checksum. Please
run e2fsck -D."

The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
589, 626, 631, and 650.

In this situation, the warning message indicates a bug in the code that
performs the RENAME_WHITEOUT operation on a directory entry that has
been stored inline. It doesn't detect that the directory is stored
inline, and incorrectly attempts to compute a dirent block checksum on
the whiteout inode when creating it. This attempt fails as a result
of the integrity checking in get_dirent_tail (usually due to a failure
to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
is then emitted.

Fix this by simply collecting the inlined data state at the time the
search for the source directory entry is performed. Existing code
handles the rest, and this is sufficient to eliminate all spurious
warning messages produced by the tests above. Go one step further
and do the same in the code that resets the source directory entry in
the event of failure. The inlined state should be present in the
"old" struct, but given the possibility of a race there's no harm
in taking a conservative approach and getting that information again
since the directory entry is being reread anyway.

Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout")

Signed-off-by: Eric Whitney <[email protected]>
---
fs/ext4/namei.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index dd28453d6ea3..924e16b239e0 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1595,11 +1595,10 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
int has_inline_data = 1;
ret = ext4_find_inline_entry(dir, fname, res_dir,
&has_inline_data);
- if (has_inline_data) {
- if (inlined)
- *inlined = 1;
+ if (inlined)
+ *inlined = has_inline_data;
+ if (has_inline_data)
goto cleanup_and_exit;
- }
}

if ((namelen <= 2) && (name[0] == '.') &&
@@ -3646,7 +3645,8 @@ static void ext4_resetent(handle_t *handle, struct ext4_renament *ent,
* so the old->de may no longer valid and need to find it again
* before reset old inode info.
*/
- old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
+ old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
+ &old.inlined);
if (IS_ERR(old.bh))
retval = PTR_ERR(old.bh);
if (!old.bh)
@@ -3813,7 +3813,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
return retval;
}

- old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
+ old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
+ &old.inlined);
if (IS_ERR(old.bh))
return PTR_ERR(old.bh);
/*
--
2.30.2



2023-02-16 12:09:44

by Ritesh Harjani (IBM)

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories

Eric Whitney <[email protected]> writes:

> A significant number of xfstests can cause ext4 to log one or more
> warning messages when they are run on a test file system where the
> inline_data feature has been enabled. An example:
>
> "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
> #16385: comm fsstress: No space for directory leaf checksum. Please
> run e2fsck -D."
>
> The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
> 070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
> 589, 626, 631, and 650.

So, I guess since these were only ext4 warnings hence maybe these were
getting ignored? Because the tests were never failing?
Should we do something for such cases? Maybe adding this warning
detection in xfstests to fail the test case when these warnings are not
intended? e.g. such warnings should make the test fail by saying
something detected in dmesg. Except when these are expected for I/O error
injection tests, etc...

>
> In this situation, the warning message indicates a bug in the code that
> performs the RENAME_WHITEOUT operation on a directory entry that has
> been stored inline. It doesn't detect that the directory is stored
> inline, and incorrectly attempts to compute a dirent block checksum on
> the whiteout inode when creating it. This attempt fails as a result
> of the integrity checking in get_dirent_tail (usually due to a failure
> to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
> is then emitted.
>
> Fix this by simply collecting the inlined data state at the time the
> search for the source directory entry is performed. Existing code
> handles the rest, and this is sufficient to eliminate all spurious
> warning messages produced by the tests above. Go one step further
> and do the same in the code that resets the source directory entry in
> the event of failure. The inlined state should be present in the
> "old" struct, but given the possibility of a race there's no harm
> in taking a conservative approach and getting that information again
> since the directory entry is being reread anyway.

Thanks for the detailed explaination. This makes sense to me.

>
> Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout")

So for your changes in ext4_resetent(), your above fixes tags make sense.
But what about the changes in ext4_rename() function. That was always
passing NULL as the last argument since the begining no?
Thinking from the backport perspective if and when required ;)


>
> Signed-off-by: Eric Whitney <[email protected]>
> ---
> fs/ext4/namei.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index dd28453d6ea3..924e16b239e0 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1595,11 +1595,10 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
> int has_inline_data = 1;
> ret = ext4_find_inline_entry(dir, fname, res_dir,
> &has_inline_data);
> - if (has_inline_data) {
> - if (inlined)
> - *inlined = 1;
> + if (inlined)
> + *inlined = has_inline_data;
> + if (has_inline_data)
> goto cleanup_and_exit;
> - }
> }

This looks like a nice cleanup!!

>
> if ((namelen <= 2) && (name[0] == '.') &&
> @@ -3646,7 +3645,8 @@ static void ext4_resetent(handle_t *handle, struct ext4_renament *ent,
> * so the old->de may no longer valid and need to find it again
> * before reset old inode info.
> */
> - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
> + &old.inlined);
> if (IS_ERR(old.bh))
> retval = PTR_ERR(old.bh);
> if (!old.bh)
> @@ -3813,7 +3813,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
> return retval;
> }
>
> - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
> + &old.inlined);
> if (IS_ERR(old.bh))
> return PTR_ERR(old.bh);
> /*
> --
> 2.30.2

2023-02-21 19:52:44

by Eric Whitney

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories

* Ritesh Harjani <[email protected]>:
> Eric Whitney <[email protected]> writes:
>
> > A significant number of xfstests can cause ext4 to log one or more
> > warning messages when they are run on a test file system where the
> > inline_data feature has been enabled. An example:
> >
> > "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
> > #16385: comm fsstress: No space for directory leaf checksum. Please
> > run e2fsck -D."
> >
> > The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
> > 070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
> > 589, 626, 631, and 650.
>
> So, I guess since these were only ext4 warnings hence maybe these were
> getting ignored? Because the tests were never failing?
> Should we do something for such cases? Maybe adding this warning
> detection in xfstests to fail the test case when these warnings are not
> intended? e.g. such warnings should make the test fail by saying
> something detected in dmesg. Except when these are expected for I/O error
> injection tests, etc...
>

Hi, Ritesh:

Thanks for taking a look at this patch.

Right, the tests never failed. I was aware of the warning messages because
I routinely check the captured system log output from my upstream regression
runs. The messages weren't so much ignored as being set aside for the time
being. They have been appearing for some years, and I'd mentioned them in
past concalls. Since the warning messages simply suggest a recovery action
that's appropriate in some cases - running "e2fsck -D" - there wasn't much
interest in pursuing them, given there was no evidence of actual file system
damage or misbehavior. After becoming much more familiar with the inline_data
code myself recently I got suspicious and took a closer look.

I don't know that I've got a strong opinion about this, but I think that adding
the EXT4-fs warning and error message prefixes to the set of strings searched
for by _check_dmesg, say, to force a test failure might be more trouble than
it's worth (at least, in comparison with periodically grepping through the
logs). Adding ext4-specific filters to individual xfstests as needed,
including maintaining them over time and extending the coverage to new tests as
they appear, sounds like a lot of ongoing work for what might be a modest
return. IIRC, we haven't had a significant number of bugs associated with
EXT4-fs messages without test failures in the last several years, at least.

> >
> > In this situation, the warning message indicates a bug in the code that
> > performs the RENAME_WHITEOUT operation on a directory entry that has
> > been stored inline. It doesn't detect that the directory is stored
> > inline, and incorrectly attempts to compute a dirent block checksum on
> > the whiteout inode when creating it. This attempt fails as a result
> > of the integrity checking in get_dirent_tail (usually due to a failure
> > to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
> > is then emitted.
> >
> > Fix this by simply collecting the inlined data state at the time the
> > search for the source directory entry is performed. Existing code
> > handles the rest, and this is sufficient to eliminate all spurious
> > warning messages produced by the tests above. Go one step further
> > and do the same in the code that resets the source directory entry in
> > the event of failure. The inlined state should be present in the
> > "old" struct, but given the possibility of a race there's no harm
> > in taking a conservative approach and getting that information again
> > since the directory entry is being reread anyway.
>
> Thanks for the detailed explaination. This makes sense to me.
>
> >
> > Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout")
>
> So for your changes in ext4_resetent(), your above fixes tags make sense.
> But what about the changes in ext4_rename() function. That was always
> passing NULL as the last argument since the begining no?
> Thinking from the backport perspective if and when required ;)
>

I'm guessing the intersection of the set of inline data and whiteout (overlayfs)
users is sufficiently small that this patch won't need backporting anytime
soon. :-)

The reason I picked that tag is that it's a fix for a fix to the patch that
originally added whiteout support to ext4. I wanted to convey that those
fixes should be applied in addition to this patch to get fully functional code.

Thanks,
Eric

>
> >
> > Signed-off-by: Eric Whitney <[email protected]>
> > ---
> > fs/ext4/namei.c | 13 +++++++------
> > 1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> > index dd28453d6ea3..924e16b239e0 100644
> > --- a/fs/ext4/namei.c
> > +++ b/fs/ext4/namei.c
> > @@ -1595,11 +1595,10 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
> > int has_inline_data = 1;
> > ret = ext4_find_inline_entry(dir, fname, res_dir,
> > &has_inline_data);
> > - if (has_inline_data) {
> > - if (inlined)
> > - *inlined = 1;
> > + if (inlined)
> > + *inlined = has_inline_data;
> > + if (has_inline_data)
> > goto cleanup_and_exit;
> > - }
> > }
>
> This looks like a nice cleanup!!
>
> >
> > if ((namelen <= 2) && (name[0] == '.') &&
> > @@ -3646,7 +3645,8 @@ static void ext4_resetent(handle_t *handle, struct ext4_renament *ent,
> > * so the old->de may no longer valid and need to find it again
> > * before reset old inode info.
> > */
> > - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> > + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
> > + &old.inlined);
> > if (IS_ERR(old.bh))
> > retval = PTR_ERR(old.bh);
> > if (!old.bh)
> > @@ -3813,7 +3813,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
> > return retval;
> > }
> >
> > - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> > + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
> > + &old.inlined);
> > if (IS_ERR(old.bh))
> > return PTR_ERR(old.bh);
> > /*
> > --
> > 2.30.2

2023-02-23 14:27:07

by Ritesh Harjani (IBM)

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories

Eric Whitney <[email protected]> writes:

> * Ritesh Harjani <[email protected]>:
>> Eric Whitney <[email protected]> writes:
>>
>> > A significant number of xfstests can cause ext4 to log one or more
>> > warning messages when they are run on a test file system where the
>> > inline_data feature has been enabled. An example:
>> >
>> > "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
>> > #16385: comm fsstress: No space for directory leaf checksum. Please
>> > run e2fsck -D."
>> >
>> > The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
>> > 070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
>> > 589, 626, 631, and 650.
>>
>> So, I guess since these were only ext4 warnings hence maybe these were
>> getting ignored? Because the tests were never failing?
>> Should we do something for such cases? Maybe adding this warning
>> detection in xfstests to fail the test case when these warnings are not
>> intended? e.g. such warnings should make the test fail by saying
>> something detected in dmesg. Except when these are expected for I/O error
>> injection tests, etc...
>>
>
> Hi, Ritesh:
>
> Thanks for taking a look at this patch.
>
> Right, the tests never failed. I was aware of the warning messages because
> I routinely check the captured system log output from my upstream regression
> runs. The messages weren't so much ignored as being set aside for the time
> being. They have been appearing for some years, and I'd mentioned them in
> past concalls. Since the warning messages simply suggest a recovery action
> that's appropriate in some cases - running "e2fsck -D" - there wasn't much
> interest in pursuing them, given there was no evidence of actual file system
> damage or misbehavior. After becoming much more familiar with the inline_data
> code myself recently I got suspicious and took a closer look.
>
> I don't know that I've got a strong opinion about this, but I think that adding
> the EXT4-fs warning and error message prefixes to the set of strings searched
> for by _check_dmesg, say, to force a test failure might be more trouble than
> it's worth (at least, in comparison with periodically grepping through the
> logs). Adding ext4-specific filters to individual xfstests as needed,
> including maintaining them over time and extending the coverage to new tests as
> they appear, sounds like a lot of ongoing work for what might be a modest

ok, sure. But let me keep an eye out for this... Let me watch out for any
such bugs in my internal tests run to see whether adding such a check can
help us catch any hidden problems. I was thinking this need not be done
in one shot but can be done incrementally/individually for many tests.
Hence it should be relatively easy if we do that on the need basis
maybe.
I am not sure though of the returns/benefits from this work at this point in
time, until I have reviewed the list of failures.

> return. IIRC, we haven't had a significant number of bugs associated with
> EXT4-fs messages without test failures in the last several years, at least.

ok. Let me also take a look at it. Thanks!

>
>> >
>> > In this situation, the warning message indicates a bug in the code that
>> > performs the RENAME_WHITEOUT operation on a directory entry that has
>> > been stored inline. It doesn't detect that the directory is stored
>> > inline, and incorrectly attempts to compute a dirent block checksum on
>> > the whiteout inode when creating it. This attempt fails as a result
>> > of the integrity checking in get_dirent_tail (usually due to a failure
>> > to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
>> > is then emitted.
>> >
>> > Fix this by simply collecting the inlined data state at the time the
>> > search for the source directory entry is performed. Existing code
>> > handles the rest, and this is sufficient to eliminate all spurious
>> > warning messages produced by the tests above. Go one step further
>> > and do the same in the code that resets the source directory entry in
>> > the event of failure. The inlined state should be present in the
>> > "old" struct, but given the possibility of a race there's no harm
>> > in taking a conservative approach and getting that information again
>> > since the directory entry is being reread anyway.
>>
>> Thanks for the detailed explaination. This makes sense to me.
>>
>> >
>> > Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout")
>>
>> So for your changes in ext4_resetent(), your above fixes tags make sense.
>> But what about the changes in ext4_rename() function. That was always
>> passing NULL as the last argument since the begining no?
>> Thinking from the backport perspective if and when required ;)
>>
>
> I'm guessing the intersection of the set of inline data and whiteout (overlayfs)
> users is sufficiently small that this patch won't need backporting anytime
> soon. :-)
>
> The reason I picked that tag is that it's a fix for a fix to the patch that
> originally added whiteout support to ext4. I wanted to convey that those
> fixes should be applied in addition to this patch to get fully functional code.

Sure. Thanks for the explaination.

-ritesh

2023-03-07 10:54:19

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories

On Fri 10-02-23 12:32:44, Eric Whitney wrote:
> A significant number of xfstests can cause ext4 to log one or more
> warning messages when they are run on a test file system where the
> inline_data feature has been enabled. An example:
>
> "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
> #16385: comm fsstress: No space for directory leaf checksum. Please
> run e2fsck -D."
>
> The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068,
> 070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585,
> 589, 626, 631, and 650.
>
> In this situation, the warning message indicates a bug in the code that
> performs the RENAME_WHITEOUT operation on a directory entry that has
> been stored inline. It doesn't detect that the directory is stored
> inline, and incorrectly attempts to compute a dirent block checksum on
> the whiteout inode when creating it. This attempt fails as a result
> of the integrity checking in get_dirent_tail (usually due to a failure
> to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message
> is then emitted.
>
> Fix this by simply collecting the inlined data state at the time the
> search for the source directory entry is performed. Existing code
> handles the rest, and this is sufficient to eliminate all spurious
> warning messages produced by the tests above. Go one step further
> and do the same in the code that resets the source directory entry in
> the event of failure. The inlined state should be present in the
> "old" struct, but given the possibility of a race there's no harm
> in taking a conservative approach and getting that information again
> since the directory entry is being reread anyway.
>
> Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout")
>
> Signed-off-by: Eric Whitney <[email protected]>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> fs/ext4/namei.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index dd28453d6ea3..924e16b239e0 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1595,11 +1595,10 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
> int has_inline_data = 1;
> ret = ext4_find_inline_entry(dir, fname, res_dir,
> &has_inline_data);
> - if (has_inline_data) {
> - if (inlined)
> - *inlined = 1;
> + if (inlined)
> + *inlined = has_inline_data;
> + if (has_inline_data)
> goto cleanup_and_exit;
> - }
> }
>
> if ((namelen <= 2) && (name[0] == '.') &&
> @@ -3646,7 +3645,8 @@ static void ext4_resetent(handle_t *handle, struct ext4_renament *ent,
> * so the old->de may no longer valid and need to find it again
> * before reset old inode info.
> */
> - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
> + &old.inlined);
> if (IS_ERR(old.bh))
> retval = PTR_ERR(old.bh);
> if (!old.bh)
> @@ -3813,7 +3813,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
> return retval;
> }
>
> - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
> + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
> + &old.inlined);
> if (IS_ERR(old.bh))
> return PTR_ERR(old.bh);
> /*
> --
> 2.30.2
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-03-08 04:33:46

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix RENAME_WHITEOUT handling for inline directories

On Fri, 10 Feb 2023 12:32:44 -0500, Eric Whitney wrote:
> A significant number of xfstests can cause ext4 to log one or more
> warning messages when they are run on a test file system where the
> inline_data feature has been enabled. An example:
>
> "EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode
> #16385: comm fsstress: No space for directory leaf checksum. Please
> run e2fsck -D."
>
> [...]

Applied, thanks!

[1/1] ext4: fix RENAME_WHITEOUT handling for inline directories
commit: c9f62c8b2dbf7240536c0cc9a4529397bb8bf38e

Best regards,
--
Theodore Ts'o <[email protected]>