2023-07-31 01:07:50

by Stephen Zhang

[permalink] [raw]
Subject: [PATCH v2] ext4: Fix rec_len verify error

From: Shida Zhang <[email protected]>

With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
a problem occurred when more than 13 million files were directly created
under a directory:

EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum

When enough files are created, the fake_dirent->reclen will be 0xffff.
it doesn't equal to the blocksize 65536, i.e. 0x10000.

But it is not the same condition when blocksize equals to 4k.
when enough file are created, the fake_dirent->reclen will be 0x1000.
it equals to the blocksize 4k, i.e. 0x1000.

The problem seems to be related to the limitation of the 16-bit field
when the blocksize is set to 64k. To address this, Modify the check so
as to handle it properly.

Signed-off-by: Shida Zhang <[email protected]>
---
v1->v2:
Use a better way to check the condition, as suggested by Andreas.

fs/ext4/namei.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 0caf6c730ce3..fffed95f8531 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -445,8 +445,9 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
struct ext4_dir_entry *dp;
struct dx_root_info *root;
int count_offset;
+ int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);

- if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
+ if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
count_offset = 8;
else if (le16_to_cpu(dirent->rec_len) == 12) {
dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
--
2.27.0



2023-07-31 15:54:31

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: Fix rec_len verify error

On Mon, Jul 31, 2023 at 09:01:04AM +0800, zhangshida wrote:
> From: Shida Zhang <[email protected]>
>
> With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
> a problem occurred when more than 13 million files were directly created
> under a directory:
>
> EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
> EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
> EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
>
> When enough files are created, the fake_dirent->reclen will be 0xffff.
> it doesn't equal to the blocksize 65536, i.e. 0x10000.
>
> But it is not the same condition when blocksize equals to 4k.
> when enough file are created, the fake_dirent->reclen will be 0x1000.
> it equals to the blocksize 4k, i.e. 0x1000.
>
> The problem seems to be related to the limitation of the 16-bit field
> when the blocksize is set to 64k. To address this, Modify the check so
> as to handle it properly.

urughghahrhrhr<shudder>

Sorry that I missed that rec_len is an encoded number, not a plain le16
integer...

> Signed-off-by: Shida Zhang <[email protected]>
> ---
> v1->v2:
> Use a better way to check the condition, as suggested by Andreas.
>
> fs/ext4/namei.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 0caf6c730ce3..fffed95f8531 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -445,8 +445,9 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
> struct ext4_dir_entry *dp;
> struct dx_root_info *root;
> int count_offset;
> + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
>
> - if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
> + if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
> count_offset = 8;
> else if (le16_to_cpu(dirent->rec_len) == 12) {

...but what about all the other le16_to_cpu(ext4_dir_entry{,_2}.rec_len)
accesses in this file? Don't those also need to be converted to
ext4_rec_len_from_disk calls?

Also,
Fixes: dbe89444042ab ("ext4: Calculate and verify checksums for htree nodes")

--D

> dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
> --
> 2.27.0
>

2023-08-01 02:33:18

by Stephen Zhang

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: Fix rec_len verify error

Darrick J. Wong <[email protected]> 于2023年7月31日周一 23:41写道:
>
> On Mon, Jul 31, 2023 at 09:01:04AM +0800, zhangshida wrote:
> > From: Shida Zhang <[email protected]>
> >
> > With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
> > a problem occurred when more than 13 million files were directly created
> > under a directory:
> >
> > EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
> > EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
> > EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
> >
> > When enough files are created, the fake_dirent->reclen will be 0xffff.
> > it doesn't equal to the blocksize 65536, i.e. 0x10000.
> >
> > But it is not the same condition when blocksize equals to 4k.
> > when enough file are created, the fake_dirent->reclen will be 0x1000.
> > it equals to the blocksize 4k, i.e. 0x1000.
> >
> > The problem seems to be related to the limitation of the 16-bit field
> > when the blocksize is set to 64k. To address this, Modify the check so
> > as to handle it properly.
>
> urughghahrhrhr<shudder>
>
> Sorry that I missed that rec_len is an encoded number, not a plain le16
> integer...
>

Yep, that's really a point that is easy to forget...

> > Signed-off-by: Shida Zhang <[email protected]>
> > ---
> > v1->v2:
> > Use a better way to check the condition, as suggested by Andreas.
> >
> > fs/ext4/namei.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> > index 0caf6c730ce3..fffed95f8531 100644
> > --- a/fs/ext4/namei.c
> > +++ b/fs/ext4/namei.c
> > @@ -445,8 +445,9 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
> > struct ext4_dir_entry *dp;
> > struct dx_root_info *root;
> > int count_offset;
> > + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
> >
> > - if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
> > + if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
> > count_offset = 8;
> > else if (le16_to_cpu(dirent->rec_len) == 12) {
>
> ...but what about all the other le16_to_cpu(ext4_dir_entry{,_2}.rec_len)
> accesses in this file? Don't those also need to be converted to
> ext4_rec_len_from_disk calls?
>
> Also,
> Fixes: dbe89444042ab ("ext4: Calculate and verify checksums for htree nodes")
>

Thanks for your suggestion, I will try to add all the other conversion
in this file for the next v3.

Cheers,
Shida





> --D
>
> > dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
> > --
> > 2.27.0
> >