ext4_handle_error
EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
if remount-ro
ext4_commit_super(sb);
As you can see, when the filesystem error in the kernel, the last sb commit
not record the journal, So sb->s_state will be overwritten by journal recover.
In some cases , modifying metadata and superblock data are placed in two
transactions, if the previous transaction is already in the journal, and
ext4_handle_error occurs when updating sb, the filesystem is still error even
if the journal is recovered(I know that this situation should not occur in
theory, but I encountered this error when testing quota. Therefore, I think
we cannot fully rely on the kernel).
So when the filesystem is error before the journal recover, keep the error
state and perform deep check later.
Signed-off-by: zhanchengbin <[email protected]>
---
e2fsck/journal.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/e2fsck/journal.c b/e2fsck/journal.c
index c7868d89..6f49321d 100644
--- a/e2fsck/journal.c
+++ b/e2fsck/journal.c
@@ -1683,6 +1683,7 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
errcode_t retval, recover_retval;
io_stats stats = 0;
unsigned long long kbytes_written = 0;
+ __u16 state = ctx->fs->super->s_state;
printf(_("%s: recovering journal\n"), ctx->device_name);
if (ctx->options & E2F_OPT_READONLY) {
@@ -1722,6 +1723,9 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
ctx->fs->flags |= EXT2_FLAG_MASTER_SB_ONLY;
ctx->fs->super->s_kbytes_written += kbytes_written;
+ if (EXT2_ERROR_FS | state)
+ ctx->fs->super->s_state = state | EXT2_ERROR_FS;
+
/* Set the superblock flags */
e2fsck_clear_recover(ctx, recover_retval != 0);
--
2.31.1
On Fri, Jun 02, 2023 at 04:27:59PM +0800, zhanchengbin wrote:
> ext4_handle_error
> EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
> if remount-ro
> ext4_commit_super(sb);
> As you can see, when the filesystem error in the kernel, the last sb commit
> not record the journal, So sb->s_state will be overwritten by journal recover.
> In some cases , modifying metadata and superblock data are placed in two
> transactions, if the previous transaction is already in the journal, and
> ext4_handle_error occurs when updating sb, the filesystem is still error even
> if the journal is recovered(I know that this situation should not occur in
> theory, but I encountered this error when testing quota. Therefore, I think
> we cannot fully rely on the kernel).
> So when the filesystem is error before the journal recover, keep the error
> state and perform deep check later.
>
> Signed-off-by: zhanchengbin <[email protected]>
> ---
> e2fsck/journal.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/e2fsck/journal.c b/e2fsck/journal.c
> index c7868d89..6f49321d 100644
> --- a/e2fsck/journal.c
> +++ b/e2fsck/journal.c
> @@ -1683,6 +1683,7 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
> errcode_t retval, recover_retval;
> io_stats stats = 0;
> unsigned long long kbytes_written = 0;
> + __u16 state = ctx->fs->super->s_state;
>
> printf(_("%s: recovering journal\n"), ctx->device_name);
> if (ctx->options & E2F_OPT_READONLY) {
> @@ -1722,6 +1723,9 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
> ctx->fs->flags |= EXT2_FLAG_MASTER_SB_ONLY;
> ctx->fs->super->s_kbytes_written += kbytes_written;
>
> + if (EXT2_ERROR_FS | state)
Isn't this ^^^^^^^^^^^^^^^^^^^^^ expression always nonzero?
> + ctx->fs->super->s_state = state | EXT2_ERROR_FS;
/me doesn't understand this bit logic at all.
--D
> +
> /* Set the superblock flags */
> e2fsck_clear_recover(ctx, recover_retval != 0);
>
> --
> 2.31.1
>
On 2023/6/2 23:18, Darrick J. Wong wrote:
> On Fri, Jun 02, 2023 at 04:27:59PM +0800, zhanchengbin wrote:
>> ext4_handle_error
>> EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
>> if remount-ro
>> ext4_commit_super(sb);
>> As you can see, when the filesystem error in the kernel, the last sb commit
>> not record the journal, So sb->s_state will be overwritten by journal recover.
>> In some cases , modifying metadata and superblock data are placed in two
>> transactions, if the previous transaction is already in the journal, and
>> ext4_handle_error occurs when updating sb, the filesystem is still error even
>> if the journal is recovered(I know that this situation should not occur in
>> theory, but I encountered this error when testing quota. Therefore, I think
>> we cannot fully rely on the kernel).
>> So when the filesystem is error before the journal recover, keep the error
>> state and perform deep check later.
>>
>> Signed-off-by: zhanchengbin <[email protected]>
>> ---
>> e2fsck/journal.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/e2fsck/journal.c b/e2fsck/journal.c
>> index c7868d89..6f49321d 100644
>> --- a/e2fsck/journal.c
>> +++ b/e2fsck/journal.c
>> @@ -1683,6 +1683,7 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
>> errcode_t retval, recover_retval;
>> io_stats stats = 0;
>> unsigned long long kbytes_written = 0;
>> + __u16 state = ctx->fs->super->s_state;
>>
>> printf(_("%s: recovering journal\n"), ctx->device_name);
>> if (ctx->options & E2F_OPT_READONLY) {
>> @@ -1722,6 +1723,9 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
>> ctx->fs->flags |= EXT2_FLAG_MASTER_SB_ONLY;
>> ctx->fs->super->s_kbytes_written += kbytes_written;
>>
>> + if (EXT2_ERROR_FS | state)
>
> Isn't this ^^^^^^^^^^^^^^^^^^^^^ expression always nonzero? >
>> + ctx->fs->super->s_state = state | EXT2_ERROR_FS;
>
> /me doesn't understand this bit logic at all.
You can check this stack:
ext4_handle_error
ext4_commit_super
ext4_update_super
if (sbi->s_add_error_count > 0) {
es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
- bin.
>
> --D
>
>> +
>> /* Set the superblock flags */
>> e2fsck_clear_recover(ctx, recover_retval != 0);
>>
>> --
>> 2.31.1
>>
> .
>
Okey, I found that this patch can fix my problem.
https://patchwork.ozlabs.org/project/linux-ext4/list/?series=342467
- bin.
On 2023/6/2 16:27, zhanchengbin wrote:
> ext4_handle_error
> EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
> if remount-ro
> ext4_commit_super(sb);
> As you can see, when the filesystem error in the kernel, the last sb commit
> not record the journal, So sb->s_state will be overwritten by journal recover.
> In some cases , modifying metadata and superblock data are placed in two
> transactions, if the previous transaction is already in the journal, and
> ext4_handle_error occurs when updating sb, the filesystem is still error even
> if the journal is recovered(I know that this situation should not occur in
> theory, but I encountered this error when testing quota. Therefore, I think
> we cannot fully rely on the kernel).
> So when the filesystem is error before the journal recover, keep the error
> state and perform deep check later.
>
> Signed-off-by: zhanchengbin <[email protected]>
> ---
> e2fsck/journal.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/e2fsck/journal.c b/e2fsck/journal.c
> index c7868d89..6f49321d 100644
> --- a/e2fsck/journal.c
> +++ b/e2fsck/journal.c
> @@ -1683,6 +1683,7 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
> errcode_t retval, recover_retval;
> io_stats stats = 0;
> unsigned long long kbytes_written = 0;
> + __u16 state = ctx->fs->super->s_state;
>
> printf(_("%s: recovering journal\n"), ctx->device_name);
> if (ctx->options & E2F_OPT_READONLY) {
> @@ -1722,6 +1723,9 @@ errcode_t e2fsck_run_ext3_journal(e2fsck_t ctx)
> ctx->fs->flags |= EXT2_FLAG_MASTER_SB_ONLY;
> ctx->fs->super->s_kbytes_written += kbytes_written;
>
> + if (EXT2_ERROR_FS | state)
> + ctx->fs->super->s_state = state | EXT2_ERROR_FS;
> +
> /* Set the superblock flags */
> e2fsck_clear_recover(ctx, recover_retval != 0);
>
>