From: Ye Bin <[email protected]>
Diff v2 vs v1:
Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
ext4_load_journal() to jbd2_journal_recover().
When do fault injection test, got issue as follows:
EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
EXT4-fs (dm-5): recovery complete
EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
EXT4-fs (dm-5): recovery complete
EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
Without do file system check, file system is clean when do second mount.
Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
mode the last super block is commit directly. So super block in journal is
not uptodate. When do jounral recovery, the uptodate super block will be
covered by jounral data. If super block submit all failed after recover
journal, then file system error flag is lost. When do "fsck -a" couldn't
repair file system deeply.
To solve above issue we need to do extra handle when do super block journal
recovery.
Ye Bin (6):
jbd2: introduce callback for recovery journal
ext4: introudce helper for jounral recover handle
jbd2: do extra handle when do journal recovery
ext4: remove backup for super block when recovery journal
ext4: fix super block checksum error
ext4: make sure fs error flag setted before clear journal error
fs/ext4/ext4_jbd2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/ext4_jbd2.h | 2 ++
fs/ext4/super.c | 18 ++++--------
fs/jbd2/recovery.c | 27 ++++++++++++++++++
include/linux/jbd2.h | 11 ++++++++
5 files changed, 112 insertions(+), 12 deletions(-)
--
2.31.1
From: Ye Bin <[email protected]>
EXT4 file system's super block may submited by journal, however it
maybe submited directly when do error handle and also other scene.
So super block isn't uptodate in journal. So there is need to do
some extra handle when recover journal.
Signed-off-by: Ye Bin <[email protected]>
---
include/linux/jbd2.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 5962072a4b19..ab0e1a435a50 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1308,6 +1308,17 @@ struct journal_s
struct buffer_head *bh,
enum passtype pass, int off,
tid_t expected_commit_id);
+ /*
+ * EXT4 file system's super block may submited by journal, however it
+ * maybe submited directly when do error handle. So super block isn't
+ * uptodate in journal. So there is need to do some extra handle when
+ * recover journal.
+ */
+ void *j_replay_private_data;
+ int (*j_replay_prepare_callback)(struct journal_s *journal);
+ int (*j_replay_callback)(struct journal_s *journal,
+ struct buffer_head *bh);
+ void (*j_replay_end_callback)(struct journal_s *journal);
};
#define jbd2_might_wait_for_commit(j) \
--
2.31.1
From: Ye Bin <[email protected]>
Now, ext4 file system only need to handle super block when do
recover journal.
Signed-off-by: Ye Bin <[email protected]>
---
fs/ext4/ext4_jbd2.c | 65 +++++++++++++++++++++++++++++++++++++++++++++
fs/ext4/ext4_jbd2.h | 2 ++
fs/ext4/super.c | 1 +
3 files changed, 68 insertions(+)
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 77f318ec8abb..af03035606e1 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -395,3 +395,68 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
}
return err;
}
+
+static void ext4_replay_end_callback(struct journal_s *journal)
+{
+ kfree(journal->j_replay_private_data);
+ journal->j_replay_private_data = NULL;
+ journal->j_replay_callback = NULL;
+ journal->j_replay_end_callback = NULL;
+}
+
+static int ext4_replay_callback(struct journal_s *journal,
+ struct buffer_head *bh)
+{
+ struct super_block *sb = journal->j_private;
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ struct ext4_super_block *es = sbi->s_es;
+ struct ext4_super_block *nes;
+ unsigned long offset;
+
+ if (likely(sbi->s_sbh != bh))
+ return 0;
+
+ offset = (void*)es - (void*)sbi->s_sbh->b_data;
+ nes = (struct ext4_super_block*)(bh->b_data + offset);
+ /*
+ * If super block has error flag in journal record, there isn't need to
+ * cover error information, as in this case is errors=continue mode,
+ * error handle submit super block through journal.
+ */
+ if (le16_to_cpu(nes->s_state) & EXT4_ERROR_FS)
+ return 0;
+
+ memcpy(((char *)es) + EXT4_S_ERR_START,
+ journal->j_replay_private_data, EXT4_S_ERR_LEN);
+ if (sbi->s_mount_state & EXT4_ERROR_FS)
+ es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
+
+ return 0;
+}
+
+static int ext4_replay_prepare_callback(struct journal_s *journal)
+{
+ struct super_block *sb = journal->j_private;
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ char *private;
+ struct ext4_super_block *es = sbi->s_es;
+
+ if (!(sbi->s_mount_state & EXT4_ERROR_FS))
+ return 0;
+
+ private = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
+ if (!private)
+ return -ENOMEM;
+ memcpy(private, ((char *)es) + EXT4_S_ERR_START, EXT4_S_ERR_LEN);
+
+ journal->j_replay_private_data = private;
+ journal->j_replay_callback = ext4_replay_callback;
+ journal->j_replay_end_callback = ext4_replay_end_callback;
+
+ return 0;
+}
+
+void ext4_init_replay(journal_t *journal)
+{
+ journal->j_replay_prepare_callback = ext4_replay_prepare_callback;
+}
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 0c77697d5e90..8dcc7ef5028c 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -513,4 +513,6 @@ static inline int ext4_should_dioread_nolock(struct inode *inode)
return 1;
}
+void ext4_init_replay(journal_t *journal);
+
#endif /* _EXT4_JBD2_H */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index dc3907dff13a..ea0fea04907c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5677,6 +5677,7 @@ static void ext4_init_journal_params(struct super_block *sb, journal_t *journal)
journal->j_commit_interval = sbi->s_commit_interval;
journal->j_min_batch_time = sbi->s_min_batch_time;
journal->j_max_batch_time = sbi->s_max_batch_time;
+ ext4_init_replay(journal);
ext4_fc_init(sb, journal);
write_lock(&journal->j_state_lock);
--
2.31.1
From: Ye Bin <[email protected]>
Ext4 file system's super block in journal maybe not uptodate, when
file system has error, we need set error information when do recover
uper block.
Signed-off-by: Ye Bin <[email protected]>
---
fs/jbd2/recovery.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 8286a9ec122f..83b1a9689984 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -309,6 +309,15 @@ int jbd2_journal_recover(journal_t *journal)
return 0;
}
+ if (journal->j_replay_prepare_callback) {
+ err = journal->j_replay_prepare_callback(journal);
+ if (err) {
+ jbd2_debug(1, "JBD2: failed to prepare replay %d",
+ err);
+ return err;
+ }
+ }
+
err = do_one_pass(journal, &info, PASS_SCAN);
if (!err)
err = do_one_pass(journal, &info, PASS_REVOKE);
@@ -335,6 +344,10 @@ int jbd2_journal_recover(journal_t *journal)
if (!err)
err = err2;
}
+
+ if (journal->j_replay_end_callback)
+ journal->j_replay_end_callback(journal);
+
return err;
}
@@ -687,6 +700,20 @@ static int do_one_pass(journal_t *journal,
*((__be32 *)nbh->b_data) =
cpu_to_be32(JBD2_MAGIC_NUMBER);
}
+ if (unlikely(journal->j_replay_callback)) {
+ err = journal->j_replay_callback(
+ journal, nbh);
+ if (err) {
+ printk(KERN_ERR
+ "JBD2: replay "
+ "call back "
+ "failed.\n");
+ unlock_buffer(nbh);
+ brelse(obh);
+ brelse(nbh);
+ goto failed;
+ }
+ }
BUFFER_TRACE(nbh, "marking dirty");
set_buffer_uptodate(nbh);
--
2.31.1
From: Ye Bin <[email protected]>
As previous commit "jbd2: do extra handle when do journal recovery"
already do extra handle for super block. There's no need to do in
ext4_load_journal(), so remove it.
Signed-off-by: Ye Bin <[email protected]>
---
fs/ext4/super.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ea0fea04907c..d86ee5af2db9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5916,17 +5916,8 @@ static int ext4_load_journal(struct super_block *sb,
if (!ext4_has_feature_journal_needs_recovery(sb))
err = jbd2_journal_wipe(journal, !really_read_only);
- if (!err) {
- char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
- if (save)
- memcpy(save, ((char *) es) +
- EXT4_S_ERR_START, EXT4_S_ERR_LEN);
+ if (!err)
err = jbd2_journal_load(journal);
- if (save)
- memcpy(((char *) es) + EXT4_S_ERR_START,
- save, EXT4_S_ERR_LEN);
- kfree(save);
- }
if (err) {
ext4_msg(sb, KERN_ERR, "error loading journal");
--
2.31.1
From: Ye Bin <[email protected]>
As commit("ext4: fix error flag covered by journal recovery") update
error record when do journal recovery.There is need to recalculate
super block checksum after update error record or will lead to super
block checksum mismatch to data.
Signed-off-by: Ye Bin <[email protected]>
---
fs/ext4/ext4_jbd2.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index af03035606e1..ffcb0d58d407 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -430,6 +430,7 @@ static int ext4_replay_callback(struct journal_s *journal,
journal->j_replay_private_data, EXT4_S_ERR_LEN);
if (sbi->s_mount_state & EXT4_ERROR_FS)
es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
+ ext4_superblock_csum_set(sb);
return 0;
}
--
2.31.1
From: Ye Bin <[email protected]>
Now, jounral error number maybe cleared even though ext4_commit_super()
failed. This may lead to error flag miss, then fsck will miss to check
file system deeply.
Signed-off-by: Ye Bin <[email protected]>
---
fs/ext4/super.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d86ee5af2db9..b458af1cbf5c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6143,11 +6143,13 @@ static int ext4_clear_journal_err(struct super_block *sb,
errstr = ext4_decode_error(sb, j_errno, nbuf);
ext4_warning(sb, "Filesystem error recorded "
"from previous mount: %s", errstr);
- ext4_warning(sb, "Marking fs in need of filesystem check.");
EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
- ext4_commit_super(sb);
+ j_errno = ext4_commit_super(sb);
+ if (j_errno)
+ return j_errno;
+ ext4_warning(sb, "Marked fs in need of filesystem check.");
jbd2_journal_clear_err(journal);
jbd2_journal_update_sb_errno(journal);
--
2.31.1
Hello!
On Fri 10-02-23 11:20:38, Ye Bin wrote:
> From: Ye Bin <[email protected]>
>
> Diff v2 vs v1:
> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
> ext4_load_journal() to jbd2_journal_recover().
>
> When do fault injection test, got issue as follows:
> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>
> Without do file system check, file system is clean when do second mount.
> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
> mode the last super block is commit directly. So super block in journal is
> not uptodate. When do jounral recovery, the uptodate super block will be
> covered by jounral data. If super block submit all failed after recover
> journal, then file system error flag is lost. When do "fsck -a" couldn't
> repair file system deeply.
> To solve above issue we need to do extra handle when do super block journal
> recovery.
Thanks for the patches. Looking through the patches, I think this is a bit
of an overengineering for the problem at hand. The only thing that is
really worth preserving so that it is not lost after journal replay is the
error information. So in ext4_load_journal() I would just save that if
EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
after journal replay. Sure if the superblock write during journal replay
succeeds but the write restoring the error information fails, we will loose
the error information but that is so unlikely in practice that I don't
think it is really worth complicating the code for it. Also the only
downside is we will loose the information there is some error in the
filesystem - we'll soon find that out again anyway :).
Honza
>
> Ye Bin (6):
> jbd2: introduce callback for recovery journal
> ext4: introudce helper for jounral recover handle
> jbd2: do extra handle when do journal recovery
> ext4: remove backup for super block when recovery journal
> ext4: fix super block checksum error
> ext4: make sure fs error flag setted before clear journal error
>
> fs/ext4/ext4_jbd2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
> fs/ext4/ext4_jbd2.h | 2 ++
> fs/ext4/super.c | 18 ++++--------
> fs/jbd2/recovery.c | 27 ++++++++++++++++++
> include/linux/jbd2.h | 11 ++++++++
> 5 files changed, 112 insertions(+), 12 deletions(-)
>
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On 2023/2/10 19:56, Jan Kara wrote:
> Hello!
>
> On Fri 10-02-23 11:20:38, Ye Bin wrote:
>> From: Ye Bin <[email protected]>
>>
>> Diff v2 vs v1:
>> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
>> ext4_load_journal() to jbd2_journal_recover().
>>
>> When do fault injection test, got issue as follows:
>> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
>> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> Without do file system check, file system is clean when do second mount.
>> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
>> mode the last super block is commit directly. So super block in journal is
>> not uptodate. When do jounral recovery, the uptodate super block will be
>> covered by jounral data. If super block submit all failed after recover
>> journal, then file system error flag is lost. When do "fsck -a" couldn't
>> repair file system deeply.
>> To solve above issue we need to do extra handle when do super block journal
>> recovery.
>
> Thanks for the patches. Looking through the patches, I think this is a bit
> of an overengineering for the problem at hand. The only thing that is
> really worth preserving so that it is not lost after journal replay is the
> error information. So in ext4_load_journal() I would just save that if
> EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
> after journal replay. Sure if the superblock write during journal replay
> succeeds but the write restoring the error information fails, we will loose
> the error information but that is so unlikely in practice that I don't
> think it is really worth complicating the code for it. Also the only
> downside is we will loose the information there is some error in the
> filesystem - we'll soon find that out again anyway :).
>
I think so, also add a error message if we failed to restoring the error
information, it could let us know what happened.
Thanks,
Yi.
On 2023/2/10 19:56, Jan Kara wrote:
> Hello!
>
> On Fri 10-02-23 11:20:38, Ye Bin wrote:
>> From: Ye Bin <[email protected]>
>>
>> Diff v2 vs v1:
>> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
>> ext4_load_journal() to jbd2_journal_recover().
>>
>> When do fault injection test, got issue as follows:
>> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
>> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> Without do file system check, file system is clean when do second mount.
>> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
>> mode the last super block is commit directly. So super block in journal is
>> not uptodate. When do jounral recovery, the uptodate super block will be
>> covered by jounral data. If super block submit all failed after recover
>> journal, then file system error flag is lost. When do "fsck -a" couldn't
>> repair file system deeply.
>> To solve above issue we need to do extra handle when do super block journal
>> recovery.
> Thanks for the patches. Looking through the patches, I think this is a bit
> of an overengineering for the problem at hand. The only thing that is
> really worth preserving so that it is not lost after journal replay is the
> error information. So in ext4_load_journal() I would just save that if
> EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
> after journal replay. Sure if the superblock write during journal replay
> succeeds but the write restoring the error information fails, we will loose
> the error information but that is so unlikely in practice that I don't
> think it is really worth complicating the code for it. Also the only
> downside is we will loose the information there is some error in the
> filesystem - we'll soon find that out again anyway :).
>
> Honza
Yes, this solution seems a little cumbersome, but to solve the problem
of error
information loss, I can only think of this solution.
I re-analyzed the issue scenario. Because the error information of the
last journal
super block was not recorded. This will cause that the error flag will
not be updated
when the super block is submitted subsequently. However, when processing
orphan
list, the file system errors were recorded in the memory, and the orphan
list were
cleared directly, resulting in file system inconsistencies. To solve
above isuue, i sent
V3 patch.
>> Ye Bin (6):
>> jbd2: introduce callback for recovery journal
>> ext4: introudce helper for jounral recover handle
>> jbd2: do extra handle when do journal recovery
>> ext4: remove backup for super block when recovery journal
>> ext4: fix super block checksum error
>> ext4: make sure fs error flag setted before clear journal error
>>
>> fs/ext4/ext4_jbd2.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
>> fs/ext4/ext4_jbd2.h | 2 ++
>> fs/ext4/super.c | 18 ++++--------
>> fs/jbd2/recovery.c | 27 ++++++++++++++++++
>> include/linux/jbd2.h | 11 ++++++++
>> 5 files changed, 112 insertions(+), 12 deletions(-)
>>
>> --
>> 2.31.1
>>