2022-08-02 11:27:00

by zhanchengbin

[permalink] [raw]
Subject: [BUG] Tune2fs and fuse2fs do not change j_tail_sequence in journal superblock

There are two recover_ext3_journal functions in e2fsprogs, the
recover_ext3_journal
function in debugfs/journal.c is called when the programs tune2fs and
fuse2fs do
log replay, but in this recover_ext3_journal function, after the log
replay is over,
the j_tail_sequence in journal superblock is not changed to the value of
the last
transaction sequence, this will cause subsequent log commitids to count
from the
commitid in last time.
```
e2fsck/journal.c
static errcode_t recover_ext3_journal(e2fsck_t ctx)
{
... ...
journal->j_superblock->s_errno = -EINVAL;
mark_buffer_dirty(journal->j_sb_buffer);
}

journal->j_tail_sequence = journal->j_transaction_sequence;

errout:
journal_destroy_revoke(journal);
... ...
}
```
```
debugfs/journal.c
static errcode_t recover_ext3_journal(ext2_filsys fs)
{
... ...
journal->j_superblock->s_errno = -EINVAL;
mark_buffer_dirty(journal->j_sb_buffer);
}

errout:
journal_destroy_revoke(journal);
... ...
}
```
result:
After tune2fs -e, the log commitid is counted from the commitid in last
time, if
the log ID of the current operation overlaps with that of the last
operation, this
will cause logs that were previously replayed by tune2fs to be replayed
here. The
code used by tune2fs to determine whether to replay the transaction is
as follows:
```
static int do_one_pass(journal_t *journal,
struct recovery_info *info, enum passtype pass)
{
... ...
while (1) {
... ...
if (sequence != next_commit_ID) {
brelse(bh);
break;
}
... ...
next_commit_ID++;
}
... ...
}
```
This moment, commitid is "34 c7", commit time stamp is "62 e0 f7 a5"
004aa000 c0 3b 39 98 00 00 00 02 00 00 34 c7 00 00 00 00
|.;9.......4.....|
004aa010 4e 93 2f fb 00 00 00 00 00 00 00 00 00 00 00 00
|N./.............|
004aa020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
004aa030 00 00 00 00 62 e0 f7 a5 0a 16 56 20 00 00 00 00
|....b.....V ....|
--
This moment, commitid is "34 c8", commit time stamp is "62 e0 e7 1e"
004ad000 c0 3b 39 98 00 00 00 02 00 00 34 c8 00 00 00 00
|.;9.......4.....|
004ad010 75 a6 12 01 00 00 00 00 00 00 00 00 00 00 00 00
|u...............|
004ad020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
004ad030 00 00 00 00 62 e0 e7 1e 18 32 cf 18 00 00 00 00
|....b....2......|
--
The probability of this happening is very small, but we do, and I think
it's a problem.

There are two existing solutions:
1) Add "journal->j_tail_sequence = journal->j_transaction_sequence" in
to the
recover_ext3_journal in debugfs/journal.c.
2) There is a timestamp in the commit block, so we can add timestamp
check when
the log is replayed.


2022-08-03 04:54:29

by Alexey Lyashkov

[permalink] [raw]
Subject: Re: [BUG] Tune2fs and fuse2fs do not change j_tail_sequence in journal superblock

Hi

Thanks for you report.
Problem much bigger than it. Debugs based code lack many parts of journal handling, including fast commit.
Journal tag v3, and other.

Alex

> On 2 Aug 2022, at 14:23, zhanchengbin <[email protected]> wrote:
>
> There are two recover_ext3_journal functions in e2fsprogs, the recover_ext3_journal
> function in debugfs/journal.c is called when the programs tune2fs and fuse2fs do
> log replay, but in this recover_ext3_journal function, after the log replay is over,
> the j_tail_sequence in journal superblock is not changed to the value of the last
> transaction sequence, this will cause subsequent log commitids to count from the
> commitid in last time.
> ```
> e2fsck/journal.c
> static errcode_t recover_ext3_journal(e2fsck_t ctx)
> {
> ... ...
> journal->j_superblock->s_errno = -EINVAL;
> mark_buffer_dirty(journal->j_sb_buffer);
> }
>
> journal->j_tail_sequence = journal->j_transaction_sequence;
>
> errout:
> journal_destroy_revoke(journal);
> ... ...
> }
> ```
> ```
> debugfs/journal.c
> static errcode_t recover_ext3_journal(ext2_filsys fs)
> {
> ... ...
> journal->j_superblock->s_errno = -EINVAL;
> mark_buffer_dirty(journal->j_sb_buffer);
> }
>
> errout:
> journal_destroy_revoke(journal);
> ... ...
> }
> ```
> result:
> After tune2fs -e, the log commitid is counted from the commitid in last time, if
> the log ID of the current operation overlaps with that of the last operation, this
> will cause logs that were previously replayed by tune2fs to be replayed here. The
> code used by tune2fs to determine whether to replay the transaction is as follows:
> ```
> static int do_one_pass(journal_t *journal,
> struct recovery_info *info, enum passtype pass)
> {
> ... ...
> while (1) {
> ... ...
> if (sequence != next_commit_ID) {
> brelse(bh);
> break;
> }
> ... ...
> next_commit_ID++;
> }
> ... ...
> }
> ```
> This moment, commitid is "34 c7", commit time stamp is "62 e0 f7 a5"
> 004aa000 c0 3b 39 98 00 00 00 02 00 00 34 c7 00 00 00 00 |.;9.......4.....|
> 004aa010 4e 93 2f fb 00 00 00 00 00 00 00 00 00 00 00 00 |N./.............|
> 004aa020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> 004aa030 00 00 00 00 62 e0 f7 a5 0a 16 56 20 00 00 00 00 |....b.....V ....|
> --
> This moment, commitid is "34 c8", commit time stamp is "62 e0 e7 1e"
> 004ad000 c0 3b 39 98 00 00 00 02 00 00 34 c8 00 00 00 00 |.;9.......4.....|
> 004ad010 75 a6 12 01 00 00 00 00 00 00 00 00 00 00 00 00 |u...............|
> 004ad020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> 004ad030 00 00 00 00 62 e0 e7 1e 18 32 cf 18 00 00 00 00 |....b....2......|
> --
> The probability of this happening is very small, but we do, and I think it's a problem.
>
> There are two existing solutions:
> 1) Add "journal->j_tail_sequence = journal->j_transaction_sequence" in to the
> recover_ext3_journal in debugfs/journal.c.
> 2) There is a timestamp in the commit block, so we can add timestamp check when
> the log is replayed.