2023-03-22 01:35:19

by Zhang Yi

[permalink] [raw]
Subject: [PATCH v5 0/3] ext4, jbd2: journal cycled record transactions between each mount

From: Zhang Yi <[email protected]>

v4->v5:
- Update doc about journal superblock in journal.rst.
v3->v4:
- Remove journal_cycle_record mount option, always enable it on ext4.
v2->v3:
- Prevent warning if mount old image with journal_cycle_record enabled.
- Limit this mount option into ext4 iamge only.
v1->v2:
- Fix the format type warning.
- Add more check of journal_cycle_record mount options in remount.

Hello!

This patch set add a new journal option 'JBD2_CYCLE_RECORD' and always
enable on ext4. It saves journal head for a clean unmounted file system
in the journal super block, which could let us record journal
transactions between each mount continuously. It could help us to do
journal backtrack and find root cause from a corrupted filesystem.
Current filesystem's corruption analysis is difficult and less useful
information, especially on the real products. It is useful to some
extent, especially for the cases of doing fuzzy tests and deploy in some
shout-runing products.

I've sent out the corresponding e2fsprogs part v2 separately[1], all of
these have done below test cases and also passed xfstests in auto mode.
- Mount a filesystem with empty journal.
- Mount a filesystem with journal ended in an unrecovered complete
transaction.
- Mount a filesystem with journal ended in an incomplete transaction.
- Mount a corrupted filesystem with out of bound journal s_head.
- Mount old filesystem without journal s_head set.

Any comments are welcome.

[1] https://lore.kernel.org/linux-ext4/[email protected]

Thanks!
Yi.

v4: https://lore.kernel.org/linux-ext4/[email protected]/
v3: https://lore.kernel.org/linux-ext4/[email protected]/
v2: https://lore.kernel.org/linux-ext4/[email protected]/
v1: https://lore.kernel.org/linux-ext4/[email protected]/

Zhang Yi (3):
jbd2: continue to record log between each mount
ext4: add journal cycled recording support
ext4: update doc about journal superblock description

Documentation/filesystems/ext4/journal.rst | 7 ++++++-
fs/ext4/super.c | 5 +++++
fs/jbd2/journal.c | 18 ++++++++++++++++--
fs/jbd2/recovery.c | 22 +++++++++++++++++-----
include/linux/jbd2.h | 9 +++++++--
5 files changed, 51 insertions(+), 10 deletions(-)

--
2.31.1


2023-03-22 21:40:54

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH v5 0/3] ext4, jbd2: journal cycled record transactions between each mount

On Mar 21, 2023, at 7:33 PM, Zhang Yi <[email protected]> wrote:
> This patch set add a new journal option 'JBD2_CYCLE_RECORD' and always
> enable on ext4. It saves journal head for a clean unmounted file system
> in the journal super block, which could let us record journal
> transactions between each mount continuously. It could help us to do
> journal backtrack and find root cause from a corrupted filesystem.
> Current filesystem's corruption analysis is difficult and less useful
> information, especially on the real products. It is useful to some
> extent, especially for the cases of doing fuzzy tests and deploy in some
> shout-runing products.

Another interesting side benefit of this change is that it gets a step
closer to the "lazy ext4" (log-structured optimization) that had been
described some time ago at FAST:

https://lwn.net/Articles/720226/
https://www.usenix.org/system/files/conference/fast17/fast17-aghayev.pdf
https://lists.openwall.net/linux-ext4/2017/04/11/1

Essentially, free space in the filesystem (or a large external device)
could be used as a continuous journal, and metadata would only rarely
be checkpointed to the actual filesystem. If the "journal" is close to
wrapping to the start, either the meta/data is checkpointed (if it is
no longer actively used or can make a large write), or re-journaled to
the end of the journal. At remount time, the full journal is read into
memory (discarding old copies of blocks) and this is used to identify
the current metadata rather than reading from the filesystem itself.

This would allow e.g. very efficient flash caching of metadata (and also
journaled data for small writes) for an HDD (or QLC) device.

Cheers, Andreas






Attachments:
signature.asc (890.00 B)
Message signed with OpenPGP

2023-03-23 08:25:17

by Zhang Yi

[permalink] [raw]
Subject: Re: [PATCH v5 0/3] ext4, jbd2: journal cycled record transactions between each mount

On 2023/3/23 5:34, Andreas Dilger wrote:
> On Mar 21, 2023, at 7:33 PM, Zhang Yi <[email protected]> wrote:
>> This patch set add a new journal option 'JBD2_CYCLE_RECORD' and always
>> enable on ext4. It saves journal head for a clean unmounted file system
>> in the journal super block, which could let us record journal
>> transactions between each mount continuously. It could help us to do
>> journal backtrack and find root cause from a corrupted filesystem.
>> Current filesystem's corruption analysis is difficult and less useful
>> information, especially on the real products. It is useful to some
>> extent, especially for the cases of doing fuzzy tests and deploy in some
>> shout-runing products.
>
> Another interesting side benefit of this change is that it gets a step
> closer to the "lazy ext4" (log-structured optimization) that had been
> described some time ago at FAST:
>
> https://lwn.net/Articles/720226/
> https://www.usenix.org/system/files/conference/fast17/fast17-aghayev.pdf
> https://lists.openwall.net/linux-ext4/2017/04/11/1
>
> Essentially, free space in the filesystem (or a large external device)
> could be used as a continuous journal, and metadata would only rarely
> be checkpointed to the actual filesystem. If the "journal" is close to
> wrapping to the start, either the meta/data is checkpointed (if it is
> no longer actively used or can make a large write), or re-journaled to
> the end of the journal. At remount time, the full journal is read into
> memory (discarding old copies of blocks) and this is used to identify
> the current metadata rather than reading from the filesystem itself.
>
> This would allow e.g. very efficient flash caching of metadata (and also
> journaled data for small writes) for an HDD (or QLC) device.
>

This is interesting, but current change looks like is just one small step.
It's been almost 6 years after the last talk I can found[1]. Is there
anyone still working on it?

[1] https://lore.kernel.org/linux-ext4/[email protected]/

Thanks,
Yi.

2023-06-15 15:28:44

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH v5 0/3] ext4, jbd2: journal cycled record transactions between each mount


On Wed, 22 Mar 2023 09:33:50 +0800, Zhang Yi wrote:
> v4->v5:
> - Update doc about journal superblock in journal.rst.
> v3->v4:
> - Remove journal_cycle_record mount option, always enable it on ext4.
> v2->v3:
> - Prevent warning if mount old image with journal_cycle_record enabled.
> - Limit this mount option into ext4 iamge only.
> v1->v2:
> - Fix the format type warning.
> - Add more check of journal_cycle_record mount options in remount.
>
> [...]

Applied, thanks!

[1/3] jbd2: continue to record log between each mount
commit: 0311c8729c0a35114d64a64f8977e7d9bec926df
[2/3] ext4: add journal cycled recording support
commit: b956fe38a26861bfe13e7e83fbeadf9d2e159366
[3/3] ext4: update doc about journal superblock description
commit: ecdae6e9d63414b263ab2848ba3835e727eef2f9

Best regards,
--
Theodore Ts'o <[email protected]>