Date:   Wed, 29 Dec 2021 20:37:21 -0500
From:   "Theodore Ts'o" <tytso@mit.edu>
To:     Manfred Spraul <manfred@colorfullife.com>
Cc:     adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, 1vier1@web.de
Subject: Re: JBD2: journal transaction 6943 on loop0-8 is corrupt.
Message-ID: <Yc0NUYyRhLdtapq+@mit.edu>
References: <baa3101d-e2f7-823e-040f-8739ab610419@colorfullife.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <baa3101d-e2f7-823e-040f-8739ab610419@colorfullife.com>
Precedence: bulk

On Tue, Dec 28, 2021 at 09:36:22PM +0100, Manfred Spraul wrote:
> Hi,
> 
> with simulated power failures, I see a corrupted journal
> 
> [39056.200845] JBD2: journal transaction 6943 on loop0-8 is corrupt.
> [39056.200851] EXT4-fs (loop0): error loading journal

This means that the journal replay found a commit which was *not* the
last commit, and which contained a CRC error.  If it's the last commit
(e.g., there is no valid subsequent commit block), then it's possible
that the journal commit was never completed before the system crashed
--- e.g., it was an interrupted commit.

Your test is aborting the commit at various points in the write I/O
stream, so it should be simulating an interrupted commit (assuming
that it's not corrupting any I/O.  So the jbd2 layer should have
understood it was the last commit in the journal, and been OK with the
checksum failure.

But what can happen is that if there is a commit block in the right
place at the end of the transaction, left over from the previous
journalling session, this can confuse the jbd2 layer into thinking
that it is *not* the last transaction, and then it will make the
"journal transaction is corrupt" report.

How does the jbd2 layer determine whether there is a valid "subsequent
commit", well if the subsequent commit block meets the following two
criteria:

	* the commit id is the correct, expected one (n+1 the previous
          commit id).
	* the commit time (seconds since January 1, 1970) in the
	  commit block is greater than the comit time in the previous
	  commit block.

So if your test setup doesn't correctly set the time (say, it always
leaves the bootup time to January 1, 1970), and the workload is
extremely regular, it's possible that the replay interrupted a journal
commit, but there was left-over commit block that *looked* valid, and
it triggered the failure.

If this is what happened, it's not a disaster --- the journal replay
will have correctly stopped where it should have, but it thought it
was an exceptional abort, as opposed to a normal journal replay
commpletion.  So the "file system is corrupted flag" will be set,
forcing an fsck, but the fsck shouldn't find any problems with the
file system.

Does this explanation seem to fit with how your test setup is
arranged?

     	  	      	      	       - Ted