2024-04-12 01:28:40

by yebin (H)

[permalink] [raw]
Subject: Re: [PATCH] jbd2: avoid mount failed when commit block is partial submitted



On 2024/4/11 22:55, Theodore Ts'o wrote:
> On Thu, Apr 11, 2024 at 03:37:18PM +0200, Jan Kara wrote:
>>> The vendor
>>> has confirmed that only 512-byte atomicity can be ensured in the firmware.
>>> Although the valid data is only 60 bytes, the entire commit block is used
>>> for calculating
>>> the checksum.
>>> jbd2_commit_block_csum_verify:
>>> ...
>>> calculated = jbd2_chksum(j, j->j_csum_seed, buf, j->j_blocksize);
>>> ...
>> Ah, indeed. This is the bit I've missed. Thanks for explanation! Still I
>> think trying to somehow automatically deal with wrong commit block checksum
>> is too dangerous because it can result in fs corruption in some (unlikely)
>> cases. OTOH I understand journal replay failure after a power fail isn't
>> great either so we need to think how to fix this...
> Unfortunately, the only fix I can think of would require changing how
> we do the checksum to only include the portion of the jbd2 block which
> contains valid data, per the header field. This would be a format
> change which means that if a new kernel writes the new jbd2 format
> (using a journal incompat flag, or a new checksum type), older kernels
> and older versions of e2fsprogs wouldn't be able to validate the
> journal. So rollout of the fix would have to be carefully managed.
>
> - Ted
> .
I thought of a solution that when the commit block checksum is
incorrect, retain the
first 512 bytes of data, clear the subsequent data, and then calculate
the checksum
to see if it is correct. This solution can distinguish whether the
commit is complete
for components that can ensure the atomicity of 512 bytes or more. But
for HDD,
it may not be able to distinguish, but it should be alleviated to some
extent.



2024-04-12 03:56:10

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] jbd2: avoid mount failed when commit block is partial submitted

On Fri, Apr 12, 2024 at 09:27:55AM +0800, yebin (H) wrote:
> I thought of a solution that when the commit block checksum is
> incorrect, retain the first 512 bytes of data, clear the subsequent
> data, and then calculate the checksum to see if it is correct. This
> solution can distinguish whether the commit is complete for
> components that can ensure the atomicity of 512 bytes or more. But
> for HDD, it may not be able to distinguish, but it should be
> alleviated to some extent.

Yeah, we discussed something similar at the weekly ext4 call; the idea
was to change the kernel to zero out the jbd2 block before we fill in
any jbd2 tags (including in the commit block) when writing the
journal. Then in the journal replay path, if the checksum doesn't
match, we can try zeroing out everything beyond the size in the header
struct, and then retry the the checksum and see if it matches.

This also has the benefit of making sure that we aren't leaking stale
(uninitialized) kernel memory to disk, which could be considered a
security vulnerability in some cases --- although the likelihood that
something truly sensitive could be leaked is quite low; the attack
requires raw access to the storate device; and exposure similar to
what gets written to the swap device. Still there are people who do
worry about such things.

- Ted