2023-09-18 21:43:13

by Andreas Dilger

[permalink] [raw]
Subject: Re: [e2fsprogs PATCH v2] resize2fs: use directio when reading superblock

On Sep 11, 2023, at 12:39 PM, Krister Johansen <[email protected]> wrote:
>
> Invocations of resize2fs intermittently report failure due to superblock
> checksum mismatches in this author's environment. This might happen a few
> times a week. The following script can make this happen within minutes.
> (It assumes /dev/nvme1n1 is available and not in use by anything else).

Krister,
thanks for submitting the patch. This particular issue was already fixed
in commit v1.46.6-16-g43a498e93888, apparently based on your previous report:

commit 43a498e938887956f393b5e45ea6ac79cc5f4b84
Author: Theodore Ts'o <[email protected]>
AuthorDate: Thu Jun 15 00:17:01 2023 -0400
Commit: Theodore Ts'o <[email protected]>
CommitDate: Thu Jun 15 00:17:01 2023 -0400

resize2fs: use Direct I/O when reading the superblock for online resizes

If the file system is mounted, the superblock can be changing while
resize2fs is trying to read the superblock, resulting in checksum
failures. One way of avoiding this problem is read the superblock
using Direct I/O, since the kernel makes sure that what gets written
to disk is self-consistent.

Suggested-by: Krister Johansen <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>

So it is landed on the e2fsprogs maint branch, but there has not been a
maintenance release since the patch was landed.

Cheers, Andreas

> #!/usr/bin/bash
> set -euxo pipefail
>
> while true
> do
> parted /dev/nvme1n1 mklabel gpt mkpart primary 2048s 2099200s
> sleep .5
> mkfs.ext4 /dev/nvme1n1p1
> mount -t ext4 /dev/nvme1n1p1 /mnt
> stress-ng --temp-path /mnt -D 4 &
> STRESS_PID=$!
> sleep 1
> growpart /dev/nvme1n1 1
> resize2fs /dev/nvme1n1p1
> kill $STRESS_PID
> wait $STRESS_PID
> umount /mnt
> wipefs -a /dev/nvme1n1p1
> wipefs -a /dev/nvme1n1
> done
>
> After trying a few possible solutions, adding an O_DIRECT read to the open
> path in resize2fs eliminated the occurrences on test systems. ext2fs_open2
> uses a negative count value when calling io_channel_read_blk to get the
> superblock. According to unix_read_block, negative offsets are to be read
> direct. However, when strace-ing a program without this fix, the
> underlying device was opened without O_DIRECT. Adding the flags in the
> patch ensures the device is opend with O_DIRECT and that the superblock
> read appears consistent.
>
> Signed-off-by: Krister Johansen <[email protected]>
> ---
> v2:
> - Only set DIRECT_IO flag when resizing a mounted filesystem. (Feedback from
> Theodore Ts'o)
> ---
> resize/main.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/resize/main.c b/resize/main.c
> index 94f5ec6d..f914c050 100644
> --- a/resize/main.c
> +++ b/resize/main.c
> @@ -409,6 +409,8 @@ int main (int argc, char ** argv)
>
> if (!(mount_flags & EXT2_MF_MOUNTED) && !print_min_size)
> io_flags = EXT2_FLAG_RW | EXT2_FLAG_EXCLUSIVE;
> + if (mount_flags & EXT2_MF_MOUNTED)
> + io_flags |= EXT2_FLAG_DIRECT_IO;
>
> io_flags |= EXT2_FLAG_64BITS | EXT2_FLAG_THREADS;
> if (undo_file) {
> --
> 2.25.1


Cheers, Andreas






Attachments:
signature.asc (890.00 B)
Message signed with OpenPGP

2023-09-18 23:59:14

by Krister Johansen

[permalink] [raw]
Subject: Re: [e2fsprogs PATCH v2] resize2fs: use directio when reading superblock

Hi Andreas,

On Mon, Sep 18, 2023 at 03:20:01PM -0600, Andreas Dilger wrote:
> On Sep 11, 2023, at 12:39 PM, Krister Johansen <[email protected]> wrote:
> >
> > Invocations of resize2fs intermittently report failure due to superblock
> > checksum mismatches in this author's environment. This might happen a few
> > times a week. The following script can make this happen within minutes.
> > (It assumes /dev/nvme1n1 is available and not in use by anything else).
>
> Krister,
> thanks for submitting the patch. This particular issue was already fixed
> in commit v1.46.6-16-g43a498e93888, apparently based on your previous report:
>
> commit 43a498e938887956f393b5e45ea6ac79cc5f4b84
> Author: Theodore Ts'o <[email protected]>
> AuthorDate: Thu Jun 15 00:17:01 2023 -0400
> Commit: Theodore Ts'o <[email protected]>
> CommitDate: Thu Jun 15 00:17:01 2023 -0400
>
> resize2fs: use Direct I/O when reading the superblock for online resizes
>
> If the file system is mounted, the superblock can be changing while
> resize2fs is trying to read the superblock, resulting in checksum
> failures. One way of avoiding this problem is read the superblock
> using Direct I/O, since the kernel makes sure that what gets written
> to disk is self-consistent.
>
> Suggested-by: Krister Johansen <[email protected]>
> Signed-off-by: Theodore Ts'o <[email protected]>
>
> So it is landed on the e2fsprogs maint branch, but there has not been a
> maintenance release since the patch was landed.

Thanks for the response. My apologies for resubmitting this. I had
thought that I checked the git trees before sending this out, but I
must've looked at the wrong one. Sorry about that.

Thanks to Ted for applying his reworked patch, it's much appreciated.

-K