2017-03-14 10:34:34

by Daniel Schultz

[permalink] [raw]
Subject: Corrupt ext4 fs after creation

Hi,

we use the Yocto Project to create custom BSPs for a AM335x SoC. After
they upgraded the e2fsprogs from 1.42.9 to 1.43 we run into problems
with our ext4 root file system. During the first boot systemd checks the
rootfs with fsck.ext4 and needs a restart beacause of an error.

The fs is created with mkfs.ext4 and direct I/O and is available on
ftp://ftp.phytec.de/pub/Test/phytec-headless-image-phyboard-wega-am335x-2.ext4

$ du -ks
/home/schultz/yocto/PD17.1.0/build/tmp/work/phyboard_wega_am335x_2-phytec-linux-gnueabi/phytec-headless-image/1.0-r0/rootfs
109708
$ truncate /var/tmp/wic/build/rootfs_root.2.ext4 -s 159674777
$ mkfs.ext4 -F -i 8192 /var/tmp/wic/build/rootfs_root.2.ext4 -L root -d
/home/schultz/yocto/PD17.1.0/build/tmp/work/phyboard_wega_am335x_2-phytec-linux-gnueabi/phytec-headless-image/1.0-r0/rootfs
mke2fs 1.43 (17-May-2016)
Discarding device blocks: done
Creating filesystem with 155932 1k blocks and 19520 inodes
Filesystem UUID: 6728344f-aa6d-4bbb-a06d-e649e36024d3
Superblock backups stored on blocks:
8193, 24577, 40961, 57345, 73729

Allocating group tables: done
Writing inode tables: done
Creating journal (4096 blocks): done
Copying files into the device: done
Writing superblocks and filesystem accounting information: done
$ du -Lbks /var/tmp/wic/build/rootfs_root.2.ext4
155933


I figured out that when I run fsck.ext4 it will perform a directory
optimizing which leads to a non-zero error code of 1 (File system errors
corrected). Also, I figured out that this optimizing only occurres on
the first boot and not after creating a lot of new files and dirs. After
checking the fs it contains more blocks than before.

root@phyboard-wega-am335x-2:~# fsck.ext4 -V
e2fsck 1.43 (17-May-2016)
Using EXT2FS Library version 1.43, 17-May-2016
root@phyboard-wega-am335x-2:~# fsck.ext4
/dev/disk/by-id/mmc-NCard_0x2519026e-part2
e2fsck 1.43 (17-May-2016)
Superblock last write time (Mon Mar 13 10:53:39 2017,
now = Wed Jan 25 11:07:23 2017) is in the future.
Fix<y>? yes
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information

root: ***** FILE SYSTEM WAS MODIFIED *****
root: 5221/19520 files (0.9% non-contiguous), 105503/155932 blocks
root@phyboard-wega-am335x-2:~# fsck.ext4
/dev/disk/by-id/mmc-NCard_0x2519026e-part2
e2fsck 1.43 (17-May-2016)
root: clean, 5221/19520 files, 105503/155932 blocks


Can anyone give me more informations about what's wrong with our fs and
if I can forbid (without problems) the directory optimization?

Thanks

--
Mit freundlichen Grüßen,
With best regards,
Daniel Schultz


2017-03-14 17:54:43

by Darrick J. Wong

[permalink] [raw]
Subject: Re: Corrupt ext4 fs after creation

On Tue, Mar 14, 2017 at 11:34:32AM +0100, Daniel Schultz wrote:
> Hi,
>
> we use the Yocto Project to create custom BSPs for a AM335x SoC. After they
> upgraded the e2fsprogs from 1.42.9 to 1.43 we run into problems with our
> ext4 root file system. During the first boot systemd checks the rootfs with
> fsck.ext4 and needs a restart beacause of an error.
>
> The fs is created with mkfs.ext4 and direct I/O and is available on ftp://ftp.phytec.de/pub/Test/phytec-headless-image-phyboard-wega-am335x-2.ext4
>
> $ du -ks /home/schultz/yocto/PD17.1.0/build/tmp/work/phyboard_wega_am335x_2-phytec-linux-gnueabi/phytec-headless-image/1.0-r0/rootfs
> 109708
> $ truncate /var/tmp/wic/build/rootfs_root.2.ext4 -s 159674777
> $ mkfs.ext4 -F -i 8192 /var/tmp/wic/build/rootfs_root.2.ext4 -L root -d /home/schultz/yocto/PD17.1.0/build/tmp/work/phyboard_wega_am335x_2-phytec-linux-gnueabi/phytec-headless-image/1.0-r0/rootfs
> mke2fs 1.43 (17-May-2016)
> Discarding device blocks: done
> Creating filesystem with 155932 1k blocks and 19520 inodes
> Filesystem UUID: 6728344f-aa6d-4bbb-a06d-e649e36024d3
> Superblock backups stored on blocks:
> 8193, 24577, 40961, 57345, 73729
>
> Allocating group tables: done
> Writing inode tables: done
> Creating journal (4096 blocks): done
> Copying files into the device: done
> Writing superblocks and filesystem accounting information: done
> $ du -Lbks /var/tmp/wic/build/rootfs_root.2.ext4
> 155933
>
>
> I figured out that when I run fsck.ext4 it will perform a directory
> optimizing which leads to a non-zero error code of 1 (File system errors
> corrected). Also, I figured out that this optimizing only occurres on the
> first boot and not after creating a lot of new files and dirs. After
> checking the fs it contains more blocks than before.
>
> root@phyboard-wega-am335x-2:~# fsck.ext4 -V
> e2fsck 1.43 (17-May-2016)
> Using EXT2FS Library version 1.43, 17-May-2016
> root@phyboard-wega-am335x-2:~# fsck.ext4
> /dev/disk/by-id/mmc-NCard_0x2519026e-part2
> e2fsck 1.43 (17-May-2016)
> Superblock last write time (Mon Mar 13 10:53:39 2017,
> now = Wed Jan 25 11:07:23 2017) is in the future.
> Fix<y>? yes
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 3A: Optimizing directories
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
>
> root: ***** FILE SYSTEM WAS MODIFIED *****
> root: 5221/19520 files (0.9% non-contiguous), 105503/155932 blocks
> root@phyboard-wega-am335x-2:~# fsck.ext4
> /dev/disk/by-id/mmc-NCard_0x2519026e-part2
> e2fsck 1.43 (17-May-2016)
> root: clean, 5221/19520 files, 105503/155932 blocks
>
>
> Can anyone give me more informations about what's wrong with our fs and if I
> can forbid (without problems) the directory optimization?

Somewhere in that newly created image is a directory large enough to
warrant an htree index to speed up directory access. Pass 3A is e2fsck
creating the directory index, hence the "FS was modified" message.

The htree (hash tree) indexes directory entries by hash to speed up
random directory accesses. e2fsck can regenerate the indices, but the
rest of e2fsprogs cannot create or maintain them. You can turn them
off, at some cost to performance.

So there are a number of options here -- (1) reduce directory sizes,
which might not be feasible.

(2) You could increase the block size since block size == page size has
less runtime overhead... unless you really do need to have a 155MB
image with as little slack as possible.

(3) You could turn off directory indexing (mkfs -O ^dir_index) which
removes the fsck surprise but also makes directory access much slower.

(4) Run e2fsck immediately after mkfs so that the directory
optimizations are baked into the root image.

I'd probably do #4 and/or #2, personally.

--D

>
> Thanks
>
> --
> Mit freundlichen Gr??en,
> With best regards,
> Daniel Schultz

2017-03-14 20:06:07

by Andreas Dilger

[permalink] [raw]
Subject: Re: Corrupt ext4 fs after creation

On Mar 14, 2017, at 4:34 AM, Daniel Schultz <[email protected]> wrote:
>
> Hi,
>
> we use the Yocto Project to create custom BSPs for a AM335x SoC. After they upgraded the e2fsprogs from 1.42.9 to 1.43 we run into problems with our ext4 root file system. During the first boot systemd checks the rootfs with fsck.ext4 and needs a restart beacause of an error.

This isn't really an "error"...

> The fs is created with mkfs.ext4 and direct I/O and is available on ftp://ftp.phytec.de/pub/Test/phytec-headless-image-phyboard-wega-am335x-2.ext4
>
> $ du -ks /home/schultz/yocto/PD17.1.0/build/tmp/work/phyboard_wega_am335x_2-phytec-linux-gnueabi/phytec-headless-image/1.0-r0/rootfs
> 109708
> $ truncate /var/tmp/wic/build/rootfs_root.2.ext4 -s 159674777
> $ mkfs.ext4 -F -i 8192 /var/tmp/wic/build/rootfs_root.2.ext4 -L root -d /home/schultz/yocto/PD17.1.0/build/tmp/work/phyboard_wega_am335x_2-phytec-linux-gnueabi/phytec-headless-image/1.0-r0/rootfs
> mke2fs 1.43 (17-May-2016)
> Discarding device blocks: done
> Creating filesystem with 155932 1k blocks and 19520 inodes
> Filesystem UUID: 6728344f-aa6d-4bbb-a06d-e649e36024d3
> Superblock backups stored on blocks:
> 8193, 24577, 40961, 57345, 73729
>
> Allocating group tables: done
> Writing inode tables: done
> Creating journal (4096 blocks): done
> Copying files into the device: done
> Writing superblocks and filesystem accounting information: done
> $ du -Lbks /var/tmp/wic/build/rootfs_root.2.ext4
> 155933
>
>
> I figured out that when I run fsck.ext4 it will perform a directory optimizing which leads to a non-zero error code of 1 (File system errors corrected).

> Also, I figured out that this optimizing only occurres on the first boot and not after creating a lot of new files and dirs. After checking the fs it contains more blocks than before.
>
> root@phyboard-wega-am335x-2:~# fsck.ext4 -V
> e2fsck 1.43 (17-May-2016)
> Using EXT2FS Library version 1.43, 17-May-2016
> root@phyboard-wega-am335x-2:~# fsck.ext4 /dev/disk/by-id/mmc-NCard_0x2519026e-part2
> e2fsck 1.43 (17-May-2016)
> Superblock last write time (Mon Mar 13 10:53:39 2017,
> now = Wed Jan 25 11:07:23 2017) is in the future.

This is the real error - your system clock is at Jan 25, but the superblock
was modified yesterday (which is likely correct), so e2fsck thinks the
superblock is wrong.

Fix your clock before running e2fsck, or use the magic setting for embedded
systems so that this check is skipped. I believe something like the following
in /etc/e2fsck.conf would fix this (see e2fsck.conf(5) man page):

[options]
broken_system_clock=true

I suspect once this is fixed, e2fsck will not complain about the directory
optimization step anymore.

Cheers, Andreas

> Fix<y>? yes
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 3A: Optimizing directories
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
>
> root: ***** FILE SYSTEM WAS MODIFIED *****
> root: 5221/19520 files (0.9% non-contiguous), 105503/155932 blocks
> root@phyboard-wega-am335x-2:~# fsck.ext4 /dev/disk/by-id/mmc-NCard_0x2519026e-part2
> e2fsck 1.43 (17-May-2016)
> root: clean, 5221/19520 files, 105503/155932 blocks
>
>
> Can anyone give me more informations about what's wrong with our fs and if I can forbid (without problems) the directory optimization?
>
> Thanks
>
> --
> Mit freundlichen Grüßen,
> With best regards,
> Daniel Schultz


Cheers, Andreas






Attachments:
signature.asc (195.00 B)
Message signed with OpenPGP

2017-03-15 05:11:34

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Corrupt ext4 fs after creation

On Tue, Mar 14, 2017 at 11:34:32AM +0100, Daniel Schultz wrote:
>
> I figured out that when I run fsck.ext4 it will perform a directory
> optimizing which leads to a non-zero error code of 1 (File system errors
> corrected). Also, I figured out that this optimizing only occurres on the
> first boot and not after creating a lot of new files and dirs. After
> checking the fs it contains more blocks than before.

So the error code was one because the file system was modified. It's
true that the formal definition of error code of one is "file system
errors corrected", and so this will be changed in the next release of
e2fsprogs:

commit bf9f3b6d5b10d19218b4ed904c12b22e36ec57dd
Author: Theodore Ts'o <[email protected]>
Date: Thu Feb 16 22:02:35 2017 -0500

e2fsck: exit with exit status 0 if no errors were fixed

Previously, e2fsck would exit with a status code of 1 even though the
only changes that it made to the file system were various
optimziations and not fixing file system corruption. Since the man
page states that an exit status of 1 means "file system errors
corrupted", fix e2fsck to return an exit status of 0.

Signed-off-by: Theodore Ts'o <[email protected]>

Cheers,

- Ted