2008-05-18 11:07:33

by Bas van Schaik

[permalink] [raw]
Subject: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

Hi all,

Lately I encountered some vague corruptions on my filesystem which were
probably caused several weeks ago. My setup is quite extraordinary
(using ATA over Ethernet, RAID5, LVM, LUKS), a simple undetected network
failure did most probably cause these corruptions.

After I recovered the filesystem I decided to set up a weekly cron job
to perform an automatic e2fsck on a snapshot. This way, no
inconsistencies go undetected for longer than a week and problems can be
fixed before additional corruption occurs. However, there is a slight
problem with scripting e2fsck: it seems that e2fsck /always/ exits with
exit code 1 just because of the fact that the snapshot journal has been
replayed. Because of this, the script cannot tell whether there is a
real problem or not and keeps e-mailing me. This is a typical output of
such an e2fsck run:
> # e2fsck.static -f -y -v /dev/loop1
> e2fsck 1.40.8 (13-Mar-2008)
> /dev/loop1: recovering journal
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
>
> /dev/loop1: ***** FILE SYSTEM WAS MODIFIED *****
>
> 12609263 inodes used (4.58%)
> 258729 non-contiguous inodes (2.1%)
> # of inodes with ind/dind/tind blocks: 1174647/20098/15
> 485401393 blocks used (88.17%)
> 0 bad blocks
> 108 large files
>
> 8730706 regular files
> 3867308 directories
> 0 character device files
> 3 block device files
> 12 fifos
> 95342096 links
> 11090 symbolic links (9640 fast symbolic links)
> 135 sockets
> --------
> 107951350 files

As you can see: no additional errors. Does anyone have any ideas on how
to work around this?

Cheers,

Bas


2008-05-18 12:29:08

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

On Sun, May 18, 2008 at 12:37:37PM +0200, Bas van Schaik wrote:
> However, there is a slight
> problem with scripting e2fsck: it seems that e2fsck /always/ exits with
> exit code 1 just because of the fact that the snapshot journal has been
> replayed. Because of this, the script cannot tell whether there is a
> real problem or not and keeps e-mailing me. This is a typical output of
> such an e2fsck run:

Simply replying the journal should not cause e2fsck to have an exit
code of 1. It must have done something else. The common one was
clearing the LARGE_FILES feature flag if the filesystem didn't have
one, but that was removed as of 1.40.7. Can you take a snapshot, run
dumpe2fs, run e2fsck -fy /dev/loop1, and then run dumpe2fs again, and
send me a before and after?

- Ted

2008-05-18 16:18:41

by Bas van Schaik

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

Theodore Tso wrote:
> On Sun, May 18, 2008 at 12:37:37PM +0200, Bas van Schaik wrote:
>
>> However, there is a slight
>> problem with scripting e2fsck: it seems that e2fsck /always/ exits with
>> exit code 1 just because of the fact that the snapshot journal has been
>> replayed. Because of this, the script cannot tell whether there is a
>> real problem or not and keeps e-mailing me. This is a typical output of
>> such an e2fsck run:
>>
>
> Simply replying the journal should not cause e2fsck to have an exit
> code of 1. It must have done something else. The common one was
> clearing the LARGE_FILES feature flag if the filesystem didn't have
> one, but that was removed as of 1.40.7. Can you take a snapshot, run
> dumpe2fs, run e2fsck -fy /dev/loop1, and then run dumpe2fs again, and
> send me a before and after?
For now, the dumpe2fs before the e2fsck. As you will probably remember,
the filesystem is quite large and and the check takes a few hours... I
will send the other dumpe2fs ASAP.


> # dumpe2fs /dev/loop1
> dumpe2fs 1.40-WIP (14-Nov-2006)
> Filesystem volume name: <none>
> Last mounted on: <not available>
> Filesystem UUID: 5e561184-65a5-4e19-9b57-7acf31ef209b
> Filesystem magic number: 0xEF53
> Filesystem revision #: 1 (dynamic)
> Filesystem features: has_journal dir_index filetype
> needs_recovery sparse_super large_file
> Filesystem flags: signed directory hash
> Default mount options: journal_data_writeback
> Filesystem state: clean
> Errors behavior: Remount read-only
> Filesystem OS type: Linux
> Inode count: 275251200
> Block count: 550502400
> Reserved block count: 0
> Free blocks: 64737800
> Free inodes: 262708704
> First block: 0
> Block size: 4096
> Fragment size: 4096
> Blocks per group: 32768
> Fragments per group: 32768
> Inodes per group: 16384
> Inode blocks per group: 512
> Filesystem created: Fri Oct 6 20:46:50 2006
> Last mount time: Tue May 13 00:30:58 2008
> Last write time: Tue May 13 00:30:58 2008
> Mount count: 1
> Maximum mount count: 24
> Last checked: Mon May 12 15:38:20 2008
> Check interval: 15552000 (6 months)
> Next check after: Sat Nov 8 14:38:20 2008
> Reserved blocks uid: 0 (user root)
> Reserved blocks gid: 0 (group root)
> First inode: 11
> Inode size: 128
> Journal inode: 8
> Default directory hash: tea
> Directory Hash Seed: 46c1768d-baa8-44f8-a823-200942db69b5
> Journal backup: inode blocks
> Journal size: 32M

-- Bas

2008-05-19 11:32:55

by Bas van Schaik

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

Bas van Schaik wrote:
> Theodore Tso wrote:
>
>> On Sun, May 18, 2008 at 12:37:37PM +0200, Bas van Schaik wrote:
>>
>>
>>> However, there is a slight
>>> problem with scripting e2fsck: it seems that e2fsck /always/ exits with
>>> exit code 1 just because of the fact that the snapshot journal has been
>>> replayed. Because of this, the script cannot tell whether there is a
>>> real problem or not and keeps e-mailing me. This is a typical output of
>>> such an e2fsck run:
>>>
>>>
>> Simply replying the journal should not cause e2fsck to have an exit
>> code of 1. It must have done something else. The common one was
>> clearing the LARGE_FILES feature flag if the filesystem didn't have
>> one, but that was removed as of 1.40.7. Can you take a snapshot, run
>> dumpe2fs, run e2fsck -fy /dev/loop1, and then run dumpe2fs again, and
>> send me a before and after?
>>
> For now, the dumpe2fs before the e2fsck. As you will probably remember,
> the filesystem is quite large and and the check takes a few hours... I
> will send the other dumpe2fs ASAP.
>
>
>
>> # dumpe2fs /dev/loop1
>> dumpe2fs 1.40-WIP (14-Nov-2006)
>> Filesystem volume name: <none>
>> Last mounted on: <not available>
>> Filesystem UUID: 5e561184-65a5-4e19-9b57-7acf31ef209b
>> Filesystem magic number: 0xEF53
>> Filesystem revision #: 1 (dynamic)
>> Filesystem features: has_journal dir_index filetype
>> needs_recovery sparse_super large_file
>> Filesystem flags: signed directory hash
>> Default mount options: journal_data_writeback
>> Filesystem state: clean
>> Errors behavior: Remount read-only
>> Filesystem OS type: Linux
>> Inode count: 275251200
>> Block count: 550502400
>> Reserved block count: 0
>> Free blocks: 64737800
>> Free inodes: 262708704
>> First block: 0
>> Block size: 4096
>> Fragment size: 4096
>> Blocks per group: 32768
>> Fragments per group: 32768
>> Inodes per group: 16384
>> Inode blocks per group: 512
>> Filesystem created: Fri Oct 6 20:46:50 2006
>> Last mount time: Tue May 13 00:30:58 2008
>> Last write time: Tue May 13 00:30:58 2008
>> Mount count: 1
>> Maximum mount count: 24
>> Last checked: Mon May 12 15:38:20 2008
>> Check interval: 15552000 (6 months)
>> Next check after: Sat Nov 8 14:38:20 2008
>> Reserved blocks uid: 0 (user root)
>> Reserved blocks gid: 0 (group root)
>> First inode: 11
>> Inode size: 128
>> Journal inode: 8
>> Default directory hash: tea
>> Directory Hash Seed: 46c1768d-baa8-44f8-a823-200942db69b5
>> Journal backup: inode blocks
>> Journal size: 32M
>>

And now the (complete) output of dumpe2fs after the e2fsck:

> Filesystem volume name: <none>
> Last mounted on: <not available>
> Filesystem UUID: 5e561184-65a5-4e19-9b57-7acf31ef209b
> Filesystem magic number: 0xEF53
> Filesystem revision #: 1 (dynamic)
> Filesystem features: has_journal dir_index filetype sparse_super
> large_file
> Filesystem flags: signed directory hash
> Default mount options: journal_data_writeback
> Filesystem state: clean
> Errors behavior: Remount read-only
> Filesystem OS type: Linux
> Inode count: 275251200
> Block count: 550502400
> Reserved block count: 0
> Free blocks: 76760569
> Free inodes: 262667659
> First block: 0
> Block size: 4096
> Fragment size: 4096
> Blocks per group: 32768
> Fragments per group: 32768
> Inodes per group: 16384
> Inode blocks per group: 512
> Filesystem created: Fri Oct 6 20:46:50 2006
> Last mount time: Mon May 19 02:15:24 2008
> Last write time: Mon May 19 02:16:08 2008
> Mount count: 0
> Maximum mount count: 24
> Last checked: Mon May 19 02:16:08 2008
> Check interval: 15552000 (6 months)
> Next check after: Sat Nov 15 01:16:08 2008
> Reserved blocks uid: 0 (user root)
> Reserved blocks gid: 0 (group root)
> First inode: 11
> Inode size: 128
> Journal inode: 8
> Default directory hash: tea
> Directory Hash Seed: 46c1768d-baa8-44f8-a823-200942db69b5
> Journal backup: inode blocks
> Journal size: 32M

Does this tell you anything?

-- Bas


2008-05-19 11:50:22

by Bas van Schaik

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

Bas van Schaik wrote:
> Theodore Tso wrote:
>
>> On Sun, May 18, 2008 at 12:37:37PM +0200, Bas van Schaik wrote:
>>
>>
>>> However, there is a slight
>>> problem with scripting e2fsck: it seems that e2fsck /always/ exits with
>>> exit code 1 just because of the fact that the snapshot journal has been
>>> replayed. Because of this, the script cannot tell whether there is a
>>> real problem or not and keeps e-mailing me. This is a typical output of
>>> such an e2fsck run:
>>>
>>>
>> Simply replying the journal should not cause e2fsck to have an exit
>> code of 1. It must have done something else. The common one was
>> clearing the LARGE_FILES feature flag if the filesystem didn't have
>> one, but that was removed as of 1.40.7. Can you take a snapshot, run
>> dumpe2fs, run e2fsck -fy /dev/loop1, and then run dumpe2fs again, and
>> send me a before and after?
>>
> For now, the dumpe2fs before the e2fsck. As you will probably remember,
> the filesystem is quite large and and the check takes a few hours... I
> will send the other dumpe2fs ASAP.
>
>
>
>> # dumpe2fs /dev/loop1
>> dumpe2fs 1.40-WIP (14-Nov-2006)
>> Filesystem volume name: <none>
>> Last mounted on: <not available>
>> Filesystem UUID: 5e561184-65a5-4e19-9b57-7acf31ef209b
>> Filesystem magic number: 0xEF53
>> Filesystem revision #: 1 (dynamic)
>> Filesystem features: has_journal dir_index filetype
>> needs_recovery sparse_super large_file
>> Filesystem flags: signed directory hash
>> Default mount options: journal_data_writeback
>> Filesystem state: clean
>> Errors behavior: Remount read-only
>> Filesystem OS type: Linux
>> Inode count: 275251200
>> Block count: 550502400
>> Reserved block count: 0
>> Free blocks: 64737800
>> Free inodes: 262708704
>> First block: 0
>> Block size: 4096
>> Fragment size: 4096
>> Blocks per group: 32768
>> Fragments per group: 32768
>> Inodes per group: 16384
>> Inode blocks per group: 512
>> Filesystem created: Fri Oct 6 20:46:50 2006
>> Last mount time: Tue May 13 00:30:58 2008
>> Last write time: Tue May 13 00:30:58 2008
>> Mount count: 1
>> Maximum mount count: 24
>> Last checked: Mon May 12 15:38:20 2008
>> Check interval: 15552000 (6 months)
>> Next check after: Sat Nov 8 14:38:20 2008
>> Reserved blocks uid: 0 (user root)
>> Reserved blocks gid: 0 (group root)
>> First inode: 11
>> Inode size: 128
>> Journal inode: 8
>> Default directory hash: tea
>> Directory Hash Seed: 46c1768d-baa8-44f8-a823-200942db69b5
>> Journal backup: inode blocks
>> Journal size: 32M
>>

And now the output of dumpe2fs after the e2fsck:
> Filesystem state: clean
> Errors behavior: Remount read-only
> Filesystem OS type: Linux
> Inode count: 275251200
> Block count: 550502400
> Reserved block count: 0
> Free blocks: 76760569
> Free inodes: 262667659
> First block: 0
> Block size: 4096
> Fragment size: 4096
> Blocks per group: 32768
> Fragments per group: 32768
> Inodes per group: 16384
> Inode blocks per group: 512
> Filesystem created: Fri Oct 6 20:46:50 2006
> Last mount time: Mon May 19 02:15:24 2008
> Last write time: Mon May 19 02:16:08 2008
> Mount count: 0
> Maximum mount count: 24
> Last checked: Mon May 19 02:16:08 2008
> Check interval: 15552000 (6 months)
> Next check after: Sat Nov 15 01:16:08 2008
> Reserved blocks uid: 0 (user root)
> Reserved blocks gid: 0 (group root)
> First inode: 11
> Inode size: 128
> Journal inode: 8
> Default directory hash: tea
> Directory Hash Seed: 46c1768d-baa8-44f8-a823-200942db69b5
> Journal backup: inode blocks
> Journal size: 32M

Does this tell you anything?

-- Bas


2008-05-20 03:58:55

by Andreas Dilger

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

On May 18, 2008 12:37 +0200, Bas van Schaik wrote:
> ... there is a slight
> problem with scripting e2fsck: it seems that e2fsck /always/ exits with
> exit code 1 just because of the fact that the snapshot journal has been
> replayed.

> > e2fsck 1.40.8 (13-Mar-2008)
> > /dev/loop1: recovering journal
> > Pass 1: Checking inodes, blocks, and sizes
> > Pass 2: Checking directory structure
> > Pass 3: Checking directory connectivity
> > Pass 4: Checking reference counts
> > Pass 5: Checking group summary information
> >
> > /dev/loop1: ***** FILE SYSTEM WAS MODIFIED *****

You shouldn't get journal replay after a proper LVM snapshot. The
LVM code should freeze the filesystem temporarily, flush out the journal,
mark the filesystem clean, take the snapshot of the clean fileysstem,
and then unfreeze the filesystem again.

If you are using LVM for snapshots, and this is not happening, then
the snapshot code is broken. If you are NOT using LVM, then you
should look into the "freeze ioctl" patches that were on linux-ext4 and
linux-fsdevel recently "[RFC PATCH 1/3] Implement generic freeze feature"
from Takashi Sato.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2008-05-20 12:10:56

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

On Mon, May 19, 2008 at 01:32:53PM +0200, Bas van Schaik wrote:
>
> Does this tell you anything?
>

Unfortunately comparing the two dumpe2fs outputs don't show anything
interesting. It did rule out a few cases where e2fsck can silently
mark the filesystem has having been modified (setting the directory
hash hint, moving the journal inode, which it does silently without
informing the user --- and I should fix that one of these days; I'll
create some bug reports to remind myself they need to be fixed), but I
don't see why it happened for your case.

It's definitely not normal; doing a journal replay does not cause fsck
to exit with a non-zero status, if it didn't make any other changes.
I just tested that with e2fsprogs 1.40.8 just in case something had
gotten screwed up, and it worked as expected.

I know how to debug it if you are really motiviated to get to the
bottom of this. It would involve running a modified e2fsck/e2fsprogs
which changes ext2fs_mark_changed() and ext2fs_mark_super_dirty() to
be real functions, and setting breakpoints in gdb so we can trap any
calls made to those functions and dump out a stack backtrace, and then
continuing the e2fsck run, and then reporting to me the stack
backtraces where gdb trapped calls to ext2fs_mark_changed() and/or
ext2fs_mark_super_dirty().

Andreas is right though that if you are taking a proper snapshot, the
disk really should be quiesced and no journal replay should be
required at all. That's how a devicemapper snapshot works in LVM ---
so one good question to explore is how *are* you doing your snapshots.

- Ted

2008-05-20 21:19:41

by Bas van Schaik

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

Theodore Tso wrote:
> On Mon, May 19, 2008 at 01:32:53PM +0200, Bas van Schaik wrote:
>
>> Does this tell you anything?
>>
>>
>
> Unfortunately comparing the two dumpe2fs outputs don't show anything
> interesting. It did rule out a few cases where e2fsck can silently
> mark the filesystem has having been modified (setting the directory
> hash hint, moving the journal inode, which it does silently without
> informing the user --- and I should fix that one of these days; I'll
> create some bug reports to remind myself they need to be fixed), but I
> don't see why it happened for your case.
>
> It's definitely not normal; doing a journal replay does not cause fsck
> to exit with a non-zero status, if it didn't make any other changes.
> I just tested that with e2fsprogs 1.40.8 just in case something had
> gotten screwed up, and it worked as expected.
>
Actually I also wouldn't expect e2fsck to do so. Maybe I'm overseeing
something really stupid, this is the bash code I'm running:
> e2fsck.static -f -y -v /dev/loop1 &> $TMPLOGFILE
> retcode="$?"
>
> (...)
>
> if [ ! "$retcode" = "0" ]; then
> echo "e2fsck had nonzero exitcode $retcode, aborting!"
> (...)
> fi


> I know how to debug it if you are really motiviated to get to the
> bottom of this. It would involve running a modified e2fsck/e2fsprogs
> which changes ext2fs_mark_changed() and ext2fs_mark_super_dirty() to
> be real functions, and setting breakpoints in gdb so we can trap any
> calls made to those functions and dump out a stack backtrace, and then
> continuing the e2fsck run, and then reporting to me the stack
> backtraces where gdb trapped calls to ext2fs_mark_changed() and/or
> ext2fs_mark_super_dirty().
>
To be honest, I'm currently trying to find out the cause of all these
filesystem corruptions. Maybe I'll try to sort this out later using gdb
and so on.

> Andreas is right though that if you are taking a proper snapshot, the
> disk really should be quiesced and no journal replay should be
> required at all. That's how a devicemapper snapshot works in LVM ---
> so one good question to explore is how *are* you doing your snapshots.
Exactly my thoughts, but apparently something is wrong here too. Maybe I
should note that my journal commit interval is set to something like 5
or 10 seconds, is that relevant? Again a small snippet of bash
responsible for snapshotting:
> snapshot_stamp=`date +%Y%m%d-%H%M%S`
> lvcreate --snapshot --size 50G --name backups-snapshot-$snapshot_stamp
> $LV &> $TMPLOGFILE

This is not a weird way to snapshot, is it?

-- Bas

2008-05-21 16:14:34

by Andreas Dilger

[permalink] [raw]
Subject: Re: Scripting e2fsck: no errors, but still exit code 1 "FILE SYSTEM WAS MODIFIED"

On May 20, 2008 23:19 +0200, Bas van Schaik wrote:
> Theodore Tso wrote:
> > Andreas is right though that if you are taking a proper snapshot, the
> > disk really should be quiesced and no journal replay should be
> > required at all. That's how a devicemapper snapshot works in LVM ---
> > so one good question to explore is how *are* you doing your snapshots.
>
> Exactly my thoughts, but apparently something is wrong here too. Maybe I
> should note that my journal commit interval is set to something like 5
> or 10 seconds, is that relevant?

No, this is normal.

> Again a small snippet of bash responsible for snapshotting:
> > snapshot_stamp=`date +%Y%m%d-%H%M%S`
> > lvcreate --snapshot --size 50G --name backups-snapshot-$snapshot_stamp
> > $LV &> $TMPLOGFILE
>
> This is not a weird way to snapshot, is it?

No, this is normal, yet it doesn't seem to work correctly. Can you please
add some sort of debug message into ext3_write_super_lockfs() and
ext3_unlockfs() to make sure these functions are being called for your
filesystem when you are doing the snapshot.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.