2009-09-10 12:55:36

by Stephan Kulow

[permalink] [raw]
Subject: buggy_init_scritps and e2fsprogs 1.41.9

Hi,

I face a bug in openSUSE since I updated to 1.41.9: you have
to manually fix your file system if you happened to mount the
file system in the wrong timezone on a machine using localtime
hardware clock.

Now this happens very easily if you boot a live cd and mount
your system from the live cd - but fsck will _not_ correct the
problem ;(

The release notes of 1.41.9 talk only about the exact opposite
case: "Fix e2fsck's buggy_init_scritps=1 so that the if the last
write and/or last mount times are in the future, they are corrected
even if buggy_init_scripts is set."

I don't want to set buggy_init_scripts for openSUSE as the init
scripts are not buggy, but a live cd is a live cd and has no idea
what the timezone of the system is configured to and even a ro
mount will destroy your file system ;(

Greetings, Stephan



2009-09-10 13:03:16

by Stephan Kulow

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

Donnerstag 10 September 2009 sent Stephan Kulow:
>
> The release notes of 1.41.9 talk only about the exact opposite
> case: "Fix e2fsck's buggy_init_scritps=1 so that the if the last
> write and/or last mount times are in the future, they are corrected
> even if buggy_init_scripts is set."
>
If I read the code correctly, the problem is that time_fudge is 0 for
buggy_init_scripts=0 and as such there is no difference between
PR_0_FUTURE_SB_LAST_MOUNT_FUDGED and PR_0_FUTURE_SB_LAST_MOUNT
and so openSUSE always falls into the PR_0_FUTURE_SB_LAST_MOUNT

-> no PR_PREEN_OK

Greetings, Stephan

2009-09-10 13:11:56

by Stephan Kulow

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

Donnerstag 10 September 2009 sent Stephan Kulow:
> Donnerstag 10 September 2009 sent Stephan Kulow:
> > The release notes of 1.41.9 talk only about the exact opposite
> > case: "Fix e2fsck's buggy_init_scritps=1 so that the if the last
> > write and/or last mount times are in the future, they are corrected
> > even if buggy_init_scripts is set."
>
> If I read the code correctly, the problem is that time_fudge is 0 for
> buggy_init_scripts=0 and as such there is no difference between
> PR_0_FUTURE_SB_LAST_MOUNT_FUDGED and PR_0_FUTURE_SB_LAST_MOUNT
> and so openSUSE always falls into the PR_0_FUTURE_SB_LAST_MOUNT
>
> -> no PR_PREEN_OK
>
Attached is my try to fix the problem (note that I also noticed another
problem, it talks about LAST_MOUNT in the wtime case).

Greetings, Stephan


Attachments:
fix_super.diff (1.24 kB)

2009-09-10 20:29:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

On Thu, Sep 10, 2009 at 02:55:38PM +0200, Stephan Kulow wrote:
>
> I face a bug in openSUSE since I updated to 1.41.9: you have
> to manually fix your file system if you happened to mount the
> file system in the wrong timezone on a machine using localtime
> hardware clock.

Yeah, it just so happened that I was working with Scott Remnant at
Canonical on this problem. The problem is that if you (a) live east
of the GMT time zone (or it's British Summer Time :-), (b) configure
your hardware clock to tick localtime instead of GMT for Windows
bug-for-bug compatibility (meaning that you have to deal with problems
if your system happens to be down during the Daylight Savings Time
adjustment, and many other annoyances inflicted on us thanks to
Microsoft), and (c) suffer an unclean shutdown of your system,
requiring the system to perform journal recovery on the root
filesystem, *then* we run into problem where the system clock ---
which is always defined to tick GMT, but which is set from the
hardware clock at boot-time and is incorrect thanks to (a) --- is in
the future, thanks to (b), and part of the journal replay is requires
the kernel to update the superblock, and this causes the superblock's
last write time to be set in the future. This causes e2fsck to decide
something must be seriously wrong, and it forces a full file system
check.

This problem has been around for a long time, but my best guess as to
why it hasn't been noticed is that many Europeans (especially Germans
:-) who are running SLES tend to be hard-core Linux folk, so they do
the One True Correct Thing, which is make their hardware CMOS clocks
tick GMT. (I believe that SLES, being written by Germans who tend to
want to do the correct thing from an engineering point of view, also
strongly encourages the use of GMT for the CMOS clock). Others tend
to use Ubuntu, which up until recently have set buggy_init_scripts=1
by default, which tends to paper over this problem.

Linux users in the US and others who live west of GMT don't see this
problem at all, since those that run in Windows bug-compatibity mode
simply have sb->s_wtime set in the past, and this is largely harmless.
Hence, no one has really noticed this problem until recently, when
Ubuntu's Karmic Koala release has exposed the problem by removing the
buggy_init_scripts config.

> I don't want to set buggy_init_scripts for openSUSE as the init
> scripts are not buggy, but a live cd is a live cd and has no idea
> what the timezone of the system is configured to and even a ro
> mount will destroy your file system ;(

Well, you're over-dramatizing things; it doesn't destroy your
filesystem --- it just forces an unnnecessary fsck. And it only does
this if the system had been uncleanly shutdown before the the boot.

Here's a kernel patch which should fix things for ext4. I'll follow
up with a patch for ext3. Scott, Stephen, you want to give this patch
a spin?

- Ted

commit a55a57d6da66f80f4de1f8451602395bbdc49c26
Author: Theodore Ts'o <[email protected]>
Date: Thu Sep 10 14:33:20 2009 -0400

ext4: Don't update superblock write time when filesystem is read-only

This avoids updating the superblock write time when we are mounting
the root file system read/only but we need to replay the journal; at
that point, for people who are east of GMT and who make their clock
tick in localtime for Windows bug-for-bug compatibility, and this will
cause e2fsck to complain and force a full file system check.

Signed-off-by: "Theodore Ts'o" <[email protected]>

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index f644a5c..33837c7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3244,7 +3244,18 @@ static int ext4_commit_super(struct super_block *sb, int sync)
clear_buffer_write_io_error(sbh);
set_buffer_uptodate(sbh);
}
- es->s_wtime = cpu_to_le32(get_seconds());
+ /*
+ * If the file system is mounted read-only, don't update the
+ * superblock write time. This avoids updating the superblock
+ * write time when we are mounting the root file system
+ * read/only but we need to replay the journal; at that point,
+ * for people who are east of GMT and who make their clock
+ * tick in localtime for Windows bug-for-bug compatibility,
+ * the clock is set in the future, and this will cause e2fsck
+ * to complain and force a full file system check.
+ */
+ if (!(sb->s_flag & MS_RDONLY))
+ es->s_wtime = cpu_to_le32(get_seconds());
es->s_kbytes_written =
cpu_to_le64(EXT4_SB(sb)->s_kbytes_written +
((part_stat_read(sb->s_bdev->bd_part, sectors[1]) -

2009-09-10 20:57:55

by Stephan Kulow

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

On Thursday 10 September 2009 22:29:27 Theodore Tso wrote:
>
> > I don't want to set buggy_init_scripts for openSUSE as the init
> > scripts are not buggy, but a live cd is a live cd and has no idea
> > what the timezone of the system is configured to and even a ro
> > mount will destroy your file system ;(
>
> Well, you're over-dramatizing things; it doesn't destroy your
> filesystem --- it just forces an unnnecessary fsck. And it only does
> this if the system had been uncleanly shutdown before the the boot.
>
Yes, it doesn't destroy your file system but it forces you to do a manual
fsck and no, there is no need for a unclean shutdown, it will always happen.

The work flow is as such:
- boot into live cd, live cd thinks system is UTC, mounts sda1 (ro)
at 9am, umounts cleanly, updates mount time to 11am (hardware clock)
- boot into real system, time is set to CEST and hardware clock is read
as 9am. fsck will say the file has uncorrectable errors and abort.

This does _not_ happen with 1.14.8 as PR_0_FUTURE_SB_LAST_MOUNT
contained PR_PREEN_OK and that makes it continue.
Now 1.14.9 does preenhalt on the two hour difference of mount time
even as the problem is just as harmless - and it will not appear
on distributions with buggy init scripts.

Your kernel patch makes sense, but it won't change the problem: because the
problem is mtime not wtime.

Greetings, Stephan

2009-09-10 21:32:29

by Theodore Ts'o

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

Oops, I sent the wrong version of the patch; it had a stupid typo in
it... (s/s_flag/s_flags).

- Ted

commit 54c90e88b955b94de94d7d730ee8d53daa64453f
Author: Theodore Ts'o <[email protected]>
Date: Thu Sep 10 17:31:04 2009 -0400

ext4: Don't update superblock write time when filesystem is read-only

This avoids updating the superblock write time when we are mounting
the root file system read/only but we need to replay the journal; at
that point, for people who are east of GMT and who make their clock
tick in localtime for Windows bug-for-bug compatibility, and this will
cause e2fsck to complain and force a full file system check.

Signed-off-by: "Theodore Ts'o" <[email protected]>

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index f644a5c..9f87707 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3244,7 +3244,18 @@ static int ext4_commit_super(struct super_block *sb, int sync)
clear_buffer_write_io_error(sbh);
set_buffer_uptodate(sbh);
}
- es->s_wtime = cpu_to_le32(get_seconds());
+ /*
+ * If the file system is mounted read-only, don't update the
+ * superblock write time. This avoids updating the superblock
+ * write time when we are mounting the root file system
+ * read/only but we need to replay the journal; at that point,
+ * for people who are east of GMT and who make their clock
+ * tick in localtime for Windows bug-for-bug compatibility,
+ * the clock is set in the future, and this will cause e2fsck
+ * to complain and force a full file system check.
+ */
+ if (!(sb->s_flags & MS_RDONLY))
+ es->s_wtime = cpu_to_le32(get_seconds());
es->s_kbytes_written =
cpu_to_le64(EXT4_SB(sb)->s_kbytes_written +
((part_stat_read(sb->s_bdev->bd_part, sectors[1]) -

2009-09-11 00:55:07

by Theodore Ts'o

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

On Thu, Sep 10, 2009 at 10:57:54PM +0200, Stephan Kulow wrote:
>
> The work flow is as such:
> - boot into live cd, live cd thinks system is UTC, mounts sda1 (ro)
> at 9am, umounts cleanly, updates mount time to 11am (hardware clock)

That doesn't make any sense. The mount time is only set if the
filesystem is mounted read/write. So if you only mounted the
filesystem read/only, the mount time wouldn't be changed. Watch:

# dumpe2fs -h /dev/closure/test | grep time:
dumpe2fs 1.41.9 (22-Aug-2009)
Last mount time: Tue Sep 1 00:00:00 2009
Last write time: Tue Sep 1 00:00:00 2009
# mount -t ext3 -o ro /dev/closure/test /mnt
# umount /mnt
# dumpe2fs -h /dev/closure/test | grep time:
dumpe2fs 1.41.9 (22-Aug-2009)
Last mount time: Tue Sep 1 00:00:00 2009
Last write time: Tue Sep 1 00:00:00 2009
# mount -t ext4 -o ro /dev/closure/test /mnt
# umount /mnt
# dumpe2fs -h /dev/closure/test | grep time:
dumpe2fs 1.41.9 (22-Aug-2009)
Last mount time: Tue Sep 1 00:00:00 2009
Last write time: Tue Sep 1 00:00:00 2009
# date
Thu Sep 10 20:51:48 EDT 2009

See?

Now, if the Live CD is going to be repairing a filesystem, it should
mount the root filesystem read/only, extract the time zone, fix the
system clock, and only then mount the filesystem read/write.

Or the Live CD should fix the system clock over the network before it
tries mounting any hard drives. The point is, there are ways for the
Live CD to do the right thing. If it's not willing to do that, then
its init scripts are buggy. :-)

- Ted

2009-09-11 10:38:51

by Stephan Kulow

[permalink] [raw]
Subject: Re: buggy_init_scritps and e2fsprogs 1.41.9

On Friday 11 September 2009 02:55:03 Theodore Tso wrote:
> On Thu, Sep 10, 2009 at 10:57:54PM +0200, Stephan Kulow wrote:
> > The work flow is as such:
> > - boot into live cd, live cd thinks system is UTC, mounts sda1 (ro)
> > at 9am, umounts cleanly, updates mount time to 11am (hardware clock)
>
> That doesn't make any sense. The mount time is only set if the
> filesystem is mounted read/write. So if you only mounted the
> filesystem read/only, the mount time wouldn't be changed. Watch:
>
I wasn't certain if it also happens with ro, so I put the ro in ().

>
> Now, if the Live CD is going to be repairing a filesystem, it should
> mount the root filesystem read/only, extract the time zone, fix the
> system clock, and only then mount the filesystem read/write.
>
> Or the Live CD should fix the system clock over the network before it
> tries mounting any hard drives. The point is, there are ways for the
> Live CD to do the right thing. If it's not willing to do that, then
> its init scripts are buggy. :-)
>
It doesn't matter if the live init scripts are buggy or not, because the
real system's fsck will be the one to complain about future mounts.

Ok, perhaps it's clearer if you think about it this way:
- installing openSUSE using local time
- boot ubuntu live cd (assuming utc), mount /data (in the future),
umount cleanly
- reboot into openSUSE (without buggy init scripts)
-> UNEXPECTED INCONSISTENCY, need to repair

I think this is pretty much expected as ubuntu has little chance to figure out
the timezone of the system using /data normally.

What you're saying is basically: screw everyone living east and using dual
boot. fsck repairs way more serious problems without any problem, but if I
share data between ubuntu and openSUSE I need to go into single mode
afterwards?

At least consider this one:

--- a/e2fsck/super.c
+++ b/e2fsck/super.c
@@ -836,7 +836,7 @@ void check_super_block(e2fsck_t ctx)
pctx.num = fs->super->s_wtime;
problem = PR_0_FUTURE_SB_LAST_WRITE;
if (fs->super->s_wtime <= (__u32) ctx->now + ctx->time_fudge)
- problem = PR_0_FUTURE_SB_LAST_MOUNT_FUDGED;
+ problem = PR_0_FUTURE_SB_LAST_WRITE_FUDGED;
if (fix_problem(ctx, problem, &pctx)) {
fs->super->s_wtime = ctx->now;
ext2fs_mark_super_dirty(fs);

Greetings, Stephan