2007-04-09 00:05:58

by Samuel Thibault

[permalink] [raw]
Subject: Add a norecovery option to ext3/4?

Hi,

Distribution installers usually try to probe OSes for building a suited
grub menu. Unfortunately, mounting an ext3 partition, even in read-only
mode, does perform some operations on the filesystem (log recovery).
This is not a good idea since it may silently garbage data. XFS has a
norecovery option that allows to disable that, I'd say ext3/4 should
have it too.

Samuel


2007-04-09 03:24:54

by Eric Sandeen

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Samuel Thibault wrote:
> Hi,
>
> Distribution installers usually try to probe OSes for building a suited
> grub menu. Unfortunately, mounting an ext3 partition, even in read-only
> mode, does perform some operations on the filesystem (log recovery).
> This is not a good idea since it may silently garbage data.

Can you elaborate? Under what circumstances is log replay going to harm
data? Do you mean that the installer mounts partitions, looking for
what OS is installed? How is that harmful?

Ohhh... this is http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=417407
isn't it?

Hm, so the root cause there seems that the installer found 2 legs of a
mirror and mounted them independently, recovering them independently...
But why did that cause problems?

> XFS has a
> norecovery option that allows to disable that, I'd say ext3/4 should
> have it too.

The xfs mount option is useful on a purely read-only device, or if the
log is corrupted to the point where it can't be replayed... It was put
in place 9+ years ago. :) I'd have to ask the sgi guys to dig & see
what the original use for...

It'd be easy enough to add to ext3/4, I suppose. Other options you may
have in the installer, though, is to check for md superblocks before
mounting bare partitions, or maybe use the BLKROSET ioctl to set the
block device to read-only prior to mount, for added insurance...

-Eric

2007-04-09 03:31:38

by Samuel Thibault

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Eric Sandeen, le Sun 08 Apr 2007 22:24:50 -0500, a ?crit :
> Samuel Thibault wrote:
> >Distribution installers usually try to probe OSes for building a suited
> >grub menu. Unfortunately, mounting an ext3 partition, even in read-only
> >mode, does perform some operations on the filesystem (log recovery).
> >This is not a good idea since it may silently garbage data.
>
> Can you elaborate? Under what circumstances is log replay going to harm
> data? Do you mean that the installer mounts partitions, looking for
> what OS is installed? How is that harmful?
>
> Ohhh... this is http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=417407
> isn't it?

Yes.

> Hm, so the root cause there seems that the installer found 2 legs of a
> mirror and mounted them independently, recovering them independently...
> But why did that cause problems?

Because that thrashed his data (or at least it didn't help to keep data
safe).

> Other options you may have in the installer, though, is to check for
> md superblocks before mounting bare partitions, or maybe use the
> BLKROSET ioctl to set the block device to read-only prior to mount,
> for added insurance...

That's one the things proposed in the bugreport yes.

Samuel

2007-04-09 03:42:07

by Eric Sandeen

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Samuel Thibault wrote:

>> Hm, so the root cause there seems that the installer found 2 legs of a
>> mirror and mounted them independently, recovering them independently...
>> But why did that cause problems?
>
> Because that thrashed his data (or at least it didn't help to keep data
> safe).
>
>> Other options you may have in the installer, though, is to check for
>> md superblocks before mounting bare partitions, or maybe use the
>> BLKROSET ioctl to set the block device to read-only prior to mount,
>> for added insurance...
>
> That's one the things proposed in the bugreport yes.

The reason I suggest other options is because intentionally mounting a
corrupted FS may not really be the way you want to go... norecovery on
xfs at least is an option of last resort, not something to use by default.

-Eric

2007-04-09 04:34:25

by Brad Campbell

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Eric Sandeen wrote:
> Samuel Thibault wrote:
>> Hi,
>>
>> Distribution installers usually try to probe OSes for building a suited
>> grub menu. Unfortunately, mounting an ext3 partition, even in read-only
>> mode, does perform some operations on the filesystem (log recovery).
>> This is not a good idea since it may silently garbage data.
>
> Can you elaborate? Under what circumstances is log replay going to harm
> data? Do you mean that the installer mounts partitions, looking for
> what OS is installed? How is that harmful?
>

It'll wreak havoc on my hibernated system when I've suspended it to do a test OS install on one of
my spare partitions. The log replay will go fine, but then when the system resumes it's idea of
what's on the disk won't match what is really there and ugly, ugly things happen.


Brad
--
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams

2007-04-09 10:14:42

by Andreas Dilger

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Apr 08, 2007 22:24 -0500, Eric Sandeen wrote:
> Samuel Thibault wrote:
> >Distribution installers usually try to probe OSes for building a suited
> >grub menu. Unfortunately, mounting an ext3 partition, even in read-only
> >mode, does perform some operations on the filesystem (log recovery).
> >This is not a good idea since it may silently garbage data.
>
> Can you elaborate? Under what circumstances is log replay going to harm
> data? Do you mean that the installer mounts partitions, looking for
> what OS is installed? How is that harmful?

If that disk was actually in use on another system but just exported
via a SAN to this node you've potentially corrupted the filesystem.

It's a bad idea to just go ahead and mount filesystems that you aren't
told to mount.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-04-09 14:00:58

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Sun, Apr 08, 2007 at 10:42:03PM -0500, Eric Sandeen wrote:
> Samuel Thibault wrote:
>
> >>Hm, so the root cause there seems that the installer found 2 legs of a
> >>mirror and mounted them independently, recovering them independently...
> >>But why did that cause problems?
> >
> >Because that thrashed his data (or at least it didn't help to keep data
> >safe).

Actually, reading through the Debian bug report, there is no proof
that is what actually caused the data loss. I certainly can't think
of any explanation for why that would have happened. See the summary
from Steve Langasek::

>Checkpoint of the IRC discussion:
>
>- The submitter says that after reboot, the RAID was reported as out of
> sync.
>- The logs show that the ext3 filesystem was automatically mounted rw for
> journal recovery by the kernel driver.
>- There is no evidence in the logs that the RAID was ever assembled within
> d-i, so it shouldn't be the case that the RAID superblocks were out of
> sync as a result of d-i itself.
>- This leaves two possible reasons for the out-of-sync state of the RAID:
> either mounting the individual partitions as ext3 filesystems somehow
> overwrote the RAID superblock just the right way (unlikely since it would
> require the ext3 driver to write past the end of the declared filesystem),
> or the RAID superblocks were out of sync /before/ booting d-i. The latter
> is consistent with the fact that the ext3 driver had to do a journal
> recovery, suggesting that both the ext3 fs and the RAID were not cleanly
> shut down.
>- If mounting as ext3 overwrote the RAID superblock, that seems to be a
> kernel bug, and we have no good explanation for how that would happen.
>- If the RAID was unclean before booting d-i, all bets are off as to the
> state of the filesystem at the beginning of this journal recovery, and it
> may be difficult to ever reproduce this bug.

> The reason I suggest other options is because intentionally mounting a
> corrupted FS may not really be the way you want to go... norecovery on
> xfs at least is an option of last resort, not something to use by default.

This would also be true for ext3; I am extremely uncomfortable with
people thinking that a norecovery option is something that should be
routinely used by programs. It's something that should only be used
by experts, who know what they are doing and who are willing to accept
the potential risks.

- Ted

2007-04-09 13:42:19

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Sun, 08 Apr 2007 22:24:50 CDT, Eric Sandeen said:
> Can you elaborate? Under what circumstances is log replay going to harm
> data? Do you mean that the installer mounts partitions, looking for
> what OS is installed? How is that harmful?

Another usage case that really wants to avoid the log replay is if you're
looking at an unknown disk image with a forensics CD such as Helix:

http://www.e-fense.com/helix/

Yes, good forensics always clones the disk image twice (the first clone being
used for nothing but creating second-gen clones for analysis), and in most
cases the forensic analyst can work around the fact that you *do* cause some
changes to the disk image by mounting. But sometimes, you'd rather be looking
at a possibly inconsistent image than replaying the log - particularly if
you're looking at a "seized and power plug pulled" image, and you actually
care about things that may have been in the log, like just-erased files.


Attachments:
(No filename) (226.00 B)

2007-04-09 15:55:40

by Phillip Susi

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Samuel Thibault wrote:
> Hi,
>
> Distribution installers usually try to probe OSes for building a suited
> grub menu. Unfortunately, mounting an ext3 partition, even in read-only
> mode, does perform some operations on the filesystem (log recovery).
> This is not a good idea since it may silently garbage data. XFS has a
> norecovery option that allows to disable that, I'd say ext3/4 should
> have it too.

When the filesystem is told to mount the disk read only, that means it
should not write to it. The fact that ext3 goes ahead and does anyway
is a bug and should be fixed. There is no need for a norecovery option,
because read only is a sufficient directive to tell the filesystem not
to write to the disk.

As someone else pointed out, this behavior causes havoc if you hibernate
a system and then boot up another system which mounts the disk of the
hibernated system. Under all conditions it should be safe to mount a
disk read only, but here it is not because the journal playback trashes
the disk out from under the hibernated system.

2007-04-09 16:37:58

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?


On Apr 8 2007 22:24, Eric Sandeen wrote:
> Samuel Thibault wrote:
>
> Can you elaborate? Under what circumstances is log replay going to harm data?
> Do you mean that the installer mounts partitions, looking for what OS is
> installed? How is that harmful?
>
> Hm, so the root cause there seems that the installer found 2 legs of a mirror
> and mounted them independently, recovering them independently... But why did
> that cause problems?

Because, for whatever unlikely reason there could possibly be, it may
have been repaired differently [depending on sunshine, daytime, rand(),
or so]?


Jan
--

2007-04-09 16:20:58

by Kyle Moffett

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Apr 09, 2007, at 11:43:15, Phillip Susi wrote:
> Samuel Thibault wrote:
>> Hi,
>> Distribution installers usually try to probe OSes for building a
>> suited grub menu. Unfortunately, mounting an ext3 partition, even
>> in read-only mode, does perform some operations on the filesystem
>> (log recovery). This is not a good idea since it may silently
>> garbage data. XFS has a norecovery option that allows to disable
>> that, I'd say ext3/4 should have it too.
>
> When the filesystem is told to mount the disk read only, that means
> it should not write to it. The fact that ext3 goes ahead and does
> anyway is a bug and should be fixed. There is no need for a
> norecovery option, because read only is a sufficient directive to
> tell the filesystem not to write to the disk.
>
> As someone else pointed out, this behavior causes havoc if you
> hibernate a system and then boot up another system which mounts the
> disk of the hibernated system. Under all conditions it should be
> safe to mount a disk read only, but here it is not because the
> journal playback trashes the disk out from under the hibernated
> system.

Well IIRC it is possible to prevent that by switching the blockdev to
read-only mode first:

[email protected]:~# mount /dev/hda6 /mnt
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda6, internal journal
EXT3-fs: mounted filesystem with ordered data mode
[email protected]:~# umount /mnt
[email protected]:~# blockdev --setro /dev/hda6
[email protected]:~# mount /dev/hda6 /mnt
mount: block device /dev/loop0 is write-protected, mounting read-only
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode

Cheers,
Kyle Moffett

2007-04-09 17:23:52

by Eric Sandeen

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Phillip Susi wrote:
> Samuel Thibault wrote:
>> Hi,
>>
>> Distribution installers usually try to probe OSes for building a suited
>> grub menu. Unfortunately, mounting an ext3 partition, even in read-only
>> mode, does perform some operations on the filesystem (log recovery).
>> This is not a good idea since it may silently garbage data. XFS has a
>> norecovery option that allows to disable that, I'd say ext3/4 should
>> have it too.
>
> When the filesystem is told to mount the disk read only, that means it
> should not write to it.

It means the filesystem should not be writeable when it is mounted.
This is not the same as saying that the filesystem itself should do no
IO in the course of making that read-only mount available.

> The fact that ext3 goes ahead and does anyway
> is a bug and should be fixed. There is no need for a norecovery option,
> because read only is a sufficient directive to tell the filesystem not
> to write to the disk.

I respectfully disagree, see above.

> As someone else pointed out, this behavior causes havoc if you hibernate
> a system and then boot up another system which mounts the disk of the
> hibernated system.

In that case you are mounting the same filesystem uner 2 different
operating systems simultaneously, which is, and always has been, a
recipe for disaster. Flagging the fs as "mounted already" would
probably be a better solution, though it's harder than it sounds at
first glance.

> Under all conditions it should be safe to mount a
> disk read only, but here it is not because the journal playback trashes
> the disk out from under the hibernated system.

Under all conditions it should be safe to mount a read-only block
device, but that is not the same as mounting a filesystem read-only.

-Eric

2007-04-10 07:26:49

by Jörn Engel

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Mon, 9 April 2007 12:21:15 -0500, Eric Sandeen wrote:
> Phillip Susi wrote:
> >
> > When the filesystem is told to mount the disk read only, that means it
> > should not write to it.
>
> It means the filesystem should not be writeable when it is mounted.
> This is not the same as saying that the filesystem itself should do no
> IO in the course of making that read-only mount available.

The filesystem has two interfaces. One to the device underneith, one to
userspace. Read-only should certainly mean that no writes cross the
userspace interface. Traditionally it has implicitly also meant that
no writes are crossing the device interface. Whether that was/is an
explicit requirement - who knows.

Journaling filesystems have introduced this thing called "journal
replay". And I have to admit, it makes thing _a lot_ easier to always
replay the journal, even when being mounted read-only.

But "it is easier" is a pretty lame excuse.

> Under all conditions it should be safe to mount a read-only block
> device, but that is not the same as mounting a filesystem read-only.

In particular, it is a lame excuse when this claim is true. If the
block-device is read-only, then journal replay will not work as expected
and all the "not so easy" work has to be done anyway.

Did I miss anything? Is it actually easier to mount a read-only device
with unclean journal than mounting a read-write device and not replay
the journal?

Jörn

--
Joern's library part 8:
http://citeseer.ist.psu.edu/plank97tutorial.html

2007-04-10 11:27:51

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Tue, Apr 10, 2007 at 09:22:53AM +0200, J?rn Engel wrote:
> > Under all conditions it should be safe to mount a read-only block
> > device, but that is not the same as mounting a filesystem read-only.
>
> In particular, it is a lame excuse when this claim is true. If the
> block-device is read-only, then journal replay will not work as expected
> and all the "not so easy" work has to be done anyway.
>
> Did I miss anything? Is it actually easier to mount a read-only device
> with unclean journal than mounting a read-write device and not replay
> the journal?

The problem is that ext3 defers writes even more than ext2 did in
order to make journalling (a) possible, and (b) more efficient. So if
you mount the filesystem read-only without replaying the journal, you
may get incorrect data; you could get data belonging to another user's
file; the kernel could detect filesystem inconsistencies and decide
that the filesystem has errors. Now, at least in theory the kernel
will not oops when it operates on an arbitrarily corrupted filesystem
(which is what a filesystems whose journal has not been run can look
like), BUT #1, this hasn't been as well tested we would probably like,
and #2, if the filesystem is marked with an errors-behavior of "reboot
on error", then system will reboot, because that's what you asked it
to do!

I suppose what you could do is to read in the journal, and use it to
create an remapping table so that when you want to read block #5126,
and block number 5126 is in the journal, to read the journal version
of the block instead of the one on disk. That would allow for safe
access to a filesystem being mounted read-only without the journal
being present.

Patches gratefully accepted....

- Ted

2007-04-10 12:08:26

by Jörn Engel

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Tue, 10 April 2007 07:27:18 -0400, Theodore Tso wrote:
>
> I suppose what you could do is to read in the journal, and use it to
> create an remapping table so that when you want to read block #5126,
> and block number 5126 is in the journal, to read the journal version
> of the block instead of the one on disk. That would allow for safe
> access to a filesystem being mounted read-only without the journal
> being present.

Another option would be to access the medium through a mapping inode,
replay the journal into the mapping inode and _not_ flush the dirty
pages. But as long as a remapping table is sufficient for ext3 journal
format, such a table should be simpler and faster.

> Patches gratefully accepted....

Not likely to come from me anytime soon. There's a certain other
filesystem I have to finish first that still suffers from the same
problem.

Jörn

--
Do not stop an army on its way home.
-- Sun Tzu

2007-04-10 16:44:51

by Matt Mackall

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

On Tue, Apr 10, 2007 at 02:08:26PM +0200, J?rn Engel wrote:
> On Tue, 10 April 2007 07:27:18 -0400, Theodore Tso wrote:
> >
> > I suppose what you could do is to read in the journal, and use it to
> > create an remapping table so that when you want to read block #5126,
> > and block number 5126 is in the journal, to read the journal version
> > of the block instead of the one on disk. That would allow for safe
> > access to a filesystem being mounted read-only without the journal
> > being present.
>
> Another option would be to access the medium through a mapping inode,
> replay the journal into the mapping inode and _not_ flush the dirty
> pages. But as long as a remapping table is sufficient for ext3 journal
> format, such a table should be simpler and faster.

Or you could make a snapshot with device-mapper and then mount it.
Requires some free disk space somewhere (or a hack with loop on
tmpfs), but should be doable today.

--
Mathematics is the supreme nostalgia of our time.

2007-04-10 18:53:49

by Phillip Susi

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Eric Sandeen wrote:
> It means the filesystem should not be writeable when it is mounted.
> This is not the same as saying that the filesystem itself should do no
> IO in the course of making that read-only mount available.

I disagree.

> I respectfully disagree, see above.

Based on what? I argue that historically the primary use of the read
only mount flag was to prevent the underlying filesystem from being
modified and possibly damaged further before it can be fsck'ed. It
became common practice to mount the root filesystem read only and run a
fsck on it, then either reboot or remount read-write depending on if
fsck had to make changes.

In this context, the meaning of the read only mount flag was clear: do
not write to the disk. If you wish to redefine it as "do not allow me
write access to any files" then you fly in the face of convention, and
the onus is on you to provide a compelling argument to make such a change.

> In that case you are mounting the same filesystem uner 2 different
> operating systems simultaneously, which is, and always has been, a
> recipe for disaster. Flagging the fs as "mounted already" would
> probably be a better solution, though it's harder than it sounds at
> first glance.

No, it has not been. Prior to poorly behaved journal playback, it was
perfectly safe to mount a filesystem read only even if it was mounted
read-write by another system ( possibly fsck or defrag ). You might not
read the correct data from it, but you would not damage the underlying
data simply by mounting it read-only.

> Under all conditions it should be safe to mount a read-only block
> device, but that is not the same as mounting a filesystem read-only.

Historically it was the same thing. I see no reason to change that
behavior, do you?

2007-04-10 19:21:30

by Eric Sandeen

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Phillip Susi wrote:
> Eric Sandeen wrote:
>> It means the filesystem should not be writeable when it is mounted.
>> This is not the same as saying that the filesystem itself should do no
>> IO in the course of making that read-only mount available.
>
> I disagree.
>
>> I respectfully disagree, see above.
>
> Based on what? I argue that historically the primary use of the read
> only mount flag was to prevent the underlying filesystem from being
> modified and possibly damaged further before it can be fsck'ed. It
> became common practice to mount the root filesystem read only and run a
> fsck on it, then either reboot or remount read-write depending on if
> fsck had to make changes.

except in the case of a journaling filesystem, where the journal in
theory obviates the need for a fsck. (yes, I know... fsck still has a
place...) But, fsck is largely meaningless until the journal has been
recovered anyway (fs can only be consistent if it includes uncommited
transactions in the journal), so isn't this new territory?

I guess looking to the man page for clarification of intent is no help...

ro Mount the file system read-only.


> In this context, the meaning of the read only mount flag was clear: do
> not write to the disk. If you wish to redefine it as "do not allow me
> write access to any files" then you fly in the face of convention, and
> the onus is on you to provide a compelling argument to make such a change.

I'm admittedly playing devil's advocate here :) but what, in the
historical non-journalled filesystem case, would be writing to the
device anyway, if all IO from the vfs were stopped? Without the
journal, isn't vfs-ro the same as bdev-ro, largely?

As a counter example, if you had a filesystem which saves it's last
mount time in the superblock; should a ro mount not update that time?
(perhaps not, depending on how that timestamp was intended to be used.)

>> In that case you are mounting the same filesystem uner 2 different
>> operating systems simultaneously, which is, and always has been, a
>> recipe for disaster. Flagging the fs as "mounted already" would
>> probably be a better solution, though it's harder than it sounds at
>> first glance.
>
> No, it has not been. Prior to poorly behaved journal playback, it was
> perfectly safe to mount a filesystem read only even if it was mounted
> read-write by another system ( possibly fsck or defrag ). You might not
> read the correct data from it, but you would not damage the underlying
> data simply by mounting it read-only.

You might not damage the underlying filesystem, but you could sure go
off in the weeds trying to read it, if you stumbled upon some
half-updated metadata... so while it may be safe for the filesystem, I'm
not convinced that it's safe for the host reading the filesystem.

>> Under all conditions it should be safe to mount a read-only block
>> device, but that is not the same as mounting a filesystem read-only.
>
> Historically it was the same thing. I see no reason to change that
> behavior, do you?

but it's already changed, and has been in linux since ext3 came on the
scene. mount -o ro -does- replay the journal. Surely readonly does not
imply that we want a corrupted filesystem if it was not cleanly shut
down. I suppose there is a place for the argument that a readonly mount
of a journaled filesystem -should- present a recovered filesystem to the
user, without actually recovering the log to disk. I guess to me, it
hardly seems worth the effort, as the precedent is long set for doing
recovery on a read-only mount.

-Eric

2007-04-10 22:04:55

by Phillip Susi

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Eric Sandeen wrote:
> except in the case of a journaling filesystem, where the journal in
> theory obviates the need for a fsck. (yes, I know... fsck still has a
> place...) But, fsck is largely meaningless until the journal has been
> recovered anyway (fs can only be consistent if it includes uncommited
> transactions in the journal), so isn't this new territory?

In a way, yes, this is new territory, but not entirely. It is merely an
extension of the existing system, and in the existing system the ro flag
clearly meant do not write to the disk. You don't extend an existing
system by breaking old expectations.

> I'm admittedly playing devil's advocate here :) but what, in the
> historical non-journalled filesystem case, would be writing to the
> device anyway, if all IO from the vfs were stopped? Without the
> journal, isn't vfs-ro the same as bdev-ro, largely?

Yes, they were the same, and that is my point: users got used to using
ro to indicate they did not want the block device written to, and were
doing this long before there was a read only flag on the block device
itself. Excepting journal playback from this breaks user expectation
and leads to data loss.

> As a counter example, if you had a filesystem which saves it's last
> mount time in the superblock; should a ro mount not update that time?
> (perhaps not, depending on how that timestamp was intended to be used.)

This is another good example arguing against writing to the disk. Ext2
would write to the super block to mark the volume as dirty and update
the mount count used to decide when it was time for a fsck. It did not
do this if mounted read only because it understood that the meaning of
the ro flag was not to write to the disk. Likewise, it did not update
file's atime. If ro just meant "don't let me write to files" then the
atime would be updated even when mounted read only.

> You might not damage the underlying filesystem, but you could sure go
> off in the weeds trying to read it, if you stumbled upon some
> half-updated metadata... so while it may be safe for the filesystem, I'm
> not convinced that it's safe for the host reading the filesystem.

And this is exactly what you get without journals. You mount a damaged
filesystem read only, you take what you can get. You might get bad
data, or fail to open or read certain files or parts of files, but you
knew for sure that you wouldn't cause any more damage to the data on
disk, and you could run a fsck or defrag or partimage or anything else
to read and/or modify the disk.

> but it's already changed, and has been in linux since ext3 came on the
> scene. mount -o ro -does- replay the journal. Surely readonly does not
> imply that we want a corrupted filesystem if it was not cleanly shut
> down. I suppose there is a place for the argument that a readonly mount
> of a journaled filesystem -should- present a recovered filesystem to the
> user, without actually recovering the log to disk. I guess to me, it
> hardly seems worth the effort, as the precedent is long set for doing
> recovery on a read-only mount.

Just because a bug has been around for a long time does not mean it is
not a bug. In the pre-journal world, if you mounted a dirty filesystem
read only you expected the possibility of errors reading it. Why should
that expectation not hold true with journals? It might be nice to do
the journal playback in ram or otherwise without writing to the disk,
but as far as the user setting the ro flag cares, they just want you not
to update the disk and if there are inconsistencies that cause errors,
then you need to fsck or mount rw.

2007-04-11 20:22:23

by Bill Davidsen

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Eric Sandeen wrote:
> Phillip Susi wrote:
>> Eric Sandeen wrote:

>>> In that case you are mounting the same filesystem uner 2 different
>>> operating systems simultaneously, which is, and always has been, a
>>> recipe for disaster. Flagging the fs as "mounted already" would
>>> probably be a better solution, though it's harder than it sounds at
>>> first glance.
>> No, it has not been. Prior to poorly behaved journal playback, it was
>> perfectly safe to mount a filesystem read only even if it was mounted
>> read-write by another system ( possibly fsck or defrag ). You might not
>> read the correct data from it, but you would not damage the underlying
>> data simply by mounting it read-only.
>
> You might not damage the underlying filesystem, but you could sure go
> off in the weeds trying to read it, if you stumbled upon some
> half-updated metadata... so while it may be safe for the filesystem, I'm
> not convinced that it's safe for the host reading the filesystem.
>
Exactly. If the data are protected you can use other software to access
it. For ext3 an explicit ext2 mount might do it... but if you corrupt
the underlying information, there's no going back.

In practice Linux has had lots of practice mounting garbage, and isn't
likely to suffer terminal damage.

I wonder what happens if the device is really read-only and the o/s
tries to replay the journal as part of a r/o mount? I suspect the system
will refuse totally with an i/o error, not what you want.

--
Bill Davidsen <[email protected]>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

2007-04-12 11:18:52

by Pavel Machek

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Hi!

> >Distribution installers usually try to probe OSes for
> >building a suited
> >grub menu. Unfortunately, mounting an ext3 partition,
> >even in read-only
> >mode, does perform some operations on the filesystem
> >(log recovery).
> >This is not a good idea since it may silently garbage
> >data.
>
> Can you elaborate? Under what circumstances is log
> replay going to harm data? Do you mean that the

Suspend machine, boot from CD trying to read from HDD, resume. People
lost data because of this trap.

Imagine _broken_ disk, with hw dying. Would you rather replay log,
possibly corrupting it even more, or read few files you do care about?

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-04-12 13:54:49

by Benny Amorsen

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

>>>>> "BD" == Bill Davidsen <[email protected]> writes:

BD> In practice Linux has had lots of practice mounting garbage, and
BD> isn't likely to suffer terminal damage.

These days, with exposed USB ports and automount, it is rather
important that the kernel doesn't suffer terminal damage when mounting
garbage. It is too easy to exploit.


/Benny

2007-04-15 19:57:40

by Pavel Machek

[permalink] [raw]
Subject: Re: Add a norecovery option to ext3/4?

Hi!

> >You might not damage the underlying filesystem, but you
> >could sure go
> >off in the weeds trying to read it, if you stumbled
> >upon some
> >half-updated metadata... so while it may be safe for
> >the filesystem, I'm
> >not convinced that it's safe for the host reading the
> >filesystem.
> >
> Exactly. If the data are protected you can use other
> software to access it. For ext3 an explicit ext2 mount
> might do it...

It does not :-(. dirty ext3 is marked incompatible with ext2.

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html