2011-04-12 17:25:14

by Mark Lord

[permalink] [raw]
Subject: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

Ted et al.

I've only just noticed this, so I have no idea how long it has been this way.

When I build a kernel with CONFIG_EXT4_USE_FOR_EXT23=y and boot from it,
the ext4 root filesystem shows up as "ext2" mode, rather than "ext4".

This looks very wrong to me, and quite dangerous.

Eg. I test it by building my own kernel (2.6.38.2), with ext4 built-in,
no initramfs required, and boot:

root=/dev/sda1 init=/bin/bash
...
$ mount /proc
$ cat /proc/mounts
/dev/root / ext2 ro,relatime,barrier=1,data=ordered 0 0

So.. it shows "ext2" instead of "ext4". That really looks like a bug.
Especially since it appears to be using journaling regardless.

Building the kernel without CONFIG_EXT4_USE_FOR_EXT23
results in a proper "ext4" mount entry in /proc/mounts.


2011-04-13 00:49:43

by Theodore Ts'o

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

On Tue, Apr 12, 2011 at 01:25:08PM -0400, Mark Lord wrote:
> Ted et al.
>
> I've only just noticed this, so I have no idea how long it has been this way.
>
> When I build a kernel with CONFIG_EXT4_USE_FOR_EXT23=y and boot from it,
> the ext4 root filesystem shows up as "ext2" mode, rather than "ext4".
>
> This looks very wrong to me, and quite dangerous.

It's a cosemtic bug, I agree, but I'm not sure why you consider it
dangerous.

CONFIG_EXT4_USE_FOR_EXT23 means that ext4 registers itself as ext2
and/or ext3, if ext2 and/or ext3 are not configured into the kernel.
Since the kernel tries to mount the file system as ext2, ext3, and
then ext4, and uses whichever one works first.

- Ted

2011-04-13 14:05:30

by Mark Lord

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

On 11-04-12 08:49 PM, Ted Ts'o wrote:
>
> It's a cosemtic bug, I agree

Great. Got a patch to fix it for me to try?

Since it currently misidentifies the filesystem type,
some of the disk management tools I have get confused.
I don't like taking any chances with user data like that.

Thanks.

2011-04-13 14:10:09

by Mark Lord

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

On 11-04-12 08:49 PM, Ted Ts'o wrote:
> On Tue, Apr 12, 2011 at 01:25:08PM -0400, Mark Lord wrote:
>> Ted et al.
>>
>> I've only just noticed this, so I have no idea how long it has been this way.
>>
>> When I build a kernel with CONFIG_EXT4_USE_FOR_EXT23=y and boot from it,
>> the ext4 root filesystem shows up as "ext2" mode, rather than "ext4".
>>
>> This looks very wrong to me, and quite dangerous.
>
> It's a cosemtic bug, I agree, but I'm not sure why you consider it
> dangerous.

It is dangerous if it really has mounted an ext4 as ext2;
the on-disk formats are not 100% compatible.

I don't know what it is doing, so it looks quite dangerous.
Perhaps it really did mount as ext4 though, in which case not.

Thanks.

2011-04-13 16:52:59

by John Stoffel

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

>>>>> "Ted" == Ted Ts'o <[email protected]> writes:

Ted> On Tue, Apr 12, 2011 at 01:25:08PM -0400, Mark Lord wrote:
>> Ted et al.
>>
>> I've only just noticed this, so I have no idea how long it has been this way.
>>
>> When I build a kernel with CONFIG_EXT4_USE_FOR_EXT23=y and boot from it,
>> the ext4 root filesystem shows up as "ext2" mode, rather than "ext4".
>>
>> This looks very wrong to me, and quite dangerous.

Ted> It's a cosemtic bug, I agree, but I'm not sure why you consider it
Ted> dangerous.

Ted> CONFIG_EXT4_USE_FOR_EXT23 means that ext4 registers itself as
Ted> ext2 and/or ext3, if ext2 and/or ext3 are not configured into the
Ted> kernel. Since the kernel tries to mount the file system as ext2,
Ted> ext3, and then ext4, and uses whichever one works first.

Should it instead be: CONFIG_EXT4_MASQUERADE_AS_EXT23 instead, so
it's blindingly obvious what's going on here.

John

2011-04-13 18:17:17

by Ric Wheeler

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

On 04/13/2011 12:45 PM, John Stoffel wrote:
>>>>>> "Ted" == Ted Ts'o<[email protected]> writes:
> Ted> On Tue, Apr 12, 2011 at 01:25:08PM -0400, Mark Lord wrote:
>>> Ted et al.
>>>
>>> I've only just noticed this, so I have no idea how long it has been this way.
>>>
>>> When I build a kernel with CONFIG_EXT4_USE_FOR_EXT23=y and boot from it,
>>> the ext4 root filesystem shows up as "ext2" mode, rather than "ext4".
>>>
>>> This looks very wrong to me, and quite dangerous.
> Ted> It's a cosemtic bug, I agree, but I'm not sure why you consider it
> Ted> dangerous.
>
> Ted> CONFIG_EXT4_USE_FOR_EXT23 means that ext4 registers itself as
> Ted> ext2 and/or ext3, if ext2 and/or ext3 are not configured into the
> Ted> kernel. Since the kernel tries to mount the file system as ext2,
> Ted> ext3, and then ext4, and uses whichever one works first.
>
> Should it instead be: CONFIG_EXT4_MASQUERADE_AS_EXT23 instead, so
> it's blindingly obvious what's going on here.
>
> John

No - the idea is to have use eliminate duplicate code and still be able to run
ext2 and ext3 in the same way.

How close we come to that goal will probably vary between ext2 and ext3 now but
it would be great to get that done eventually.

We did discuss this at LSF last week and we had some enthusiasm for keeping ext2
code around forever more or less as "the simple" file system template :)

Ric

2011-04-13 21:00:45

by Theodore Ts'o

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4


On Apr 13, 2011, at 10:10 AM, Mark Lord wrote:

>
> It is dangerous if it really has mounted an ext4 as ext2;
> the on-disk formats are not 100% compatible.
>
> I don't know what it is doing, so it looks quite dangerous.
> Perhaps it really did mount as ext4 though, in which case not.


As the config option name: CONFIG_EXT4_USE_FOR_EXT23
might mave suggested to you, what this means is that we are
using the ext4 file system for ext2/3 file systems. So yes, we are
using the ext4 file system driver when the ext2 file system is
requested. Since the kernel tries mounting the "ext2" file system
first, since we are using the ext4 file system driver, it succeeds,
and since it was mounted when the requested name was "ext2",
that is what is recorded in the mount table.

It's not dangerous at all. Again, as you can tell if you were to actually
look at your .config carefully, the ext2 file system was not compiled
into your kernel at all. Ext4 will only try masquerading as ext2 if
CONFIG_EXT2_FS is disabled. So it couldn't have possibly been
using ext2 code, if you thought about it for a second.

I can write up a patch which explicitly tests for feature flags that go
beyond ext2 as of a particular version, and if so, refuse the mount
when ext4 is masquerading as ext2, and do the same for ext3. I
probably will do this to avoid user questions, when I have some
spare time.

However, please note that this will have no actual effect on how
anything int he kernel will behave --- none ---- except for a one
character change in /proc/mounts: s/2/4/. (This is because the
kernel will now try ext2, and fail, try ext3 and fail, and then succeed
when the exact same file system driver is used. Oh, it might also
extend the boot time by a few milliseconds.)

Have I made it clear enough now to assuage your fears?

-- Ted

2011-04-13 22:30:14

by Joel Becker

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

On Wed, Apr 13, 2011 at 05:00:34PM -0400, Theodore Tso wrote:
> I can write up a patch which explicitly tests for feature flags that go
> beyond ext2 as of a particular version, and if so, refuse the mount
> when ext4 is masquerading as ext2, and do the same for ext3. I
> probably will do this to avoid user questions, when I have some
> spare time.

This is the correct behavior. Few people understand the
filesystem type test ordering, and fewer (these days) modify their own
.config. They expect that the name in /proc/mounts reflects the format
on the platter. If we say 'ext2', they think it's a non-journaled FFS.
Errors in the other direction are less confusing. If you wanted
a quick hack, you could just have ext4 always fail ext2/3 mounts and
report itself as ext4 no matter what the physical disk looks like.
People would understand that 'ext4' in /proc/mounts means that the ext4
driver has mounted an extN filesystem much faster than they would
understand that 'ext2' means the ext4 driver has mounted an ext4
filesystem but with the scanning name of ext2.

Joel

--

"If at first you don't succeed, cover all traces that you tried."
-Unknown

http://www.jlbec.org/
[email protected]

2011-04-14 01:34:23

by Mark Lord

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4

On 11-04-13 05:00 PM, Theodore Tso wrote:
>
> It's not dangerous at all. Again, as you can tell if you were to actually
> look at your .config carefully, the ext2 file system was not compiled
> into your kernel at all. Ext4 will only try masquerading as ext2 if
> CONFIG_EXT2_FS is disabled. So it couldn't have possibly been
> using ext2 code, if you thought about it for a second.
>
> I can write up a patch which explicitly tests for feature flags that go
> beyond ext2 as of a particular version, and if so, refuse the mount
> when ext4 is masquerading as ext2, and do the same for ext3. I
> probably will do this to avoid user questions, when I have some
> spare time.
>
> However, please note that this will have no actual effect on how
> anything int he kernel will behave --- none ---- except for a one
> character change in /proc/mounts: s/2/4/. (This is because the
> kernel will now try ext2, and fail, try ext3 and fail, and then succeed
> when the exact same file system driver is used. Oh, it might also
> extend the boot time by a few milliseconds.)
>
> Have I made it clear enough now to assuage your fears?

No, very much the contrary, despite the condescending
attempt at "explanation".

config EXT4_USE_FOR_EXT23
bool "Use ext4 for ext2/ext3 file systems"
depends on EXT4_FS
depends on EXT3_FS=n || EXT2_FS=n
default y
help
Allow the ext4 file system driver code to be used for ext2 or
ext3 file system mounts. This allows users to reduce their
compiled kernel size by using one file system driver for
ext2, ext3, and ext4 file systems.

So the option is supposed to allow ext4 code to be used
to access ext2 and ext3 filesystems, saving some space
in the binary kernel. That much is obvious.

But far far less obvious is why the kernel reports
having mounted my ext4 filesystem as "ext2",
and what exactly it means by that.

And I've been a kernel developer as long as you have, Ted,
so just imagine the confusion this might be causing lots
of other less experienced people.

Thanks.

2011-04-14 12:47:22

by Theodore Ts'o

[permalink] [raw]
Subject: Re: CONFIG_EXT4_USE_FOR_EXT23: rootfs shows as ext2 instead of ext4


On Apr 13, 2011, at 9:34 PM, Mark Lord wrote:

> But far far less obvious is why the kernel reports
> having mounted my ext4 filesystem as "ext2",
> and what exactly it means by that.

Because the kernel reports the type that it successfully matched against when the mount succeeded, and in order for ext4 to take over ext2, it calls register_filesystem() as ext2. Since the kernel (a) tries mounting the root file system as ext2 (it tries blindly until it succeeds), (b) when ext4 is taking care of ext2 mounts, an filesystem with ext4 features can be mounted by the ext4 file system driver, (c) the base VFS layer controls what types is listed, and since it succeeded when it tried mounting as ext2, it reports it as ext2.

In other words, what the kernel reports is not what file system driver is currently servicing your block device, but the name of the fs type first succeeded when the root file system mount did a brute-force search. (BTW, some distributions may use an initrd that uses a more sophisticated userspace fs probing logic that will correctly determine that the file system is really an ext4 file system, and explicitly request an ext4 mount, and then the kernel will report that it was mounted with an ext4 fs type.) The brute-force search nature of the kernel can potentially get you in trouble, since if a block device was originally formatted as a FAT file system, and then later reformatted as some other file system, such as btrfs, ext4, xfs, etc., in theory, depending on the order the kernel tries its brute force search, it could conceivably get things wrong. This is a long-known problem, and we mostly work around it by making the mkfs programs zero out enough data blocks that it's unlikely that a reused block device might get mistaken as something else.

I repeat; this is a cosmetic issue _only_. I suspect most users won't be looking that closely at what the kernel outputs, but I agree it's something that we should fix. OK?

-- Ted

2011-04-14 15:41:25

by Milton Miller

[permalink] [raw]
Subject: [PATCH] fs: synchronize_rcu when unregister_filesystem success not failure

While checking unregister_filesystem for saftey vs extra calls for
"ext4: register ext2 and ext3 alias after ext4" I realized that
the synchronize_rcu() was called on the error path but not on
the success path.

Should we call it in both?
Cc: stable (2.6.38)
Signed-off-by: Milton Miller <[email protected]>

Index: work.git/fs/filesystems.c
===================================================================
--- work.git.orig/fs/filesystems.c 2011-04-14 10:06:44.360068116 -0500
+++ work.git/fs/filesystems.c 2011-04-14 10:08:41.880061794 -0500
@@ -110,14 +110,13 @@ int unregister_filesystem(struct file_sy
*tmp = fs->next;
fs->next = NULL;
write_unlock(&file_systems_lock);
+ synchronize_rcu();
return 0;
}
tmp = &(*tmp)->next;
}
write_unlock(&file_systems_lock);

- synchronize_rcu();
-
return -EINVAL;
}

2011-04-14 15:41:33

by Milton Miller

[permalink] [raw]
Subject: [PATCH] ext4: register ext2 and ext3 alias after ext4

The intent of CONFIG_EXT4_USE_FOR_EXT23 is to provide an alias so
that the ext4 module is sufficient even if user space calls out ext2.
However, it currently registers the alias names before registering ext4.
This means they show up first in /proc/filesystems and ext3 and ext4
formatted file systems show up in the mount list as ext4.

Register the filesystem aliases after ext4 so they show as ext4 until the
filesystems is filtered. This will avoid users or tools configuring a
kernel with just ext2 and wondering why there system broke. Also register
in the same order as listed in fs/Makefile.

Signed-off-by: Milton Miller <[email protected]>
---

Untested but simple code movement. Mark will you please test?

Is it safe to initialize ext4_li_info and ext4_li_mtx after we register
the filesystem? What keeps the init after the first call to mount
a filesystem?

I checked unregister_filesystem and it is safe when the file system
type is and is not registered.

Index: work.git/fs/ext4/super.c
===================================================================
--- work.git.orig/fs/ext4/super.c 2011-04-14 09:26:43.250066794 -0500
+++ work.git/fs/ext4/super.c 2011-04-14 09:29:53.220061521 -0500
@@ -4898,11 +4898,11 @@ static int __init ext4_init_fs(void)
err = init_inodecache();
if (err)
goto out1;
- register_as_ext2();
- register_as_ext3();
err = register_filesystem(&ext4_fs_type);
if (err)
goto out;
+ register_as_ext3();
+ register_as_ext2();

ext4_li_info = NULL;
mutex_init(&ext4_li_mtx);

2011-04-14 15:53:34

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] fs: synchronize_rcu when unregister_filesystem success not failure

On Thu, Apr 14, 2011 at 8:41 AM, Milton Miller <[email protected]> wrote:
>
> While checking unregister_filesystem for saftey vs extra calls for
> "ext4: register ext2 and ext3 alias after ext4" I realized that
> the synchronize_rcu() was called on the error path but not on
> the success path.

Good catch.

I think this is the bug that then caused us to do commit d863b50ab013
("vfs: call rcu_barrier after ->kill_sb()")

That said, that commit says that "synchronize_rcu()" isn't enough, and
uses rcu_barrier().

Which _should_ mean that there are no actual users that care about RCU
events by the time you actually hit "unregister_filesystem()".

So I think your patch is correct, but won't actually matter. But maybe
I'm missing something.

> Should we call it in both?

No, I think the success path is the one that would matter.

Comments?

Linus

2011-04-14 17:04:49

by Marco Stornelli

[permalink] [raw]
Subject: Re: [PATCH] fs: synchronize_rcu when unregister_filesystem success not failure

Il 14/04/2011 17:52, Linus Torvalds ha scritto:
> On Thu, Apr 14, 2011 at 8:41 AM, Milton Miller<[email protected]> wrote:
>>
>> While checking unregister_filesystem for saftey vs extra calls for
>> "ext4: register ext2 and ext3 alias after ext4" I realized that
>> the synchronize_rcu() was called on the error path but not on
>> the success path.
>
> Good catch.
>
> I think this is the bug that then caused us to do commit d863b50ab013
> ("vfs: call rcu_barrier after ->kill_sb()")
>
> That said, that commit says that "synchronize_rcu()" isn't enough, and
> uses rcu_barrier().
>
> Which _should_ mean that there are no actual users that care about RCU
> events by the time you actually hit "unregister_filesystem()".
>
> So I think your patch is correct, but won't actually matter. But maybe
> I'm missing something.
>
>> Should we call it in both?
>
> No, I think the success path is the one that would matter.
>
> Comments?
>

If I well remember the rcu_barrier was needed for the fs module
unloading problem. In that case synchronize_rcu() wasn't enough. That
said, I agree with you, it won't have any impact.

Marco

2011-04-15 00:49:42

by Mark Lord

[permalink] [raw]
Subject: Re: [PATCH] fs: synchronize_rcu when unregister_filesystem success not failure

On 11-04-14 11:52 AM, Linus Torvalds wrote:
> On Thu, Apr 14, 2011 at 8:41 AM, Milton Miller <[email protected]> wrote:
>>
>> While checking unregister_filesystem for saftey vs extra calls for
>> "ext4: register ext2 and ext3 alias after ext4" I realized that
>> the synchronize_rcu() was called on the error path but not on
>> the success path.
>
> Good catch.
>
> I think this is the bug that then caused us to do commit d863b50ab013
> ("vfs: call rcu_barrier after ->kill_sb()")
>
> That said, that commit says that "synchronize_rcu()" isn't enough, and
> uses rcu_barrier().
>
> Which _should_ mean that there are no actual users that care about RCU
> events by the time you actually hit "unregister_filesystem()".

Is that true of older kernels? (eg. 2.6.38 has the same bug)

IOW, is this a -stable candidate?

2011-04-15 01:07:36

by Mark Lord

[permalink] [raw]
Subject: Re: [PATCH] ext4: register ext2 and ext3 alias after ext4

On 11-04-14 11:41 AM, Milton Miller wrote:
> The intent of CONFIG_EXT4_USE_FOR_EXT23 is to provide an alias so
> that the ext4 module is sufficient even if user space calls out ext2.
> However, it currently registers the alias names before registering ext4.
> This means they show up first in /proc/filesystems and ext3 and ext4
> formatted file systems show up in the mount list as ext4.

Shouldn't that say:

This means they show up first in /proc/filesystems, causing ext3 and
ext4 formatted file systems to show up in the mount list as ext2.

Thanks. Other than that, it all looks good here with ext2 and ext4
file systems. I didn't try an ext3 filesystem, but with it being between
the other two, I expect it to also be fine.

> Register the filesystem aliases after ext4 so they show as ext4 until the
> filesystems is filtered. This will avoid users or tools configuring a
> kernel with just ext2 and wondering why there system broke. Also register
> in the same order as listed in fs/Makefile.
>
> Signed-off-by: Milton Miller <[email protected]>
> ---
>
> Untested but simple code movement. Mark will you please test?
>
> Is it safe to initialize ext4_li_info and ext4_li_mtx after we register
> the filesystem? What keeps the init after the first call to mount
> a filesystem?
>
> I checked unregister_filesystem and it is safe when the file system
> type is and is not registered.
>
> Index: work.git/fs/ext4/super.c
> ===================================================================
> --- work.git.orig/fs/ext4/super.c 2011-04-14 09:26:43.250066794 -0500
> +++ work.git/fs/ext4/super.c 2011-04-14 09:29:53.220061521 -0500
> @@ -4898,11 +4898,11 @@ static int __init ext4_init_fs(void)
> err = init_inodecache();
> if (err)
> goto out1;
> - register_as_ext2();
> - register_as_ext3();
> err = register_filesystem(&ext4_fs_type);
> if (err)
> goto out;
> + register_as_ext3();
> + register_as_ext2();
>
> ext4_li_info = NULL;
> mutex_init(&ext4_li_mtx);