2013-03-02 01:22:46

by Kees Cook

[permalink] [raw]
Subject: user ns: arbitrary module loading

The rearranging done for user ns has resulted in allowing arbitrary
kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
by what is assumed to be an unprivileged process.

At present, it does look to require at least CAP_SETUID along the way
to set up the uidmap (but things like the setuid helper newuidmap
might soon start providing such a thing by default).

It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
examines module symbols to verify that request_module() for a
filesystem only loads a module that defines "register_filesystem"
(among other things).

-Kees

[1] https://twitter.com/grsecurity/status/307473816672665600

--
Kees Cook
Chrome OS Security


2013-03-03 00:56:21

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

Quoting Kees Cook ([email protected]):
> The rearranging done for user ns has resulted in allowing arbitrary
> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
> by what is assumed to be an unprivileged process.
>
> At present, it does look to require at least CAP_SETUID along the way
> to set up the uidmap (but things like the setuid helper newuidmap
> might soon start providing such a thing by default).
>
> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
> examines module symbols to verify that request_module() for a
> filesystem only loads a module that defines "register_filesystem"
> (among other things).
>
> -Kees
>
> [1] https://twitter.com/grsecurity/status/307473816672665600

So the concern is root in a child user namespace doing

mount -t randomfs <...>

in which case do_new_mount() checks ns_capable(), not capable(),
before trying to load a module for randomfs.

As well as (secondly) the fact that there is no enforcement on
the format of the module names (i.e. fs-*).

Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
At least Eric Paris is strongly against it. But how about if we
add a check for 'current_user_ns() == &init_user_ns' at that place
instead?

Eric Biederman, do you have any objections to that?

thanks,
-serge

2013-03-03 01:18:23

by Kees Cook

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Sat, Mar 2, 2013 at 4:57 PM, Serge E. Hallyn <[email protected]> wrote:
> Quoting Kees Cook ([email protected]):
>> The rearranging done for user ns has resulted in allowing arbitrary
>> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>> by what is assumed to be an unprivileged process.
>>
>> At present, it does look to require at least CAP_SETUID along the way
>> to set up the uidmap (but things like the setuid helper newuidmap
>> might soon start providing such a thing by default).
>>
>> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
>> examines module symbols to verify that request_module() for a
>> filesystem only loads a module that defines "register_filesystem"
>> (among other things).
>>
>> -Kees
>>
>> [1] https://twitter.com/grsecurity/status/307473816672665600
>
> So the concern is root in a child user namespace doing
>
> mount -t randomfs <...>
>
> in which case do_new_mount() checks ns_capable(), not capable(),
> before trying to load a module for randomfs.

Well, not just randomfs. Any module that modprobe in the init ns can find.

> As well as (secondly) the fact that there is no enforcement on
> the format of the module names (i.e. fs-*).
>
> Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
> At least Eric Paris is strongly against it.

I'd be curious to hear the objections. It seems pretty nice to me to
add a new argument to every request_module() that specifies the
"subsystem" it expects a module to load from. Maybe pass
"request_module=filesystem" or "...=netdev" to the modprobe call. And
then in init_module(), check the userargs for which subsystem was
requested and look up in a table for the entry point module symbol for
that subsystem to require. e.g. for "request_module=filesystem",
require that the module contains the "register_filesystem" symbol,
etc.

> But how about if we
> add a check for 'current_user_ns() == &init_user_ns' at that place
> instead?

Well, we'd need to mostly revert
57eccb830f1cc93d4b506ba306d8dfa685e0c88f ("mount: consolidate
permission checks") since get_fs_type() is being called before
may_mount() right now. (And then, as you suggest, we should strengthen
the test.) I think this will require either more plumbing into
get_fs_type (something like "bool load_module_if_missing") or the
subsystem verification stuff in request_module. I think the latter is
MUCH nicer as it covers this problem in all places, not just this
"mount" case.

> Eric Biederman, do you have any objections to that?
>
> thanks,
> -serge

-Kees

--
Kees Cook
Chrome OS Security

2013-03-03 03:55:29

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

Quoting Kees Cook ([email protected]):
> On Sat, Mar 2, 2013 at 4:57 PM, Serge E. Hallyn <[email protected]> wrote:
> > Quoting Kees Cook ([email protected]):
> >> The rearranging done for user ns has resulted in allowing arbitrary
> >> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
> >> by what is assumed to be an unprivileged process.
> >>
> >> At present, it does look to require at least CAP_SETUID along the way
> >> to set up the uidmap (but things like the setuid helper newuidmap
> >> might soon start providing such a thing by default).
> >>
> >> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
> >> examines module symbols to verify that request_module() for a
> >> filesystem only loads a module that defines "register_filesystem"
> >> (among other things).
> >>
> >> -Kees
> >>
> >> [1] https://twitter.com/grsecurity/status/307473816672665600
> >
> > So the concern is root in a child user namespace doing
> >
> > mount -t randomfs <...>
> >
> > in which case do_new_mount() checks ns_capable(), not capable(),
> > before trying to load a module for randomfs.
>
> Well, not just randomfs. Any module that modprobe in the init ns can find.

right

> > As well as (secondly) the fact that there is no enforcement on
> > the format of the module names (i.e. fs-*).
> >
> > Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
> > At least Eric Paris is strongly against it.
>
> I'd be curious to hear the objections. It seems pretty nice to me to

Wait, sorry, I mis-spoke. The objection would have been to requiring
CAP_SYS_MODULE, which is different. Sorry!

> add a new argument to every request_module() that specifies the
> "subsystem" it expects a module to load from. Maybe pass
> "request_module=filesystem" or "...=netdev" to the modprobe call. And

That would be useful for adding to the separation of privileges,
i.e. helping contain the leaking of posix caps. It sounds good to
me.

> then in init_module(), check the userargs for which subsystem was
> requested and look up in a table for the entry point module symbol for
> that subsystem to require. e.g. for "request_module=filesystem",
> require that the module contains the "register_filesystem" symbol,
> etc.
>
> > But how about if we
> > add a check for 'current_user_ns() == &init_user_ns' at that place
> > instead?
>
> Well, we'd need to mostly revert
> 57eccb830f1cc93d4b506ba306d8dfa685e0c88f ("mount: consolidate
> permission checks") since get_fs_type() is being called before
> may_mount() right now. (And then, as you suggest, we should strengthen
> the test.) I think this will require either more plumbing into
> get_fs_type (something like "bool load_module_if_missing") or the
> subsystem verification stuff in request_module. I think the latter is
> MUCH nicer as it covers this problem in all places, not just this
> "mount" case.

My first instinct was to say I'd like to have the kernel 100% belonging
to the init_user_ns, with child user namespaces having zero ability to
induce loading of any kernel modules, period. So a check for current
being in init_user_ns at request_module itself.

However (thinking more) that seems maybe wrong. You don't need privs to
induce the loading of a new binfmt module right? The host's
/lib/modules and module blacklists should be set up right by the admin
(or distro)... If we require that the host admin manually modprobe
every module which a task in a child user namespace might need, that
goes counter to the goal of kernel modules.

> > Eric Biederman, do you have any objections to that?

-serge

2013-03-03 04:12:35

by Eric W. Biederman

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

"Serge E. Hallyn" <[email protected]> writes:

> Quoting Kees Cook ([email protected]):
>> The rearranging done for user ns has resulted in allowing arbitrary
>> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>> by what is assumed to be an unprivileged process.
>>
>> At present, it does look to require at least CAP_SETUID along the way
>> to set up the uidmap (but things like the setuid helper newuidmap
>> might soon start providing such a thing by default).

CAP_SETUID is not needed.

>> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
>> examines module symbols to verify that request_module() for a
>> filesystem only loads a module that defines "register_filesystem"
>> (among other things).
>>
>> -Kees
>>
>> [1] https://twitter.com/grsecurity/status/307473816672665600
>
> So the concern is root in a child user namespace doing
>
> mount -t randomfs <...>
>
> in which case do_new_mount() checks ns_capable(), not capable(),
> before trying to load a module for randomfs.
>
> As well as (secondly) the fact that there is no enforcement on
> the format of the module names (i.e. fs-*).
>
> Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
> At least Eric Paris is strongly against it.

What is wrong with GRKERNSEC_MODHARDEN? It took a quick look and the
code is far from clean. But that would not be a fundamental objection
from keeping code like that out of the kernel.

It is also entertaining to read security code that won't even build with
CONFIG_UIDGID_STRICT_TYPE_CHECKS enabled.

> But how about if we
> add a check for 'current_user_ns() == &init_user_ns' at that place
> instead?
>
> Eric Biederman, do you have any objections to that?

The obvious solution here is to test for CAP_SYS_ADMIN rather than
current_user_ns == &init_user_ns before we request the module here. As
that is what was previously required on this path.

Reading the comments the concerns are.
- Non-root users are allowed to load obscure and possibly kernel
modules.
- get_fs_type can trigger the load of any kernel module.

At a practical level I don't see adding a capalbe(CAP_SYS_ADMIN) check
as having much effect for the functionality currently present in user
namespaces today as the filesystems that an legal to mount in a user
namespace (ramfs, tmpfs, mqueuefs, sysfs, proc, devpts) are so common
most of them can not even be built as modules and even if they are
modules the modules will already be loaded. So I will see about adding
a capable(CAP_SYS_ADMIN) check to shore things up for the short term.

In the longer term I very much would like to get loopback devices
and mounts of filesystems on those loopback devices working, and being
able to mount filesystems from usb sticks that people commonly plug in,
and remove the need for privileged daemons to do that work. At that
point manually having to do something that was automatic before will
either mean a regression in functionality or bugs as people manually
load things.


So I am wondering what I a good policy should be. Should we trust
kernel modules to not be buggy (especially if they were signed as part
of the build process)? Do we add some defense in depth and add
filesystem registration magic? Thinking...

We can limit the request_module in get_fs_type to just filesystems
fairly easily.

In include/linux/fs.h:

#define MODULE_ALIAS_FS(type) MODULE_ALIAS("fs-" __stringify(type))

In fs/filesystems.c:

if (request_moudle("fs-%.*s", len, name) == 0)

Then just add the appropriate MODULE_ALIAS_FS lines in all of the
filesystems. This also allows user space to say set the module loading
policy for filesystems using the blacklist and the alias keywords
in /etc/modprobe.d/*.conf.

That seems a whole lot simpler, more powerful and more maintainable than
what little I saw in GRKERNSEC_MODHARDEN to prevent loading of
non-filesystem modules from get_fs_type.

Eric

p.s. This is the patch I am looking at pushing to Linus in the near
future.

diff --git a/fs/filesystems.c b/fs/filesystems.c
index da165f6..5b0644d 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -273,7 +273,8 @@ struct file_system_type *get_fs_type(const char *name)
int len = dot ? dot - name : strlen(name);

fs = __get_fs_type(name, len);
- if (!fs && (request_module("%.*s", len, name) == 0))
+ if (!fs && capable(CAP_SYS_ADMIN) &&
+ (request_module("%.*s", len, name) == 0))
fs = __get_fs_type(name, len);

if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {


2013-03-03 10:15:06

by Eric W. Biederman

[permalink] [raw]
Subject: [RFC][PATCH] fs: Limit sys_mount to only loading filesystem modules.


Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.

A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.

Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.

Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives. Allowing simple, safe,
well understood work-arounds to known problematic software.

This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work. While writing this patch I saw a handful of such
cases. The most significant being autofs that lives in the module
autofs4.

Signed-off-by: "Eric W. Biederman" <[email protected]>
---
arch/ia64/kernel/perfmon.c | 1 +
arch/powerpc/platforms/cell/spufs/inode.c | 1 +
arch/s390/hypfs/inode.c | 1 +
drivers/firmware/efivars.c | 1 +
drivers/infiniband/hw/ipath/ipath_fs.c | 1 +
drivers/infiniband/hw/qib/qib_fs.c | 1 +
drivers/misc/ibmasm/ibmasmfs.c | 1 +
drivers/mtd/mtdchar.c | 1 +
drivers/oprofile/oprofilefs.c | 1 +
drivers/staging/ccg/f_fs.c | 1 +
drivers/usb/gadget/f_fs.c | 1 +
drivers/usb/gadget/inode.c | 1 +
drivers/xen/xenfs/super.c | 1 +
fs/9p/vfs_super.c | 1 +
fs/adfs/super.c | 1 +
fs/affs/super.c | 1 +
fs/afs/super.c | 1 +
fs/autofs4/init.c | 1 +
fs/befs/linuxvfs.c | 1 +
fs/bfs/inode.c | 1 +
fs/binfmt_misc.c | 1 +
fs/btrfs/super.c | 1 +
fs/ceph/super.c | 1 +
fs/coda/inode.c | 1 +
fs/configfs/mount.c | 1 +
fs/cramfs/inode.c | 1 +
fs/debugfs/inode.c | 1 +
fs/devpts/inode.c | 1 +
fs/ecryptfs/main.c | 1 +
fs/efs/super.c | 1 +
fs/exofs/super.c | 1 +
fs/ext2/super.c | 1 +
fs/ext3/super.c | 1 +
fs/ext4/super.c | 5 +++--
fs/f2fs/super.c | 1 +
fs/fat/namei_msdos.c | 1 +
fs/fat/namei_vfat.c | 1 +
fs/filesystems.c | 2 +-
fs/freevxfs/vxfs_super.c | 2 +-
fs/fuse/control.c | 1 +
fs/fuse/inode.c | 2 ++
fs/gfs2/ops_fstype.c | 4 +++-
fs/hfs/super.c | 1 +
fs/hfsplus/super.c | 1 +
fs/hppfs/hppfs.c | 1 +
fs/hugetlbfs/inode.c | 1 +
fs/isofs/inode.c | 3 +--
fs/jffs2/super.c | 1 +
fs/jfs/super.c | 1 +
fs/logfs/super.c | 1 +
fs/minix/inode.c | 1 +
fs/ncpfs/inode.c | 1 +
fs/nfs/super.c | 3 ++-
fs/nfsd/nfsctl.c | 1 +
fs/nilfs2/super.c | 1 +
fs/ntfs/super.c | 1 +
fs/ocfs2/dlmfs/dlmfs.c | 1 +
fs/omfs/inode.c | 1 +
fs/openpromfs/inode.c | 1 +
fs/qnx4/inode.c | 1 +
fs/qnx6/inode.c | 1 +
fs/reiserfs/super.c | 1 +
fs/romfs/super.c | 1 +
fs/sysv/super.c | 3 ++-
fs/ubifs/super.c | 1 +
fs/ufs/super.c | 1 +
fs/xfs/xfs_super.c | 1 +
include/linux/fs.h | 2 ++
net/sunrpc/rpc_pipe.c | 4 +---
69 files changed, 77 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index ea39eba..db6e866 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -619,6 +619,7 @@ static struct file_system_type pfm_fs_type = {
.mount = pfmfs_mount,
.kill_sb = kill_anon_super,
};
+MODULE_ALIAS_FS("pfmfs");

DEFINE_PER_CPU(unsigned long, pfm_syst_info);
DEFINE_PER_CPU(struct task_struct *, pmu_owner);
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index dba1ce2..db6d080 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -771,6 +771,7 @@ static struct file_system_type spufs_type = {
.mount = spufs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("spufs");

static int __init spufs_init(void)
{
diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index 06ea69b..50d9bde 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -458,6 +458,7 @@ static struct file_system_type hypfs_type = {
.mount = hypfs_mount,
.kill_sb = hypfs_kill_super
};
+MODULE_ALIAS_FS("s390_hypfs");

static const struct super_operations hypfs_s_ops = {
.statfs = simple_statfs,
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index fed08b6..6881e2e 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -1116,6 +1116,7 @@ static struct file_system_type efivarfs_type = {
.mount = efivarfs_mount,
.kill_sb = efivarfs_kill_sb,
};
+MODULE_ALIAS_FS("efivarfs");

static const struct inode_operations efivarfs_dir_inode_operations = {
.lookup = simple_lookup,
diff --git a/drivers/infiniband/hw/ipath/ipath_fs.c b/drivers/infiniband/hw/ipath/ipath_fs.c
index a4de9d5..2c082cd 100644
--- a/drivers/infiniband/hw/ipath/ipath_fs.c
+++ b/drivers/infiniband/hw/ipath/ipath_fs.c
@@ -410,6 +410,7 @@ static struct file_system_type ipathfs_fs_type = {
.mount = ipathfs_mount,
.kill_sb = ipathfs_kill_super,
};
+MODULE_ALIAS_FS("ipathfs");

int __init ipath_init_ipathfs(void)
{
diff --git a/drivers/infiniband/hw/qib/qib_fs.c b/drivers/infiniband/hw/qib/qib_fs.c
index 65a2a23..9566ceb 100644
--- a/drivers/infiniband/hw/qib/qib_fs.c
+++ b/drivers/infiniband/hw/qib/qib_fs.c
@@ -604,6 +604,7 @@ static struct file_system_type qibfs_fs_type = {
.mount = qibfs_mount,
.kill_sb = qibfs_kill_super,
};
+MODULE_ALIAS_FS("ipathfs");

int __init qib_init_qibfs(void)
{
diff --git a/drivers/misc/ibmasm/ibmasmfs.c b/drivers/misc/ibmasm/ibmasmfs.c
index 6673e57..ce5b756 100644
--- a/drivers/misc/ibmasm/ibmasmfs.c
+++ b/drivers/misc/ibmasm/ibmasmfs.c
@@ -110,6 +110,7 @@ static struct file_system_type ibmasmfs_type = {
.mount = ibmasmfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("ibmasmfs");

static int ibmasmfs_fill_super (struct super_block *sb, void *data, int silent)
{
diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index 82c0616..92ab30a 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -1238,6 +1238,7 @@ static struct file_system_type mtd_inodefs_type = {
.mount = mtd_inodefs_mount,
.kill_sb = kill_anon_super,
};
+MODULE_ALIAS_FS("mtd_inodefs");

static int __init init_mtdchar(void)
{
diff --git a/drivers/oprofile/oprofilefs.c b/drivers/oprofile/oprofilefs.c
index 849357c..55e3646 100644
--- a/drivers/oprofile/oprofilefs.c
+++ b/drivers/oprofile/oprofilefs.c
@@ -266,6 +266,7 @@ static struct file_system_type oprofilefs_type = {
.mount = oprofilefs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("oprofilefs");


int __init oprofilefs_register(void)
diff --git a/drivers/staging/ccg/f_fs.c b/drivers/staging/ccg/f_fs.c
index 8adc79d..f6373da 100644
--- a/drivers/staging/ccg/f_fs.c
+++ b/drivers/staging/ccg/f_fs.c
@@ -1223,6 +1223,7 @@ static struct file_system_type ffs_fs_type = {
.mount = ffs_fs_mount,
.kill_sb = ffs_fs_kill_sb,
};
+MODULE_ALIAS_FS("functionfs");


/* Driver's main init/cleanup functions *************************************/
diff --git a/drivers/usb/gadget/f_fs.c b/drivers/usb/gadget/f_fs.c
index 38388d7..c377ff8 100644
--- a/drivers/usb/gadget/f_fs.c
+++ b/drivers/usb/gadget/f_fs.c
@@ -1235,6 +1235,7 @@ static struct file_system_type ffs_fs_type = {
.mount = ffs_fs_mount,
.kill_sb = ffs_fs_kill_sb,
};
+MODULE_ALIAS_FS("functionfs");


/* Driver's main init/cleanup functions *************************************/
diff --git a/drivers/usb/gadget/inode.c b/drivers/usb/gadget/inode.c
index 8ac840f..e2b2e9c 100644
--- a/drivers/usb/gadget/inode.c
+++ b/drivers/usb/gadget/inode.c
@@ -2105,6 +2105,7 @@ static struct file_system_type gadgetfs_type = {
.mount = gadgetfs_mount,
.kill_sb = gadgetfs_kill_sb,
};
+MODULE_ALIAS_FS("gadgetfs");

/*----------------------------------------------------------------------*/

diff --git a/drivers/xen/xenfs/super.c b/drivers/xen/xenfs/super.c
index 459b9ac..891d83f 100644
--- a/drivers/xen/xenfs/super.c
+++ b/drivers/xen/xenfs/super.c
@@ -119,6 +119,7 @@ static struct file_system_type xenfs_type = {
.mount = xenfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("xenfs");

static int __init xenfs_init(void)
{
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index 137d503..17368f4 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -365,3 +365,4 @@ struct file_system_type v9fs_fs_type = {
.owner = THIS_MODULE,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT,
};
+MODULE_ALIAS_FS("9p");
diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index d571229..0ff4bae 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -524,6 +524,7 @@ static struct file_system_type adfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("adfs");

static int __init init_adfs_fs(void)
{
diff --git a/fs/affs/super.c b/fs/affs/super.c
index b84dc73..45161a8 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -622,6 +622,7 @@ static struct file_system_type affs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("affs");

static int __init init_affs_fs(void)
{
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 7c31ec3..c486155 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -45,6 +45,7 @@ struct file_system_type afs_fs_type = {
.kill_sb = afs_kill_super,
.fs_flags = 0,
};
+MODULE_ALIAS_FS("afs");

static const struct super_operations afs_super_ops = {
.statfs = afs_statfs,
diff --git a/fs/autofs4/init.c b/fs/autofs4/init.c
index cddc74b..b3db517 100644
--- a/fs/autofs4/init.c
+++ b/fs/autofs4/init.c
@@ -26,6 +26,7 @@ static struct file_system_type autofs_fs_type = {
.mount = autofs_mount,
.kill_sb = autofs4_kill_sb,
};
+MODULE_ALIAS_FS("autofs");

static int __init init_autofs4_fs(void)
{
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 2b3bda8..8613785 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -951,6 +951,7 @@ static struct file_system_type befs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("befs");

static int __init
init_befs_fs(void)
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index 737aaa3..5e376bb 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -473,6 +473,7 @@ static struct file_system_type bfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("bfs");

static int __init init_bfs_fs(void)
{
diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 0c8869f..65f91ec 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -720,6 +720,7 @@ static struct file_system_type bm_fs_type = {
.mount = bm_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("binfmt_misc");

static int __init init_misc_binfmt(void)
{
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d8982e9..fe51afd 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1518,6 +1518,7 @@ static struct file_system_type btrfs_fs_type = {
.kill_sb = btrfs_kill_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("btrfs");

/*
* used by btrfsctl to scan devices when no FS is mounted
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index e86aa994..0a25c04 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -947,6 +947,7 @@ static struct file_system_type ceph_fs_type = {
.kill_sb = ceph_kill_sb,
.fs_flags = FS_RENAME_DOES_D_MOVE,
};
+MODULE_ALIAS_FS("ceph");

#define _STRINGIFY(x) #x
#define STRINGIFY(x) _STRINGIFY(x)
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index cf674e9..5075f81 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -329,4 +329,5 @@ struct file_system_type coda_fs_type = {
.kill_sb = kill_anon_super,
.fs_flags = FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("coda");

diff --git a/fs/configfs/mount.c b/fs/configfs/mount.c
index aee0a7e..7f26c3c 100644
--- a/fs/configfs/mount.c
+++ b/fs/configfs/mount.c
@@ -114,6 +114,7 @@ static struct file_system_type configfs_fs_type = {
.mount = configfs_do_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("configfs");

struct dentry *configfs_pin_fs(void)
{
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index c6c3f91..3f79599 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -573,6 +573,7 @@ static struct file_system_type cramfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("cramfs");

static int __init init_cramfs_fs(void)
{
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 0c4f80b..4888cb3 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -299,6 +299,7 @@ static struct file_system_type debug_fs_type = {
.mount = debug_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("debugfs");

static struct dentry *__create_file(const char *name, umode_t mode,
struct dentry *parent, void *data,
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index f0f0faa..741626f 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -380,6 +380,7 @@ static struct file_system_type devpts_fs_type = {
.kill_sb = devpts_kill_sb,
.fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
};
+MODULE_ALIAS_FS("devpts");

struct vfsmount *devpts_mntget(struct file *filp)
{
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 4e0886c..e924cf4 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -629,6 +629,7 @@ static struct file_system_type ecryptfs_fs_type = {
.kill_sb = ecryptfs_kill_block_super,
.fs_flags = 0
};
+MODULE_ALIAS_FS("ecryptfs");

/**
* inode_info_init_once
diff --git a/fs/efs/super.c b/fs/efs/super.c
index 2002431..c6f57a7 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -33,6 +33,7 @@ static struct file_system_type efs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("efs");

static struct pt_types sgi_pt_types[] = {
{0x00, "SGI vh"},
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 5e59280..9d97633 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -1010,6 +1010,7 @@ static struct file_system_type exofs_type = {
.mount = exofs_mount,
.kill_sb = generic_shutdown_super,
};
+MODULE_ALIAS_FS("exofs");

static int __init init_exofs(void)
{
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index c25c56b..6438912 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -1542,6 +1542,7 @@ static struct file_system_type ext2_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV | FS_USERNS_MOUNT,
};
+MODULE_ALIAS_FS("ext2");

static int __init init_ext2_fs(void)
{
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 4ba2683..d59852d 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -3059,6 +3059,7 @@ static struct file_system_type ext3_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext3");

static int __init init_ext3_fs(void)
{
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3d4fb81..b3264ba 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -92,6 +92,7 @@ static struct file_system_type ext2_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext2");
#define IS_EXT2_SB(sb) ((sb)->s_bdev->bd_holder == &ext2_fs_type)
#else
#define IS_EXT2_SB(sb) (0)
@@ -106,6 +107,7 @@ static struct file_system_type ext3_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext3");
#define IS_EXT3_SB(sb) ((sb)->s_bdev->bd_holder == &ext3_fs_type)
#else
#define IS_EXT3_SB(sb) (0)
@@ -5194,7 +5196,6 @@ static inline int ext2_feature_set_ok(struct super_block *sb)
return 0;
return 1;
}
-MODULE_ALIAS("ext2");
#else
static inline void register_as_ext2(void) { }
static inline void unregister_as_ext2(void) { }
@@ -5227,7 +5228,6 @@ static inline int ext3_feature_set_ok(struct super_block *sb)
return 0;
return 1;
}
-MODULE_ALIAS("ext3");
#else
static inline void register_as_ext3(void) { }
static inline void unregister_as_ext3(void) { }
@@ -5241,6 +5241,7 @@ static struct file_system_type ext4_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext4");

static int __init ext4_init_feat_adverts(void)
{
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 37fad04..45ebdb6 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -639,6 +639,7 @@ static struct file_system_type f2fs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("f2fs");

static int __init init_inodecache(void)
{
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index e2cfda9..081b759 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -668,6 +668,7 @@ static struct file_system_type msdos_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("msdos");

static int __init init_msdos_fs(void)
{
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index ac959d6..2da9520 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -1073,6 +1073,7 @@ static struct file_system_type vfat_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("vfat");

static int __init init_vfat_fs(void)
{
diff --git a/fs/filesystems.c b/fs/filesystems.c
index 71f9be2..490f48c 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -274,7 +274,7 @@ struct file_system_type *get_fs_type(const char *name)

fs = __get_fs_type(name, len);
if (!fs && capable(CAP_SYS_ADMIN) &&
- (request_module("%.*s", len, name) == 0))
+ (request_module("fs-%.*s", len, name) == 0))
fs = __get_fs_type(name, len);

if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
index fed2c8a..4550743 100644
--- a/fs/freevxfs/vxfs_super.c
+++ b/fs/freevxfs/vxfs_super.c
@@ -52,7 +52,6 @@ MODULE_AUTHOR("Christoph Hellwig");
MODULE_DESCRIPTION("Veritas Filesystem (VxFS) driver");
MODULE_LICENSE("Dual BSD/GPL");

-MODULE_ALIAS("vxfs"); /* makes mount -t vxfs autoload the module */


static void vxfs_put_super(struct super_block *);
@@ -258,6 +257,7 @@ static struct file_system_type vxfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("vxfs"); /* makes mount -t vxfs autoload the module */

static int __init
vxfs_init(void)
diff --git a/fs/fuse/control.c b/fs/fuse/control.c
index 75a20c0..895cc91 100644
--- a/fs/fuse/control.c
+++ b/fs/fuse/control.c
@@ -341,6 +341,7 @@ static struct file_system_type fuse_ctl_fs_type = {
.mount = fuse_ctl_mount,
.kill_sb = fuse_ctl_kill_sb,
};
+MODULE_ALIAS_FS("fusectl");

int __init fuse_ctl_init(void)
{
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index d022569..6ab1666 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1127,6 +1127,7 @@ static struct file_system_type fuse_fs_type = {
.mount = fuse_mount,
.kill_sb = fuse_kill_sb_anon,
};
+MODULE_ALIAS_FS("fuse");

#ifdef CONFIG_BLOCK
static struct dentry *fuse_mount_blk(struct file_system_type *fs_type,
@@ -1156,6 +1157,7 @@ static struct file_system_type fuseblk_fs_type = {
.kill_sb = fuse_kill_sb_blk,
.fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE,
};
+MODULE_ALIAS_FS("fuseblk");

static inline int register_fuseblk(void)
{
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 1b612be..60ede2a 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -20,6 +20,7 @@
#include <linux/gfs2_ondisk.h>
#include <linux/quotaops.h>
#include <linux/lockdep.h>
+#include <linux/module.h>

#include "gfs2.h"
#include "incore.h"
@@ -1425,6 +1426,7 @@ struct file_system_type gfs2_fs_type = {
.kill_sb = gfs2_kill_sb,
.owner = THIS_MODULE,
};
+MODULE_ALIAS_FS("gfs2");

struct file_system_type gfs2meta_fs_type = {
.name = "gfs2meta",
@@ -1432,4 +1434,4 @@ struct file_system_type gfs2meta_fs_type = {
.mount = gfs2_mount_meta,
.owner = THIS_MODULE,
};
-
+MODULE_ALIAS_FS("gfs2meta");
diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index e93ddaa..bbaaa8a 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -466,6 +466,7 @@ static struct file_system_type hfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("hfs");

static void hfs_init_once(void *p)
{
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index 796198d..d2e1718 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -618,6 +618,7 @@ static struct file_system_type hfsplus_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("hfsplus");

static void hfsplus_init_once(void *p)
{
diff --git a/fs/hppfs/hppfs.c b/fs/hppfs/hppfs.c
index 43b315f..3eefbcc 100644
--- a/fs/hppfs/hppfs.c
+++ b/fs/hppfs/hppfs.c
@@ -748,6 +748,7 @@ static struct file_system_type hppfs_type = {
.kill_sb = kill_anon_super,
.fs_flags = 0,
};
+MODULE_ALIAS_FS("hppfs");

static int __init init_hppfs(void)
{
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 78bde32..81d1cb4 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -896,6 +896,7 @@ static struct file_system_type hugetlbfs_fs_type = {
.mount = hugetlbfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("hugetlbfs");

static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE];

diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index 67ce525..a67f16e 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -1556,6 +1556,7 @@ static struct file_system_type iso9660_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("iso9660");

static int __init init_iso9660_fs(void)
{
@@ -1593,5 +1594,3 @@ static void __exit exit_iso9660_fs(void)
module_init(init_iso9660_fs)
module_exit(exit_iso9660_fs)
MODULE_LICENSE("GPL");
-/* Actual filesystem name is iso9660, as requested in filesystems.c */
-MODULE_ALIAS("iso9660");
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index d3d8799..0defb1c 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -356,6 +356,7 @@ static struct file_system_type jffs2_fs_type = {
.mount = jffs2_mount,
.kill_sb = jffs2_kill_sb,
};
+MODULE_ALIAS_FS("jffs2");

static int __init init_jffs2_fs(void)
{
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 060ba63..2003e83 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -833,6 +833,7 @@ static struct file_system_type jfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("jfs");

static void init_once(void *foo)
{
diff --git a/fs/logfs/super.c b/fs/logfs/super.c
index 345c24b..5436029 100644
--- a/fs/logfs/super.c
+++ b/fs/logfs/super.c
@@ -608,6 +608,7 @@ static struct file_system_type logfs_fs_type = {
.fs_flags = FS_REQUIRES_DEV,

};
+MODULE_ALIAS_FS("logfs");

static int __init logfs_init(void)
{
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 99541cc..df12249 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -660,6 +660,7 @@ static struct file_system_type minix_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("minix");

static int __init init_minix_fs(void)
{
diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
index e2be336..38b4fe1 100644
--- a/fs/ncpfs/inode.c
+++ b/fs/ncpfs/inode.c
@@ -1051,6 +1051,7 @@ static struct file_system_type ncp_fs_type = {
.kill_sb = kill_anon_super,
.fs_flags = FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("ncpfs");

static int __init init_ncp_fs(void)
{
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index befbae0..e11a863 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -293,6 +293,7 @@ struct file_system_type nfs_fs_type = {
.kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("nfs");
EXPORT_SYMBOL_GPL(nfs_fs_type);

struct file_system_type nfs_xdev_fs_type = {
@@ -332,6 +333,7 @@ struct file_system_type nfs4_fs_type = {
.kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("nfs4");
EXPORT_SYMBOL_GPL(nfs4_fs_type);

static int __init register_nfs4_fs(void)
@@ -2716,6 +2718,5 @@ module_param(send_implementation_id, ushort, 0644);
MODULE_PARM_DESC(send_implementation_id,
"Send implementation ID with NFSv4.1 exchange_id");
MODULE_PARM_DESC(nfs4_unique_id, "nfs_client_id4 uniquifier string");
-MODULE_ALIAS("nfs4");

#endif /* CONFIG_NFS_V4 */
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 7493428..939bfd9 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1052,6 +1052,7 @@ static struct file_system_type nfsd_fs_type = {
.mount = nfsd_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("nfsd");

#ifdef CONFIG_PROC_FS
static int create_proc_exports_entry(void)
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index 3c991dc..c7d1f9f 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -1361,6 +1361,7 @@ struct file_system_type nilfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("nilfs2");

static void nilfs_inode_init_once(void *obj)
{
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index 4a8289f8..82650d5 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -3079,6 +3079,7 @@ static struct file_system_type ntfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ntfs");

/* Stable names for the slab caches. */
static const char ntfs_index_ctx_cache_name[] = "ntfs_index_ctx_cache";
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index 16b712d..9259d78 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -640,6 +640,7 @@ static struct file_system_type dlmfs_fs_type = {
.mount = dlmfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("ocfs2_dlmfs");

static int __init init_dlmfs_fs(void)
{
diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c
index 25d715c..d8b0afd 100644
--- a/fs/omfs/inode.c
+++ b/fs/omfs/inode.c
@@ -572,6 +572,7 @@ static struct file_system_type omfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("omfs");

static int __init init_omfs_fs(void)
{
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index 2ad080f..66abc26 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -432,6 +432,7 @@ static struct file_system_type openprom_fs_type = {
.mount = openprom_mount,
.kill_sb = kill_anon_super,
};
+MODULE_ALIAS_FS("openpromfs");

static void op_inode_init_once(void *data)
{
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index 43098bb..2e8caa6 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -412,6 +412,7 @@ static struct file_system_type qnx4_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("qnx4");

static int __init init_qnx4_fs(void)
{
diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 57199a5..8d941ed 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -672,6 +672,7 @@ static struct file_system_type qnx6_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("qnx6");

static int __init init_qnx6_fs(void)
{
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 418bdc3..194113b 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -2434,6 +2434,7 @@ struct file_system_type reiserfs_fs_type = {
.kill_sb = reiserfs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("reiserfs");

MODULE_DESCRIPTION("ReiserFS journaled filesystem");
MODULE_AUTHOR("Hans Reiser <[email protected]>");
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index fd7c5f6..42e3d06 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -599,6 +599,7 @@ static struct file_system_type romfs_fs_type = {
.kill_sb = romfs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("romfs");

/*
* inode storage initialiser
diff --git a/fs/sysv/super.c b/fs/sysv/super.c
index a38e87b..a39938b 100644
--- a/fs/sysv/super.c
+++ b/fs/sysv/super.c
@@ -545,6 +545,7 @@ static struct file_system_type sysv_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("sysv");

static struct file_system_type v7_fs_type = {
.owner = THIS_MODULE,
@@ -553,6 +554,7 @@ static struct file_system_type v7_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("v7");

static int __init init_sysv_fs(void)
{
@@ -586,5 +588,4 @@ static void __exit exit_sysv_fs(void)

module_init(init_sysv_fs)
module_exit(exit_sysv_fs)
-MODULE_ALIAS("v7");
MODULE_LICENSE("GPL");
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index ddc0f6a..ac838b8 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2174,6 +2174,7 @@ static struct file_system_type ubifs_fs_type = {
.mount = ubifs_mount,
.kill_sb = kill_ubifs_super,
};
+MODULE_ALIAS_FS("ubifs");

/*
* Inode slab cache constructor.
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index dc8e3a8..329f2f5 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1500,6 +1500,7 @@ static struct file_system_type ufs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ufs");

static int __init init_ufs_fs(void)
{
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c407121..ea341ce 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1561,6 +1561,7 @@ static struct file_system_type xfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("xfs");

STATIC int __init
xfs_init_zones(void)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d0a246d..f4b68f5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1829,6 +1829,8 @@ struct file_system_type {
struct lock_class_key i_mutex_dir_key;
};

+#define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME)
+
extern struct dentry *mount_ns(struct file_system_type *fs_type, int flags,
void *data, int (*fill_super)(struct super_block *, void *, int));
extern struct dentry *mount_bdev(struct file_system_type *fs_type,
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index fd10981..6e86b2c 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1174,6 +1174,7 @@ static struct file_system_type rpc_pipe_fs_type = {
.mount = rpc_mount,
.kill_sb = rpc_kill_sb,
};
+MODULE_ALIAS_FS("rpc_pipefs");

static void
init_once(void *foo)
@@ -1218,6 +1219,3 @@ void unregister_rpc_pipefs(void)
kmem_cache_destroy(rpc_inode_cachep);
unregister_filesystem(&rpc_pipe_fs_type);
}
-
-/* Make 'mount -t rpc_pipefs ...' autoload this module. */
-MODULE_ALIAS("rpc_pipefs");
--
1.7.5.4

2013-03-03 15:28:56

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC][PATCH] fs: Limit sys_mount to only loading filesystem modules.

Quoting Eric W. Biederman ([email protected]):
>
> Modify the request_module to prefix the file system type with "fs-"
> and add aliases to all of the filesystems that can be built as modules
> to match.
>
> A common practice is to build all of the kernel code and leave code
> that is not commonly needed as modules, with the result that many
> users are exposed to any bug anywhere in the kernel.
>
> Looking for filesystems with a fs- prefix limits the pool of possible
> modules that can be loaded by mount to just filesystems trivially
> making things safer with no real cost.
>
> Using aliases means user space can control the policy of which
> filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
> with blacklist and alias directives. Allowing simple, safe,
> well understood work-arounds to known problematic software.
>
> This also addresses a rare but unfortunate problem where the filesystem
> name is not the same as it's module name and module auto-loading
> would not work. While writing this patch I saw a handful of such
> cases. The most significant being autofs that lives in the module
> autofs4.
>
> Signed-off-by: "Eric W. Biederman" <[email protected]>

Acked-by: Serge Hallyn <[email protected]>

> ---
> arch/ia64/kernel/perfmon.c | 1 +
> arch/powerpc/platforms/cell/spufs/inode.c | 1 +
> arch/s390/hypfs/inode.c | 1 +
> drivers/firmware/efivars.c | 1 +
> drivers/infiniband/hw/ipath/ipath_fs.c | 1 +
> drivers/infiniband/hw/qib/qib_fs.c | 1 +
> drivers/misc/ibmasm/ibmasmfs.c | 1 +
> drivers/mtd/mtdchar.c | 1 +
> drivers/oprofile/oprofilefs.c | 1 +
> drivers/staging/ccg/f_fs.c | 1 +
> drivers/usb/gadget/f_fs.c | 1 +
> drivers/usb/gadget/inode.c | 1 +
> drivers/xen/xenfs/super.c | 1 +
> fs/9p/vfs_super.c | 1 +
> fs/adfs/super.c | 1 +
> fs/affs/super.c | 1 +
> fs/afs/super.c | 1 +
> fs/autofs4/init.c | 1 +
> fs/befs/linuxvfs.c | 1 +
> fs/bfs/inode.c | 1 +
> fs/binfmt_misc.c | 1 +
> fs/btrfs/super.c | 1 +
> fs/ceph/super.c | 1 +
> fs/coda/inode.c | 1 +
> fs/configfs/mount.c | 1 +
> fs/cramfs/inode.c | 1 +
> fs/debugfs/inode.c | 1 +
> fs/devpts/inode.c | 1 +
> fs/ecryptfs/main.c | 1 +
> fs/efs/super.c | 1 +
> fs/exofs/super.c | 1 +
> fs/ext2/super.c | 1 +
> fs/ext3/super.c | 1 +
> fs/ext4/super.c | 5 +++--
> fs/f2fs/super.c | 1 +
> fs/fat/namei_msdos.c | 1 +
> fs/fat/namei_vfat.c | 1 +
> fs/filesystems.c | 2 +-
> fs/freevxfs/vxfs_super.c | 2 +-
> fs/fuse/control.c | 1 +
> fs/fuse/inode.c | 2 ++
> fs/gfs2/ops_fstype.c | 4 +++-
> fs/hfs/super.c | 1 +
> fs/hfsplus/super.c | 1 +
> fs/hppfs/hppfs.c | 1 +
> fs/hugetlbfs/inode.c | 1 +
> fs/isofs/inode.c | 3 +--
> fs/jffs2/super.c | 1 +
> fs/jfs/super.c | 1 +
> fs/logfs/super.c | 1 +
> fs/minix/inode.c | 1 +
> fs/ncpfs/inode.c | 1 +
> fs/nfs/super.c | 3 ++-
> fs/nfsd/nfsctl.c | 1 +
> fs/nilfs2/super.c | 1 +
> fs/ntfs/super.c | 1 +
> fs/ocfs2/dlmfs/dlmfs.c | 1 +
> fs/omfs/inode.c | 1 +
> fs/openpromfs/inode.c | 1 +
> fs/qnx4/inode.c | 1 +
> fs/qnx6/inode.c | 1 +
> fs/reiserfs/super.c | 1 +
> fs/romfs/super.c | 1 +
> fs/sysv/super.c | 3 ++-
> fs/ubifs/super.c | 1 +
> fs/ufs/super.c | 1 +
> fs/xfs/xfs_super.c | 1 +
> include/linux/fs.h | 2 ++
> net/sunrpc/rpc_pipe.c | 4 +---
> 69 files changed, 77 insertions(+), 12 deletions(-)
>
> diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
> index ea39eba..db6e866 100644
> --- a/arch/ia64/kernel/perfmon.c
> +++ b/arch/ia64/kernel/perfmon.c
> @@ -619,6 +619,7 @@ static struct file_system_type pfm_fs_type = {
> .mount = pfmfs_mount,
> .kill_sb = kill_anon_super,
> };
> +MODULE_ALIAS_FS("pfmfs");
>
> DEFINE_PER_CPU(unsigned long, pfm_syst_info);
> DEFINE_PER_CPU(struct task_struct *, pmu_owner);
> diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> index dba1ce2..db6d080 100644
> --- a/arch/powerpc/platforms/cell/spufs/inode.c
> +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> @@ -771,6 +771,7 @@ static struct file_system_type spufs_type = {
> .mount = spufs_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("spufs");
>
> static int __init spufs_init(void)
> {
> diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
> index 06ea69b..50d9bde 100644
> --- a/arch/s390/hypfs/inode.c
> +++ b/arch/s390/hypfs/inode.c
> @@ -458,6 +458,7 @@ static struct file_system_type hypfs_type = {
> .mount = hypfs_mount,
> .kill_sb = hypfs_kill_super
> };
> +MODULE_ALIAS_FS("s390_hypfs");
>
> static const struct super_operations hypfs_s_ops = {
> .statfs = simple_statfs,
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
> index fed08b6..6881e2e 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -1116,6 +1116,7 @@ static struct file_system_type efivarfs_type = {
> .mount = efivarfs_mount,
> .kill_sb = efivarfs_kill_sb,
> };
> +MODULE_ALIAS_FS("efivarfs");
>
> static const struct inode_operations efivarfs_dir_inode_operations = {
> .lookup = simple_lookup,
> diff --git a/drivers/infiniband/hw/ipath/ipath_fs.c b/drivers/infiniband/hw/ipath/ipath_fs.c
> index a4de9d5..2c082cd 100644
> --- a/drivers/infiniband/hw/ipath/ipath_fs.c
> +++ b/drivers/infiniband/hw/ipath/ipath_fs.c
> @@ -410,6 +410,7 @@ static struct file_system_type ipathfs_fs_type = {
> .mount = ipathfs_mount,
> .kill_sb = ipathfs_kill_super,
> };
> +MODULE_ALIAS_FS("ipathfs");
>
> int __init ipath_init_ipathfs(void)
> {
> diff --git a/drivers/infiniband/hw/qib/qib_fs.c b/drivers/infiniband/hw/qib/qib_fs.c
> index 65a2a23..9566ceb 100644
> --- a/drivers/infiniband/hw/qib/qib_fs.c
> +++ b/drivers/infiniband/hw/qib/qib_fs.c
> @@ -604,6 +604,7 @@ static struct file_system_type qibfs_fs_type = {
> .mount = qibfs_mount,
> .kill_sb = qibfs_kill_super,
> };
> +MODULE_ALIAS_FS("ipathfs");
>
> int __init qib_init_qibfs(void)
> {
> diff --git a/drivers/misc/ibmasm/ibmasmfs.c b/drivers/misc/ibmasm/ibmasmfs.c
> index 6673e57..ce5b756 100644
> --- a/drivers/misc/ibmasm/ibmasmfs.c
> +++ b/drivers/misc/ibmasm/ibmasmfs.c
> @@ -110,6 +110,7 @@ static struct file_system_type ibmasmfs_type = {
> .mount = ibmasmfs_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("ibmasmfs");
>
> static int ibmasmfs_fill_super (struct super_block *sb, void *data, int silent)
> {
> diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
> index 82c0616..92ab30a 100644
> --- a/drivers/mtd/mtdchar.c
> +++ b/drivers/mtd/mtdchar.c
> @@ -1238,6 +1238,7 @@ static struct file_system_type mtd_inodefs_type = {
> .mount = mtd_inodefs_mount,
> .kill_sb = kill_anon_super,
> };
> +MODULE_ALIAS_FS("mtd_inodefs");
>
> static int __init init_mtdchar(void)
> {
> diff --git a/drivers/oprofile/oprofilefs.c b/drivers/oprofile/oprofilefs.c
> index 849357c..55e3646 100644
> --- a/drivers/oprofile/oprofilefs.c
> +++ b/drivers/oprofile/oprofilefs.c
> @@ -266,6 +266,7 @@ static struct file_system_type oprofilefs_type = {
> .mount = oprofilefs_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("oprofilefs");
>
>
> int __init oprofilefs_register(void)
> diff --git a/drivers/staging/ccg/f_fs.c b/drivers/staging/ccg/f_fs.c
> index 8adc79d..f6373da 100644
> --- a/drivers/staging/ccg/f_fs.c
> +++ b/drivers/staging/ccg/f_fs.c
> @@ -1223,6 +1223,7 @@ static struct file_system_type ffs_fs_type = {
> .mount = ffs_fs_mount,
> .kill_sb = ffs_fs_kill_sb,
> };
> +MODULE_ALIAS_FS("functionfs");
>
>
> /* Driver's main init/cleanup functions *************************************/
> diff --git a/drivers/usb/gadget/f_fs.c b/drivers/usb/gadget/f_fs.c
> index 38388d7..c377ff8 100644
> --- a/drivers/usb/gadget/f_fs.c
> +++ b/drivers/usb/gadget/f_fs.c
> @@ -1235,6 +1235,7 @@ static struct file_system_type ffs_fs_type = {
> .mount = ffs_fs_mount,
> .kill_sb = ffs_fs_kill_sb,
> };
> +MODULE_ALIAS_FS("functionfs");
>
>
> /* Driver's main init/cleanup functions *************************************/
> diff --git a/drivers/usb/gadget/inode.c b/drivers/usb/gadget/inode.c
> index 8ac840f..e2b2e9c 100644
> --- a/drivers/usb/gadget/inode.c
> +++ b/drivers/usb/gadget/inode.c
> @@ -2105,6 +2105,7 @@ static struct file_system_type gadgetfs_type = {
> .mount = gadgetfs_mount,
> .kill_sb = gadgetfs_kill_sb,
> };
> +MODULE_ALIAS_FS("gadgetfs");
>
> /*----------------------------------------------------------------------*/
>
> diff --git a/drivers/xen/xenfs/super.c b/drivers/xen/xenfs/super.c
> index 459b9ac..891d83f 100644
> --- a/drivers/xen/xenfs/super.c
> +++ b/drivers/xen/xenfs/super.c
> @@ -119,6 +119,7 @@ static struct file_system_type xenfs_type = {
> .mount = xenfs_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("xenfs");
>
> static int __init xenfs_init(void)
> {
> diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
> index 137d503..17368f4 100644
> --- a/fs/9p/vfs_super.c
> +++ b/fs/9p/vfs_super.c
> @@ -365,3 +365,4 @@ struct file_system_type v9fs_fs_type = {
> .owner = THIS_MODULE,
> .fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT,
> };
> +MODULE_ALIAS_FS("9p");
> diff --git a/fs/adfs/super.c b/fs/adfs/super.c
> index d571229..0ff4bae 100644
> --- a/fs/adfs/super.c
> +++ b/fs/adfs/super.c
> @@ -524,6 +524,7 @@ static struct file_system_type adfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("adfs");
>
> static int __init init_adfs_fs(void)
> {
> diff --git a/fs/affs/super.c b/fs/affs/super.c
> index b84dc73..45161a8 100644
> --- a/fs/affs/super.c
> +++ b/fs/affs/super.c
> @@ -622,6 +622,7 @@ static struct file_system_type affs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("affs");
>
> static int __init init_affs_fs(void)
> {
> diff --git a/fs/afs/super.c b/fs/afs/super.c
> index 7c31ec3..c486155 100644
> --- a/fs/afs/super.c
> +++ b/fs/afs/super.c
> @@ -45,6 +45,7 @@ struct file_system_type afs_fs_type = {
> .kill_sb = afs_kill_super,
> .fs_flags = 0,
> };
> +MODULE_ALIAS_FS("afs");
>
> static const struct super_operations afs_super_ops = {
> .statfs = afs_statfs,
> diff --git a/fs/autofs4/init.c b/fs/autofs4/init.c
> index cddc74b..b3db517 100644
> --- a/fs/autofs4/init.c
> +++ b/fs/autofs4/init.c
> @@ -26,6 +26,7 @@ static struct file_system_type autofs_fs_type = {
> .mount = autofs_mount,
> .kill_sb = autofs4_kill_sb,
> };
> +MODULE_ALIAS_FS("autofs");
>
> static int __init init_autofs4_fs(void)
> {
> diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
> index 2b3bda8..8613785 100644
> --- a/fs/befs/linuxvfs.c
> +++ b/fs/befs/linuxvfs.c
> @@ -951,6 +951,7 @@ static struct file_system_type befs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("befs");
>
> static int __init
> init_befs_fs(void)
> diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
> index 737aaa3..5e376bb 100644
> --- a/fs/bfs/inode.c
> +++ b/fs/bfs/inode.c
> @@ -473,6 +473,7 @@ static struct file_system_type bfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("bfs");
>
> static int __init init_bfs_fs(void)
> {
> diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
> index 0c8869f..65f91ec 100644
> --- a/fs/binfmt_misc.c
> +++ b/fs/binfmt_misc.c
> @@ -720,6 +720,7 @@ static struct file_system_type bm_fs_type = {
> .mount = bm_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("binfmt_misc");
>
> static int __init init_misc_binfmt(void)
> {
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index d8982e9..fe51afd 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1518,6 +1518,7 @@ static struct file_system_type btrfs_fs_type = {
> .kill_sb = btrfs_kill_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("btrfs");
>
> /*
> * used by btrfsctl to scan devices when no FS is mounted
> diff --git a/fs/ceph/super.c b/fs/ceph/super.c
> index e86aa994..0a25c04 100644
> --- a/fs/ceph/super.c
> +++ b/fs/ceph/super.c
> @@ -947,6 +947,7 @@ static struct file_system_type ceph_fs_type = {
> .kill_sb = ceph_kill_sb,
> .fs_flags = FS_RENAME_DOES_D_MOVE,
> };
> +MODULE_ALIAS_FS("ceph");
>
> #define _STRINGIFY(x) #x
> #define STRINGIFY(x) _STRINGIFY(x)
> diff --git a/fs/coda/inode.c b/fs/coda/inode.c
> index cf674e9..5075f81 100644
> --- a/fs/coda/inode.c
> +++ b/fs/coda/inode.c
> @@ -329,4 +329,5 @@ struct file_system_type coda_fs_type = {
> .kill_sb = kill_anon_super,
> .fs_flags = FS_BINARY_MOUNTDATA,
> };
> +MODULE_ALIAS_FS("coda");
>
> diff --git a/fs/configfs/mount.c b/fs/configfs/mount.c
> index aee0a7e..7f26c3c 100644
> --- a/fs/configfs/mount.c
> +++ b/fs/configfs/mount.c
> @@ -114,6 +114,7 @@ static struct file_system_type configfs_fs_type = {
> .mount = configfs_do_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("configfs");
>
> struct dentry *configfs_pin_fs(void)
> {
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index c6c3f91..3f79599 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -573,6 +573,7 @@ static struct file_system_type cramfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("cramfs");
>
> static int __init init_cramfs_fs(void)
> {
> diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
> index 0c4f80b..4888cb3 100644
> --- a/fs/debugfs/inode.c
> +++ b/fs/debugfs/inode.c
> @@ -299,6 +299,7 @@ static struct file_system_type debug_fs_type = {
> .mount = debug_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("debugfs");
>
> static struct dentry *__create_file(const char *name, umode_t mode,
> struct dentry *parent, void *data,
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index f0f0faa..741626f 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -380,6 +380,7 @@ static struct file_system_type devpts_fs_type = {
> .kill_sb = devpts_kill_sb,
> .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
> };
> +MODULE_ALIAS_FS("devpts");
>
> struct vfsmount *devpts_mntget(struct file *filp)
> {
> diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
> index 4e0886c..e924cf4 100644
> --- a/fs/ecryptfs/main.c
> +++ b/fs/ecryptfs/main.c
> @@ -629,6 +629,7 @@ static struct file_system_type ecryptfs_fs_type = {
> .kill_sb = ecryptfs_kill_block_super,
> .fs_flags = 0
> };
> +MODULE_ALIAS_FS("ecryptfs");
>
> /**
> * inode_info_init_once
> diff --git a/fs/efs/super.c b/fs/efs/super.c
> index 2002431..c6f57a7 100644
> --- a/fs/efs/super.c
> +++ b/fs/efs/super.c
> @@ -33,6 +33,7 @@ static struct file_system_type efs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("efs");
>
> static struct pt_types sgi_pt_types[] = {
> {0x00, "SGI vh"},
> diff --git a/fs/exofs/super.c b/fs/exofs/super.c
> index 5e59280..9d97633 100644
> --- a/fs/exofs/super.c
> +++ b/fs/exofs/super.c
> @@ -1010,6 +1010,7 @@ static struct file_system_type exofs_type = {
> .mount = exofs_mount,
> .kill_sb = generic_shutdown_super,
> };
> +MODULE_ALIAS_FS("exofs");
>
> static int __init init_exofs(void)
> {
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index c25c56b..6438912 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -1542,6 +1542,7 @@ static struct file_system_type ext2_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV | FS_USERNS_MOUNT,
> };
> +MODULE_ALIAS_FS("ext2");
>
> static int __init init_ext2_fs(void)
> {
> diff --git a/fs/ext3/super.c b/fs/ext3/super.c
> index 4ba2683..d59852d 100644
> --- a/fs/ext3/super.c
> +++ b/fs/ext3/super.c
> @@ -3059,6 +3059,7 @@ static struct file_system_type ext3_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("ext3");
>
> static int __init init_ext3_fs(void)
> {
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 3d4fb81..b3264ba 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -92,6 +92,7 @@ static struct file_system_type ext2_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("ext2");
> #define IS_EXT2_SB(sb) ((sb)->s_bdev->bd_holder == &ext2_fs_type)
> #else
> #define IS_EXT2_SB(sb) (0)
> @@ -106,6 +107,7 @@ static struct file_system_type ext3_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("ext3");
> #define IS_EXT3_SB(sb) ((sb)->s_bdev->bd_holder == &ext3_fs_type)
> #else
> #define IS_EXT3_SB(sb) (0)
> @@ -5194,7 +5196,6 @@ static inline int ext2_feature_set_ok(struct super_block *sb)
> return 0;
> return 1;
> }
> -MODULE_ALIAS("ext2");
> #else
> static inline void register_as_ext2(void) { }
> static inline void unregister_as_ext2(void) { }
> @@ -5227,7 +5228,6 @@ static inline int ext3_feature_set_ok(struct super_block *sb)
> return 0;
> return 1;
> }
> -MODULE_ALIAS("ext3");
> #else
> static inline void register_as_ext3(void) { }
> static inline void unregister_as_ext3(void) { }
> @@ -5241,6 +5241,7 @@ static struct file_system_type ext4_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("ext4");
>
> static int __init ext4_init_feat_adverts(void)
> {
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 37fad04..45ebdb6 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -639,6 +639,7 @@ static struct file_system_type f2fs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("f2fs");
>
> static int __init init_inodecache(void)
> {
> diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
> index e2cfda9..081b759 100644
> --- a/fs/fat/namei_msdos.c
> +++ b/fs/fat/namei_msdos.c
> @@ -668,6 +668,7 @@ static struct file_system_type msdos_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("msdos");
>
> static int __init init_msdos_fs(void)
> {
> diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
> index ac959d6..2da9520 100644
> --- a/fs/fat/namei_vfat.c
> +++ b/fs/fat/namei_vfat.c
> @@ -1073,6 +1073,7 @@ static struct file_system_type vfat_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("vfat");
>
> static int __init init_vfat_fs(void)
> {
> diff --git a/fs/filesystems.c b/fs/filesystems.c
> index 71f9be2..490f48c 100644
> --- a/fs/filesystems.c
> +++ b/fs/filesystems.c
> @@ -274,7 +274,7 @@ struct file_system_type *get_fs_type(const char *name)
>
> fs = __get_fs_type(name, len);
> if (!fs && capable(CAP_SYS_ADMIN) &&
> - (request_module("%.*s", len, name) == 0))
> + (request_module("fs-%.*s", len, name) == 0))
> fs = __get_fs_type(name, len);
>
> if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
> diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
> index fed2c8a..4550743 100644
> --- a/fs/freevxfs/vxfs_super.c
> +++ b/fs/freevxfs/vxfs_super.c
> @@ -52,7 +52,6 @@ MODULE_AUTHOR("Christoph Hellwig");
> MODULE_DESCRIPTION("Veritas Filesystem (VxFS) driver");
> MODULE_LICENSE("Dual BSD/GPL");
>
> -MODULE_ALIAS("vxfs"); /* makes mount -t vxfs autoload the module */
>
>
> static void vxfs_put_super(struct super_block *);
> @@ -258,6 +257,7 @@ static struct file_system_type vxfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("vxfs"); /* makes mount -t vxfs autoload the module */
>
> static int __init
> vxfs_init(void)
> diff --git a/fs/fuse/control.c b/fs/fuse/control.c
> index 75a20c0..895cc91 100644
> --- a/fs/fuse/control.c
> +++ b/fs/fuse/control.c
> @@ -341,6 +341,7 @@ static struct file_system_type fuse_ctl_fs_type = {
> .mount = fuse_ctl_mount,
> .kill_sb = fuse_ctl_kill_sb,
> };
> +MODULE_ALIAS_FS("fusectl");
>
> int __init fuse_ctl_init(void)
> {
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index d022569..6ab1666 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1127,6 +1127,7 @@ static struct file_system_type fuse_fs_type = {
> .mount = fuse_mount,
> .kill_sb = fuse_kill_sb_anon,
> };
> +MODULE_ALIAS_FS("fuse");
>
> #ifdef CONFIG_BLOCK
> static struct dentry *fuse_mount_blk(struct file_system_type *fs_type,
> @@ -1156,6 +1157,7 @@ static struct file_system_type fuseblk_fs_type = {
> .kill_sb = fuse_kill_sb_blk,
> .fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE,
> };
> +MODULE_ALIAS_FS("fuseblk");
>
> static inline int register_fuseblk(void)
> {
> diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> index 1b612be..60ede2a 100644
> --- a/fs/gfs2/ops_fstype.c
> +++ b/fs/gfs2/ops_fstype.c
> @@ -20,6 +20,7 @@
> #include <linux/gfs2_ondisk.h>
> #include <linux/quotaops.h>
> #include <linux/lockdep.h>
> +#include <linux/module.h>
>
> #include "gfs2.h"
> #include "incore.h"
> @@ -1425,6 +1426,7 @@ struct file_system_type gfs2_fs_type = {
> .kill_sb = gfs2_kill_sb,
> .owner = THIS_MODULE,
> };
> +MODULE_ALIAS_FS("gfs2");
>
> struct file_system_type gfs2meta_fs_type = {
> .name = "gfs2meta",
> @@ -1432,4 +1434,4 @@ struct file_system_type gfs2meta_fs_type = {
> .mount = gfs2_mount_meta,
> .owner = THIS_MODULE,
> };
> -
> +MODULE_ALIAS_FS("gfs2meta");
> diff --git a/fs/hfs/super.c b/fs/hfs/super.c
> index e93ddaa..bbaaa8a 100644
> --- a/fs/hfs/super.c
> +++ b/fs/hfs/super.c
> @@ -466,6 +466,7 @@ static struct file_system_type hfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("hfs");
>
> static void hfs_init_once(void *p)
> {
> diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
> index 796198d..d2e1718 100644
> --- a/fs/hfsplus/super.c
> +++ b/fs/hfsplus/super.c
> @@ -618,6 +618,7 @@ static struct file_system_type hfsplus_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("hfsplus");
>
> static void hfsplus_init_once(void *p)
> {
> diff --git a/fs/hppfs/hppfs.c b/fs/hppfs/hppfs.c
> index 43b315f..3eefbcc 100644
> --- a/fs/hppfs/hppfs.c
> +++ b/fs/hppfs/hppfs.c
> @@ -748,6 +748,7 @@ static struct file_system_type hppfs_type = {
> .kill_sb = kill_anon_super,
> .fs_flags = 0,
> };
> +MODULE_ALIAS_FS("hppfs");
>
> static int __init init_hppfs(void)
> {
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 78bde32..81d1cb4 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -896,6 +896,7 @@ static struct file_system_type hugetlbfs_fs_type = {
> .mount = hugetlbfs_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("hugetlbfs");
>
> static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE];
>
> diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
> index 67ce525..a67f16e 100644
> --- a/fs/isofs/inode.c
> +++ b/fs/isofs/inode.c
> @@ -1556,6 +1556,7 @@ static struct file_system_type iso9660_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("iso9660");
>
> static int __init init_iso9660_fs(void)
> {
> @@ -1593,5 +1594,3 @@ static void __exit exit_iso9660_fs(void)
> module_init(init_iso9660_fs)
> module_exit(exit_iso9660_fs)
> MODULE_LICENSE("GPL");
> -/* Actual filesystem name is iso9660, as requested in filesystems.c */
> -MODULE_ALIAS("iso9660");
> diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
> index d3d8799..0defb1c 100644
> --- a/fs/jffs2/super.c
> +++ b/fs/jffs2/super.c
> @@ -356,6 +356,7 @@ static struct file_system_type jffs2_fs_type = {
> .mount = jffs2_mount,
> .kill_sb = jffs2_kill_sb,
> };
> +MODULE_ALIAS_FS("jffs2");
>
> static int __init init_jffs2_fs(void)
> {
> diff --git a/fs/jfs/super.c b/fs/jfs/super.c
> index 060ba63..2003e83 100644
> --- a/fs/jfs/super.c
> +++ b/fs/jfs/super.c
> @@ -833,6 +833,7 @@ static struct file_system_type jfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("jfs");
>
> static void init_once(void *foo)
> {
> diff --git a/fs/logfs/super.c b/fs/logfs/super.c
> index 345c24b..5436029 100644
> --- a/fs/logfs/super.c
> +++ b/fs/logfs/super.c
> @@ -608,6 +608,7 @@ static struct file_system_type logfs_fs_type = {
> .fs_flags = FS_REQUIRES_DEV,
>
> };
> +MODULE_ALIAS_FS("logfs");
>
> static int __init logfs_init(void)
> {
> diff --git a/fs/minix/inode.c b/fs/minix/inode.c
> index 99541cc..df12249 100644
> --- a/fs/minix/inode.c
> +++ b/fs/minix/inode.c
> @@ -660,6 +660,7 @@ static struct file_system_type minix_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("minix");
>
> static int __init init_minix_fs(void)
> {
> diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
> index e2be336..38b4fe1 100644
> --- a/fs/ncpfs/inode.c
> +++ b/fs/ncpfs/inode.c
> @@ -1051,6 +1051,7 @@ static struct file_system_type ncp_fs_type = {
> .kill_sb = kill_anon_super,
> .fs_flags = FS_BINARY_MOUNTDATA,
> };
> +MODULE_ALIAS_FS("ncpfs");
>
> static int __init init_ncp_fs(void)
> {
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index befbae0..e11a863 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -293,6 +293,7 @@ struct file_system_type nfs_fs_type = {
> .kill_sb = nfs_kill_super,
> .fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
> };
> +MODULE_ALIAS_FS("nfs");
> EXPORT_SYMBOL_GPL(nfs_fs_type);
>
> struct file_system_type nfs_xdev_fs_type = {
> @@ -332,6 +333,7 @@ struct file_system_type nfs4_fs_type = {
> .kill_sb = nfs_kill_super,
> .fs_flags = FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
> };
> +MODULE_ALIAS_FS("nfs4");
> EXPORT_SYMBOL_GPL(nfs4_fs_type);
>
> static int __init register_nfs4_fs(void)
> @@ -2716,6 +2718,5 @@ module_param(send_implementation_id, ushort, 0644);
> MODULE_PARM_DESC(send_implementation_id,
> "Send implementation ID with NFSv4.1 exchange_id");
> MODULE_PARM_DESC(nfs4_unique_id, "nfs_client_id4 uniquifier string");
> -MODULE_ALIAS("nfs4");
>
> #endif /* CONFIG_NFS_V4 */
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 7493428..939bfd9 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1052,6 +1052,7 @@ static struct file_system_type nfsd_fs_type = {
> .mount = nfsd_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("nfsd");
>
> #ifdef CONFIG_PROC_FS
> static int create_proc_exports_entry(void)
> diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
> index 3c991dc..c7d1f9f 100644
> --- a/fs/nilfs2/super.c
> +++ b/fs/nilfs2/super.c
> @@ -1361,6 +1361,7 @@ struct file_system_type nilfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("nilfs2");
>
> static void nilfs_inode_init_once(void *obj)
> {
> diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
> index 4a8289f8..82650d5 100644
> --- a/fs/ntfs/super.c
> +++ b/fs/ntfs/super.c
> @@ -3079,6 +3079,7 @@ static struct file_system_type ntfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("ntfs");
>
> /* Stable names for the slab caches. */
> static const char ntfs_index_ctx_cache_name[] = "ntfs_index_ctx_cache";
> diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
> index 16b712d..9259d78 100644
> --- a/fs/ocfs2/dlmfs/dlmfs.c
> +++ b/fs/ocfs2/dlmfs/dlmfs.c
> @@ -640,6 +640,7 @@ static struct file_system_type dlmfs_fs_type = {
> .mount = dlmfs_mount,
> .kill_sb = kill_litter_super,
> };
> +MODULE_ALIAS_FS("ocfs2_dlmfs");
>
> static int __init init_dlmfs_fs(void)
> {
> diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c
> index 25d715c..d8b0afd 100644
> --- a/fs/omfs/inode.c
> +++ b/fs/omfs/inode.c
> @@ -572,6 +572,7 @@ static struct file_system_type omfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("omfs");
>
> static int __init init_omfs_fs(void)
> {
> diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
> index 2ad080f..66abc26 100644
> --- a/fs/openpromfs/inode.c
> +++ b/fs/openpromfs/inode.c
> @@ -432,6 +432,7 @@ static struct file_system_type openprom_fs_type = {
> .mount = openprom_mount,
> .kill_sb = kill_anon_super,
> };
> +MODULE_ALIAS_FS("openpromfs");
>
> static void op_inode_init_once(void *data)
> {
> diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
> index 43098bb..2e8caa6 100644
> --- a/fs/qnx4/inode.c
> +++ b/fs/qnx4/inode.c
> @@ -412,6 +412,7 @@ static struct file_system_type qnx4_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("qnx4");
>
> static int __init init_qnx4_fs(void)
> {
> diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
> index 57199a5..8d941ed 100644
> --- a/fs/qnx6/inode.c
> +++ b/fs/qnx6/inode.c
> @@ -672,6 +672,7 @@ static struct file_system_type qnx6_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("qnx6");
>
> static int __init init_qnx6_fs(void)
> {
> diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
> index 418bdc3..194113b 100644
> --- a/fs/reiserfs/super.c
> +++ b/fs/reiserfs/super.c
> @@ -2434,6 +2434,7 @@ struct file_system_type reiserfs_fs_type = {
> .kill_sb = reiserfs_kill_sb,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("reiserfs");
>
> MODULE_DESCRIPTION("ReiserFS journaled filesystem");
> MODULE_AUTHOR("Hans Reiser <[email protected]>");
> diff --git a/fs/romfs/super.c b/fs/romfs/super.c
> index fd7c5f6..42e3d06 100644
> --- a/fs/romfs/super.c
> +++ b/fs/romfs/super.c
> @@ -599,6 +599,7 @@ static struct file_system_type romfs_fs_type = {
> .kill_sb = romfs_kill_sb,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("romfs");
>
> /*
> * inode storage initialiser
> diff --git a/fs/sysv/super.c b/fs/sysv/super.c
> index a38e87b..a39938b 100644
> --- a/fs/sysv/super.c
> +++ b/fs/sysv/super.c
> @@ -545,6 +545,7 @@ static struct file_system_type sysv_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("sysv");
>
> static struct file_system_type v7_fs_type = {
> .owner = THIS_MODULE,
> @@ -553,6 +554,7 @@ static struct file_system_type v7_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("v7");
>
> static int __init init_sysv_fs(void)
> {
> @@ -586,5 +588,4 @@ static void __exit exit_sysv_fs(void)
>
> module_init(init_sysv_fs)
> module_exit(exit_sysv_fs)
> -MODULE_ALIAS("v7");
> MODULE_LICENSE("GPL");
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index ddc0f6a..ac838b8 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2174,6 +2174,7 @@ static struct file_system_type ubifs_fs_type = {
> .mount = ubifs_mount,
> .kill_sb = kill_ubifs_super,
> };
> +MODULE_ALIAS_FS("ubifs");
>
> /*
> * Inode slab cache constructor.
> diff --git a/fs/ufs/super.c b/fs/ufs/super.c
> index dc8e3a8..329f2f5 100644
> --- a/fs/ufs/super.c
> +++ b/fs/ufs/super.c
> @@ -1500,6 +1500,7 @@ static struct file_system_type ufs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("ufs");
>
> static int __init init_ufs_fs(void)
> {
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index c407121..ea341ce 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1561,6 +1561,7 @@ static struct file_system_type xfs_fs_type = {
> .kill_sb = kill_block_super,
> .fs_flags = FS_REQUIRES_DEV,
> };
> +MODULE_ALIAS_FS("xfs");
>
> STATIC int __init
> xfs_init_zones(void)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index d0a246d..f4b68f5 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1829,6 +1829,8 @@ struct file_system_type {
> struct lock_class_key i_mutex_dir_key;
> };
>
> +#define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME)
> +
> extern struct dentry *mount_ns(struct file_system_type *fs_type, int flags,
> void *data, int (*fill_super)(struct super_block *, void *, int));
> extern struct dentry *mount_bdev(struct file_system_type *fs_type,
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index fd10981..6e86b2c 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -1174,6 +1174,7 @@ static struct file_system_type rpc_pipe_fs_type = {
> .mount = rpc_mount,
> .kill_sb = rpc_kill_sb,
> };
> +MODULE_ALIAS_FS("rpc_pipefs");
>
> static void
> init_once(void *foo)
> @@ -1218,6 +1219,3 @@ void unregister_rpc_pipefs(void)
> kmem_cache_destroy(rpc_inode_cachep);
> unregister_filesystem(&rpc_pipe_fs_type);
> }
> -
> -/* Make 'mount -t rpc_pipefs ...' autoload this module. */
> -MODULE_ALIAS("rpc_pipefs");
> --
> 1.7.5.4

2013-03-03 17:48:53

by Kees Cook

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Sat, Mar 2, 2013 at 7:56 PM, Serge E. Hallyn <[email protected]> wrote:
> Quoting Kees Cook ([email protected]):
>> On Sat, Mar 2, 2013 at 4:57 PM, Serge E. Hallyn <[email protected]> wrote:
>> > Quoting Kees Cook ([email protected]):
>> >> The rearranging done for user ns has resulted in allowing arbitrary
>> >> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>> >> by what is assumed to be an unprivileged process.
>> >>
>> >> At present, it does look to require at least CAP_SETUID along the way
>> >> to set up the uidmap (but things like the setuid helper newuidmap
>> >> might soon start providing such a thing by default).
>> >>
>> >> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
>> >> examines module symbols to verify that request_module() for a
>> >> filesystem only loads a module that defines "register_filesystem"
>> >> (among other things).
>> >>
>> >> -Kees
>> >>
>> >> [1] https://twitter.com/grsecurity/status/307473816672665600
>> >
>> > So the concern is root in a child user namespace doing
>> >
>> > mount -t randomfs <...>
>> >
>> > in which case do_new_mount() checks ns_capable(), not capable(),
>> > before trying to load a module for randomfs.
>>
>> Well, not just randomfs. Any module that modprobe in the init ns can find.
>
> right
>
>> > As well as (secondly) the fact that there is no enforcement on
>> > the format of the module names (i.e. fs-*).
>> >
>> > Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
>> > At least Eric Paris is strongly against it.
>>
>> I'd be curious to hear the objections. It seems pretty nice to me to
>
> Wait, sorry, I mis-spoke. The objection would have been to requiring
> CAP_SYS_MODULE, which is different. Sorry!
>
>> add a new argument to every request_module() that specifies the
>> "subsystem" it expects a module to load from. Maybe pass
>> "request_module=filesystem" or "...=netdev" to the modprobe call. And
>
> That would be useful for adding to the separation of privileges,
> i.e. helping contain the leaking of posix caps. It sounds good to
> me.
>
>> then in init_module(), check the userargs for which subsystem was
>> requested and look up in a table for the entry point module symbol for
>> that subsystem to require. e.g. for "request_module=filesystem",
>> require that the module contains the "register_filesystem" symbol,
>> etc.
>>
>> > But how about if we
>> > add a check for 'current_user_ns() == &init_user_ns' at that place
>> > instead?
>>
>> Well, we'd need to mostly revert
>> 57eccb830f1cc93d4b506ba306d8dfa685e0c88f ("mount: consolidate
>> permission checks") since get_fs_type() is being called before
>> may_mount() right now. (And then, as you suggest, we should strengthen
>> the test.) I think this will require either more plumbing into
>> get_fs_type (something like "bool load_module_if_missing") or the
>> subsystem verification stuff in request_module. I think the latter is
>> MUCH nicer as it covers this problem in all places, not just this
>> "mount" case.
>
> My first instinct was to say I'd like to have the kernel 100% belonging
> to the init_user_ns, with child user namespaces having zero ability to
> induce loading of any kernel modules, period. So a check for current
> being in init_user_ns at request_module itself.
>
> However (thinking more) that seems maybe wrong. You don't need privs to
> induce the loading of a new binfmt module right? The host's
> /lib/modules and module blacklists should be set up right by the admin
> (or distro)... If we require that the host admin manually modprobe
> every module which a task in a child user namespace might need, that
> goes counter to the goal of kernel modules.

Several subsystems already have an implicit subsystem restriction
because they load with aliases. (e.g. binfmt-XXXX, net-pf=NNN,
snd-card-NNN, FOO-iosched, etc). This isn't the case for filesystems
and a few others, unfortunately:

$ git grep 'request_module("%.*s"' | grep -vi prefix
crypto/api.c: request_module("%s", name);
drivers/mtd/chips/chipreg.c: if (!drv && !request_module("%s", name))
drivers/mtd/mtdpart.c: if (!parser && !request_module("%s", *types))
drivers/net/wireless/iwlwifi/iwl-drv.c: request_module("%s", op->name);
drivers/staging/rtl8192e/rtllib_wx.c: request_module("%s", tempbuf);
fs/filesystems.c: if (!fs && (request_module("%.*s", len, name) == 0))
net/core/dev_ioctl.c: if (!request_module("%s", name))

Several of these come from hardcoded values, though (e.g. crypto, chipreg).

>> > Eric Biederman, do you have any objections to that?
>
> -serge

-Kees

--
Kees Cook
Chrome OS Security

2013-03-03 18:18:16

by Kees Cook

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Sat, Mar 2, 2013 at 8:12 PM, Eric W. Biederman <[email protected]> wrote:
> "Serge E. Hallyn" <[email protected]> writes:
>
>> Quoting Kees Cook ([email protected]):
>>> The rearranging done for user ns has resulted in allowing arbitrary
>>> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>>> by what is assumed to be an unprivileged process.
>>>
>>> At present, it does look to require at least CAP_SETUID along the way
>>> to set up the uidmap (but things like the setuid helper newuidmap
>>> might soon start providing such a thing by default).
>
> CAP_SETUID is not needed.

Do you have an example? I wasn't able to gain capabilities within the
userns until I had a uid map set that allowed my uid to map to root.
(I needed CAP_SYS_ADMIN in the mnt ns to get past may_mount() before I
could touch get_fs_type().)

>>> It might be worth examining GRKERNSEC_MODHARDEN in grsecurity, which
>>> examines module symbols to verify that request_module() for a
>>> filesystem only loads a module that defines "register_filesystem"
>>> (among other things).
>>>
>>> -Kees
>>>
>>> [1] https://twitter.com/grsecurity/status/307473816672665600
>>
>> So the concern is root in a child user namespace doing
>>
>> mount -t randomfs <...>
>>
>> in which case do_new_mount() checks ns_capable(), not capable(),
>> before trying to load a module for randomfs.
>>
>> As well as (secondly) the fact that there is no enforcement on
>> the format of the module names (i.e. fs-*).
>>
>> Kees, from what I've seen the GRKERNSEC_MODHARDEN won't be acceptable.
>> At least Eric Paris is strongly against it.
>
> What is wrong with GRKERNSEC_MODHARDEN? It took a quick look and the
> code is far from clean. But that would not be a fundamental objection
> from keeping code like that out of the kernel.
>
> It is also entertaining to read security code that won't even build with
> CONFIG_UIDGID_STRICT_TYPE_CHECKS enabled.
>
>> But how about if we
>> add a check for 'current_user_ns() == &init_user_ns' at that place
>> instead?
>>
>> Eric Biederman, do you have any objections to that?
>
> The obvious solution here is to test for CAP_SYS_ADMIN rather than
> current_user_ns == &init_user_ns before we request the module here. As
> that is what was previously required on this path.
>
> Reading the comments the concerns are.
> - Non-root users are allowed to load obscure and possibly kernel
> modules.
> - get_fs_type can trigger the load of any kernel module.

Yes, though I think you meant "... possibly vulnerable kernel modules
..." There has been a history of weird stuff living in kernel modules
that gets built by distros. If we can raise the bar on being able to
force the kernel to load those things, it'd be nice. As mentioned in
my other email, this is basically one of the few remaining places were
arbitrary module names can get loaded.

> At a practical level I don't see adding a capalbe(CAP_SYS_ADMIN) check
> as having much effect for the functionality currently present in user
> namespaces today as the filesystems that an legal to mount in a user
> namespace (ramfs, tmpfs, mqueuefs, sysfs, proc, devpts) are so common
> most of them can not even be built as modules and even if they are
> modules the modules will already be loaded. So I will see about adding
> a capable(CAP_SYS_ADMIN) check to shore things up for the short term.

For the short-term, yes, this solves the "regression". But it
obviously isn't what is desired for userns in the long term.

> In the longer term I very much would like to get loopback devices
> and mounts of filesystems on those loopback devices working, and being
> able to mount filesystems from usb sticks that people commonly plug in,
> and remove the need for privileged daemons to do that work. At that
> point manually having to do something that was automatic before will
> either mean a regression in functionality or bugs as people manually
> load things.
>
>
> So I am wondering what I a good policy should be. Should we trust
> kernel modules to not be buggy (especially if they were signed as part
> of the build process)? Do we add some defense in depth and add
> filesystem registration magic? Thinking...

If we can produce a mechanism that provides some defensive design, we
should do it. There will always be bugs, so we should always try to
make them harder to get to or harder to exploit. Reducing the
available attack surface to "just filesystems" would be a win here,
and it would be consistent with the _intent_ of existing kernel code.

> We can limit the request_module in get_fs_type to just filesystems
> fairly easily.
>
> In include/linux/fs.h:
>
> #define MODULE_ALIAS_FS(type) MODULE_ALIAS("fs-" __stringify(type))
>
> In fs/filesystems.c:
>
> if (request_moudle("fs-%.*s", len, name) == 0)
>
> Then just add the appropriate MODULE_ALIAS_FS lines in all of the
> filesystems. This also allows user space to say set the module loading
> policy for filesystems using the blacklist and the alias keywords
> in /etc/modprobe.d/*.conf.

This was the solution for netdev. The backward compat situation is this:

if (no_module && capable(CAP_NET_ADMIN))
no_module = request_module("netdev-%s", name);
if (no_module && capable(CAP_SYS_MODULE)) {
if (!request_module("%s", name))
pr_warn("Loading kernel module for a network
device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
netdev-%s instead.\n",
name);
}

Those aren't ns_capable, though, so right now userns can't trigger
loading new network drivers via ifconfig, but this seems like a
reasonable approach to take.

> That seems a whole lot simpler, more powerful and more maintainable than
> what little I saw in GRKERNSEC_MODHARDEN to prevent loading of
> non-filesystem modules from get_fs_type.
>
> Eric
>
> p.s. This is the patch I am looking at pushing to Linus in the near
> future.
>
> diff --git a/fs/filesystems.c b/fs/filesystems.c
> index da165f6..5b0644d 100644
> --- a/fs/filesystems.c
> +++ b/fs/filesystems.c
> @@ -273,7 +273,8 @@ struct file_system_type *get_fs_type(const char *name)
> int len = dot ? dot - name : strlen(name);
>
> fs = __get_fs_type(name, len);
> - if (!fs && (request_module("%.*s", len, name) == 0))
> + if (!fs && capable(CAP_SYS_ADMIN) &&
> + (request_module("%.*s", len, name) == 0))
> fs = __get_fs_type(name, len);
>
> if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {

Will this break other users of get_fs_type()? I think it's okay; it
looks like all the other users are expected to be privileged.

Should this look like netdev in the future? Something like this, after
making may_mount() available to fs/filesystems.c:

- if (!fs && (request_module("%.*s", len, name) == 0))
+ if (!fs && ((may_mount() && request_module("fs-%.*s", len,
name) == 0) ||
+ (capable(CAP_SYS_ADMIN) && request_module("%.*s",
len, name) == 0)))

and adding all the filesystem module aliases.

At some point maybe we can add a flag to kill the old unqualified
paths available to CAP_SYS_ADMIN for netdev and mount...

-Kees

--
Kees Cook
Chrome OS Security

2013-03-03 18:30:15

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC][PATCH] fs: Limit sys_mount to only loading filesystem modules.

On Sun, Mar 3, 2013 at 2:14 AM, Eric W. Biederman <[email protected]> wrote:
>
> Modify the request_module to prefix the file system type with "fs-"
> and add aliases to all of the filesystems that can be built as modules
> to match.
>
> A common practice is to build all of the kernel code and leave code
> that is not commonly needed as modules, with the result that many
> users are exposed to any bug anywhere in the kernel.
>
> Looking for filesystems with a fs- prefix limits the pool of possible
> modules that can be loaded by mount to just filesystems trivially
> making things safer with no real cost.
>
> Using aliases means user space can control the policy of which
> filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
> with blacklist and alias directives. Allowing simple, safe,
> well understood work-arounds to known problematic software.
>
> This also addresses a rare but unfortunate problem where the filesystem
> name is not the same as it's module name and module auto-loading
> would not work. While writing this patch I saw a handful of such
> cases. The most significant being autofs that lives in the module
> autofs4.
>
> Signed-off-by: "Eric W. Biederman" <[email protected]>

Acked-by: Kees Cook <[email protected]>

--
Kees Cook
Chrome OS Security

2013-03-03 21:58:24

by Eric W. Biederman

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

Kees Cook <[email protected]> writes:

> On Sat, Mar 2, 2013 at 8:12 PM, Eric W. Biederman <[email protected]> wrote:
>> "Serge E. Hallyn" <[email protected]> writes:
>>
>>> Quoting Kees Cook ([email protected]):
>>>> The rearranging done for user ns has resulted in allowing arbitrary
>>>> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>>>> by what is assumed to be an unprivileged process.
>>>>
>>>> At present, it does look to require at least CAP_SETUID along the way
>>>> to set up the uidmap (but things like the setuid helper newuidmap
>>>> might soon start providing such a thing by default).
>>
>> CAP_SETUID is not needed.
>
> Do you have an example? I wasn't able to gain capabilities within the
> userns until I had a uid map set that allowed my uid to map to root.
> (I needed CAP_SYS_ADMIN in the mnt ns to get past may_mount() before I
> could touch get_fs_type().)

Likely because exec drops capbilities if your uid isn't 0.

Without CAP_SETUID you can still map uid to 0 to your current uid.

The following shell script is an easy basis for testing and playing
around with things. It does assume a recent version of unshare from util-linux.

#!/bin/sh
export IFIFO=/tmp/userns-test-$$-in
export OFIFO=/tmp/userns-test-$$-out
rm -f $IFIFO $OFIFO
mkfifo $IFIFO
mkfifo $OFIFO
unshare --user -- /bin/bash -s <<'EOF' &
echo waiting-for-uid-and-gid-maps > $OFIFO
read LINE < $IFIFO
exec unshare --mount --net -- /bin/sh -s <<'EOF2'
mount --bind $HOME /root/
mount --bind /dev/null /dev/log
# Start a shell to keep the namespace reference alive
$SHELL -i < /dev/tty > /dev/tty 2> /dev/tty
EOF2
EOF
child=$!
read LINE < $OFIFO
uid=$(id --user)
gid=$(id --group)
echo "0 $uid 1" > /proc/$child/uid_map
echo "0 $gid 1" > /proc/$child/gid_map
echo uid-and-gid-maps > $IFIFO
wait $child


>> At a practical level I don't see adding a capalbe(CAP_SYS_ADMIN) check
>> as having much effect for the functionality currently present in user
>> namespaces today as the filesystems that an legal to mount in a user
>> namespace (ramfs, tmpfs, mqueuefs, sysfs, proc, devpts) are so common
>> most of them can not even be built as modules and even if they are
>> modules the modules will already be loaded. So I will see about adding
>> a capable(CAP_SYS_ADMIN) check to shore things up for the short term.
>
> For the short-term, yes, this solves the "regression". But it
> obviously isn't what is desired for userns in the long term.


>> So I am wondering what I a good policy should be. Should we trust
>> kernel modules to not be buggy (especially if they were signed as part
>> of the build process)? Do we add some defense in depth and add
>> filesystem registration magic? Thinking...
>
> If we can produce a mechanism that provides some defensive design, we
> should do it. There will always be bugs, so we should always try to
> make them harder to get to or harder to exploit. Reducing the
> available attack surface to "just filesystems" would be a win here,
> and it would be consistent with the _intent_ of existing kernel code.

Agreed.

>> We can limit the request_module in get_fs_type to just filesystems
>> fairly easily.
>>
>> In include/linux/fs.h:
>>
>> #define MODULE_ALIAS_FS(type) MODULE_ALIAS("fs-" __stringify(type))
>>
>> In fs/filesystems.c:
>>
>> if (request_moudle("fs-%.*s", len, name) == 0)
>>
>> Then just add the appropriate MODULE_ALIAS_FS lines in all of the
>> filesystems. This also allows user space to say set the module loading
>> policy for filesystems using the blacklist and the alias keywords
>> in /etc/modprobe.d/*.conf.
>
> This was the solution for netdev. The backward compat situation is this:
>
> if (no_module && capable(CAP_NET_ADMIN))
> no_module = request_module("netdev-%s", name);
> if (no_module && capable(CAP_SYS_MODULE)) {
> if (!request_module("%s", name))
> pr_warn("Loading kernel module for a network
> device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
> netdev-%s instead.\n",
> name);
> }

If I have read it right the backwards compat case is there to be
compatibile with existing peoples configuration who had created specific
aliases like eth0 for specific network devices in
/etc/modprobe.d/*.conf. Where the new netdev-* would break userspace.

I don't believe anyone does anything like that for filesystems, and even
if they tried it wouldn't work because the registered filesystem type
would have a different name. So I don't see needing to worry about that
kind of backwards compatibility for filesystems.

> Those aren't ns_capable, though, so right now userns can't trigger
> loading new network drivers via ifconfig, but this seems like a
> reasonable approach to take.

Yes. I saw those and deliberately didn't make those ns_capable, to be
on the safe side when I made the rest of the networking ns_capable.
That and ns_capabable is not meaningful as a check for loading modules.
The only interesting question is can the global root user trigger them
or can everyone trigger module loading.

>> That seems a whole lot simpler, more powerful and more maintainable than
>> what little I saw in GRKERNSEC_MODHARDEN to prevent loading of
>> non-filesystem modules from get_fs_type.
>>
>> Eric
>>
>> p.s. This is the patch I am looking at pushing to Linus in the near
>> future.
>>
>> diff --git a/fs/filesystems.c b/fs/filesystems.c
>> index da165f6..5b0644d 100644
>> --- a/fs/filesystems.c
>> +++ b/fs/filesystems.c
>> @@ -273,7 +273,8 @@ struct file_system_type *get_fs_type(const char *name)
>> int len = dot ? dot - name : strlen(name);
>>
>> fs = __get_fs_type(name, len);
>> - if (!fs && (request_module("%.*s", len, name) == 0))
>> + if (!fs && capable(CAP_SYS_ADMIN) &&
>> + (request_module("%.*s", len, name) == 0))
>> fs = __get_fs_type(name, len);
>>
>> if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
>
> Will this break other users of get_fs_type()? I think it's okay; it
> looks like all the other users are expected to be privileged.

I read through them all, and I didn't see problems. The most concerning
one is tomoyo's use of get_fs_type before capabilities are checked in
do_mount. But that is neither here nor there.

> Should this look like netdev in the future? Something like this, after
> making may_mount() available to fs/filesystems.c:
>
> - if (!fs && (request_module("%.*s", len, name) == 0))
> + if (!fs && ((may_mount() && request_module("fs-%.*s", len,
> name) == 0) ||
> + (capable(CAP_SYS_ADMIN) && request_module("%.*s",
> len, name) == 0)))
>
> and adding all the filesystem module aliases.
>
> At some point maybe we can add a flag to kill the old unqualified
> paths available to CAP_SYS_ADMIN for netdev and mount...

For netdev at least I don't see it as being particularly interesting.
The case for tunnels is already as unprivileged as you can reasonably
get with "request_module("rtnl-link-%s", kind);" in
net/core/rtnetlink.c:rtnl_newlink(). For real physical devices there is
both greater chance of a buggy module and no realy need as udev will
load the module based on hardware auto-discovery.

This leads to the fundamental question: Should we require privilege to
request the load filesystem modules?

I have looked at GRKERNSEC_MODHARDEN to see if that could give me some
guidance. Unfortunately GRKERNSEC_MODHARDEN takes the position that fs
kernel modules are the only kernel modules that should ever auto-load
and only in very specific situations. So I can't see that being the
normal kernel policy, especially since there are the sysctls
/proc/sys/kernel/modprobe and /proc/sys/kernel/modules_disabled.

Overall the basic policy building blocks for controlling which modules
are loaded seem solid and in use. So I don't see any particular reason
why the kernel's default policy should not be to allow any users actions
to request modules.

So I think I am going to scrap the change sitting in my development tree
to require capalbe(CAP_SYS_ADMIN) to load a filesystem module and just
go with my request_module("fs-%.*",...); change. That is simple and
seems to match the rest of the kernel.

Does anyone see a reason why we should need CAP_SYS_ADMIN or be in the
initial user namespace to trigger a request of filesystem modules?

Eric

2013-03-04 02:35:31

by Kees Cook

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Sun, Mar 3, 2013 at 1:58 PM, Eric W. Biederman <[email protected]> wrote:
> Kees Cook <[email protected]> writes:
>
>> On Sat, Mar 2, 2013 at 8:12 PM, Eric W. Biederman <[email protected]> wrote:
>>> "Serge E. Hallyn" <[email protected]> writes:
>>>
>>>> Quoting Kees Cook ([email protected]):
>>>>> The rearranging done for user ns has resulted in allowing arbitrary
>>>>> kernel module loading[1] (i.e. re-introducing a form of CVE-2011-1019)
>>>>> by what is assumed to be an unprivileged process.
>>>>>
>>>>> At present, it does look to require at least CAP_SETUID along the way
>>>>> to set up the uidmap (but things like the setuid helper newuidmap
>>>>> might soon start providing such a thing by default).
>>>
>>> CAP_SETUID is not needed.
>>
>> Do you have an example? I wasn't able to gain capabilities within the
>> userns until I had a uid map set that allowed my uid to map to root.
>> (I needed CAP_SYS_ADMIN in the mnt ns to get past may_mount() before I
>> could touch get_fs_type().)
>
> Likely because exec drops capbilities if your uid isn't 0.
>
> Without CAP_SETUID you can still map uid to 0 to your current uid.
>
> The following shell script is an easy basis for testing and playing
> around with things. It does assume a recent version of unshare from util-linux.
>
> #!/bin/sh
> export IFIFO=/tmp/userns-test-$$-in
> export OFIFO=/tmp/userns-test-$$-out
> rm -f $IFIFO $OFIFO
> mkfifo $IFIFO
> mkfifo $OFIFO
> unshare --user -- /bin/bash -s <<'EOF' &
> echo waiting-for-uid-and-gid-maps > $OFIFO
> read LINE < $IFIFO
> exec unshare --mount --net -- /bin/sh -s <<'EOF2'
> mount --bind $HOME /root/
> mount --bind /dev/null /dev/log
> # Start a shell to keep the namespace reference alive
> $SHELL -i < /dev/tty > /dev/tty 2> /dev/tty
> EOF2
> EOF
> child=$!
> read LINE < $OFIFO
> uid=$(id --user)
> gid=$(id --group)
> echo "0 $uid 1" > /proc/$child/uid_map
> echo "0 $gid 1" > /proc/$child/gid_map
> echo uid-and-gid-maps > $IFIFO
> wait $child

Ah-ha, thanks! Yes, that worked great. I think map_write()'s
cap_valid/ns_capable calls confused me. :)

>>> At a practical level I don't see adding a capalbe(CAP_SYS_ADMIN) check
>>> as having much effect for the functionality currently present in user
>>> namespaces today as the filesystems that an legal to mount in a user
>>> namespace (ramfs, tmpfs, mqueuefs, sysfs, proc, devpts) are so common
>>> most of them can not even be built as modules and even if they are
>>> modules the modules will already be loaded. So I will see about adding
>>> a capable(CAP_SYS_ADMIN) check to shore things up for the short term.
>>
>> For the short-term, yes, this solves the "regression". But it
>> obviously isn't what is desired for userns in the long term.
>
>
>>> So I am wondering what I a good policy should be. Should we trust
>>> kernel modules to not be buggy (especially if they were signed as part
>>> of the build process)? Do we add some defense in depth and add
>>> filesystem registration magic? Thinking...
>>
>> If we can produce a mechanism that provides some defensive design, we
>> should do it. There will always be bugs, so we should always try to
>> make them harder to get to or harder to exploit. Reducing the
>> available attack surface to "just filesystems" would be a win here,
>> and it would be consistent with the _intent_ of existing kernel code.
>
> Agreed.
>
>>> We can limit the request_module in get_fs_type to just filesystems
>>> fairly easily.
>>>
>>> In include/linux/fs.h:
>>>
>>> #define MODULE_ALIAS_FS(type) MODULE_ALIAS("fs-" __stringify(type))
>>>
>>> In fs/filesystems.c:
>>>
>>> if (request_moudle("fs-%.*s", len, name) == 0)
>>>
>>> Then just add the appropriate MODULE_ALIAS_FS lines in all of the
>>> filesystems. This also allows user space to say set the module loading
>>> policy for filesystems using the blacklist and the alias keywords
>>> in /etc/modprobe.d/*.conf.
>>
>> This was the solution for netdev. The backward compat situation is this:
>>
>> if (no_module && capable(CAP_NET_ADMIN))
>> no_module = request_module("netdev-%s", name);
>> if (no_module && capable(CAP_SYS_MODULE)) {
>> if (!request_module("%s", name))
>> pr_warn("Loading kernel module for a network
>> device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
>> netdev-%s instead.\n",
>> name);
>> }
>
> If I have read it right the backwards compat case is there to be
> compatibile with existing peoples configuration who had created specific
> aliases like eth0 for specific network devices in
> /etc/modprobe.d/*.conf. Where the new netdev-* would break userspace.
>
> I don't believe anyone does anything like that for filesystems, and even
> if they tried it wouldn't work because the registered filesystem type
> would have a different name. So I don't see needing to worry about that
> kind of backwards compatibility for filesystems.

Yeah, on re-reading the thread that introduced the backward compat
mode, I agree. There is no need for this with filesystems AFAICT.

>> Those aren't ns_capable, though, so right now userns can't trigger
>> loading new network drivers via ifconfig, but this seems like a
>> reasonable approach to take.
>
> Yes. I saw those and deliberately didn't make those ns_capable, to be
> on the safe side when I made the rest of the networking ns_capable.
> That and ns_capabable is not meaningful as a check for loading modules.
> The only interesting question is can the global root user trigger them
> or can everyone trigger module loading.
>
>>> That seems a whole lot simpler, more powerful and more maintainable than
>>> what little I saw in GRKERNSEC_MODHARDEN to prevent loading of
>>> non-filesystem modules from get_fs_type.
>>>
>>> Eric
>>>
>>> p.s. This is the patch I am looking at pushing to Linus in the near
>>> future.
>>>
>>> diff --git a/fs/filesystems.c b/fs/filesystems.c
>>> index da165f6..5b0644d 100644
>>> --- a/fs/filesystems.c
>>> +++ b/fs/filesystems.c
>>> @@ -273,7 +273,8 @@ struct file_system_type *get_fs_type(const char *name)
>>> int len = dot ? dot - name : strlen(name);
>>>
>>> fs = __get_fs_type(name, len);
>>> - if (!fs && (request_module("%.*s", len, name) == 0))
>>> + if (!fs && capable(CAP_SYS_ADMIN) &&
>>> + (request_module("%.*s", len, name) == 0))
>>> fs = __get_fs_type(name, len);
>>>
>>> if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
>>
>> Will this break other users of get_fs_type()? I think it's okay; it
>> looks like all the other users are expected to be privileged.
>
> I read through them all, and I didn't see problems. The most concerning
> one is tomoyo's use of get_fs_type before capabilities are checked in
> do_mount. But that is neither here nor there.
>
>> Should this look like netdev in the future? Something like this, after
>> making may_mount() available to fs/filesystems.c:
>>
>> - if (!fs && (request_module("%.*s", len, name) == 0))
>> + if (!fs && ((may_mount() && request_module("fs-%.*s", len,
>> name) == 0) ||
>> + (capable(CAP_SYS_ADMIN) && request_module("%.*s",
>> len, name) == 0)))
>>
>> and adding all the filesystem module aliases.
>>
>> At some point maybe we can add a flag to kill the old unqualified
>> paths available to CAP_SYS_ADMIN for netdev and mount...
>
> For netdev at least I don't see it as being particularly interesting.
> The case for tunnels is already as unprivileged as you can reasonably
> get with "request_module("rtnl-link-%s", kind);" in
> net/core/rtnetlink.c:rtnl_newlink(). For real physical devices there is
> both greater chance of a buggy module and no realy need as udev will
> load the module based on hardware auto-discovery.
>
> This leads to the fundamental question: Should we require privilege to
> request the load filesystem modules?

I think that the past dictated the need for privilege due to it being
tied to mounting. With userns, this is weakened, but it seems like the
privilege should at least attempt to segregate caps to certain classes
of modules.

> I have looked at GRKERNSEC_MODHARDEN to see if that could give me some
> guidance. Unfortunately GRKERNSEC_MODHARDEN takes the position that fs
> kernel modules are the only kernel modules that should ever auto-load
> and only in very specific situations. So I can't see that being the
> normal kernel policy, especially since there are the sysctls

Right -- modharden's basic goal is to block all non-root autoloading.
It had to work around some corner-cases (mount being setuid, etc).

> /proc/sys/kernel/modprobe and /proc/sys/kernel/modules_disabled.

Right -- this gives the granularity of "autoloading" and "loading"
respectively, but there is no concept of "only privileged autoloading"
in the current kernel.

It seems to me that unpriv users shouldn't be able to arbitrarily load
kernel modules, but if userns continues in this direction, there will
always be a path to doing autoloading for each different subsystem's
modules, ultimately leading to unpriv loading. Still, I think it's
worth creating obvious subsystem aliases so userspace can more easily
blacklist/whitelist things.

> Overall the basic policy building blocks for controlling which modules
> are loaded seem solid and in use. So I don't see any particular reason
> why the kernel's default policy should not be to allow any users actions
> to request modules.
>
> So I think I am going to scrap the change sitting in my development tree
> to require capalbe(CAP_SYS_ADMIN) to load a filesystem module and just
> go with my request_module("fs-%.*",...); change. That is simple and
> seems to match the rest of the kernel.

Agreed.

> Does anyone see a reason why we should need CAP_SYS_ADMIN or be in the
> initial user namespace to trigger a request of filesystem modules?
>
> Eric

-Kees

--
Kees Cook
Chrome OS Security

2013-03-04 03:55:00

by Eric W. Biederman

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

Kees Cook <[email protected]> writes:

> On Sun, Mar 3, 2013 at 1:58 PM, Eric W. Biederman <[email protected]> wrote:

> Ah-ha, thanks! Yes, that worked great. I think map_write()'s
> cap_valid/ns_capable calls confused me. :)

Yes permissions across user namespaces can be a little weird. But
mostly if you are their creator, you are their all powerful god and
they must watch out.

>> For netdev at least I don't see it as being particularly interesting.
>> The case for tunnels is already as unprivileged as you can reasonably
>> get with "request_module("rtnl-link-%s", kind);" in
>> net/core/rtnetlink.c:rtnl_newlink(). For real physical devices there is
>> both greater chance of a buggy module and no realy need as udev will
>> load the module based on hardware auto-discovery.
>>
>> This leads to the fundamental question: Should we require privilege to
>> request the load filesystem modules?
>
> I think that the past dictated the need for privilege due to it being
> tied to mounting. With userns, this is weakened, but it seems like the
> privilege should at least attempt to segregate caps to certain classes
> of modules.

With filesystems in particular the attack surface is practically
non-existent if the filesystem is not mounted, and FS_USERNS_MOUNT
prevents most filesystems from being mounted. So right now I do not see
any real danger in allowing filesystem modules to be auto-loaded.

>> I have looked at GRKERNSEC_MODHARDEN to see if that could give me some
>> guidance. Unfortunately GRKERNSEC_MODHARDEN takes the position that fs
>> kernel modules are the only kernel modules that should ever auto-load
>> and only in very specific situations. So I can't see that being the
>> normal kernel policy, especially since there are the sysctls
>
> Right -- modharden's basic goal is to block all non-root autoloading.
> It had to work around some corner-cases (mount being setuid, etc).

The mail goal appears to be disable module auto-loading and to log
a message.

>> /proc/sys/kernel/modprobe and /proc/sys/kernel/modules_disabled.
>
> Right -- this gives the granularity of "autoloading" and "loading"
> respectively, but there is no concept of "only privileged autoloading"
> in the current kernel.

As of the 3.8 patch when a module is loaded it does:

+#ifdef CONFIG_GRKERNSEC_MODHARDEN
+ {
+ char *p, *p2;
+
+ if (strstr(mod->args, "grsec_modharden_netdev")) {
+ printk(KERN_ALERT "grsec: denied auto-loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-%.64s instead.", mod->name);
+ err = -EPERM;
+ goto free_modinfo;
+ } else if ((p = strstr(mod->args, "grsec_modharden_normal"))) {
+ p += sizeof("grsec_modharden_normal") - 1;
+ p2 = strstr(p, "_");
+ if (p2) {
+ *p2 = '\0';
+ printk(KERN_ALERT "grsec: denied kernel module auto-load of %.64s by uid %.9s\n", mod->name, p);
+ *p2 = '_';
+ }
+ err = -EPERM;
+ goto free_modinfo;
+ }
+ }
+#endif

Which simplifies to
if (strstr(mod->args, "grsec_modharden_fs") != 0) {
err = -EPERM;
goto free_modinfo;
}

So perhaps it tries but there is not really a concept of privileged
processes being able to auto-load anything except fs modules.

In general what I see GRKERNSEC_MODHARDEN implementing is a policy of
printing a logging a message instead of auto-loading any kernel modules.

> It seems to me that unpriv users shouldn't be able to arbitrarily load
> kernel modules, but if userns continues in this direction, there will
> always be a path to doing autoloading for each different subsystem's
> modules, ultimately leading to unpriv loading. Still, I think it's
> worth creating obvious subsystem aliases so userspace can more easily
> blacklist/whitelist things.

But I don't see any problem with uprivileged users being able to cause
the kernel to request kernel modules, as long as the request is clear
enough that modprobe can implement the desired policy.

>> Overall the basic policy building blocks for controlling which modules
>> are loaded seem solid and in use. So I don't see any particular reason
>> why the kernel's default policy should not be to allow any users actions
>> to request modules.
>>
>> So I think I am going to scrap the change sitting in my development tree
>> to require capalbe(CAP_SYS_ADMIN) to load a filesystem module and just
>> go with my request_module("fs-%.*",...); change. That is simple and
>> seems to match the rest of the kernel.
>
> Agreed.

Eric

2013-03-04 07:49:08

by Eric W. Biederman

[permalink] [raw]
Subject: [PATCH 0/2] userns bug fixes for v3.9-rc2 for review


Baring problems these are the changes I intend to put in linux-next and
then send to Linus for v3.9-rc2.

The first is a trivial oops fix.
The second reworks how mount -t triggers module loading to make it
harder to abuse.

Eric W. Biederman (2):
userns: Stop oopsing in key_change_session_keyring
fs: Limit sys_mount to only request filesystem modules.

arch/ia64/kernel/perfmon.c | 1 +
arch/powerpc/platforms/cell/spufs/inode.c | 1 +
arch/s390/hypfs/inode.c | 1 +
drivers/firmware/efivars.c | 1 +
drivers/infiniband/hw/ipath/ipath_fs.c | 1 +
drivers/infiniband/hw/qib/qib_fs.c | 1 +
drivers/misc/ibmasm/ibmasmfs.c | 1 +
drivers/mtd/mtdchar.c | 1 +
drivers/oprofile/oprofilefs.c | 1 +
drivers/staging/ccg/f_fs.c | 1 +
drivers/usb/gadget/f_fs.c | 1 +
drivers/usb/gadget/inode.c | 1 +
drivers/xen/xenfs/super.c | 1 +
fs/9p/vfs_super.c | 1 +
fs/adfs/super.c | 1 +
fs/affs/super.c | 1 +
fs/afs/super.c | 1 +
fs/autofs4/init.c | 1 +
fs/befs/linuxvfs.c | 1 +
fs/bfs/inode.c | 1 +
fs/binfmt_misc.c | 1 +
fs/btrfs/super.c | 1 +
fs/ceph/super.c | 1 +
fs/coda/inode.c | 1 +
fs/configfs/mount.c | 1 +
fs/cramfs/inode.c | 1 +
fs/debugfs/inode.c | 1 +
fs/devpts/inode.c | 1 +
fs/ecryptfs/main.c | 1 +
fs/efs/super.c | 1 +
fs/exofs/super.c | 1 +
fs/ext2/super.c | 1 +
fs/ext3/super.c | 1 +
fs/ext4/super.c | 5 +++--
fs/f2fs/super.c | 1 +
fs/fat/namei_msdos.c | 1 +
fs/fat/namei_vfat.c | 1 +
fs/filesystems.c | 2 +-
fs/freevxfs/vxfs_super.c | 2 +-
fs/fuse/control.c | 1 +
fs/fuse/inode.c | 2 ++
fs/gfs2/ops_fstype.c | 4 +++-
fs/hfs/super.c | 1 +
fs/hfsplus/super.c | 1 +
fs/hppfs/hppfs.c | 1 +
fs/hugetlbfs/inode.c | 1 +
fs/isofs/inode.c | 3 +--
fs/jffs2/super.c | 1 +
fs/jfs/super.c | 1 +
fs/logfs/super.c | 1 +
fs/minix/inode.c | 1 +
fs/ncpfs/inode.c | 1 +
fs/nfs/super.c | 3 ++-
fs/nfsd/nfsctl.c | 1 +
fs/nilfs2/super.c | 1 +
fs/ntfs/super.c | 1 +
fs/ocfs2/dlmfs/dlmfs.c | 1 +
fs/omfs/inode.c | 1 +
fs/openpromfs/inode.c | 1 +
fs/qnx4/inode.c | 1 +
fs/qnx6/inode.c | 1 +
fs/reiserfs/super.c | 1 +
fs/romfs/super.c | 1 +
fs/sysv/super.c | 3 ++-
fs/ubifs/super.c | 1 +
fs/ufs/super.c | 1 +
fs/xfs/xfs_super.c | 1 +
include/linux/fs.h | 2 ++
net/sunrpc/rpc_pipe.c | 4 +---
security/keys/process_keys.c | 2 +-
70 files changed, 78 insertions(+), 13 deletions(-)

2013-03-04 07:50:24

by Eric W. Biederman

[permalink] [raw]
Subject: [PATCH 1/2] userns: Stop oopsing in key_change_session_keyring


Dave Jones <[email protected]> writes:
> Just hit this on Linus' current tree.
>
> [ 89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
> [ 89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [ 89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0
> [ 89.624901] Oops: 0000 [#1] PREEMPT SMP
> [ 89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii
> [ 89.637846] CPU 2
> [ 89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
> [ 89.639850] RIP: 0010:[<ffffffff810784b0>] [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [ 89.641161] RSP: 0018:ffff880115657eb8 EFLAGS: 00010207
> [ 89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000
> [ 89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600
> [ 89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000
> [ 89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600
> [ 89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000
> [ 89.647431] FS: 00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000
> [ 89.648660] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0
> [ 89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490)
> [ 89.654128] Stack:
> [ 89.654433] 0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78
> [ 89.655769] ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000
> [ 89.657073] ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58
> [ 89.658399] Call Trace:
> [ 89.658822] [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140
> [ 89.659845] [<ffffffff8106c665>] task_work_run+0xa5/0xd0
> [ 89.660698] [<ffffffff81002911>] do_notify_resume+0x71/0xb0
> [ 89.661581] [<ffffffff816c9a4a>] int_signal+0x12/0x17
> [ 89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b
> [ 89.667778] RIP [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [ 89.668733] RSP <ffff880115657eb8>
> [ 89.669301] CR2: 00000000000000c8
>
> My fastest trinity induced oops yet!
>
>
> Appears to be..
>
> if ((set_ns == subset_ns->parent) &&
> 850: 48 8b 8a c8 00 00 00 mov 0xc8(%rdx),%rcx
>
> from the inlined cred_cap_issubset

By historical accident we have been reading trying to set new->user_ns
from new->user_ns. Which is totally silly as new->user_ns is NULL (as
is every other field in new except session_keyring at that point).

The intent is clearly to copy all of the fields from old to new so copy
old->user_ns into into new->user_ns.

Cc: [email protected]
Reported-by: Dave Jones <[email protected]>
Tested-by: Dave Jones <[email protected]>
Acked-by: Serge Hallyn <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
---
security/keys/process_keys.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index 58dfe08..a571fad 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -839,7 +839,7 @@ void key_change_session_keyring(struct callback_head *twork)
new-> sgid = old-> sgid;
new->fsgid = old->fsgid;
new->user = get_uid(old->user);
- new->user_ns = get_user_ns(new->user_ns);
+ new->user_ns = get_user_ns(old->user_ns);
new->group_info = get_group_info(old->group_info);

new->securebits = old->securebits;
--
1.7.5.4

2013-03-04 07:51:13

by Eric W. Biederman

[permalink] [raw]
Subject: [PATCH 2/2] fs: Limit sys_mount to only request filesystem modules.


Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.

A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.

Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.

Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives. Allowing simple, safe,
well understood work-arounds to known problematic software.

This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work. While writing this patch I saw a handful of such
cases. The most significant being autofs that lives in the module
autofs4.

This is relevant to user namespaces because we can reach the request
module in get_fs_type() without having any special permissions, and
people get uncomfortable when a user specified string (in this case
the filesystem type) goes all of the way to request_module.

After having looked at this issue I don't think there is any
particular reason to perform any filtering or permission checks beyond
making it clear in the module request that we want a filesystem
module. The common pattern in the kernel is to call request_module()
without regards to the users permissions. In general all a filesystem
module does once loaded is call register_filesystem() and go to sleep.
Which means there is not much attack surface exposed by loading a
filesytem module unless the filesystem is mounted. In a user
namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
which most filesystems do not set today.

Acked-by: Serge Hallyn <[email protected]>
Acked-by: Kees Cook <[email protected]>
Reported-by: Kees Cook <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
---
arch/ia64/kernel/perfmon.c | 1 +
arch/powerpc/platforms/cell/spufs/inode.c | 1 +
arch/s390/hypfs/inode.c | 1 +
drivers/firmware/efivars.c | 1 +
drivers/infiniband/hw/ipath/ipath_fs.c | 1 +
drivers/infiniband/hw/qib/qib_fs.c | 1 +
drivers/misc/ibmasm/ibmasmfs.c | 1 +
drivers/mtd/mtdchar.c | 1 +
drivers/oprofile/oprofilefs.c | 1 +
drivers/staging/ccg/f_fs.c | 1 +
drivers/usb/gadget/f_fs.c | 1 +
drivers/usb/gadget/inode.c | 1 +
drivers/xen/xenfs/super.c | 1 +
fs/9p/vfs_super.c | 1 +
fs/adfs/super.c | 1 +
fs/affs/super.c | 1 +
fs/afs/super.c | 1 +
fs/autofs4/init.c | 1 +
fs/befs/linuxvfs.c | 1 +
fs/bfs/inode.c | 1 +
fs/binfmt_misc.c | 1 +
fs/btrfs/super.c | 1 +
fs/ceph/super.c | 1 +
fs/coda/inode.c | 1 +
fs/configfs/mount.c | 1 +
fs/cramfs/inode.c | 1 +
fs/debugfs/inode.c | 1 +
fs/devpts/inode.c | 1 +
fs/ecryptfs/main.c | 1 +
fs/efs/super.c | 1 +
fs/exofs/super.c | 1 +
fs/ext2/super.c | 1 +
fs/ext3/super.c | 1 +
fs/ext4/super.c | 5 +++--
fs/f2fs/super.c | 1 +
fs/fat/namei_msdos.c | 1 +
fs/fat/namei_vfat.c | 1 +
fs/filesystems.c | 2 +-
fs/freevxfs/vxfs_super.c | 2 +-
fs/fuse/control.c | 1 +
fs/fuse/inode.c | 2 ++
fs/gfs2/ops_fstype.c | 4 +++-
fs/hfs/super.c | 1 +
fs/hfsplus/super.c | 1 +
fs/hppfs/hppfs.c | 1 +
fs/hugetlbfs/inode.c | 1 +
fs/isofs/inode.c | 3 +--
fs/jffs2/super.c | 1 +
fs/jfs/super.c | 1 +
fs/logfs/super.c | 1 +
fs/minix/inode.c | 1 +
fs/ncpfs/inode.c | 1 +
fs/nfs/super.c | 3 ++-
fs/nfsd/nfsctl.c | 1 +
fs/nilfs2/super.c | 1 +
fs/ntfs/super.c | 1 +
fs/ocfs2/dlmfs/dlmfs.c | 1 +
fs/omfs/inode.c | 1 +
fs/openpromfs/inode.c | 1 +
fs/qnx4/inode.c | 1 +
fs/qnx6/inode.c | 1 +
fs/reiserfs/super.c | 1 +
fs/romfs/super.c | 1 +
fs/sysv/super.c | 3 ++-
fs/ubifs/super.c | 1 +
fs/ufs/super.c | 1 +
fs/xfs/xfs_super.c | 1 +
include/linux/fs.h | 2 ++
net/sunrpc/rpc_pipe.c | 4 +---
69 files changed, 77 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 433f5e8..2eda284 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -619,6 +619,7 @@ static struct file_system_type pfm_fs_type = {
.mount = pfmfs_mount,
.kill_sb = kill_anon_super,
};
+MODULE_ALIAS_FS("pfmfs");

DEFINE_PER_CPU(unsigned long, pfm_syst_info);
DEFINE_PER_CPU(struct task_struct *, pmu_owner);
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 863184b..3f3bb4c 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -749,6 +749,7 @@ static struct file_system_type spufs_type = {
.mount = spufs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("spufs");

static int __init spufs_init(void)
{
diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index 8538015..5f7d7ba 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -456,6 +456,7 @@ static struct file_system_type hypfs_type = {
.mount = hypfs_mount,
.kill_sb = hypfs_kill_super
};
+MODULE_ALIAS_FS("s390_hypfs");

static const struct super_operations hypfs_s_ops = {
.statfs = simple_statfs,
diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index 7320bf8..3edade0 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -1234,6 +1234,7 @@ static struct file_system_type efivarfs_type = {
.mount = efivarfs_mount,
.kill_sb = efivarfs_kill_sb,
};
+MODULE_ALIAS_FS("efivarfs");

/*
* Handle negative dentry.
diff --git a/drivers/infiniband/hw/ipath/ipath_fs.c b/drivers/infiniband/hw/ipath/ipath_fs.c
index a479375..e0c404b 100644
--- a/drivers/infiniband/hw/ipath/ipath_fs.c
+++ b/drivers/infiniband/hw/ipath/ipath_fs.c
@@ -410,6 +410,7 @@ static struct file_system_type ipathfs_fs_type = {
.mount = ipathfs_mount,
.kill_sb = ipathfs_kill_super,
};
+MODULE_ALIAS_FS("ipathfs");

int __init ipath_init_ipathfs(void)
{
diff --git a/drivers/infiniband/hw/qib/qib_fs.c b/drivers/infiniband/hw/qib/qib_fs.c
index 644bd6f..f247fc6e 100644
--- a/drivers/infiniband/hw/qib/qib_fs.c
+++ b/drivers/infiniband/hw/qib/qib_fs.c
@@ -604,6 +604,7 @@ static struct file_system_type qibfs_fs_type = {
.mount = qibfs_mount,
.kill_sb = qibfs_kill_super,
};
+MODULE_ALIAS_FS("ipathfs");

int __init qib_init_qibfs(void)
{
diff --git a/drivers/misc/ibmasm/ibmasmfs.c b/drivers/misc/ibmasm/ibmasmfs.c
index 6673e57..ce5b756 100644
--- a/drivers/misc/ibmasm/ibmasmfs.c
+++ b/drivers/misc/ibmasm/ibmasmfs.c
@@ -110,6 +110,7 @@ static struct file_system_type ibmasmfs_type = {
.mount = ibmasmfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("ibmasmfs");

static int ibmasmfs_fill_super (struct super_block *sb, void *data, int silent)
{
diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index 82c0616..92ab30a 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -1238,6 +1238,7 @@ static struct file_system_type mtd_inodefs_type = {
.mount = mtd_inodefs_mount,
.kill_sb = kill_anon_super,
};
+MODULE_ALIAS_FS("mtd_inodefs");

static int __init init_mtdchar(void)
{
diff --git a/drivers/oprofile/oprofilefs.c b/drivers/oprofile/oprofilefs.c
index 445ffda..7c12d9c 100644
--- a/drivers/oprofile/oprofilefs.c
+++ b/drivers/oprofile/oprofilefs.c
@@ -276,6 +276,7 @@ static struct file_system_type oprofilefs_type = {
.mount = oprofilefs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("oprofilefs");


int __init oprofilefs_register(void)
diff --git a/drivers/staging/ccg/f_fs.c b/drivers/staging/ccg/f_fs.c
index 8adc79d..f6373da 100644
--- a/drivers/staging/ccg/f_fs.c
+++ b/drivers/staging/ccg/f_fs.c
@@ -1223,6 +1223,7 @@ static struct file_system_type ffs_fs_type = {
.mount = ffs_fs_mount,
.kill_sb = ffs_fs_kill_sb,
};
+MODULE_ALIAS_FS("functionfs");


/* Driver's main init/cleanup functions *************************************/
diff --git a/drivers/usb/gadget/f_fs.c b/drivers/usb/gadget/f_fs.c
index 38388d7..c377ff8 100644
--- a/drivers/usb/gadget/f_fs.c
+++ b/drivers/usb/gadget/f_fs.c
@@ -1235,6 +1235,7 @@ static struct file_system_type ffs_fs_type = {
.mount = ffs_fs_mount,
.kill_sb = ffs_fs_kill_sb,
};
+MODULE_ALIAS_FS("functionfs");


/* Driver's main init/cleanup functions *************************************/
diff --git a/drivers/usb/gadget/inode.c b/drivers/usb/gadget/inode.c
index 8ac840f..e2b2e9c 100644
--- a/drivers/usb/gadget/inode.c
+++ b/drivers/usb/gadget/inode.c
@@ -2105,6 +2105,7 @@ static struct file_system_type gadgetfs_type = {
.mount = gadgetfs_mount,
.kill_sb = gadgetfs_kill_sb,
};
+MODULE_ALIAS_FS("gadgetfs");

/*----------------------------------------------------------------------*/

diff --git a/drivers/xen/xenfs/super.c b/drivers/xen/xenfs/super.c
index ec0abb6..7167987 100644
--- a/drivers/xen/xenfs/super.c
+++ b/drivers/xen/xenfs/super.c
@@ -75,6 +75,7 @@ static struct file_system_type xenfs_type = {
.mount = xenfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("xenfs");

static int __init xenfs_init(void)
{
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index 91dad63..2756dcd 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -365,3 +365,4 @@ struct file_system_type v9fs_fs_type = {
.owner = THIS_MODULE,
.fs_flags = FS_RENAME_DOES_D_MOVE,
};
+MODULE_ALIAS_FS("9p");
diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index d571229..0ff4bae 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -524,6 +524,7 @@ static struct file_system_type adfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("adfs");

static int __init init_adfs_fs(void)
{
diff --git a/fs/affs/super.c b/fs/affs/super.c
index b84dc73..45161a8 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -622,6 +622,7 @@ static struct file_system_type affs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("affs");

static int __init init_affs_fs(void)
{
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 7c31ec3..c486155 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -45,6 +45,7 @@ struct file_system_type afs_fs_type = {
.kill_sb = afs_kill_super,
.fs_flags = 0,
};
+MODULE_ALIAS_FS("afs");

static const struct super_operations afs_super_ops = {
.statfs = afs_statfs,
diff --git a/fs/autofs4/init.c b/fs/autofs4/init.c
index cddc74b..b3db517 100644
--- a/fs/autofs4/init.c
+++ b/fs/autofs4/init.c
@@ -26,6 +26,7 @@ static struct file_system_type autofs_fs_type = {
.mount = autofs_mount,
.kill_sb = autofs4_kill_sb,
};
+MODULE_ALIAS_FS("autofs");

static int __init init_autofs4_fs(void)
{
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index c8f4e25..8615ee8 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -951,6 +951,7 @@ static struct file_system_type befs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("befs");

static int __init
init_befs_fs(void)
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index 737aaa3..5e376bb 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -473,6 +473,7 @@ static struct file_system_type bfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("bfs");

static int __init init_bfs_fs(void)
{
diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index fecbbf3..751df5e 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -720,6 +720,7 @@ static struct file_system_type bm_fs_type = {
.mount = bm_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("binfmt_misc");

static int __init init_misc_binfmt(void)
{
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 68a29a1..f6b8859 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1558,6 +1558,7 @@ static struct file_system_type btrfs_fs_type = {
.kill_sb = btrfs_kill_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("btrfs");

/*
* used by btrfsctl to scan devices when no FS is mounted
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 9fe17c6c..6ddc0bc 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -952,6 +952,7 @@ static struct file_system_type ceph_fs_type = {
.kill_sb = ceph_kill_sb,
.fs_flags = FS_RENAME_DOES_D_MOVE,
};
+MODULE_ALIAS_FS("ceph");

#define _STRINGIFY(x) #x
#define STRINGIFY(x) _STRINGIFY(x)
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index dada9d0..4dcc0d8 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -329,4 +329,5 @@ struct file_system_type coda_fs_type = {
.kill_sb = kill_anon_super,
.fs_flags = FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("coda");

diff --git a/fs/configfs/mount.c b/fs/configfs/mount.c
index aee0a7e..7f26c3c 100644
--- a/fs/configfs/mount.c
+++ b/fs/configfs/mount.c
@@ -114,6 +114,7 @@ static struct file_system_type configfs_fs_type = {
.mount = configfs_do_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("configfs");

struct dentry *configfs_pin_fs(void)
{
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 3ceb9ec..35b1c7b 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -573,6 +573,7 @@ static struct file_system_type cramfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("cramfs");

static int __init init_cramfs_fs(void)
{
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 0c4f80b..4888cb3 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -299,6 +299,7 @@ static struct file_system_type debug_fs_type = {
.mount = debug_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("debugfs");

static struct dentry *__create_file(const char *name, umode_t mode,
struct dentry *parent, void *data,
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 073d30b..79b6629 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -510,6 +510,7 @@ static struct file_system_type devpts_fs_type = {
.fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT,
#endif
};
+MODULE_ALIAS_FS("devpts");

/*
* The normal naming convention is simply /dev/pts/<number>; this conforms
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 4e0886c..e924cf4 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -629,6 +629,7 @@ static struct file_system_type ecryptfs_fs_type = {
.kill_sb = ecryptfs_kill_block_super,
.fs_flags = 0
};
+MODULE_ALIAS_FS("ecryptfs");

/**
* inode_info_init_once
diff --git a/fs/efs/super.c b/fs/efs/super.c
index 2002431..c6f57a7 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -33,6 +33,7 @@ static struct file_system_type efs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("efs");

static struct pt_types sgi_pt_types[] = {
{0x00, "SGI vh"},
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 5e59280..9d97633 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -1010,6 +1010,7 @@ static struct file_system_type exofs_type = {
.mount = exofs_mount,
.kill_sb = generic_shutdown_super,
};
+MODULE_ALIAS_FS("exofs");

static int __init init_exofs(void)
{
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 7f68c81..2885349 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -1536,6 +1536,7 @@ static struct file_system_type ext2_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext2");

static int __init init_ext2_fs(void)
{
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 5546ca2..1d6e2ed 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -3068,6 +3068,7 @@ static struct file_system_type ext3_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext3");

static int __init init_ext3_fs(void)
{
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5e6c878..34e8552 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -90,6 +90,7 @@ static struct file_system_type ext2_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext2");
#define IS_EXT2_SB(sb) ((sb)->s_bdev->bd_holder == &ext2_fs_type)
#else
#define IS_EXT2_SB(sb) (0)
@@ -104,6 +105,7 @@ static struct file_system_type ext3_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext3");
#define IS_EXT3_SB(sb) ((sb)->s_bdev->bd_holder == &ext3_fs_type)
#else
#define IS_EXT3_SB(sb) (0)
@@ -5152,7 +5154,6 @@ static inline int ext2_feature_set_ok(struct super_block *sb)
return 0;
return 1;
}
-MODULE_ALIAS("ext2");
#else
static inline void register_as_ext2(void) { }
static inline void unregister_as_ext2(void) { }
@@ -5185,7 +5186,6 @@ static inline int ext3_feature_set_ok(struct super_block *sb)
return 0;
return 1;
}
-MODULE_ALIAS("ext3");
#else
static inline void register_as_ext3(void) { }
static inline void unregister_as_ext3(void) { }
@@ -5199,6 +5199,7 @@ static struct file_system_type ext4_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ext4");

static int __init ext4_init_feat_adverts(void)
{
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 8c11764..fea6e58 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -687,6 +687,7 @@ static struct file_system_type f2fs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("f2fs");

static int __init init_inodecache(void)
{
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index e2cfda9..081b759 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -668,6 +668,7 @@ static struct file_system_type msdos_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("msdos");

static int __init init_msdos_fs(void)
{
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index ac959d6..2da9520 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -1073,6 +1073,7 @@ static struct file_system_type vfat_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("vfat");

static int __init init_vfat_fs(void)
{
diff --git a/fs/filesystems.c b/fs/filesystems.c
index da165f6..92567d9 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -273,7 +273,7 @@ struct file_system_type *get_fs_type(const char *name)
int len = dot ? dot - name : strlen(name);

fs = __get_fs_type(name, len);
- if (!fs && (request_module("%.*s", len, name) == 0))
+ if (!fs && (request_module("fs-%.*s", len, name) == 0))
fs = __get_fs_type(name, len);

if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
index fed2c8a..4550743 100644
--- a/fs/freevxfs/vxfs_super.c
+++ b/fs/freevxfs/vxfs_super.c
@@ -52,7 +52,6 @@ MODULE_AUTHOR("Christoph Hellwig");
MODULE_DESCRIPTION("Veritas Filesystem (VxFS) driver");
MODULE_LICENSE("Dual BSD/GPL");

-MODULE_ALIAS("vxfs"); /* makes mount -t vxfs autoload the module */


static void vxfs_put_super(struct super_block *);
@@ -258,6 +257,7 @@ static struct file_system_type vxfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("vxfs"); /* makes mount -t vxfs autoload the module */

static int __init
vxfs_init(void)
diff --git a/fs/fuse/control.c b/fs/fuse/control.c
index b7978b9f..a0b0855 100644
--- a/fs/fuse/control.c
+++ b/fs/fuse/control.c
@@ -341,6 +341,7 @@ static struct file_system_type fuse_ctl_fs_type = {
.mount = fuse_ctl_mount,
.kill_sb = fuse_ctl_kill_sb,
};
+MODULE_ALIAS_FS("fusectl");

int __init fuse_ctl_init(void)
{
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index df00993..137185c 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1117,6 +1117,7 @@ static struct file_system_type fuse_fs_type = {
.mount = fuse_mount,
.kill_sb = fuse_kill_sb_anon,
};
+MODULE_ALIAS_FS("fuse");

#ifdef CONFIG_BLOCK
static struct dentry *fuse_mount_blk(struct file_system_type *fs_type,
@@ -1146,6 +1147,7 @@ static struct file_system_type fuseblk_fs_type = {
.kill_sb = fuse_kill_sb_blk,
.fs_flags = FS_REQUIRES_DEV | FS_HAS_SUBTYPE,
};
+MODULE_ALIAS_FS("fuseblk");

static inline int register_fuseblk(void)
{
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 1b612be..60ede2a 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -20,6 +20,7 @@
#include <linux/gfs2_ondisk.h>
#include <linux/quotaops.h>
#include <linux/lockdep.h>
+#include <linux/module.h>

#include "gfs2.h"
#include "incore.h"
@@ -1425,6 +1426,7 @@ struct file_system_type gfs2_fs_type = {
.kill_sb = gfs2_kill_sb,
.owner = THIS_MODULE,
};
+MODULE_ALIAS_FS("gfs2");

struct file_system_type gfs2meta_fs_type = {
.name = "gfs2meta",
@@ -1432,4 +1434,4 @@ struct file_system_type gfs2meta_fs_type = {
.mount = gfs2_mount_meta,
.owner = THIS_MODULE,
};
-
+MODULE_ALIAS_FS("gfs2meta");
diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index e93ddaa..bbaaa8a 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -466,6 +466,7 @@ static struct file_system_type hfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("hfs");

static void hfs_init_once(void *p)
{
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index 974c26f..7b87284 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -654,6 +654,7 @@ static struct file_system_type hfsplus_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("hfsplus");

static void hfsplus_init_once(void *p)
{
diff --git a/fs/hppfs/hppfs.c b/fs/hppfs/hppfs.c
index 74f5570..126d3c2 100644
--- a/fs/hppfs/hppfs.c
+++ b/fs/hppfs/hppfs.c
@@ -748,6 +748,7 @@ static struct file_system_type hppfs_type = {
.kill_sb = kill_anon_super,
.fs_flags = 0,
};
+MODULE_ALIAS_FS("hppfs");

static int __init init_hppfs(void)
{
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 7f94e0c..84e3d85 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -896,6 +896,7 @@ static struct file_system_type hugetlbfs_fs_type = {
.mount = hugetlbfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("hugetlbfs");

static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE];

diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index 67ce525..a67f16e 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -1556,6 +1556,7 @@ static struct file_system_type iso9660_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("iso9660");

static int __init init_iso9660_fs(void)
{
@@ -1593,5 +1594,3 @@ static void __exit exit_iso9660_fs(void)
module_init(init_iso9660_fs)
module_exit(exit_iso9660_fs)
MODULE_LICENSE("GPL");
-/* Actual filesystem name is iso9660, as requested in filesystems.c */
-MODULE_ALIAS("iso9660");
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index d3d8799..0defb1c 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -356,6 +356,7 @@ static struct file_system_type jffs2_fs_type = {
.mount = jffs2_mount,
.kill_sb = jffs2_kill_sb,
};
+MODULE_ALIAS_FS("jffs2");

static int __init init_jffs2_fs(void)
{
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 060ba63..2003e83 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -833,6 +833,7 @@ static struct file_system_type jfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("jfs");

static void init_once(void *foo)
{
diff --git a/fs/logfs/super.c b/fs/logfs/super.c
index 345c24b..5436029 100644
--- a/fs/logfs/super.c
+++ b/fs/logfs/super.c
@@ -608,6 +608,7 @@ static struct file_system_type logfs_fs_type = {
.fs_flags = FS_REQUIRES_DEV,

};
+MODULE_ALIAS_FS("logfs");

static int __init logfs_init(void)
{
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 99541cc..df12249 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -660,6 +660,7 @@ static struct file_system_type minix_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("minix");

static int __init init_minix_fs(void)
{
diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
index 7dafd6899..26910c8 100644
--- a/fs/ncpfs/inode.c
+++ b/fs/ncpfs/inode.c
@@ -1051,6 +1051,7 @@ static struct file_system_type ncp_fs_type = {
.kill_sb = kill_anon_super,
.fs_flags = FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("ncpfs");

static int __init init_ncp_fs(void)
{
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 17b32b7..95cdcb2 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -294,6 +294,7 @@ struct file_system_type nfs_fs_type = {
.kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("nfs");
EXPORT_SYMBOL_GPL(nfs_fs_type);

struct file_system_type nfs_xdev_fs_type = {
@@ -333,6 +334,7 @@ struct file_system_type nfs4_fs_type = {
.kill_sb = nfs_kill_super,
.fs_flags = FS_RENAME_DOES_D_MOVE|FS_BINARY_MOUNTDATA,
};
+MODULE_ALIAS_FS("nfs4");
EXPORT_SYMBOL_GPL(nfs4_fs_type);

static int __init register_nfs4_fs(void)
@@ -2717,6 +2719,5 @@ module_param(send_implementation_id, ushort, 0644);
MODULE_PARM_DESC(send_implementation_id,
"Send implementation ID with NFSv4.1 exchange_id");
MODULE_PARM_DESC(nfs4_unique_id, "nfs_client_id4 uniquifier string");
-MODULE_ALIAS("nfs4");

#endif /* CONFIG_NFS_V4 */
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 13a21c8..f33455b 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1090,6 +1090,7 @@ static struct file_system_type nfsd_fs_type = {
.mount = nfsd_mount,
.kill_sb = nfsd_umount,
};
+MODULE_ALIAS_FS("nfsd");

#ifdef CONFIG_PROC_FS
static int create_proc_exports_entry(void)
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index 3c991dc..c7d1f9f 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -1361,6 +1361,7 @@ struct file_system_type nilfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("nilfs2");

static void nilfs_inode_init_once(void *obj)
{
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index 4a8289f8..82650d5 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -3079,6 +3079,7 @@ static struct file_system_type ntfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ntfs");

/* Stable names for the slab caches. */
static const char ntfs_index_ctx_cache_name[] = "ntfs_index_ctx_cache";
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index 4c5fc8d..12bafb7 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -640,6 +640,7 @@ static struct file_system_type dlmfs_fs_type = {
.mount = dlmfs_mount,
.kill_sb = kill_litter_super,
};
+MODULE_ALIAS_FS("ocfs2_dlmfs");

static int __init init_dlmfs_fs(void)
{
diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c
index 25d715c..d8b0afd 100644
--- a/fs/omfs/inode.c
+++ b/fs/omfs/inode.c
@@ -572,6 +572,7 @@ static struct file_system_type omfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("omfs");

static int __init init_omfs_fs(void)
{
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index ae47fa7..75885ff 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -432,6 +432,7 @@ static struct file_system_type openprom_fs_type = {
.mount = openprom_mount,
.kill_sb = kill_anon_super,
};
+MODULE_ALIAS_FS("openpromfs");

static void op_inode_init_once(void *data)
{
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index 43098bb..2e8caa6 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -412,6 +412,7 @@ static struct file_system_type qnx4_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("qnx4");

static int __init init_qnx4_fs(void)
{
diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 57199a5..8d941ed 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -672,6 +672,7 @@ static struct file_system_type qnx6_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("qnx6");

static int __init init_qnx6_fs(void)
{
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 418bdc3..194113b 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -2434,6 +2434,7 @@ struct file_system_type reiserfs_fs_type = {
.kill_sb = reiserfs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("reiserfs");

MODULE_DESCRIPTION("ReiserFS journaled filesystem");
MODULE_AUTHOR("Hans Reiser <[email protected]>");
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index 7e8d3a8..15cbc41 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -599,6 +599,7 @@ static struct file_system_type romfs_fs_type = {
.kill_sb = romfs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("romfs");

/*
* inode storage initialiser
diff --git a/fs/sysv/super.c b/fs/sysv/super.c
index a38e87b..a39938b 100644
--- a/fs/sysv/super.c
+++ b/fs/sysv/super.c
@@ -545,6 +545,7 @@ static struct file_system_type sysv_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("sysv");

static struct file_system_type v7_fs_type = {
.owner = THIS_MODULE,
@@ -553,6 +554,7 @@ static struct file_system_type v7_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("v7");

static int __init init_sysv_fs(void)
{
@@ -586,5 +588,4 @@ static void __exit exit_sysv_fs(void)

module_init(init_sysv_fs)
module_exit(exit_sysv_fs)
-MODULE_ALIAS("v7");
MODULE_LICENSE("GPL");
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index ddc0f6a..ac838b8 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2174,6 +2174,7 @@ static struct file_system_type ubifs_fs_type = {
.mount = ubifs_mount,
.kill_sb = kill_ubifs_super,
};
+MODULE_ALIAS_FS("ubifs");

/*
* Inode slab cache constructor.
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index dc8e3a8..329f2f5 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1500,6 +1500,7 @@ static struct file_system_type ufs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("ufs");

static int __init init_ufs_fs(void)
{
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c407121..ea341ce 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1561,6 +1561,7 @@ static struct file_system_type xfs_fs_type = {
.kill_sb = kill_block_super,
.fs_flags = FS_REQUIRES_DEV,
};
+MODULE_ALIAS_FS("xfs");

STATIC int __init
xfs_init_zones(void)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 74a907b..2c28271 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1825,6 +1825,8 @@ struct file_system_type {
struct lock_class_key i_mutex_dir_key;
};

+#define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME)
+
extern struct dentry *mount_ns(struct file_system_type *fs_type, int flags,
void *data, int (*fill_super)(struct super_block *, void *, int));
extern struct dentry *mount_bdev(struct file_system_type *fs_type,
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 7b9b402..a0f48a5 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1174,6 +1174,7 @@ static struct file_system_type rpc_pipe_fs_type = {
.mount = rpc_mount,
.kill_sb = rpc_kill_sb,
};
+MODULE_ALIAS_FS("rpc_pipefs");

static void
init_once(void *foo)
@@ -1218,6 +1219,3 @@ void unregister_rpc_pipefs(void)
kmem_cache_destroy(rpc_inode_cachep);
unregister_filesystem(&rpc_pipe_fs_type);
}
-
-/* Make 'mount -t rpc_pipefs ...' autoload this module. */
-MODULE_ALIAS("rpc_pipefs");
--
1.7.5.4

2013-03-04 08:39:20

by Mathias Krause

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Sun, Mar 03, 2013 at 09:48:50AM -0800, Kees Cook wrote:
> Several subsystems already have an implicit subsystem restriction
> because they load with aliases. (e.g. binfmt-XXXX, net-pf=NNN,
> snd-card-NNN, FOO-iosched, etc). This isn't the case for filesystems
> and a few others, unfortunately:
>
> $ git grep 'request_module("%.*s"' | grep -vi prefix
> crypto/api.c: request_module("%s", name);
>
> [...]
>
> Several of these come from hardcoded values, though (e.g. crypto, chipreg).

Well, crypto does not. Try the code snippet below on a system with
CONFIG_CRYPTO_USER_API=y. It'll abuse the above request_module() call
to load any module the user requests -- iregardless of being contained
in a user ns or not.

---8<---
/* Loading arbitrary modules using crypto api since v2.6.38
*
* - minipli
*/
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>

#ifndef AF_ALG
#define AF_ALG 38
#endif


int main(int argc, char **argv) {
struct sockaddr_alg sa_alg = {
.salg_family = AF_ALG,
.salg_type = "hash",
};
int sock;

if (argc != 2) {
printf("usage: %s MODULE_NAME\n", argv[0]);
exit(1);
}

sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
if (sock < 0) {
perror("socket(AF_ALG)");
exit(1);
}

strncpy((char *) sa_alg.salg_name, argv[1], sizeof(sa_alg.salg_name));
bind(sock, (struct sockaddr *) &sa_alg, sizeof(sa_alg));
close(sock);

return 0;
}
--->8---

If people care about unprivileged users not being able to load arbitrary
modules, could someone please fix this in crypto API, then? Herbert?


Thanks,
Mathias

2013-03-04 16:46:04

by Kees Cook

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Mon, Mar 4, 2013 at 12:29 AM, Mathias Krause <[email protected]> wrote:
> On Sun, Mar 03, 2013 at 09:48:50AM -0800, Kees Cook wrote:
>> Several subsystems already have an implicit subsystem restriction
>> because they load with aliases. (e.g. binfmt-XXXX, net-pf=NNN,
>> snd-card-NNN, FOO-iosched, etc). This isn't the case for filesystems
>> and a few others, unfortunately:
>>
>> $ git grep 'request_module("%.*s"' | grep -vi prefix
>> crypto/api.c: request_module("%s", name);
>>
>> [...]
>>
>> Several of these come from hardcoded values, though (e.g. crypto, chipreg).
>
> Well, crypto does not. Try the code snippet below on a system with
> CONFIG_CRYPTO_USER_API=y. It'll abuse the above request_module() call
> to load any module the user requests -- iregardless of being contained
> in a user ns or not.

Oh ew. Yeah, I must have missed the path through the user api. Arg.

> ---8<---
> /* Loading arbitrary modules using crypto api since v2.6.38
> *
> * - minipli
> */
> #include <linux/if_alg.h>
> #include <sys/socket.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
>
> #ifndef AF_ALG
> #define AF_ALG 38
> #endif
>
>
> int main(int argc, char **argv) {
> struct sockaddr_alg sa_alg = {
> .salg_family = AF_ALG,
> .salg_type = "hash",
> };
> int sock;
>
> if (argc != 2) {
> printf("usage: %s MODULE_NAME\n", argv[0]);
> exit(1);
> }
>
> sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
> if (sock < 0) {
> perror("socket(AF_ALG)");
> exit(1);
> }
>
> strncpy((char *) sa_alg.salg_name, argv[1], sizeof(sa_alg.salg_name));
> bind(sock, (struct sockaddr *) &sa_alg, sizeof(sa_alg));
> close(sock);
>
> return 0;
> }
> --->8---
>
> If people care about unprivileged users not being able to load arbitrary
> modules, could someone please fix this in crypto API, then? Herbert?

So, should this get a prefix too? Maybe we need to change the
request_module primitive to request_module(prefix, fmt, args) to stop
these request_module("%s", name) things from continuing to exist...

-Kees

--
Kees Cook
Chrome OS Security

2013-03-04 17:36:20

by Vasily Kulikov

[permalink] [raw]
Subject: Re: [PATCH 2/2] fs: Limit sys_mount to only request filesystem modules.

(cc'ed kernel-hardening)

On Sun, Mar 03, 2013 at 23:51 -0800, Eric W. Biederman wrote:
> Modify the request_module to prefix the file system type with "fs-"
> and add aliases to all of the filesystems that can be built as modules
> to match.
>
> A common practice is to build all of the kernel code and leave code
> that is not commonly needed as modules, with the result that many
> users are exposed to any bug anywhere in the kernel.
>
> Looking for filesystems with a fs- prefix limits the pool of possible
> modules that can be loaded by mount to just filesystems trivially
> making things safer with no real cost.
>
> Using aliases means user space can control the policy of which
> filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
> with blacklist and alias directives. Allowing simple, safe,
> well understood work-arounds to known problematic software.
>
> This also addresses a rare but unfortunate problem where the filesystem
> name is not the same as it's module name and module auto-loading
> would not work. While writing this patch I saw a handful of such
> cases. The most significant being autofs that lives in the module
> autofs4.
>
> This is relevant to user namespaces because we can reach the request
> module in get_fs_type() without having any special permissions, and
> people get uncomfortable when a user specified string (in this case
> the filesystem type) goes all of the way to request_module.
>
> After having looked at this issue I don't think there is any
> particular reason to perform any filtering or permission checks beyond
> making it clear in the module request that we want a filesystem
> module. The common pattern in the kernel is to call request_module()
> without regards to the users permissions. In general all a filesystem
> module does once loaded is call register_filesystem() and go to sleep.
> Which means there is not much attack surface exposed by loading a
> filesytem module unless the filesystem is mounted. In a user
> namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
> which most filesystems do not set today.
>
> Acked-by: Serge Hallyn <[email protected]>
> Acked-by: Kees Cook <[email protected]>
> Reported-by: Kees Cook <[email protected]>
> Signed-off-by: "Eric W. Biederman" <[email protected]>
...
> diff --git a/fs/filesystems.c b/fs/filesystems.c
> index da165f6..92567d9 100644
> --- a/fs/filesystems.c
> +++ b/fs/filesystems.c
> @@ -273,7 +273,7 @@ struct file_system_type *get_fs_type(const char *name)
> int len = dot ? dot - name : strlen(name);
>
> fs = __get_fs_type(name, len);
> - if (!fs && (request_module("%.*s", len, name) == 0))
> + if (!fs && (request_module("fs-%.*s", len, name) == 0))
> fs = __get_fs_type(name, len);
>
> if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {

Maybe we should divide request_module() into several functions regarding
expected caller's privileges?

- request_module() for CAP_SYS_MODULE in init_ns
- request_module_relaxed() for everybody

request_module_relaxed() is used in get_fs_type(), dev_load() and all
places where the safety of module loading is manually checked. All old
not yet checked users of request_module() will not be triggerable from user_ns.
That's the same scheme as with capable() and ns_capable().

Thanks,

--
Vasily Kulikov
http://www.openwall.com - bringing security into open computing environments

2013-03-04 18:21:46

by Eric W. Biederman

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

Kees Cook <[email protected]> writes:

> On Mon, Mar 4, 2013 at 12:29 AM, Mathias Krause <[email protected]> wrote:
>> On Sun, Mar 03, 2013 at 09:48:50AM -0800, Kees Cook wrote:
>>> Several subsystems already have an implicit subsystem restriction
>>> because they load with aliases. (e.g. binfmt-XXXX, net-pf=NNN,
>>> snd-card-NNN, FOO-iosched, etc). This isn't the case for filesystems
>>> and a few others, unfortunately:
>>>
>>> $ git grep 'request_module("%.*s"' | grep -vi prefix
>>> crypto/api.c: request_module("%s", name);
>>>
>>> [...]
>>>
>>> Several of these come from hardcoded values, though (e.g. crypto, chipreg).
>>
>> Well, crypto does not. Try the code snippet below on a system with
>> CONFIG_CRYPTO_USER_API=y. It'll abuse the above request_module() call
>> to load any module the user requests -- iregardless of being contained
>> in a user ns or not.
>
> Oh ew. Yeah, I must have missed the path through the user api. Arg.

I will let someone else write the patch that adds the module aliases to
crypto.

It seems worth doing even outside of any security concerns as it just
makes the reqest to modprobe make more sense, and allows the existing
modprobe policy controls to work.

Whereas an ill-formed string just doesn't tell modprobe enough to really
act intelligently.

>> ---8<---
>> /* Loading arbitrary modules using crypto api since v2.6.38
>> *
>> * - minipli
>> */
>> #include <linux/if_alg.h>
>> #include <sys/socket.h>
>> #include <unistd.h>
>> #include <stdlib.h>
>> #include <string.h>
>> #include <stdio.h>
>>
>> #ifndef AF_ALG
>> #define AF_ALG 38
>> #endif
>>
>>
>> int main(int argc, char **argv) {
>> struct sockaddr_alg sa_alg = {
>> .salg_family = AF_ALG,
>> .salg_type = "hash",
>> };
>> int sock;
>>
>> if (argc != 2) {
>> printf("usage: %s MODULE_NAME\n", argv[0]);
>> exit(1);
>> }
>>
>> sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
>> if (sock < 0) {
>> perror("socket(AF_ALG)");
>> exit(1);
>> }
>>
>> strncpy((char *) sa_alg.salg_name, argv[1], sizeof(sa_alg.salg_name));
>> bind(sock, (struct sockaddr *) &sa_alg, sizeof(sa_alg));
>> close(sock);
>>
>> return 0;
>> }
>> --->8---
>>
>> If people care about unprivileged users not being able to load arbitrary
>> modules, could someone please fix this in crypto API, then? Herbert?
>
> So, should this get a prefix too? Maybe we need to change the
> request_module primitive to request_module(prefix, fmt, args) to stop
> these request_module("%s", name) things from continuing to exist...

Something like the patch below?

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 56dd349..859aa3a 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -131,6 +131,10 @@ int __request_module(bool wait, const char *fmt, ...)
#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */
static int kmod_loop_msg;

+ /* Require that calls to request module have a little structure */
+ if (fmt[0] == '%')
+ return -EINVAL;
+
/*
* We don't allow synchronous module loading from async. Module
* init may invoke async_synchronize_full() which will end up

2013-03-04 18:36:15

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 2/2] fs: Limit sys_mount to only request filesystem modules.

Vasily Kulikov <[email protected]> writes:

> (cc'ed kernel-hardening)
>
> On Sun, Mar 03, 2013 at 23:51 -0800, Eric W. Biederman wrote:
>> Modify the request_module to prefix the file system type with "fs-"
>> and add aliases to all of the filesystems that can be built as modules
>> to match.
>>
>> A common practice is to build all of the kernel code and leave code
>> that is not commonly needed as modules, with the result that many
>> users are exposed to any bug anywhere in the kernel.
>>
>> Looking for filesystems with a fs- prefix limits the pool of possible
>> modules that can be loaded by mount to just filesystems trivially
>> making things safer with no real cost.
>>
>> Using aliases means user space can control the policy of which
>> filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
>> with blacklist and alias directives. Allowing simple, safe,
>> well understood work-arounds to known problematic software.
>>
>> This also addresses a rare but unfortunate problem where the filesystem
>> name is not the same as it's module name and module auto-loading
>> would not work. While writing this patch I saw a handful of such
>> cases. The most significant being autofs that lives in the module
>> autofs4.
>>
>> This is relevant to user namespaces because we can reach the request
>> module in get_fs_type() without having any special permissions, and
>> people get uncomfortable when a user specified string (in this case
>> the filesystem type) goes all of the way to request_module.
>>
>> After having looked at this issue I don't think there is any
>> particular reason to perform any filtering or permission checks beyond
>> making it clear in the module request that we want a filesystem
>> module. The common pattern in the kernel is to call request_module()
>> without regards to the users permissions. In general all a filesystem
>> module does once loaded is call register_filesystem() and go to sleep.
>> Which means there is not much attack surface exposed by loading a
>> filesytem module unless the filesystem is mounted. In a user
>> namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
>> which most filesystems do not set today.
>>
>> Acked-by: Serge Hallyn <[email protected]>
>> Acked-by: Kees Cook <[email protected]>
>> Reported-by: Kees Cook <[email protected]>
>> Signed-off-by: "Eric W. Biederman" <[email protected]>
> ...
>> diff --git a/fs/filesystems.c b/fs/filesystems.c
>> index da165f6..92567d9 100644
>> --- a/fs/filesystems.c
>> +++ b/fs/filesystems.c
>> @@ -273,7 +273,7 @@ struct file_system_type *get_fs_type(const char *name)
>> int len = dot ? dot - name : strlen(name);
>>
>> fs = __get_fs_type(name, len);
>> - if (!fs && (request_module("%.*s", len, name) == 0))
>> + if (!fs && (request_module("fs-%.*s", len, name) == 0))
>> fs = __get_fs_type(name, len);
>>
>> if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
>
> Maybe we should divide request_module() into several functions regarding
> expected caller's privileges?
>
> - request_module() for CAP_SYS_MODULE in init_ns
> - request_module_relaxed() for everybody
>
> request_module_relaxed() is used in get_fs_type(), dev_load() and all
> places where the safety of module loading is manually checked. All old
> not yet checked users of request_module() will not be triggerable from user_ns.
> That's the same scheme as with capable() and ns_capable().

User namespaces in this discussion are pretty much a red-herring. You
can already reach most request_module callers without having any
capabilities. And honestly that seems fine.

It never ever hurts to request a module.

It only sometimes when something else has already gone wrong hurts to
get the module.

It makes sense to add a prefix when sending the module request to make
it clear what kind of module we are looking for. That makes it easy to
tell why we are requesting the module and makes it easy to implement
policy controls in userspace.

I don't see any reason to limit request_module to people with some
capability or other. The filesystem module_request just happened to be
after a capable(CAP_SYS_AMDIN) in this case which is the case people
noticed was a little fishy.

But if I have overlooked something I am happy to hear it.

Eric

2013-03-04 18:41:10

by Kees Cook

[permalink] [raw]
Subject: Re: user ns: arbitrary module loading

On Mon, Mar 4, 2013 at 10:21 AM, Eric W. Biederman
<[email protected]> wrote:
> Kees Cook <[email protected]> writes:
>
>> On Mon, Mar 4, 2013 at 12:29 AM, Mathias Krause <[email protected]> wrote:
>>> On Sun, Mar 03, 2013 at 09:48:50AM -0800, Kees Cook wrote:
>>>> Several subsystems already have an implicit subsystem restriction
>>>> because they load with aliases. (e.g. binfmt-XXXX, net-pf=NNN,
>>>> snd-card-NNN, FOO-iosched, etc). This isn't the case for filesystems
>>>> and a few others, unfortunately:
>>>>
>>>> $ git grep 'request_module("%.*s"' | grep -vi prefix
>>>> crypto/api.c: request_module("%s", name);
>>>>
>>>> [...]
>>>>
>>>> Several of these come from hardcoded values, though (e.g. crypto, chipreg).
>>>
>>> Well, crypto does not. Try the code snippet below on a system with
>>> CONFIG_CRYPTO_USER_API=y. It'll abuse the above request_module() call
>>> to load any module the user requests -- iregardless of being contained
>>> in a user ns or not.
>>
>> Oh ew. Yeah, I must have missed the path through the user api. Arg.
>
> I will let someone else write the patch that adds the module aliases to
> crypto.
>
> It seems worth doing even outside of any security concerns as it just
> makes the reqest to modprobe make more sense, and allows the existing
> modprobe policy controls to work.
>
> Whereas an ill-formed string just doesn't tell modprobe enough to really
> act intelligently.
>
>>> ---8<---
>>> /* Loading arbitrary modules using crypto api since v2.6.38
>>> *
>>> * - minipli
>>> */
>>> #include <linux/if_alg.h>
>>> #include <sys/socket.h>
>>> #include <unistd.h>
>>> #include <stdlib.h>
>>> #include <string.h>
>>> #include <stdio.h>
>>>
>>> #ifndef AF_ALG
>>> #define AF_ALG 38
>>> #endif
>>>
>>>
>>> int main(int argc, char **argv) {
>>> struct sockaddr_alg sa_alg = {
>>> .salg_family = AF_ALG,
>>> .salg_type = "hash",
>>> };
>>> int sock;
>>>
>>> if (argc != 2) {
>>> printf("usage: %s MODULE_NAME\n", argv[0]);
>>> exit(1);
>>> }
>>>
>>> sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
>>> if (sock < 0) {
>>> perror("socket(AF_ALG)");
>>> exit(1);
>>> }
>>>
>>> strncpy((char *) sa_alg.salg_name, argv[1], sizeof(sa_alg.salg_name));
>>> bind(sock, (struct sockaddr *) &sa_alg, sizeof(sa_alg));
>>> close(sock);
>>>
>>> return 0;
>>> }
>>> --->8---
>>>
>>> If people care about unprivileged users not being able to load arbitrary
>>> modules, could someone please fix this in crypto API, then? Herbert?
>>
>> So, should this get a prefix too? Maybe we need to change the
>> request_module primitive to request_module(prefix, fmt, args) to stop
>> these request_module("%s", name) things from continuing to exist...
>
> Something like the patch below?
>
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 56dd349..859aa3a 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -131,6 +131,10 @@ int __request_module(bool wait, const char *fmt, ...)
> #define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */
> static int kmod_loop_msg;
>
> + /* Require that calls to request module have a little structure */
> + if (fmt[0] == '%')
> + return -EINVAL;
> +
> /*
> * We don't allow synchronous module loading from async. Module
> * init may invoke async_synchronize_full() which will end up

Something like that, but that'll break some things that do stuff like %s-suffix.

-Kees

--
Kees Cook
Chrome OS Security

2013-03-05 19:07:10

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 2/2] fs: Limit sys_mount to only request filesystem modules.

On Mon, Mar 4, 2013 at 8:51 AM, Eric W. Biederman <[email protected]> wrote:
>
> Modify the request_module to prefix the file system type with "fs-"
> and add aliases to all of the filesystems that can be built as modules
> to match.
>
> A common practice is to build all of the kernel code and leave code
> that is not commonly needed as modules, with the result that many
> users are exposed to any bug anywhere in the kernel.
>
> Looking for filesystems with a fs- prefix limits the pool of possible
> modules that can be loaded by mount to just filesystems trivially
> making things safer with no real cost.

'-' is a commonly used part of a module name, and does not mix well
with ramdom user provided names.

We usually use ':' as the prefix separator for modaliases, when
user-supplied strings are prefixed with the subsystem.

I think it would be nicer to change that, and I'm sure some creative
guy calls the next filesystem of the month fs-$something :)

Kay

2013-03-05 19:32:03

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 2/2] fs: Limit sys_mount to only request filesystem modules.

On Tue, Mar 5, 2013 at 11:06 AM, Kay Sievers <[email protected]> wrote:
> On Mon, Mar 4, 2013 at 8:51 AM, Eric W. Biederman <[email protected]> wrote:
>>
>> Modify the request_module to prefix the file system type with "fs-"
>> and add aliases to all of the filesystems that can be built as modules
>> to match.
>>
>> A common practice is to build all of the kernel code and leave code
>> that is not commonly needed as modules, with the result that many
>> users are exposed to any bug anywhere in the kernel.
>>
>> Looking for filesystems with a fs- prefix limits the pool of possible
>> modules that can be loaded by mount to just filesystems trivially
>> making things safer with no real cost.
>
> '-' is a commonly used part of a module name, and does not mix well
> with ramdom user provided names.
>
> We usually use ':' as the prefix separator for modaliases, when
> user-supplied strings are prefixed with the subsystem.
>
> I think it would be nicer to change that, and I'm sure some creative
> guy calls the next filesystem of the month fs-$something :)

The precedent is "-". "netdev-" "pf-net-" etc. Naming something
fs-$something is fine as long as it's actually a filesystem. :)

-Kees

--
Kees Cook
Chrome OS Security

2013-03-05 23:24:22

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 2/2] fs: Limit sys_mount to only request filesystem modules.

Kay Sievers <[email protected]> writes:

> On Mon, Mar 4, 2013 at 8:51 AM, Eric W. Biederman <[email protected]> wrote:
>>
>> Modify the request_module to prefix the file system type with "fs-"
>> and add aliases to all of the filesystems that can be built as modules
>> to match.
>>
>> A common practice is to build all of the kernel code and leave code
>> that is not commonly needed as modules, with the result that many
>> users are exposed to any bug anywhere in the kernel.
>>
>> Looking for filesystems with a fs- prefix limits the pool of possible
>> modules that can be loaded by mount to just filesystems trivially
>> making things safer with no real cost.
>
> '-' is a commonly used part of a module name, and does not mix well
> with ramdom user provided names.

The symbols '-' and '_' occur in 2382 out of 3968 modules from an
allmodconfig build, and modprobe ignores the difference between the two.
However only three of those modules begin with fs and none of them begin
with fs-.

Furthermore if it actually becomes a concern to ensure we are talking
about an alias rather than a real module name, the solution is to
change how we call modprobe. As long as we are in the same namespace
something can go wrong.

fs- seems sufficiently unique for the purpose.

> We usually use ':' as the prefix separator for modaliases, when
> user-supplied strings are prefixed with the subsystem.

There are at least two different conventions in use. For software
subsystems like the networking stack '-' is the commonly used
to separate the prefix. For hardware specific subsystems ':' is
commonly used. What I really don't want to load here are hardware
modules so using a hardware module style convention does not seem like
the right way to go.

> I think it would be nicer to change that, and I'm sure some creative
> guy calls the next filesystem of the month fs-$something :)

If it is a filesystem it simply does not matter. The goal is to
only load filesystems.

If it is not a filesystem someone has choosen a confusing naming
convention.

If it turns out I am wrong it is a two line change.

Eric