2011-02-18 16:17:48

by Steven Liu

[permalink] [raw]
Subject: [PATCH 1/2] add ->mount function introduction into Documentation/filesystems/porting

Hi Alexander Viro,

I have add the fstype->mount introduction into
Documentation/filesystems/porting

Can this patch fixed in?


add ->mount function introduction into Documentation/filesystems/porting
and note that the vfs will replace ->get_sb to ->mount


Signed-off-by: LiuQi <[email protected]>
---
Documentation/filesystems/vfs.txt | 91 +++++++++++++++++++++++++++++++------
1 files changed, 76 insertions(+), 15 deletions(-)

diff --git a/Documentation/filesystems/vfs.txt
b/Documentation/filesystems/vfs.txt
index 94cf97b..ba850e3 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -96,9 +96,9 @@ functions:

The passed struct file_system_type describes your filesystem. When a
request is made to mount a device onto a directory in your filespace,
-the VFS will call the appropriate get_sb() method for the specific
-filesystem. The dentry for the mount point will then be updated to
-point to the root inode for the new filesystem.
+the VFS will call the appropriate get_sb() or mount() method for
+the specific filesystem. The dentry for the mount point will then
+be updated to point to the root inode for the new filesystem.

You can see all filesystems that are registered to the kernel in the
file /proc/filesystems.
@@ -107,20 +107,30 @@ file /proc/filesystems.
struct file_system_type
-----------------------

-This describes the filesystem. As of kernel 2.6.22, the following
+This describes the filesystem. As of kernel 2.6.38, the following
members are defined:

+
struct file_system_type {
- const char *name;
- int fs_flags;
- int (*get_sb) (struct file_system_type *, int,
- const char *, void *, struct vfsmount *);
- void (*kill_sb) (struct super_block *);
- struct module *owner;
- struct file_system_type * next;
- struct list_head fs_supers;
- struct lock_class_key s_lock_key;
- struct lock_class_key s_umount_key;
+ const char *name;
+ int fs_flags;
+ int (*get_sb) (struct file_system_type *, int,
+ const char *, void *, struct vfsmount *);
+ struct dentry *(*mount) (struct file_system_type *, int,
+ const char *, void *);
+ void (*kill_sb) (struct super_block *);
+ struct module *owner;
+ struct file_system_type * next;
+ struct list_head fs_supers;
+
+ struct lock_class_key s_lock_key;
+ struct lock_class_key s_umount_key;
+ struct lock_class_key s_vfs_rename_key;
+
+ struct lock_class_key i_lock_key;
+ struct lock_class_key i_mutex_key;
+ struct lock_class_key i_mutex_dir_key;
+ struct lock_class_key i_alloc_sem_key;
};

name: the name of the filesystem type, such as "ext2", "iso9660",
@@ -131,6 +141,9 @@ struct file_system_type {
get_sb: the method to call when a new instance of this
filesystem should be mounted

+ mount: the method to call when a new instance of this
+ filesystem should be mounted(*NOTE* linux will remove get_sb soon)
+
kill_sb: the method to call when an instance of this filesystem
should be unmounted

@@ -139,7 +152,8 @@ struct file_system_type {

next: for internal VFS use: you should initialize this to NULL

- s_lock_key, s_umount_key: lockdep-specific
+ s_lock_key, s_umount_key, s_vfs_rename_key, i_lock_key, i_mutex_key,
+ i_mutex_dir_key, i_alloc_sem_key: lockdep-specific

The get_sb() method has the following arguments:

@@ -187,6 +201,53 @@ A fill_super() method implementation has the
following arguments:
int silent: whether or not to be silent on error


+
+The mount() method has the following arguments:
+
+ struct file_system_type *fs_type: describes the filesystem, partly
initialized
+ by the specific filesystem code
+
+ int flags: mount flags
+
+ const char *dev_name: the device name we are mounting.
+
+ void *data: arbitrary mount options, usually comes as an ASCII
+ string (see "Mount Options" section)
+
+
+The mount() method must determine if the block device specified
+in the dev_name and fs_type contains a filesystem of the type the method
+supports. If it succeeds in opening the named block device, it initializes a
+struct super_block descriptor for the filesystem contained by the block device.
+On failure it returns an error.
+
+The most interesting member of the superblock structure that the
+mount() method fills in is the "s_op" field. This is a pointer to
+a "struct super_operations" which describes the next level of the
+filesystem implementation.
+
+Usually, a filesystem uses one of the generic mount() implementations
+and provides a fill_super() method instead. The generic methods are:
+
+ mount_bdev: mount a filesystem residing on a block device
+
+ mount_nodev: mount a filesystem that is not backed by a device
+
+ mount_single: mount a filesystem which shares the instance between
+ all mounts
+
+A fill_super() method implementation has the following arguments:
+
+ struct super_block *sb: the superblock structure. The method fill_super()
+ must initialize this properly.
+
+ void *data: arbitrary mount options, usually comes as an ASCII
+ string (see "Mount Options" section)
+
+ int silent: whether or not to be silent on error
+
+
+
The Superblock Object
=====================

--
1.7.2



Best Regards


Steven Liu


2011-02-18 18:21:55

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 1/2] add ->mount function introduction into Documentation/filesystems/porting

On Sat, Feb 19, 2011 at 12:17:44AM +0800, Steven Liu wrote:
> +the VFS will call the appropriate get_sb() or mount() method for
> +the specific filesystem. The dentry for the mount point will then
> +be updated to point to the root inode for the new filesystem.

What do you mean, "updated"? What happens is this
* filesystem driver will get the arguments of mount(2) telling
what to mount and return you a reference to dentry in the root of
the (sub)tree you have asked for. Depending on the filesystem type
and argument it might be on an already existing fs or on a new one.
In the latter case filesystem driver will take care of creating a new fs
(getting superblock allocated, filled, etc.)
* VFS will create a new vfsmount refering to that subtree and
attach it to the mountpoint you have given.

> +The mount() method must determine if the block device specified
> +in the dev_name and fs_type contains a filesystem of the type the method
> +supports. If it succeeds in opening the named block device, it initializes a
> +struct super_block descriptor for the filesystem contained by the block device.
> +On failure it returns an error.

No. Interpretation of ->mount() arguments is entirely up to ->mount().
Many filesystem types interpret dev_name as pathname of block device to be
opened, with the filesystem backed by that device. It is by no means
mandatory, _even_ _for_ _block-based_ _fs_. There are filesystems that
choose to do it differently. What you've described is mount_bdev().

->mount() is NOT a superblock constructor. It may have to create one,
but that's up to fs. For everybody outside its job is to give you
a dentry (sub)tree. If it's going to be on a new superblock, so be it,
but that's really not a concern of VFS. It _will_ grab a reference to
whatever superblock these dentries live on and that reference will be
kept as long as vfsmount lives, but that's just a "don't shut that
one down as long as you want that dentry tree around" thing.

Note that even mount_bdev() may return an old struct super_block. Mount
e.g. ext2 from the same device twice (with the same flags) and you'll get
two vfsmounts with identical ->mnt_root/->mnt_sb. Things like e.g. sysfs
*always* return the same superblock. Things like nfs are in between -
depending on what options you give you might end up with an old superblock
or with a new one. Note that depending on the options you may get a new
subtree backed by old superblock.

If you are looking for superblock constructor, it's sget(). It finds a
superblock of given type satisfying given condition or creates a new one.
In any case, a reference to superblock is returned to caller locked.
grep and you'll see how it's used; that'll give you the instances of
->mount/->get_sb and helpers used by such.

Again, as far as VFS is concerned, the main purpose of ->mount() is to give
it a tree to attach; creation of new superblock is a common side effect,
but that's really up to fs. Hell, it doesn't even have to be of the same
type - see e.g. what cpuset is doing. And yes, it's perfectly legitimate.

> +Usually, a filesystem uses one of the generic mount() implementations
> +and provides a fill_super() method instead.

s/method/callback/

> The generic methods are:
s/methods/helpers/
> + mount_bdev: mount a filesystem residing on a block device
> +
> + mount_nodev: mount a filesystem that is not backed by a device
> +
> + mount_single: mount a filesystem which shares the instance between
> + all mounts


FWIW, file_system_type is a bit odd. I'm not sure it's worth messing with,
but essentially what we have there is a mix of several things.
a) ->kill_sb() and "behaviour" flags are really properties of
individual superblock. ->kill_sb probably belongs in ->super_operations.
b) (name, ->mount, ->owner) triple is how VFS finds which ->mount()
to call when asked to mount fs of this type and what module to pin down
for the duration. It's what /proc/filesystems refers to and "is that a block
one?" flag is relevant only for that (the only effect is whether we slap
"nodev" in the corresponding line of /proc/filesystems or put spaces in there -
really)
c) type for the purposes of sget(). This is a dynamic object -
collection of all superblocks of the same type. ->name and ->owner are
relevant here as well. Note that it's *NOT* necessary the same one we
talked to when we did mount(2) - see cpuset again, or weirder nfs ones.
The latter add superblocks into _usual_ nfs types, so you end up with
e.g. crossing server device boundary on nfs4, stepping into automount point
and calling vfs_kern_mount() on nfs4_xdev_fs_type. It calls nfs4_xdev_mount(),
which will do sget() in nfs4_fs_type. As the result, you get a superblock
of a different type - nfs4. As the matter of fact,

struct file_system_type nfs4_xdev_fs_type = {
.mount = nfs4_xdev_mount,
};

would suffice - the rest of its initializer is pure fluff. It's &nfs4_fs_type
that will be in s->s_type and it's nfs4_fs_type ->kill_sb() that will be
called, etc. Doesn't even need a name, since we never register it and refer
to the sucker directly...

It's probably not worth the effort trying to separate these objects. We'd
need more boilerplate code for very little additional clarity in the area
that isn't particulary tricky to start with.