2008-06-18 17:09:37

by Benjamin Thery

[permalink] [raw]
Subject: [PATCH 01/11] sysfs: Support for preventing unmounts.

sysfs: Support for preventing unmounts.

To support mounting multiple instances of sysfs occassionally I
need to walk through all of the currently present sysfs super blocks.

To allow this iteration this patch adds sysfs_grab_supers
and sysfs_release_supers. While a piece of code is in
a section surrounded by these no more sysfs super blocks
will be either created or destroyed.

Signed-off-by: Eric W. Biederman <[email protected]>
Signed-off-by: Benjamin Thery <[email protected]>
---
fs/sysfs/mount.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++------
fs/sysfs/sysfs.h | 10 ++++++
2 files changed, 81 insertions(+), 8 deletions(-)

Index: linux-mm/fs/sysfs/mount.c
===================================================================
--- linux-mm.orig/fs/sysfs/mount.c
+++ linux-mm/fs/sysfs/mount.c
@@ -41,47 +41,110 @@ struct sysfs_dirent sysfs_root = {

static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
{
- struct inode *inode;
- struct dentry *root;
+ struct sysfs_super_info *info = NULL;
+ struct inode *inode = NULL;
+ struct dentry *root = NULL;
+ int error;

sb->s_blocksize = PAGE_CACHE_SIZE;
sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
sb->s_magic = SYSFS_MAGIC;
sb->s_op = &sysfs_ops;
sb->s_time_gran = 1;
- sysfs_sb = sb;
+ if (!sysfs_sb)
+ sysfs_sb = sb;
+
+ error = -ENOMEM;
+ info = kzalloc(sizeof(*info), GFP_KERNEL);
+ if (!info)
+ goto out_err;

/* get root inode, initialize and unlock it */
+ error = -ENOMEM;
inode = sysfs_get_inode(&sysfs_root);
if (!inode) {
pr_debug("sysfs: could not get root inode\n");
- return -ENOMEM;
+ goto out_err;
}

/* instantiate and link root dentry */
+ error = -ENOMEM;
root = d_alloc_root(inode);
if (!root) {
pr_debug("%s: could not get root dentry!\n",__func__);
- iput(inode);
- return -ENOMEM;
+ goto out_err;
}
root->d_fsdata = &sysfs_root;
sb->s_root = root;
+ sb->s_fs_info = info;
return 0;
+
+out_err:
+ dput(root);
+ iput(inode);
+ kfree(info);
+ if (sysfs_sb == sb)
+ sysfs_sb = NULL;
+ return error;
}

static int sysfs_get_sb(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data, struct vfsmount *mnt)
{
- return get_sb_single(fs_type, flags, data, sysfs_fill_super, mnt);
+ int rc;
+ mutex_lock(&sysfs_rename_mutex);
+ rc = get_sb_single(fs_type, flags, data, sysfs_fill_super, mnt);
+ mutex_unlock(&sysfs_rename_mutex);
+ return rc;
}

-static struct file_system_type sysfs_fs_type = {
+struct file_system_type sysfs_fs_type = {
.name = "sysfs",
.get_sb = sysfs_get_sb,
.kill_sb = kill_anon_super,
};

+void sysfs_grab_supers(void)
+{
+ /* must hold sysfs_rename_mutex */
+ struct super_block *sb;
+ /* Loop until I have taken s_umount on all sysfs superblocks */
+restart:
+ spin_lock(&sb_lock);
+ list_for_each_entry(sb, &sysfs_fs_type.fs_supers, s_instances) {
+ if (sysfs_info(sb)->grabbed)
+ continue;
+ /* Wait for unmount activity to complete. */
+ if (sb->s_count < S_BIAS) {
+ sb->s_count += 1;
+ spin_unlock(&sb_lock);
+ down_read(&sb->s_umount);
+ drop_super(sb);
+ goto restart;
+ }
+ atomic_inc(&sb->s_active);
+ sysfs_info(sb)->grabbed = 1;
+ }
+ spin_unlock(&sb_lock);
+}
+
+void sysfs_release_supers(void)
+{
+ /* must hold sysfs_rename_mutex */
+ struct super_block *sb;
+restart:
+ spin_lock(&sb_lock);
+ list_for_each_entry(sb, &sysfs_fs_type.fs_supers, s_instances) {
+ if (!sysfs_info(sb)->grabbed)
+ continue;
+ sysfs_info(sb)->grabbed = 0;
+ spin_unlock(&sb_lock);
+ deactivate_super(sb);
+ goto restart;
+ }
+ spin_unlock(&sb_lock);
+}
+
int __init sysfs_init(void)
{
int err = -ENOMEM;
Index: linux-mm/fs/sysfs/sysfs.h
===================================================================
--- linux-mm.orig/fs/sysfs/sysfs.h
+++ linux-mm/fs/sysfs/sysfs.h
@@ -85,6 +85,12 @@ struct sysfs_addrm_cxt {
int cnt;
};

+struct sysfs_super_info {
+ int grabbed;
+};
+
+#define sysfs_info(SB) ((struct sysfs_super_info *)(SB)->s_fs_info)
+
/*
* mount.c
*/
@@ -92,6 +98,10 @@ extern struct sysfs_dirent sysfs_root;
extern struct super_block *sysfs_sb;
extern struct kmem_cache *sysfs_dir_cachep;
extern struct vfsmount *sysfs_mount;
+extern struct file_system_type sysfs_fs_type;
+
+void sysfs_grab_supers(void);
+void sysfs_release_supers(void);

/*
* dir.c

--


2008-06-18 17:44:50

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 01/11] sysfs: Support for preventing unmounts.

On Wed, 2008-06-18 at 19:07 +0200, Benjamin Thery wrote:
> To support mounting multiple instances of sysfs occassionally I
> need to walk through all of the currently present sysfs super blocks.

I know you may have addressed this before, but I forgot and it didn't
make it into the changelogs.

Why are you doing this again? It seems like an awfully blunt
instrument.

-- Dave

2008-06-18 20:21:19

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 01/11] sysfs: Support for preventing unmounts.

Dave Hansen <[email protected]> writes:

> On Wed, 2008-06-18 at 19:07 +0200, Benjamin Thery wrote:
>> To support mounting multiple instances of sysfs occassionally I
>> need to walk through all of the currently present sysfs super blocks.
>
> I know you may have addressed this before, but I forgot and it didn't
> make it into the changelogs.
>
> Why are you doing this again? It seems like an awfully blunt
> instrument.

So the fundamentals.
- The data in sysfs fundamentally changes behind the back of the
VFS and we need to keep the VFS in sync. Essentially this is the
distributed filesystem problem.

- In particular for sysfs_rename and sysfs_move_dir we need to support finding
the dcache entries and calling d_move. So that the dcache does not
get into an inconsistent state. Timeouts and invalidates like NFS
uses are to be avoided if at all possible.

- Coming through the vfs we are guaranteed that the filesystem will
not be unmounted while we have a reference on a dentry, and with
multiple mounts we do not get that guarantee. Therefore to get that
guarantee for all of the superblocks we need the blunt instrument.

- Since mount/unmount are rare blocking them is no big deal.

I believe any distributed filesystem that is together enough to tell
us about renames (so we can update the dcache) instead of doing the
NFS timeout will need the ability to block mount/unmount while it is
executing d_move.

Currently sysfs does not need to block mounts only because we perform
an internal mount and then never unmount sysfs.

Eric

2008-06-19 08:59:52

by Benjamin Thery

[permalink] [raw]
Subject: Re: [PATCH 01/11] sysfs: Support for preventing unmounts.

Eric W. Biederman wrote:
> Dave Hansen <[email protected]> writes:
>
>> On Wed, 2008-06-18 at 19:07 +0200, Benjamin Thery wrote:
>>> To support mounting multiple instances of sysfs occassionally I
>>> need to walk through all of the currently present sysfs super blocks.
>> I know you may have addressed this before, but I forgot and it didn't
>> make it into the changelogs.
>>
>> Why are you doing this again? It seems like an awfully blunt
>> instrument.
>
> So the fundamentals.
> - The data in sysfs fundamentally changes behind the back of the
> VFS and we need to keep the VFS in sync. Essentially this is the
> distributed filesystem problem.
>
> - In particular for sysfs_rename and sysfs_move_dir we need to support finding
> the dcache entries and calling d_move. So that the dcache does not
> get into an inconsistent state. Timeouts and invalidates like NFS
> uses are to be avoided if at all possible.
>
> - Coming through the vfs we are guaranteed that the filesystem will
> not be unmounted while we have a reference on a dentry, and with
> multiple mounts we do not get that guarantee. Therefore to get that
> guarantee for all of the superblocks we need the blunt instrument.
>
> - Since mount/unmount are rare blocking them is no big deal.
>
> I believe any distributed filesystem that is together enough to tell
> us about renames (so we can update the dcache) instead of doing the
> NFS timeout will need the ability to block mount/unmount while it is
> executing d_move.
>
> Currently sysfs does not need to block mounts only because we perform
> an internal mount and then never unmount sysfs.

Thanks Eric for detailing this.
I think you explained it in much better way than I could do.
You're the author of the patch after all ;-)

Benjamin

>
> Eric
>
>


--
B e n j a m i n T h e r y - BULL/DT/Open Software R&D

http://www.bull.com

2008-06-19 16:32:33

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 01/11] sysfs: Support for preventing unmounts.

On Wed, 2008-06-18 at 13:12 -0700, Eric W. Biederman wrote:
>
> - The data in sysfs fundamentally changes behind the back of the
> VFS and we need to keep the VFS in sync. Essentially this is the
> distributed filesystem problem.
>
> - In particular for sysfs_rename and sysfs_move_dir we need to support
> finding
> the dcache entries and calling d_move. So that the dcache does not
> get into an inconsistent state. Timeouts and invalidates like NFS
> uses are to be avoided if at all possible.

Much clearer now, thanks!

Can we get this description into the changelog, pretty please?

-- Dave

2008-06-19 20:19:21

by Benjamin Thery

[permalink] [raw]
Subject: Re: [PATCH 01/11] sysfs: Support for preventing unmounts.

Quoting Dave Hansen <[email protected]>:

> On Wed, 2008-06-18 at 13:12 -0700, Eric W. Biederman wrote:
>>
>> - The data in sysfs fundamentally changes behind the back of the
>> VFS and we need to keep the VFS in sync. Essentially this is the
>> distributed filesystem problem.
>>
>> - In particular for sysfs_rename and sysfs_move_dir we need to support
>> finding
>> the dcache entries and calling d_move. So that the dcache does not
>> get into an inconsistent state. Timeouts and invalidates like NFS
>> uses are to be avoided if at all possible.
>
> Much clearer now, thanks!
>
> Can we get this description into the changelog, pretty please?

Yes sure.
I'll add it to the patch introduction.

Benjamin

>
> -- Dave
>
>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.