2005-03-30 20:10:48

by Russ Weight

[permalink] [raw]
Subject: [PATCH] Set MS_ACTIVE in isofs_fill_super()

This patch sets the MS_ACTIVE bit in isofs_fill_super() prior to calling
iget() or iput(). This eliminates a race condition between mount
(for isofs) and kswapd that results in a system panic.

Signed-off-by: Russ Weight <[email protected]>

--- linux-2.6.12-rc1/fs/isofs/inode.c 2005-03-17 17:34:36.000000000
-0800
+++ linux-2.6.12-rc1-isofsfix/fs/isofs/inode.c 2005-03-22
15:29:51.945607217 -0800
@@ -820,6 +820,7 @@
* the s_rock flag. Once we have the final s_rock value,
* we then decide whether to use the Joliet descriptor.
*/
+ s->s_flags |= MS_ACTIVE;
inode = isofs_iget(s, sbi->s_firstdatazone, 0);

/*
@@ -909,6 +910,7 @@
kfree(opt.iocharset);
kfree(sbi);
s->s_fs_info = NULL;
+ s->s_flags &= ~MS_ACTIVE;
return -EINVAL;
}



2005-03-30 20:39:24

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Set MS_ACTIVE in isofs_fill_super()

Russ Weight <[email protected]> wrote:
>
> This patch sets the MS_ACTIVE bit in isofs_fill_super() prior to calling
> iget() or iput(). This eliminates a race condition between mount
> (for isofs) and kswapd that results in a system panic.
>
> Signed-off-by: Russ Weight <[email protected]>
>
> --- linux-2.6.12-rc1/fs/isofs/inode.c 2005-03-17 17:34:36.000000000
> -0800
> +++ linux-2.6.12-rc1-isofsfix/fs/isofs/inode.c 2005-03-22
> 15:29:51.945607217 -0800
> @@ -820,6 +820,7 @@
> * the s_rock flag. Once we have the final s_rock value,
> * we then decide whether to use the Joliet descriptor.
> */
> + s->s_flags |= MS_ACTIVE;
> inode = isofs_iget(s, sbi->s_firstdatazone, 0);
>
> /*
> @@ -909,6 +910,7 @@
> kfree(opt.iocharset);
> kfree(sbi);
> s->s_fs_info = NULL;
> + s->s_flags &= ~MS_ACTIVE;
> return -EINVAL;
> }
>

The patch is obviously safe enough, but seems a bit kludgy.

The basic problem here appears to be that isofs is doing iget/iput in
->fill_super before MS_ACTIVE is set and the inode freeing code
(generic_forget_inode) doesn't expect that to happen, yes?

I wonder if it would make more sense for all the ->fill_super callers to
set MS_ACTIVE prior to calling ->fill_super(), and clear MS_ACTIVE if
fill_super() failed?

2005-03-30 22:20:35

by Jan Harkes

[permalink] [raw]
Subject: Re: [PATCH] Set MS_ACTIVE in isofs_fill_super()

On Wed, Mar 30, 2005 at 12:39:07PM -0800, Andrew Morton wrote:
> Russ Weight <[email protected]> wrote:
> > This patch sets the MS_ACTIVE bit in isofs_fill_super() prior to calling
> > iget() or iput(). This eliminates a race condition between mount
> > (for isofs) and kswapd that results in a system panic.
>
> The patch is obviously safe enough, but seems a bit kludgy.

Ehh, I think every filesystem must have that bug, since fill_super AFAIK
is expected to initialize the root dentry (sb->s_root), or at least that
is what the function used to do before it became 'fill_super'.

When I saw this bugreport I immediately knew that Coda did the same
thing "wrong". Let's see,

ext2/ext3, iget in fill_super
reiserfs, iget5_locked in fill_super
ramfs, new_inode in fill_super

There probably isn't a reason to keep looking further at this point...

I think logically if a filesystem needs to perform any setup before it
can handle VFS request that should probably be done in the registered
get_sb function before that function then calls back into the VFS to get
the actual superblock with get_sb_dev/nodev/single(..., fill_super).

Not sure how realistic this thought is, since we don't have an
initialized superblock at this time to hang things off of.

> I wonder if it would make more sense for all the ->fill_super callers to
> set MS_ACTIVE prior to calling ->fill_super(), and clear MS_ACTIVE if
> fill_super() failed?

This sounds like a better solution, although filesystems might have to
handle some operations earlier than they currently expect.

Jan

2005-03-30 22:27:22

by Russ Weight

[permalink] [raw]
Subject: Re: [PATCH] Set MS_ACTIVE in isofs_fill_super()

On Wed, 2005-03-30 at 12:39 -0800, Andrew Morton wrote:
> Russ Weight <[email protected]> wrote:
> >
> > This patch sets the MS_ACTIVE bit in isofs_fill_super() prior to calling
> > iget() or iput(). This eliminates a race condition between mount
> > (for isofs) and kswapd that results in a system panic.
> >
> > Signed-off-by: Russ Weight <[email protected]>
> >
> > --- linux-2.6.12-rc1/fs/isofs/inode.c 2005-03-17 17:34:36.000000000
> > -0800
> > +++ linux-2.6.12-rc1-isofsfix/fs/isofs/inode.c 2005-03-22
> > 15:29:51.945607217 -0800
> > @@ -820,6 +820,7 @@
> > * the s_rock flag. Once we have the final s_rock value,
> > * we then decide whether to use the Joliet descriptor.
> > */
> > + s->s_flags |= MS_ACTIVE;
> > inode = isofs_iget(s, sbi->s_firstdatazone, 0);
> >
> > /*
> > @@ -909,6 +910,7 @@
> > kfree(opt.iocharset);
> > kfree(sbi);
> > s->s_fs_info = NULL;
> > + s->s_flags &= ~MS_ACTIVE;
> > return -EINVAL;
> > }
> >
>
> The patch is obviously safe enough, but seems a bit kludgy.
>
> The basic problem here appears to be that isofs is doing iget/iput in
> ->fill_super before MS_ACTIVE is set and the inode freeing code
> (generic_forget_inode) doesn't expect that to happen, yes?

Yes.

> I wonder if it would make more sense for all the ->fill_super callers to
> set MS_ACTIVE prior to calling ->fill_super(), and clear MS_ACTIVE if
> fill_super() failed?

It does seem a bit kludgy. I was trying keep the changes in isofs
specific code, and this seemed the best way to do it.

When you say "all the ->fill_super callers", are you referring to the
fill_super functions for all filesystems? Or just callers to
isofs_fill_super()?

isofs_fill_super() is called by get_sb_bdev(), which is in the generic
filesystem code. get_sb_bdev() receives a pointer as a parameter when it
is called from isofs_get_sb(). To make a change in get_sb_bdev() would
be to affect many (all?) filesystems. This may be the right thing to do
- although it seems a little heavy-weight since the failure hasn't been
reported in other filesystems.



2005-03-30 23:18:50

by Jan Harkes

[permalink] [raw]
Subject: Re: [PATCH] Set MS_ACTIVE in isofs_fill_super()

On Wed, Mar 30, 2005 at 02:26:59PM -0800, Russ Weight wrote:
> isofs_fill_super() is called by get_sb_bdev(), which is in the generic
> filesystem code. get_sb_bdev() receives a pointer as a parameter when it
> is called from isofs_get_sb(). To make a change in get_sb_bdev() would
> be to affect many (all?) filesystems. This may be the right thing to do
> - although it seems a little heavy-weight since the failure hasn't been
> reported in other filesystems.

It probably hasn't been reported in other filesystems because they only
use iput() in the error path if they can't get the dentry for the root
inode, which is probably a fairly rare situation. From what I can see
only isofs seems to intentionally drop the inode to grab another one.

Jan

2005-04-01 15:15:02

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] Set MS_ACTIVE in isofs_fill_super()

> On Wed, Mar 30, 2005 at 12:39:07PM -0800, Andrew Morton wrote:
<snip>
> > I wonder if it would make more sense for all the ->fill_super callers to
> > set MS_ACTIVE prior to calling ->fill_super(), and clear MS_ACTIVE if
> > fill_super() failed?
>
> This sounds like a better solution, although filesystems might have to
> handle some operations earlier than they currently expect.
Actually I've just checked the code and MS_ACTIVE is almost unused -
the only place where it is checked is generic_forget_inode(). So
Andrew's suggestion should be safe (and a quick grep showed that quite
a few filesystems already set MS_ACTIVE in their code)... What really
protects the filesystem from early operations is the fact that its
directory tree is made visible to the rest of the world in
do_add_mount() which is quite after fill_super() has finished.

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs