2001-04-12 19:20:02

by Marcelo Tosatti

[permalink] [raw]
Subject: generic_osync_inode() broken?


Hi,

generic_osync_inode() (called by generic_file_write()) is not checking if
the inode being synced has the I_LOCK bit set before checking the I_DIRTY
bit.

AFAICS, the following problem can happen:

sync()
...
sync_one()
reset I_DIRTY, set I_LOCK
filemap_fdatasync() <-- #window
write_inode() <-- #window
filemap_fdatawait() <-- #window
unset I_LOCK

There is no guarantee that the inode is fully synced until sync_one()
cleans the inode I_LOCK bit.

If generic_osync_inode() checks the I_DIRTY bit (and sees it clean) during
"#window", an "O_SYNC write()" call may return to userspace without having
all the data actually synced.

If I'm not missing something here this patch should the problem.

Comments?

--- fs/inode.c~ Thu Mar 22 16:04:13 2001
+++ fs/inode.c Thu Apr 12 15:18:22 2001
@@ -347,6 +347,11 @@
#endif

spin_lock(&inode_lock);
+ while (inode->i_state & I_LOCK) {
+ spin_unlock(&inode_lock);
+ __wait_on_inode(inode);
+ spin_lock(&inode_lock);
+ }
if (!(inode->i_state & I_DIRTY))
goto out;
if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))


2001-04-12 19:32:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: generic_osync_inode() broken?



On Thu, 12 Apr 2001, Marcelo Tosatti wrote:
>
> Comments?
>
> --- fs/inode.c~ Thu Mar 22 16:04:13 2001
> +++ fs/inode.c Thu Apr 12 15:18:22 2001
> @@ -347,6 +347,11 @@
> #endif
>
> spin_lock(&inode_lock);
> + while (inode->i_state & I_LOCK) {
> + spin_unlock(&inode_lock);
> + __wait_on_inode(inode);
> + spin_lock(&inode_lock);
> + }
> if (!(inode->i_state & I_DIRTY))
> goto out;
> if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))

Ehh.

Why not just lock the inode around the thing?

The above looks rather ugly.

Linus

2001-04-12 19:38:50

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: generic_osync_inode() broken?


On Thu, 12 Apr 2001, Linus Torvalds wrote:

> On Thu, 12 Apr 2001, Marcelo Tosatti wrote:
> >
> > Comments?
> >
> > --- fs/inode.c~ Thu Mar 22 16:04:13 2001
> > +++ fs/inode.c Thu Apr 12 15:18:22 2001
> > @@ -347,6 +347,11 @@
> > #endif
> >
> > spin_lock(&inode_lock);
> > + while (inode->i_state & I_LOCK) {
> > + spin_unlock(&inode_lock);
> > + __wait_on_inode(inode);
> > + spin_lock(&inode_lock);
> > + }
> > if (!(inode->i_state & I_DIRTY))
> > goto out;
> > if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
>
> Ehh.
>
> Why not just lock the inode around the thing?
>
> The above looks rather ugly.

You mean writing a function called "lock_inode()" or whatever to basically
do what I did ?



2001-04-12 20:01:44

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: generic_osync_inode() broken?



On Thu, 12 Apr 2001, Marcelo Tosatti wrote:

>
> On Thu, 12 Apr 2001, Linus Torvalds wrote:
>
> > On Thu, 12 Apr 2001, Marcelo Tosatti wrote:
> > >
> > > Comments?
> > >
> > > --- fs/inode.c~ Thu Mar 22 16:04:13 2001
> > > +++ fs/inode.c Thu Apr 12 15:18:22 2001
> > > @@ -347,6 +347,11 @@
> > > #endif
> > >
> > > spin_lock(&inode_lock);
> > > + while (inode->i_state & I_LOCK) {
> > > + spin_unlock(&inode_lock);
> > > + __wait_on_inode(inode);
> > > + spin_lock(&inode_lock);
> > > + }
> > > if (!(inode->i_state & I_DIRTY))
> > > goto out;
> > > if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
> >
> > Ehh.
> >
> > Why not just lock the inode around the thing?
> >
> > The above looks rather ugly.
>
> You mean writing a function called "lock_inode()" or whatever to basically
> do what I did ?

Oh well, its still bad.

We drop the inode_lock before calling write_inode_now() (which is broken,
too :)), which means someone can set I_LOCK under us.

I'll send you a patch to fix that one (and other callers of
write_inode_now()) later.


2001-04-13 00:24:29

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: generic_osync_inode() broken?


On Thu, 12 Apr 2001, Linus Torvalds wrote:

>
>
> On Thu, 12 Apr 2001, Marcelo Tosatti wrote:
> >
> > Comments?
> >
> > --- fs/inode.c~ Thu Mar 22 16:04:13 2001
> > +++ fs/inode.c Thu Apr 12 15:18:22 2001
> > @@ -347,6 +347,11 @@
> > #endif
> >
> > spin_lock(&inode_lock);
> > + while (inode->i_state & I_LOCK) {
> > + spin_unlock(&inode_lock);
> > + __wait_on_inode(inode);
> > + spin_lock(&inode_lock);
> > + }
> > if (!(inode->i_state & I_DIRTY))
> > goto out;
> > if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
>
> Ehh.
>
> Why not just lock the inode around the thing?
>
> The above looks rather ugly.

Ok, me again.

The inode->i_state locking is rather nasty: there is no need to lock the
inode. We just have to wait for it to become unlocked, since its
guaranteed that who locked it wrote it to disk. (sync_one())

Aviro suggested the following, which is much cleaner than the previous
patch:


--- fs/inode.c~ Thu Apr 12 21:15:23 2001
+++ fs/inode.c Thu Apr 12 21:16:35 2001
@@ -301,6 +301,8 @@
while (inode->i_state & I_DIRTY)
sync_one(inode, sync);
spin_unlock(&inode_lock);
+ if (sync)
+ wait_on_inode(inode);
}
else
printk("write_inode_now: no super block\n");
@@ -357,6 +359,7 @@

out:
spin_unlock(&inode_lock);
+ wait_on_inode(inode);
return err;
}