2007-11-29 21:56:57

by David Chinner

[permalink] [raw]
Subject: Race between generic_forget_inode() and sync_sb_inodes()?


If we are in the process of dropping an inode and it is hashed,
generic_forget_inode() will mark it I_WILL_FREE and drop the
inode_lock before calling write_inode_now().

However, at this point, the inode is still on the sb->s_dirty_list
so sync_sb_inodes() could see it and try to write it back.
i.e.:

generic_forget_inode sync_sb_inodes

i_state |= I_WILL_FREE
spin_unlock(inode_lock)
write_inode_now()
spin_lock(inode_lock)
__iget(inode)
__writeback_single_inode()
spin_unlock(inode_lock)
spin_lock(inode_lock)
i_state &= ~I_WILL_FREE
(remove from lists)
i_state |= I_FREEING
spin_unlock(inode_lock)
i_state = I_CLEAR
spin_lock(inode_lock)
iput(inode)
BUG_ON(i_state == I_CLEAR)
(inode gets freed)


I came across this because I've been making changes to XFS to avoid the
inode hash, and I've found that I need to remove the inode from the
dirty list when setting I_WILL_FREE to avoid this race. I can't see
how this race is avoided when inodes are hashed, so I'm wondering
if we've just been lucky or there's something that I'm missing that
means the above does not occur.

If it really is a race, then sync_sb_inodes() should really check
I_WILL_FREE before doing an __iget(), I think. If I_WILL_FREE is set
we are already doing writeback so we don't need to do it from
sync_sb_inodes(), right?

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group


2007-11-29 22:07:35

by NeilBrown

[permalink] [raw]
Subject: Re: Race between generic_forget_inode() and sync_sb_inodes()?


Hi David,

On Friday November 30, [email protected] wrote:
>
>
> I came across this because I've been making changes to XFS to avoid the
> inode hash, and I've found that I need to remove the inode from the
> dirty list when setting I_WILL_FREE to avoid this race. I can't see
> how this race is avoided when inodes are hashed, so I'm wondering
> if we've just been lucky or there's something that I'm missing that
> means the above does not occur.

Looking at inode.c in 2.6.23-mm1, in generic_forget_inode, I see code:

if (!hlist_unhashed(&inode->i_hash)) {
if (!(inode->i_state & (I_DIRTY|I_SYNC)))
list_move(&inode->i_list, &inode_unused);

so it looks to me like:
If the inode is hashed and dirty, then move it (off the s_dirty
list) to inode_unused.

So it seems to me that generic_forget_inode also finds it needs to
remove the inode from the dirty list when setting I_WILL_FREE.

Maybe we are looking at different kernel versions? Maybe I
misunderstood your problem?

NeilBrown

2007-11-29 22:24:31

by David Chinner

[permalink] [raw]
Subject: Re: Race between generic_forget_inode() and sync_sb_inodes()?

On Fri, Nov 30, 2007 at 09:07:06AM +1100, Neil Brown wrote:
>
> Hi David,
>
> On Friday November 30, [email protected] wrote:
> >
> >
> > I came across this because I've been making changes to XFS to avoid the
> > inode hash, and I've found that I need to remove the inode from the
> > dirty list when setting I_WILL_FREE to avoid this race. I can't see
> > how this race is avoided when inodes are hashed, so I'm wondering
> > if we've just been lucky or there's something that I'm missing that
> > means the above does not occur.
>
> Looking at inode.c in 2.6.23-mm1, in generic_forget_inode, I see code:
>
> if (!hlist_unhashed(&inode->i_hash)) {
> if (!(inode->i_state & (I_DIRTY|I_SYNC)))
> list_move(&inode->i_list, &inode_unused);
>
> so it looks to me like:
> If the inode is hashed and dirty, then move it (off the s_dirty
> list) to inode_unused.

That check is for if the inode is _not_ dirty or being sync, right?
Or have I just not had enough coffee this morning?

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group

2007-11-29 23:04:29

by NeilBrown

[permalink] [raw]
Subject: Re: Race between generic_forget_inode() and sync_sb_inodes()?

On Friday November 30, [email protected] wrote:
> On Fri, Nov 30, 2007 at 09:07:06AM +1100, Neil Brown wrote:
> >
> > Hi David,
> >
> > On Friday November 30, [email protected] wrote:
> > >
> > >
> > > I came across this because I've been making changes to XFS to avoid the
> > > inode hash, and I've found that I need to remove the inode from the
> > > dirty list when setting I_WILL_FREE to avoid this race. I can't see
> > > how this race is avoided when inodes are hashed, so I'm wondering
> > > if we've just been lucky or there's something that I'm missing that
> > > means the above does not occur.
> >
> > Looking at inode.c in 2.6.23-mm1, in generic_forget_inode, I see code:
> >
> > if (!hlist_unhashed(&inode->i_hash)) {
> > if (!(inode->i_state & (I_DIRTY|I_SYNC)))
> > list_move(&inode->i_list, &inode_unused);
> >
> > so it looks to me like:
> > If the inode is hashed and dirty, then move it (off the s_dirty
> > list) to inode_unused.
>
> That check is for if the inode is _not_ dirty or being sync, right?
> Or have I just not had enough coffee this morning?

:-) And I cannot even blame the lack of coffee as I don't drink it.

My second guess is that we have been lucky.... which is hard to believe.

I wonder if iput (and even iget) should BUG on I_WILL_FREE as well...

Perplexed.

NeilBrown

2007-11-30 09:37:28

by Jarek Poplawski

[permalink] [raw]
Subject: Re: Race between generic_forget_inode() and sync_sb_inodes()?

On 30-11-2007 00:03, Neil Brown wrote:
> On Friday November 30, [email protected] wrote:
...
>> Or have I just not had enough coffee this morning?
>
> :-) And I cannot even blame the lack of coffee as I don't drink it.
>

Looks like logical error...

(Or I haven't had enough coffee this morning either?)

Regards,
Jarek P.