Hello,
1. I found that generic_forget_inode() calls write_inode_now() dropping
inode_lock and destroys inode after that. The problem is that
write_inode_now() can sleep and during this sleep someone can find inode
in the hash, w/o I_FREEING state and with i_count = 0.
If such inode will be iget'ed, then it will be iput'ed once more later
messing with the current iput(). So the inode can be cleared and
destroyed twice.
2. Why there is no wake_up_inode() in generic_forget_inode() like in
generic_delete_inode()? Looks like it is missing...
is it bugs in generic_forget_inode()?
Kirill
Kirill Korotaev <[email protected]> wrote:
>
> Hello,
>
> 1. I found that generic_forget_inode() calls write_inode_now() dropping
> inode_lock and destroys inode after that. The problem is that
> write_inode_now() can sleep and during this sleep someone can find inode
> in the hash, w/o I_FREEING state and with i_count = 0.
The filesystem is in the process of being unmounted (!MS_ACTIVE). So the
question is: who is doing inode lookups against a soon-to-be-defunct
filesystem?
Andrew Morton wrote:
> Kirill Korotaev <[email protected]> wrote:
>
>>Hello,
>>
>>1. I found that generic_forget_inode() calls write_inode_now() dropping
>>inode_lock and destroys inode after that. The problem is that
>>write_inode_now() can sleep and during this sleep someone can find inode
>>in the hash, w/o I_FREEING state and with i_count = 0.
>
> The filesystem is in the process of being unmounted (!MS_ACTIVE). So the
> question is: who is doing inode lookups against a soon-to-be-defunct
> filesystem?
Yup, I'm studing this issue.
But while looking at code I found this interesting place:
__writeback_single_inode()
{
while (inode->i_state & I_LOCK) {
__iget(inode); <<<<<<
spin_unlock(&inode_lock);
__wait_on_inode(inode);
iput(inode); <<<<<<
spin_lock(&inode_lock);
}
return __sync_single_inode(inode, wbc); <<<<<<
}
the problem here is iget/iput.
There are 2 possible cases:
1. all callers of this function do hold reference on inode, then
iget/iput is unneeded.
2. if 1) is incorrect then it's a bug, since inode is used after iput.
This place looks really ugly, or I don't understand something here?
Kirill
Kirill Korotaev <[email protected]> wrote:
>
> Andrew Morton wrote:
> > Kirill Korotaev <[email protected]> wrote:
> >
> >>Hello,
> >>
> >>1. I found that generic_forget_inode() calls write_inode_now() dropping
> >>inode_lock and destroys inode after that. The problem is that
> >>write_inode_now() can sleep and during this sleep someone can find inode
> >>in the hash, w/o I_FREEING state and with i_count = 0.
> >
> > The filesystem is in the process of being unmounted (!MS_ACTIVE). So the
> > question is: who is doing inode lookups against a soon-to-be-defunct
> > filesystem?
>
> Yup, I'm studing this issue.
Do you mean to say that you are observing the above scenario with an
unmodified kernel.org filesystem? Which one?
I suggest you add a
BUG_ON(!(inode->i_sb->s_flags & MS_ACTIVE));
in the hash lookup code.
> But while looking at code I found this interesting place:
>
> __writeback_single_inode()
> {
> while (inode->i_state & I_LOCK) {
> __iget(inode); <<<<<<
> spin_unlock(&inode_lock);
> __wait_on_inode(inode);
> iput(inode); <<<<<<
> spin_lock(&inode_lock);
> }
> return __sync_single_inode(inode, wbc); <<<<<<
> }
>
> the problem here is iget/iput.
>
> There are 2 possible cases:
> 1. all callers of this function do hold reference on inode, then
> iget/iput is unneeded.
> 2. if 1) is incorrect then it's a bug, since inode is used after iput.
>
> This place looks really ugly, or I don't understand something here?
You're right - the iget/iput is not needed. The only caller here who does
not already have a ref on the inode is pdflush, and we already did an
__iget on that callpath.