2007-10-18 17:50:34

by Artem Bityutskiy

[permalink] [raw]
Subject: is the inode an orphan?

Hi,

I need a help from VFS folks: when I'm in ->unlink() in is there a safe way to
to realize that ->delete_inod()e is going to be called? IOW, I'd like to call
myfs_delete_inode() myself form ->unlink(), and not wait for VFS calling
->delete_inode().

Or to put it differently, I'd like to know if the inode is an orphan or not in
->unlink()?

AFAICS, if (inode->i_nlink == 0 && atomic_read(&inode->i_count) == 2) then this
file is not going to be an orphan. And AFAIC judge, it is safe to use this,
but I'm not sure and kindly ask for help.

Thanks.

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)


2007-10-18 18:02:01

by Al Viro

[permalink] [raw]
Subject: Re: is the inode an orphan?

On Thu, Oct 18, 2007 at 08:49:34PM +0300, Artem Bityutskiy wrote:
> Hi,
>
> I need a help from VFS folks: when I'm in ->unlink() in is there a safe way
> to to realize that ->delete_inod()e is going to be called? IOW, I'd like to
> call myfs_delete_inode() myself form ->unlink(), and not wait for VFS
> calling ->delete_inode().
>
> Or to put it differently, I'd like to know if the inode is an orphan or not
> in ->unlink()?
>
> AFAICS, if (inode->i_nlink == 0 && atomic_read(&inode->i_count) == 2) then
> this file is not going to be an orphan. And AFAIC judge, it is safe to use
> this, but I'm not sure and kindly ask for help.

Define orphan. It might very well be still opened after the only link
to it had been removed and you still will get IO on it.

2007-10-19 07:08:15

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: is the inode an orphan?

Al Viro wrote:
> Define orphan. It might very well be still opened after the only link
> to it had been removed and you still will get IO on it.

Well, in the mail I called files like open/unlink the last link/do some I/O
orphans. Let me shortly describe the problem I'm trying to solve.

In our FS when we're in ->unlink() and i_nlink becomes 0, we have to record
this inode in the table of orphans, and remove it from there in
->delete_inode(). This is needed to be able to dispose of orphans in case of an
unclean reboot on the next mount. AFAIK, ext3 has something similar. I just
figured that this could be optimized - in most cases ->delete_inode() is called
right after ->unlink(), and I wanted to avoid putting the inode to the orphan
table in those cases.

I.e., if one just does "unlink file", then it is not going to be an orphan. And
most cases are like this. It is rather rare to open a file, unlink it, and keep
utilizing it.

So my question was - while I'm in ->unlink(), how do I figure out that this is
not an orphan. So I was thinking about

if (atomic_read(&inode->i_count) == 2)

then this is not an orphan and ->delete_inode() will be called straight away
(i_nlink is assumed to be 0).

But I've now also figured that ->unlink() may race with write-back, and there
might be a write-back I/O between ->unlink() (and during it) and
->delete_inode(), even though the user-space does not have the file in question
opened.

So, at the moment, AFAIU

if (atomic_read(&inode->i_count) == 2 && !(inode->i_state & I_DIRTY))

then there won't be any I/O on the inode between ->unlink() and ->delete_inode
i_nlink is assumed to be 0). Is that right, safe and acceptable to use such
checks in ->unlink() for optimization?

P.S. the code and short description of the FS I refer is here:
http://www.linux-mtd.infradead.org/doc/ubifs.html

Thanks!

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

2007-10-30 14:10:20

by Jan Kara

[permalink] [raw]
Subject: Re: is the inode an orphan?

> Al Viro wrote:
> >Define orphan. It might very well be still opened after the only link
> >to it had been removed and you still will get IO on it.
>
> Well, in the mail I called files like open/unlink the last link/do some I/O
> orphans. Let me shortly describe the problem I'm trying to solve.
>
> In our FS when we're in ->unlink() and i_nlink becomes 0, we have to record
> this inode in the table of orphans, and remove it from there in
> ->delete_inode(). This is needed to be able to dispose of orphans in case
> of an unclean reboot on the next mount. AFAIK, ext3 has something similar.
> I just figured that this could be optimized - in most cases
> ->delete_inode() is called right after ->unlink(), and I wanted to avoid
> putting the inode to the orphan table in those cases.
Yes, ext3 has something similar. But actually ext3 would have to insert
inode in the orphan list anyway - in delete_inode we do truncate and
for it we also insert the inode into the orphan list because truncate
can be too large to fit into a single transaction.

> I.e., if one just does "unlink file", then it is not going to be an orphan.
> And most cases are like this. It is rather rare to open a file, unlink it,
> and keep utilizing it.
>
> So my question was - while I'm in ->unlink(), how do I figure out that this
> is not an orphan. So I was thinking about
>
> if (atomic_read(&inode->i_count) == 2)
>
> then this is not an orphan and ->delete_inode() will be called straight
> away (i_nlink is assumed to be 0).
>
> But I've now also figured that ->unlink() may race with write-back, and
> there might be a write-back I/O between ->unlink() (and during it) and
> ->delete_inode(), even though the user-space does not have the file in
> question opened.
>
> So, at the moment, AFAIU
>
> if (atomic_read(&inode->i_count) == 2 && !(inode->i_state & I_DIRTY))
>
> then there won't be any I/O on the inode between ->unlink() and
> ->delete_inode i_nlink is assumed to be 0). Is that right, safe and
> acceptable to use such checks in ->unlink() for optimization?
>
> P.S. the code and short description of the FS I refer is here:
> http://www.linux-mtd.infradead.org/doc/ubifs.html
Hmm, I'm just not sure whether unlink cannot somehow race with open
(at least I don't see any lock that would prevent open while unlink is
in progress)...

Honza

2007-11-19 15:48:17

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: is the inode an orphan?

Hi,

Jan Kara wrote:
>> In our FS when we're in ->unlink() and i_nlink becomes 0, we have to record
>> this inode in the table of orphans, and remove it from there in
>> ->delete_inode(). This is needed to be able to dispose of orphans in case
>> of an unclean reboot on the next mount. AFAIK, ext3 has something similar.
>> I just figured that this could be optimized - in most cases
>> ->delete_inode() is called right after ->unlink(), and I wanted to avoid
>> putting the inode to the orphan table in those cases.
> Yes, ext3 has something similar. But actually ext3 would have to insert
> inode in the orphan list anyway - in delete_inode we do truncate and
> for it we also insert the inode into the orphan list because truncate
> can be too large to fit into a single transaction.

Ok, thanks for this point.

> Hmm, I'm just not sure whether unlink cannot somehow race with open
> (at least I don't see any lock that would prevent open while unlink is
> in progress)...

And this.

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)