2004-10-13 01:39:37

by Charles Manning

[permalink] [raw]
Subject: Using ilookup?

Hi

I'm the maintainer for YAFFS, the NAND-specific file system used in many
mobile/embedded Linux devices over the last two years.

There is a problem that I'm unsure how to best fix. I thought perhaps using
ilookup might be a good idea, but I see a big fat warning that ilookup is
perhaps not the function to be using. I therefore seek help.

YAFFS allocates its own "objectId"s which are used as inode numbers for most
purposes. When objects get deleted (== unlinked), the object numbers get
recycles. Sometimes though the Linux cache has an inode after the object has
been deleted. Then if that object id gets recycled before the cached inode is
released, a problem occurs since iget() gets the old inode instead of
creating a new one. We then end up with an inconsistency.

Someone has been able to force this by doing the following:
# cd /test; rm -rf /test

I think I need to do one of two things:

1) Somehow plug myself into the inode iput() chain so that I know when an
inode is removed from the cache. I can then make sure that I don't free up
the inode number for reuse until the inode is not in the cache. Any hints on
how to do that?

2) When creating inode numbers, I could test for if there is a corresponding
inode in the cache first and just not recycle that number if there is. To do
this is ilookup the correct/safe thing to do?

2') A further issue here is that ilookup is not available in some older 2.4.x
versions. Is it Ok to just patch the ilookup code in, say, 2.4.27 back into
earlier versions (say 2.4.18 which seems a popular vintage for embedded stuff
for some reason or other).


Thanx

-- Charles



2004-10-13 05:51:11

by Andreas Dilger

[permalink] [raw]
Subject: Re: Using ilookup?

On Oct 13, 2004 14:42 +1300, Charles Manning wrote:
> YAFFS allocates its own "objectId"s which are used as inode numbers for most
> purposes. When objects get deleted (== unlinked), the object numbers get
> recycles. Sometimes though the Linux cache has an inode after the object
> has been deleted. Then if that object id gets recycled before the cached
> inode is released, a problem occurs since iget() gets the old inode instead
> of creating a new one. We then end up with an inconsistency.

You can use iget4() along with a filesystem-specific comparison function,
which allows you to distinguish inodes with the same number based on
some extra data (e.g. generation number, 64-bit inode numbers, etc). Is
there a reason to recycle the inode numbers, or could you just have a
32-bit counter?

> 1) Somehow plug myself into the inode iput() chain so that I know when
> an inode is removed from the cache. I can then make sure that I don't
> free up the inode number for reuse until the inode is not in the cache.
> Any hints on how to do that?

You can use the ->delete_inode method which is a hook to be called
before/instead of the clear_inode() function in iput(), and is
the last thing action taken when the inode is being unlinked. There
is also the ->clear_inode inode method, which is called when inodes
are being put away but not only when being unlinked.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/


Attachments:
(No filename) (1.50 kB)
(No filename) (189.00 B)
Download all attachments

2004-10-13 08:50:49

by David Woodhouse

[permalink] [raw]
Subject: Re: Using ilookup?

On Wed, 2004-10-13 at 14:42 +1300, Charles Manning wrote:
> YAFFS allocates its own "objectId"s which are used as inode numbers for most
> purposes. When objects get deleted (== unlinked), the object numbers get
> recycles. Sometimes though the Linux cache has an inode after the object has
> been deleted.

Stop there... _why_ is the inode still in the icache after the object
has been deleted? It sounds like this is your real problem. Once the
last link to it goes, and the last file handle to it is closed, it
should be gone immediately.

It sounds like you're doing too much in your unlink method. The object
isn't necessarily dead at the time you unlink it -- it could still be
open. You just decrement its nlink and wait for the VFS to tell you it's
done with it.

> 1) Somehow plug myself into the inode iput() chain so that I know when an
> inode is removed from the cache. I can then make sure that I don't free up
> the inode number for reuse until the inode is not in the cache. Any hints on
> how to do that?

->clear_inode() or ->delete_inode() as Andreas said.

> 2) When creating inode numbers, I could test for if there is a corresponding
> inode in the cache first and just not recycle that number if there is. To do
> this is ilookup the correct/safe thing to do?

It's safe but it doesn't sound correct. It sounds like a workaround for
the real problem.

> 2') A further issue here is that ilookup is not available in some older 2.4.x
> versions. Is it Ok to just patch the ilookup code in, say, 2.4.27 back into
> earlier versions (say 2.4.18 which seems a popular vintage for embedded stuff
> for some reason or other).

No. If these people want new file systems and new features in code code,
why on earth are they still using 2.4.18? They should be on 2.6, or at
_least_ current 2.4 kernels. I could sort of understand if they've had a
lot of testing in the two and a half years since 2.4.18 was released and
they don't want to change _anything_.... but that obviously isn't the
case if they're adding new stuff like this.

--
dwmw2

2004-10-13 15:32:54

by Lee Revell

[permalink] [raw]
Subject: Re: Using ilookup?

On Wed, 2004-10-13 at 04:50, David Woodhouse wrote:
> > 2') A further issue here is that ilookup is not available in some older 2.4.x
> > versions. Is it Ok to just patch the ilookup code in, say, 2.4.27 back into
> > earlier versions (say 2.4.18 which seems a popular vintage for embedded stuff
> > for some reason or other).
>
> No. If these people want new file systems and new features in code code,
> why on earth are they still using 2.4.18? They should be on 2.6, or at
> _least_ current 2.4 kernels. I could sort of understand if they've had a
> lot of testing in the two and a half years since 2.4.18 was released and
> they don't want to change _anything_.... but that obviously isn't the
> case if they're adding new stuff like this.

2.4.18 is probably popular for embedded applications because that's
about where development on the preempt/low latency patches for 2.4
stopped.

Lee

2004-10-13 18:06:13

by Charles Manning

[permalink] [raw]
Subject: Re: Using ilookup?

On Wednesday 13 October 2004 18:50, Andreas Dilger wrote:
> On Oct 13, 2004 14:42 +1300, Charles Manning wrote:
> > YAFFS allocates its own "objectId"s which are used as inode numbers for
> > most purposes. When objects get deleted (== unlinked), the object numbers
> > get recycles. Sometimes though the Linux cache has an inode after the
> > object has been deleted. Then if that object id gets recycled before the
> > cached inode is released, a problem occurs since iget() gets the old
> > inode instead of creating a new one. We then end up with an
> > inconsistency.
>
> You can use iget4() along with a filesystem-specific comparison function,
> which allows you to distinguish inodes with the same number based on
> some extra data (e.g. generation number, 64-bit inode numbers, etc). Is
> there a reason to recycle the inode numbers, or could you just have a
> 32-bit counter?

The problem, I believe, with iget4() is that this will make a new inode if
one does not exist which seems to be more running around than I really want
(especially since in most cases the inode will not exist).

The number space is 18 bits, but even with 32 bits incrementing through the
list will not make the problem go away, just reduce it to a very small
probability.

>
> > 1) Somehow plug myself into the inode iput() chain so that I know when
> > an inode is removed from the cache. I can then make sure that I don't
> > free up the inode number for reuse until the inode is not in the cache.
> > Any hints on how to do that?
>
> You can use the ->delete_inode method which is a hook to be called
> before/instead of the clear_inode() function in iput(), and is
> the last thing action taken when the inode is being unlinked. There
> is also the ->clear_inode inode method, which is called when inodes
> are being put away but not only when being unlinked.

It seems to me that delete_inode() is the place to hook into. I already use
this for regular files, I just need to extend this to directories, pipes and
other specials.

I knew about the regular file case because you can do things like:

f= open("xx"...);; /
unlink("xx"); // file no longer exists in directory, but still exists
read(f...)
write(f...)
close(f) ; // file disappears from disk.

I did npt realise that you could essentially achieve the same thing with
directories, pipes and other specials.

Thanks for the help.

-- Charles