2002-08-31 17:55:39

by Anton Altaparmakov

[permalink] [raw]
Subject: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

Linus, Al,

The below ChangeSet against Linus' current BK tree adds a new function to
the VFS, fs/inode.c::ilookup().

This is needed in NTFS when writing out inode metadata pages via the VM
dirty page code paths as we need to know whether there is an active inode
in icache but we don't want to do an iget() because if the inode is not
active then there is no need to write it... - I can just skip onto the next
one instead... - If there is an active inode then I need to get the struct
inode in order to perform appropriate locking for the write out to happen.

If there is something you don't like about this patch please let me know
what it is, preferably with what you want instead so I can modify it...

Without such icache lookup functionality it is impossible to write inodes
via the VM page dirty code paths AFAICS. - The only alternative I can see
is to duplicate the whole icache private to NTFS so that I can perform the
lookup internally but I think that is silly considering the VFS already
keeps the inode cache...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/, http://www-stu.christs.cam.ac.uk/~aia21/

===================================================================

This will update the following files:

fs/inode.c | 33 +++++++++++++++++++++++++++++++++
include/linux/fs.h | 3 +++
2 files changed, 36 insertions(+)

through these ChangeSets:

<[email protected]> (02/08/31 1.572)
Implement an inode cache search function fs/inode.c::ilookup() and export it to modules.


diff -Nru a/fs/inode.c b/fs/inode.c
--- a/fs/inode.c Sat Aug 31 18:49:37 2002
+++ b/fs/inode.c Sat Aug 31 18:49:37 2002
@@ -696,6 +696,39 @@
return inode;
}

+/**
+ * ilookup - search for an inode in the inode cache
+ * @sb: super block of file system to search
+ * @hashval: hash value (usually inode number) to search for
+ * @test: callback used for comparisons between inodes
+ * @data: opaque data pointer to pass to @test
+ *
+ * ilookup() searches for an inode in the inode cache.
+ *
+ * If the inode is in the cache and is not (in the process of being) cleared,
+ * the reference count of the inode is increased and the inode is returned.
+ *
+ * Otherwise NULL is returned.
+ */
+struct inode *ilookup(struct super_block *sb, unsigned long hashval,
+ int (*test)(struct inode *, void *), void *data)
+{
+ struct list_head *head = inode_hashtable + hash(sb, hashval);
+ struct inode *inode;
+
+ spin_lock(&inode_lock);
+ inode = find_inode(sb, head, test, data);
+ if (inode && !(inode->i_state & (I_FREEING | I_CLEAR))) {
+ __iget(inode);
+ spin_unlock(&inode_lock);
+ wait_on_inode(inode);
+ return inode;
+ }
+ spin_unlock(&inode_lock);
+ return NULL;
+}
+EXPORT_SYMBOL(ilookup);
+
/*
* This is iget without the read_inode portion of get_new_inode
* the filesystem gets back a new locked and hashed inode and gets
diff -Nru a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h Sat Aug 31 18:49:37 2002
+++ b/include/linux/fs.h Sat Aug 31 18:49:37 2002
@@ -1199,6 +1199,9 @@
extern int inode_needs_sync(struct inode *inode);
extern void generic_delete_inode(struct inode *inode);

+extern struct inode *ilookup(struct super_block *sb, unsigned long hashval,
+ int (*test)(struct inode *, void *), void *data);
+
extern struct inode * iget5_locked(struct super_block *, unsigned long, int (*test)(struct inode *, void *), int (*set)(struct inode *, void *), void *);
extern struct inode * iget_locked(struct super_block *, unsigned long);
extern void unlock_new_inode(struct inode *);

===================================================================

This BitKeeper patch contains the following changesets:
+
## Wrapped with gzip_uu ##


begin 664 bkpatch106439
M'XL(`+$!<3T``\56^V_;-A#^6?PK;B@0V$XLBWI95N`B:9IUQK(F<!=@`P8(
ME$3;0F72(ZD\,/5_'RG)<1YM@P1[^"6*O/ONXW?'D]_`I:0BMDA!7(S>P$]<
MJMC*"%,DM1E5>FK.N9X:55*,I,A&9<&JFZ%K!\.BY/QSM4':YH*H;`575,C8
MPK9W-Z-N-S2VYJ<?+L^.YPA-IW"R(FQ)/U$%TRE27%R1,I='1*U*SFPE")-K
MJHB=\75]9UJ[CN/J=X#'GA.$-0X=?UQG.,>8^)CFCNM'H8_*/#W27YOGTN9B
M^=@_\C`>N]AWZ\#W\`2]!VP'8Q<<=^1$(P\#CF+/CS'>=W#L.-!(<K23`O9=
M&#KH'?RSK$]0!K/[email protected],9"L*DA*A95Q4+%,%9["0HV;9SN*X
MD[_7UTXYT)L-%PH*I>G!FN=52:6-?H;`#:((7>QT1\,7OA!RB(/>/K/M';/[
M^YYX4:WWB2=UZN#(B^ABXN1Y&FJU'\O[!*')E^<%DQI[?C!^ED'!LK+*:5N@
MHX6T5P^9:)PP\,9UZ-!)%.7CQ2(/4QH^)?(MH!TAQW<GN"GG'>GGZ_FE$B&I
MD50A2Y+*KP*,7==U_,@)ZV#BX;"IZ#!Z6-!>'$R^6=">]V]5]'&>PUV!FCIL
M<[email protected]'=?'1=7=Q3[Q55.0LG$7@>&@T&"`;;:#"\.S1<[(Y3P4"MZ/VS97R.
M9!I;EJPV5$!:\NPS\`4LBE(?O%NIZ-J<I1:ML5X1N=(ZQ989@!Y5%'J5K$A9
MWG;0K%JG5/1WCH9&XZQHVUG+,B4Z4"5IWE#4"F^(*"1G$E*JKBGM.,O&+2>*
MQ!;?D#]U,',#&UXPI0GK$!LBI;DVX-K\G@ZZ+;0$J'Q.";OSG"WNK11R:]EV
M(M-C]!SC"GK=PD;PC&H"6K.4%FS9AZS4(6E^8-",A:`+*BC+-`BO='?C3R)D
M@A*CA(%_L"2HJ@2C^9;<N5X5UX6D\/'R[.RQQ0A)):I,=?Z#K0C=;)/AI,WP
M0*8'4#%9++4OZ&I?0I?7`V196EKH#8R<_=Y#R`.XXD4.@_YV8)+11W\AJ[,K
M"ZF2%25ZJ?F=MIZ)0=<G3A?5?A.I9PAT(?N'=^Y;YN9RB/[0\YN")89R;Z\%
M,F/CT%KJWE.P/&EN6D@=]``,]8.F4!K3A<F6,=_;@Q_:X?!MD4A%E)Z#WBSY
M<7YZ.OOX`6J8)2=GI\?S?K\/>E=6DA1+JEH?@]42JMC7*%G7I%`)9QV=G4^;
M).AV97U!WT/IC$V"#]$7=/K;Q?G\U^33[[^\.S_K=2GM&VU,YWW:I9_OP*]\
M1*`T$^61XGQ=92M;A^!,<5L7^HJH[^)&+G9=SW=P'41!&#4-&@<OZM#_68-N
M'FF/^O/3?;VF3V/7P>`A>J.[%H/_Z:`V9;/]6ZH;6O995NMIYD78G^0.^AOH
'AE,_"`L`````
`
end


2002-09-01 15:13:47

by David Woodhouse

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function


[email protected] said:
> The below ChangeSet against Linus' current BK tree adds a new function
> to the VFS, fs/inode.c::ilookup().

> This is needed in NTFS when writing out inode metadata pages via the
> VM dirty page code paths as we need to know whether there is an active
> inode in icache but we don't want to do an iget() because if the inode
> is not active then there is no need to write it... - I can just skip
> onto the next one instead... - If there is an active inode then I need
> to get the struct inode in order to perform appropriate locking for
> the write out to happen.

Yum. I need similar functionality for JFFS2 garbage collection. When moving
a data node, we currently iget() the inode to which it belongs and update
its in-core extent lists accordingly. If the inode in question wasn't
already present, there's no real need to do that.

--
dwmw2


2002-09-02 09:58:01

by Nikita Danilov

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

David Woodhouse writes:
>
> [email protected] said:
> > The below ChangeSet against Linus' current BK tree adds a new function
> > to the VFS, fs/inode.c::ilookup().
>
> > This is needed in NTFS when writing out inode metadata pages via the
> > VM dirty page code paths as we need to know whether there is an active
> > inode in icache but we don't want to do an iget() because if the inode
> > is not active then there is no need to write it... - I can just skip
> > onto the next one instead... - If there is an active inode then I need
> > to get the struct inode in order to perform appropriate locking for
> > the write out to happen.
>
> Yum. I need similar functionality for JFFS2 garbage collection. When moving
> a data node, we currently iget() the inode to which it belongs and update
> its in-core extent lists accordingly. If the inode in question wasn't
> already present, there's no real need to do that.
>

Reiser4 also needs this.

> --
> dwmw2
>

Nikita.

>

2002-09-03 03:53:16

by Jan Harkes

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

On Sat, Aug 31, 2002 at 07:00:04PM +0100, Anton Altaparmakov wrote:
> Without such icache lookup functionality it is impossible to write inodes
> via the VM page dirty code paths AFAICS. - The only alternative I can see
> is to duplicate the whole icache private to NTFS so that I can perform the
> lookup internally but I think that is silly considering the VFS already
> keeps the inode cache...

Wouldn't you be able to use something like the following code to do ilookup?

Jan


static int dont_set(struct inode *inode, void *data)
{
return -1;
}

struct inode *ilookup(struct super_block *sb, struct list_head *head,
int (*test)(struct inode *, void *), void *data)
{
return iget5_locked(sb, head, test, dont_set, data);
}

2002-09-03 05:16:14

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

On Mon, 2 Sep 2002, Jan Harkes wrote:
> On Sat, Aug 31, 2002 at 07:00:04PM +0100, Anton Altaparmakov wrote:
> > Without such icache lookup functionality it is impossible to write inodes
> > via the VM page dirty code paths AFAICS. - The only alternative I can see
> > is to duplicate the whole icache private to NTFS so that I can perform the
> > lookup internally but I think that is silly considering the VFS already
> > keeps the inode cache...
>
> Wouldn't you be able to use something like the following code to do ilookup?
>
> Jan
>
>
> static int dont_set(struct inode *inode, void *data)
> {
> return -1;
> }
>
> struct inode *ilookup(struct super_block *sb, struct list_head *head,
> int (*test)(struct inode *, void *), void *data)
> {
> return iget5_locked(sb, head, test, dont_set, data);
> }

Sure! But that makes me want to throw up. Before I use that I would
implement my own ntfs inode cache alongside the VFS...

1) The iget5_locked + failing set() approach incurs an
alloc_inode()/destroy_inode() + a second find_inode() + second taking of
inode_lock, i.e. it is very expensive operation.

2) It will return inodes that are I_FREEING or I_CLEAR. I will have to
test for these in NTFS and then iput() to wash my hands clean of such
garbage. And if I am not mistaken, the iput() actually will BUG().

3) My proposed ilookup() is fast and small so I would prefer that. And it
would seem quite a few other FS would be only too happy to see something
along those lines appear.

4) If anything, as Christoph Hellwig suggested to me on #kernel,
iget{,5}_locked() should be reimplemented in terms of my ilookup()
implementation and not vice versa. (-:

I suspect that point 2) may actually be an existing race condition in the
VFS. If someone does iget() or iget_locked(), whatever, at just the right
time, they could end up with a very weird struct inode which would BUG()
on iput() if the code actually survives that long... - I haven't looked
into this closely so I may be missing something which doesn't allow these
BUG() scenarios to happen but from a cursory look around when I was
looking for a way to do the equivalent of ilookup() it would seem there
isn't...

Thanks for your comments. It is interesting to read everyone's
suggestions...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2002-09-03 15:17:58

by Jan Harkes

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

On Tue, Sep 03, 2002 at 06:20:44AM +0100, Anton Altaparmakov wrote:
> 2) It will return inodes that are I_FREEING or I_CLEAR. I will have to
> test for these in NTFS and then iput() to wash my hands clean of such
> garbage. And if I am not mistaken, the iput() actually will BUG().

If that is the case iget is broken. Perhaps it should test for these
states in find_inode (and find_inode_fast) and never return them. Are
those types of inodes still on the inode hash?

> 4) If anything, as Christoph Hellwig suggested to me on #kernel,
> iget{,5}_locked() should be reimplemented in terms of my ilookup()
> implementation and not vice versa. (-:

Well, considering that this function (modulo the I_FREEING|I_CLEAR test
is identical to the first 10 lines in iget5_locked, this could call that
function. Ofcourse iget_locked is using the 'fast' version of find_inode.

Jan

2002-09-03 15:27:03

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

On Tue, 3 Sep 2002, Jan Harkes wrote:
> On Tue, Sep 03, 2002 at 06:20:44AM +0100, Anton Altaparmakov wrote:
> > 2) It will return inodes that are I_FREEING or I_CLEAR. I will have to
> > test for these in NTFS and then iput() to wash my hands clean of such
> > garbage. And if I am not mistaken, the iput() actually will BUG().
>
> If that is the case iget is broken. Perhaps it should test for these
> states in find_inode (and find_inode_fast) and never return them. Are
> those types of inodes still on the inode hash?

That is of course the big question. As I said I haven't looked at this
very closely...

> > 4) If anything, as Christoph Hellwig suggested to me on #kernel,
> > iget{,5}_locked() should be reimplemented in terms of my ilookup()
> > implementation and not vice versa. (-:
>
> Well, considering that this function (modulo the I_FREEING|I_CLEAR test
> is identical to the first 10 lines in iget5_locked, this could call that
> function. Ofcourse iget_locked is using the 'fast' version of find_inode.

Yes, indeed, and yes. I recall that Al wanted the _fast() versions to
remain separate. This can be done easily enough here, too. For example
ilookup() and ilookup5() (to remain consistent with iget5() or if you
have a better idea for a name...).

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2002-09-03 16:03:26

by Jan Harkes

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

On Tue, Sep 03, 2002 at 04:31:32PM +0100, Anton Altaparmakov wrote:
> On Tue, 3 Sep 2002, Jan Harkes wrote:
> > On Tue, Sep 03, 2002 at 06:20:44AM +0100, Anton Altaparmakov wrote:
> > > 2) It will return inodes that are I_FREEING or I_CLEAR. I will have to
> > > test for these in NTFS and then iput() to wash my hands clean of such
> > > garbage. And if I am not mistaken, the iput() actually will BUG().
> >
> > If that is the case iget is broken. Perhaps it should test for these
> > states in find_inode (and find_inode_fast) and never return them. Are
> > those types of inodes still on the inode hash?
>
> That is of course the big question. As I said I haven't looked at this
> very closely...

<handwaving> There is no problem here </handwaving>

invalidate_list, prune_icache, generic_delete_inode, and
generic_forget_inode all move the inode off of the inode hash before
setting the I_FREEING flag. So inodes in I_FREEING state can not be
found by find_inode.

Clear inode bugs when the inode is not I_FREEING, it then clears out the
inode and sets I_CLEAR. As a result, inodes in I_CLEAR state can not be
found by find_inode either.

So you can safely remove that test in ilookup.

> > > 4) If anything, as Christoph Hellwig suggested to me on #kernel,
> > > iget{,5}_locked() should be reimplemented in terms of my ilookup()
> > > implementation and not vice versa. (-:
> >
> > Well, considering that this function (modulo the I_FREEING|I_CLEAR test
> > is identical to the first 10 lines in iget5_locked, this could call that
> > function. Ofcourse iget_locked is using the 'fast' version of find_inode.
>
> Yes, indeed, and yes. I recall that Al wanted the _fast() versions to
> remain separate. This can be done easily enough here, too. For example
> ilookup() and ilookup5() (to remain consistent with iget5() or if you
> have a better idea for a name...).

I have no problem with the name. And coda_fid_to_inode could use
ilookup5 as well, as currently uses iget with the 'set_fail' trick.

Jan

2002-09-03 18:07:31

by Daniel Phillips

[permalink] [raw]
Subject: Re: [BK-PATCH-2.5] Introduce new VFS inode cache lookup function

On Tuesday 03 September 2002 07:20, Anton Altaparmakov wrote:
> Thanks for your comments. It is interesting to read everyone's
> suggestions...

One more... 'ifind' would be more in keeping with existing terminology
than 'ilookup'. (See page cache operations in filemap.c)

--
Daniel