2005-02-22 11:54:54

by Jan Blunck

[permalink] [raw]
Subject: Re: [RFC] pdirops: vfs patch

Quoting Alex Tomas <[email protected]>:

>
> 1) i_sem protects dcache too

Where? i_sem is the per-inode lock, and shouldn't be used else.

> 2) tmpfs has no "own" data, so we can use it this way (see 2nd patch)
> 3) I have pdirops patch for ext3, but it needs some cleaning ...

I think you didn't get my point.

1) Your approach is duplicating the locking effort for regular filesystem
(like ext2):
a) locking with s_pdirops_sems
b) locking the low-level filesystem data
It's cool that it speeds up tmpfs, but I don't think that this legatimate the
doubled locking for every other filesystem.
I'm not sure that it also increases performance for regular filesystems, if you
do the locking right.

2) In my opinion, a superblock-wide semaphore array which allows 1024
different (different names and different operations) accesses to ONE single
inode (which is the data it should protect) is not a good idea.

Jan


2005-02-22 12:06:03

by Alex Tomas

[permalink] [raw]
Subject: Re: [RFC] pdirops: vfs patch

>>>>> Jan Blunck (JB) writes:

>> 1) i_sem protects dcache too

JB> Where? i_sem is the per-inode lock, and shouldn't be used else.

read comments in fs/namei.c:read_lookup()

>> 2) tmpfs has no "own" data, so we can use it this way (see 2nd patch)
>> 3) I have pdirops patch for ext3, but it needs some cleaning ...

JB> I think you didn't get my point.

JB> 1) Your approach is duplicating the locking effort for regular filesystem
JB> (like ext2):
JB> a) locking with s_pdirops_sems
JB> b) locking the low-level filesystem data
JB> It's cool that it speeds up tmpfs, but I don't think that this legatimate the
JB> doubled locking for every other filesystem.
JB> I'm not sure that it also increases performance for regular filesystems, if you
JB> do the locking right.

we've already done this for ext3. it works.
it speeds some loads up significantly.
especially on big directories.
and you can control this via mount option,
so almost zero cost for fs that doesn't support pdirops.

JB> 2) In my opinion, a superblock-wide semaphore array which allows 1024
JB> different (different names and different operations) accesses to ONE single
JB> inode (which is the data it should protect) is not a good idea.

yes, it has some weakness, i'm reworking vfs patch to avoid inter-inode
collisions.

thanks, Alex

2005-02-22 13:00:37

by Jan Blunck

[permalink] [raw]
Subject: Re: [RFC] pdirops: vfs patch

Quoting Alex Tomas <[email protected]>:

> >>>>> Jan Blunck (JB) writes:
>
> >> 1) i_sem protects dcache too
>
> JB> Where? i_sem is the per-inode lock, and shouldn't be used else.
>
> read comments in fs/namei.c:read_lookup()
>

i_sem does NOT protect the dcache. Also not in real_lookup(). The lock must be
acquired for ->lookup() and because we might sleep on i_sem, we have to get it
early and check for repopulation of the dcache.

> we've already done this for ext3. it works.
> it speeds some loads up significantly.
> especially on big directories.
> and you can control this via mount option,
> so almost zero cost for fs that doesn't support pdirops.

Ok.

Jan

2005-02-22 13:24:59

by Alex Tomas

[permalink] [raw]
Subject: Re: [RFC] pdirops: vfs patch

>>>>> Jan Blunck (JB) writes:

JB> i_sem does NOT protect the dcache. Also not in real_lookup(). The lock must be
JB> acquired for ->lookup() and because we might sleep on i_sem, we have to get it
JB> early and check for repopulation of the dcache.

dentry is part of dcache, right? i_sem protects dentry from being
returned with incomplete data inside, right?


thanks, Alex

2005-02-22 13:42:10

by Jan Blunck

[permalink] [raw]
Subject: Re: [RFC] pdirops: vfs patch

Quoting Alex Tomas <[email protected]>:

> >>>>> Jan Blunck (JB) writes:
>
> JB> i_sem does NOT protect the dcache. Also not in real_lookup(). The lock
> must be
> JB> acquired for ->lookup() and because we might sleep on i_sem, we have to
> get it
> JB> early and check for repopulation of the dcache.
>
> dentry is part of dcache, right? i_sem protects dentry from being
> returned with incomplete data inside, right?
>

Nope, d_alloc() is setting d_flags to DCACHE_UNHASHED. Therefore it is not found
by __d_lookup() until it is rehashed which is implicit done by ->lookup().

Jan

2005-02-23 13:57:08

by Alex Tomas

[permalink] [raw]
Subject: Re: [RFC] pdirops: vfs patch

>>>>> Jan Blunck (JB) writes:

JB> Nope, d_alloc() is setting d_flags to DCACHE_UNHASHED. Therefore it is not found
JB> by __d_lookup() until it is rehashed which is implicit done by ->lookup().

that means we can have two processes allocated dentry for
same name. they'll call ->lookup() each against own dentry,
fill them and hash. so, we'll get two equal dentries.
is that OK?

thanks, Alex