2001-02-08 07:15:58

by dean gaudet

[permalink] [raw]
Subject: dentry cache order 7 is broken

this looks to be a problem going back all the way to at least 2.2.

if you've got 512Mb of RAM you end up with a dentry cache of order 7 --
65536 entries.

this results in a D_HASHBITS of 16. if you look at d_hash it contains
this code:

hash = hash ^ (hash >> D_HASHBITS) ^ (hash >> D_HASHBITS*2);

D_HASHBITS*2 = 32, which on an x86 results in a shift of 0. (ia32 is
defined to mask a variable shift down to the low 5 bits and use the low 5
bits.)

this results in a hash with very little perturbance at all, since the 1st
and 3rd terms of the xor are equal and cancel... and the resulting
performance is pretty abysmal. i'm getting hash chains with well over 100
entries in them just doing a find on the kernel source tree.

if i hard code to order 8 then i get nice well distributed hash chains, 5
entries max.

also, for order > 7, was the real intention to use a shift of
(order*2)&31?

-dean


2001-02-08 08:12:02

by David Miller

[permalink] [raw]
Subject: Re: dentry cache order 7 is broken


dean gaudet writes:
> also, for order > 7, was the real intention to use a shift of
> (order*2)&31?

No, the whole thing is buggered. How stupid, my fault.
It was the 64-bit platform fascist at work :-)

How does this work for you (against 2.4.x)?

--- fs/dcache.c.~1~ Tue Feb 6 23:00:58 2001
+++ fs/dcache.c Thu Feb 8 00:09:10 2001
@@ -696,7 +696,8 @@
static inline struct list_head * d_hash(struct dentry * parent, unsigned long hash)
{
hash += (unsigned long) parent / L1_CACHE_BYTES;
- hash = hash ^ (hash >> D_HASHBITS) ^ (hash >> D_HASHBITS*2);
+ hash = hash ^ (hash >> D_HASHBITS) ^
+ (hash >> (D_HASHBITS+(D_HASHBITS/2)));
return dentry_hashtable + (hash & D_HASHMASK);
}

2001-02-08 08:22:43

by Mitchell Blank Jr

[permalink] [raw]
Subject: Re: dentry cache order 7 is broken

David S. Miller wrote:
> - hash = hash ^ (hash >> D_HASHBITS) ^ (hash >> D_HASHBITS*2);
> + hash = hash ^ (hash >> D_HASHBITS) ^
> + (hash >> (D_HASHBITS+(D_HASHBITS/2)));

Two comments:
1. The inode-cache has the exact same problem, but it'll require a lot
of RAM to run into it. The buffer and page caches don't have the
same problem.
2. Given that D_HASHBITS is not a constant I wonder if there isn't
a more efficient hash to be found. But I guess I'll leave that
to the hashing experts.

-Mitch

2001-02-08 09:24:10

by David Miller

[permalink] [raw]
Subject: Re: dentry cache order 7 is broken


Mitchell Blank Jr writes:
> 1. The inode-cache has the exact same problem, but it'll require a lot
> of RAM to run into it. The buffer and page caches don't have the
> same problem.

Yep, fix attached. You just need 1GB ram to hit that case.

> 2. Given that D_HASHBITS is not a constant I wonder if there isn't
> a more efficient hash to be found. But I guess I'll leave that
> to the hashing experts.

For the moment anything is better than when you hit this
bug :-)

--- fs/inode.c.~1~ Sun Feb 4 20:45:36 2001
+++ fs/inode.c Thu Feb 8 01:21:07 2001
@@ -729,7 +729,8 @@
static inline unsigned long hash(struct super_block *sb, unsigned long i_ino)
{
unsigned long tmp = i_ino + ((unsigned long) sb / L1_CACHE_BYTES);
- tmp = tmp + (tmp >> I_HASHBITS) + (tmp >> I_HASHBITS*2);
+ tmp = tmp + (tmp >> I_HASHBITS) +
+ (tmp >> (I_HASHBITS+(I_HASHBITS/2)));
return tmp & I_HASHMASK;
}

2001-02-09 00:36:37

by dean gaudet

[permalink] [raw]
Subject: Re: dentry cache order 7 is broken

that appears to do it :)

-dean

On Thu, 8 Feb 2001, David S. Miller wrote:

>
> dean gaudet writes:
> > also, for order > 7, was the real intention to use a shift of
> > (order*2)&31?
>
> No, the whole thing is buggered. How stupid, my fault.
> It was the 64-bit platform fascist at work :-)
>
> How does this work for you (against 2.4.x)?
>
> --- fs/dcache.c.~1~ Tue Feb 6 23:00:58 2001
> +++ fs/dcache.c Thu Feb 8 00:09:10 2001
> @@ -696,7 +696,8 @@
> static inline struct list_head * d_hash(struct dentry * parent, unsigned long hash)
> {
> hash += (unsigned long) parent / L1_CACHE_BYTES;
> - hash = hash ^ (hash >> D_HASHBITS) ^ (hash >> D_HASHBITS*2);
> + hash = hash ^ (hash >> D_HASHBITS) ^
> + (hash >> (D_HASHBITS+(D_HASHBITS/2)));
> return dentry_hashtable + (hash & D_HASHMASK);
> }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-02-09 01:21:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: dentry cache order 7 is broken

In article <[email protected]>,
dean gaudet <[email protected]> wrote:
>
>if you've got 512Mb of RAM you end up with a dentry cache of order 7 --
>65536 entries.
>
>this results in a D_HASHBITS of 16. if you look at d_hash it contains
>this code:
>
> hash = hash ^ (hash >> D_HASHBITS) ^ (hash >> D_HASHBITS*2);

Cool.

Just remove the third part altogether. The "hash" is only 32 bits
anyway (even on 64-bit platforms, as we don't want to blow up the dentry
size unnecessarily). And with most machines with reasonable memory, you
already get fine distribution with just two terms (ie you don't lose any
bits), and it speeds it up and avoids the problem you see.

Thanks..

Linus