2003-03-10 17:35:02

by Martin J. Bligh

[permalink] [raw]
Subject: Dcache hash distrubition patches

Thanks to Andi for the distribution count patches. I did some quick
numbers with this, just by running "find / | xargs ls -l > /dev/null"
to populate the cache, then dumped the numbers out.

larry:~# grep dentry /proc/slabinfo
dentry_cache 880531 884880 160 36870 36870 1 : 248 124

count length
444569 0
381295 1
163442 2
46937 3
10042 4
1747 5
253 6
30 7
5 8

Conclusion: the hash distribution (for this simple test) looks fine
to me.

M.


2003-03-10 17:41:53

by Andi Kleen

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches


> Conclusion: the hash distribution (for this simple test) looks fine
> to me.

Yes, because of the overkill size of the hash table. With a 100K + entry
table you can make near every hash function look good ;)

Try to reduce it to a smaller number of buckets and see what happens.

-Andi



2003-03-11 07:30:30

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches

>> Conclusion: the hash distribution (for this simple test) looks fine
>> to me.
>
> Yes, because of the overkill size of the hash table. With a 100K + entry
> table you can make near every hash function look good ;)
>
> Try to reduce it to a smaller number of buckets and see what happens.

OK, after I've stopped being an idiot, and misreading the code, I have
some numbers. They still look pretty good to me. I shrunk us from
1,048,576 buckets to 65536, and loaded 1,150,000 entries in there.



5 3
9 4
44 5
113 6
243 7
519 8
1059 9
1613 10
2458 11
3506 12
4515 13
5349 14
6071 15
6328 16
6369 17
5862 18
5228 19
4305 20
3546 21
2613 22
1981 23
1382 24
928 25
602 26
368 27
230 28
115 29
75 30
45 31
16 32
14 33
3 34
4 35
2 36

It's not perfect, but not bad either. Some mathematician can go calculate
just how imperfect it is over random distribution, but it looks OK to me ;-)

M

2003-03-11 07:37:46

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches

On Mon, Mar 10, 2003 at 11:41:07PM -0800, Martin J. Bligh wrote:
> It's not perfect, but not bad either. Some mathematician can go calculate
> just how imperfect it is over random distribution, but it looks OK to me ;-)

Okay, the standards are a bit more stringent for hash functions. You
basically want the total number of collisions to be minimized.

That said, standard statistical things are useful for first-pass
discrimination.

This looks basically okay, but if it's really just even with respect
to cache pressure the need for a different data structure in the
eventual future is clear. I'd need a baseline (pre-hlist) dcache
snapshot to compare against to get a better idea.


-- wli

2003-03-11 15:13:01

by Andi Kleen

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches

On Tue, Mar 11, 2003 at 08:41:07AM +0100, Martin J. Bligh wrote:
> some numbers. They still look pretty good to me. I shrunk us from
> 1,048,576 buckets to 65536, and loaded 1,150,000 entries in there.

Interesting would be to find the sweet spot with the smallest hash table
size that still performs well. Not sure if find / is a good workload
for that though.

Also same for inode hash (but I don't have statistics for that right now)

-Andi

2003-03-11 15:20:47

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches

>> some numbers. They still look pretty good to me. I shrunk us from
>> 1,048,576 buckets to 65536, and loaded 1,150,000 entries in there.
>
> Interesting would be to find the sweet spot with the smallest hash table
> size that still performs well. Not sure if find / is a good workload
> for that though.

Difficult, as it depends how many files are in the working set of the
machine, really. Right now it eats 4Mb of lowmem ... 1M entries seems
to be about 150Mb of slab for dentries, which is probably as much as
anyone wants ... but it's nice to keep those hash chains short ;-)

I can try 1Mb or something I suppose ... what's the purpose here,
to keep the cachelines of the bucket heads warm? Not sure it's worth
the tradeoff, as we have to touch another line for each element we
walk?

I take it you're happy enough with the current hash function distribution?

> Also same for inode hash (but I don't have statistics for that right now)

I could hack something up ... but 1 machine ain't going to cut it. I
suspect I'd have a much smaller inode hash, as I tend to have masses
of kernel trees, mostly hardlinked to each other.

M.

2003-03-11 16:17:17

by Andi Kleen

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches

On Tue, Mar 11, 2003 at 04:31:23PM +0100, Martin J. Bligh wrote:
> I can try 1Mb or something I suppose ... what's the purpose here,
> to keep the cachelines of the bucket heads warm? Not sure it's worth
> the tradeoff, as we have to touch another line for each element we
> walk?

Use less cache for the hash table overall.

Use less lowmem.

Ideally there would be no tradeoff if you can still get reasonable
hash chain length with smaller tables (= overall win)

>
> I take it you're happy enough with the current hash function distribution?

At least I don't know how to improve it.



> > Also same for inode hash (but I don't have statistics for that right now)
>
> I could hack something up ... but 1 machine ain't going to cut it. I
> suspect I'd have a much smaller inode hash, as I tend to have masses
> of kernel trees, mostly hardlinked to each other.

I doubt inode cache is very critical, except perhaps in NFS server loads
(but even the nfs server has an own frontend cache)
Normally the dcache should bear most of the load.

-Andi

2003-03-11 17:21:04

by Anton Blanchard

[permalink] [raw]
Subject: Re: Dcache hash distrubition patches


> I doubt inode cache is very critical, except perhaps in NFS server loads
> (but even the nfs server has an own frontend cache)
> Normally the dcache should bear most of the load.

I did some analysis of the hashes ages ago. From memory the regular
grouping of inodes caused this unbalanced distribution:

http://samba.org/~anton/linux/hashes/1/icache.png

Anton