1998-08-28 09:54:22

by Patrik Hagglund

[permalink] [raw]
Subject: "fuzzy hashing" = skiplists in a different shape

I saw your "fuzzy hashing" implementation on Linux Weekly News
yesterday, and to me it looks much like randomized skip list. The
neigbour_next list is the first level pointer chain and hash_next is
the second level. But, there is a notable exception. Your code
contians 16 second-level lists, that is, 15 redundant ones.

> it will be as fast or cheaper _even_ in the common case than what we
> have now

? What is this common case. I can't see how your implementation would
be faster than a good implementation of a balanced search tree.

Regards,
Patrik H?gglund,
intrested in data structures and algorithms,
but not a kernel hacker (yet)


1998-08-28 21:53:31

by David S. Miller

[permalink] [raw]
Subject: Re: "fuzzy hashing" = skiplists in a different shape

Date: Fri, 28 Aug 1998 11:06:33 +0200
From: Patrik Hagglund <[email protected]>

? What is this common case. I can't see how your implementation would
be faster than a good implementation of a balanced search tree.

You said the answer, I don't have to balance anything, so
insert/delete don't cost so much like trees requiring balancing do.

I would suggest that people run through some examples, ie. take
/proc/{pid}/maps for some large process with many attachments, like
emacs or something, have a little test program setup the data
structures as if vma_insert was called in sequence for each vma, and
then pick arbitrary addresses and see what find_vma() does and how
quickly it finds the answer.

Or just print little ascii visualiziations from your test program and
try to figure out for yourself with the picture it outputs why it is
so impressive an algorithm and why it kicks the shit out of any tree
based solution for this class of problems.

Later,
David S. Miller
[email protected]