MIME-Version: 1.0
In-Reply-To: <CA+55aFzMtEVc_5QS8EUvdKYfNMmJ9fpgcM4iEhdwaY1Oe=JYmw@mail.gmail.com>
References: <20160428161742.363543816@linutronix.de>
	<20160428163525.495514517@linutronix.de>
	<CA+55aFzMtEVc_5QS8EUvdKYfNMmJ9fpgcM4iEhdwaY1Oe=JYmw@mail.gmail.com>
Date: Fri, 29 Apr 2016 14:10:27 -0700
Message-ID: <CA+55aFyPCN4w6dpaf0p3iQnR=kqJpZ_-=noMVU328-_yDQq9WQ@mail.gmail.com>
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>, Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@kernel.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
        Darren Hart <darren@dvhart.com>,
        Michael Kerrisk <mtk.manpages@googlemail.com>,
        Davidlohr Bueso <dave@stgolabs.net>, Chris Mason <clm@fb.com>,
        "Carlos O'Donell" <carlos@redhat.com>,
        Torvald Riegel <triegel@redhat.com>,
        Eric Dumazet <edumazet@google.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3347
Lines: 73

On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> hash_long/ptr really shouldn't care about the low bits being zero at
> all, because it should mix in all the bits (using a prime multiplier
> and taking the high bits).

Looking at this assertion, it doesn't actually pan out very well at all.

The 32-bit hash looks to be ok. Certainly not perfect, but not horrible.

The 64-bit hash seems to be quite horribly bad with lots of values. I
wrote a test-harness to check it out (some simple values just spread
out at a fixed stride), and the end results are *so* bad that I'm kind
of worried that I screwed up the test harness. But it gives quite
reasonable values for hash_32() and for the plain modulo case.

Now, the way my test-harness works (literally testing a lot of
equal-stride cases), the "modulo prime number" approach will
automatically look perfect. So my test-harness is pretty unfair in
that respect.

But the hash_32() function looks good when hashing into 16 bits, for
example. In that case it does a very good job of spreading things out.
When hashing into 17 bits, hash_32 still looks good, except it does
very badly for strides of 32. It starts doing worse for bigger hash
buckets and bigger strides.

But out hash_64() seems to do very badly on pretty much *any* pattern.
To the point where I started to doubt my test-program. But it really
looks like that multiplication constant is almost pessimally chosen.

For example, that _long_ range of bits set ("7fffffffc" in the middle)
is effectively just one bit set with a subtraction. And it's *right*
in that bit area that is supposed to shuffle bits 14-40 to the high
bits (which is what we actually *use*. So it effectively shuffles none
of those bits around at all, and if you have a stride of 4096, your'e
pretty much done for.

The 32-bit value is better in this regard, largely thanks to having
that low bit set. Thanks to that, the information in bits around 12-18
will stay in bits 12-18, and because we then only have 32 bits, if the
hash table is large enough, they will still be part of the bits when
we take the high bits. For the 64-bit case, bits 12-18 will never even
be relevant.

So I think that what happens here is that hash_32() is actually
somewhat reasonable. But hash_64() sucks donkey balls when there's a
lot of information in around bits 10-20 (which is exactly where a lot
of pointer bits have the *most* information thanks to alignment
issues.

Picking a new value almost at random (I say "almost", because I just
started with that 32-bit multiplicand value that mostly works and
shifted it up by 32 bits and then randomly added a few more bits to
avoid long ranges of ones and zeroes), I picked

  #define GOLDEN_RATIO_PRIME_64 0x9e3700310c100d01UL

and it is *much* better in my test harness.

Of course, things like that depend on what patterns you test, But I
did have a "range of strides and hash sizes" I tried. So just for fun:
try changing GOLDEN_RATIO_PRIME_64 to that value, and see if the
absolutely _horrid_ page-aligned case goes away for you?

It really looks like those multiplication numbers were very very badly picked.

Still, that number doesn't do very well if the hash is small (say, 8
bits). But for slightly larger hash tables it seems to be doing much
better.

                  Linus