Date: 28 Apr 2016 22:57:51 -0400
Message-ID: <20160429025751.8368.qmail@ns.horizon.com>
From: "George Spelvin" <linux@horizon.com>
To: tglx@linutronix.de
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
Cc: linux@horizon.com, linux-kernel@vger.kernel.org,
        torvalds@linux-foundation.org
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1869
Lines: 60

Thomas Gleixner wrote:
> I'm not a hashing wizard and I completely failed to understand why
> hash_long/ptr are so horrible for the various test cases I ran.

It's very simple: the constants chosen are bit-sparse, *particularly*
in the least significant bits, and only 32/64 bits of the product are
kept.  Using the high-word of a double-width multiply is even better,
but some machines (*cough* SPARCv9 *cough*) don't have hardware
support for that.

So what you get is:

  (0x9e370001 * (x << 12)) & 0xffffffff
= (0x9e370001 * x & 0xfffff) << 12
= (0x70001 * x & 0xfffff) << 12

*Now* does it make sense?

64 bits is just as bad...  0x9e37fffffffc0001 becomes
0x7fffffffc0001, which is 2^51 - 2^18 + 1.


The challenge is the !CONFIG_ARCH_HAS_FAST_MULTIPLIER case,
when it has to be done with shifts and adds/subtracts.

Now, what's odd is that it's only relevant for 64-bit platforms, and
currently only x86 and POWER7+ have it.

SPARCv9, MIPS64, ARM64, SH64, PPC64, and IA64 all have it turned off.

Is this a bug that should be fixed?

In fact, do *any* 64-bit platforms need multiply emulation?

How many 32-bit platforms nead a multiplier that's easy for GCC to
evaluate via shifts and adds?

Generlly, by the time you've got a machine grunty enough to
need 64 bits, a multiplier is quite affordable.


Anyway, assuming there exists at least one platform that needs the
shift-and-add sequence, it's quite easy to get a higher hamming weight,
you just have to use a few more registers to save some intermediate
results.

E.g.

	u64 x = val, t = val, u;
	x <<= 2;
	u = x += t;	/* val * 5 */
	x <<= 4;	/* val * 80 */
	x -= u;		/* val * 75 = 0b1001011 */

Shall I try to come up with something?


Footnote: useful web pages on shift-and-add/subtract mutliplciation
http://www.vinc17.org/research/mulbyconst/index.en.html
http://www.spiral.net/hardware/multless.html