Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752485AbcD2XbS (ORCPT ); Fri, 29 Apr 2016 19:31:18 -0400 Received: from ns.horizon.com ([71.41.210.147]:57346 "HELO ns.horizon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1751339AbcD2XbR (ORCPT ); Fri, 29 Apr 2016 19:31:17 -0400 Date: 29 Apr 2016 19:31:15 -0400 Message-ID: <20160429233115.8864.qmail@ns.horizon.com> From: "George Spelvin" To: torvalds@linux-foundation.org Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism Cc: linux-kernel@vger.kernel.org, linux@horizon.com, tglx@linutronix.de In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2319 Lines: 56 On Fri, Apr 29, 2016 at 9:32 PM, Linus Torvalds wrote: wrote: > For example, that _long_ range of bits set ("7fffffffc" in the middle) > is effectively just one bit set with a subtraction. And it's *right* > in that bit area that is supposed to shuffle bits 14-40 to the high bits > (which is what we actually *use*. So it effectively shuffles none of those > bits around at all, and if you have a stride of 4096, your'e pretty much > done for. Gee, I recall saying something a lot like that. > 64 bits is just as bad... 0x9e37fffffffc0001 becomes > 0x7fffffffc0001, which is 2^51 - 2^18 + 1. After researching it, I think that the "high bits of a multiply" is in fact a decent way to do such a hash. Interestingly, for a randomly chosen odd multiplier A, the high k bits of the w-bit product A*x is a universal hash function in the cryptographic sense. See section 2.3 of http://arxiv.org/abs/1504.06804 One thing I note is that the advice in the comments to choose a prime number is misquoting Knuth! Knuth says (vol. 3 section 6.4) the number should be *relatively* prime to the word size, which for binary computers simply means odd. When we have a hardware multiplier, keeping the Hamming weight low is a waste of time. When we don't, clever organization can do better than the very naive addition/subtraction chain in the current hash_64(). To multiply by the 32-bit constant 1640531527 = 0x61c88647 (which is the negative of the golden ratio, so has identical distribution properties) can be done in 6 shifts + adds, with a critical path length of 7 operations (3 shifts + 4 adds). #define GOLDEN_RATIO_32 0x61c88647 /* phi^2 = 1-phi */ /* Returns x * GOLDEN_RATIO_32 without a hardware multiplier */ unsigned hash_32(unsigned x) { unsigned y, z; /* Path length */ y = (x << 19) + x; /* 1 shift + 1 add */ z = (x << 9) + y; /* 1 shift + 2 add */ x = (x << 23) + z; /* 1 shift + 3 add */ z = (z << 8) + y; /* 2 shift + 3 add */ x = (x << 6) - x; /* 2 shift + 4 add */ return (z << 3) + x; /* 3 shift + 4 add */ } Finding a similarly efficient chain for the 64-bit golden ratio 0x9E3779B97F4A7C15 = 11400714819323198485 or 0x61C8864680B583EB = 7046029254386353131 is a bit of a challenge, but algorithms are known.