Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752195AbcD3RMk (ORCPT ); Sat, 30 Apr 2016 13:12:40 -0400 Received: from mail-ig0-f194.google.com ([209.85.213.194]:36835 "EHLO mail-ig0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750973AbcD3RMj (ORCPT ); Sat, 30 Apr 2016 13:12:39 -0400 MIME-Version: 1.0 In-Reply-To: <1462034711.5535.169.camel@edumazet-glaptop3.roam.corp.google.com> References: <20160428161742.363543816@linutronix.de> <20160428163525.495514517@linutronix.de> <1462034711.5535.169.camel@edumazet-glaptop3.roam.corp.google.com> Date: Sat, 30 Apr 2016 10:12:38 -0700 X-Google-Sender-Auth: oWQ5FmzbxBx4PUwajWCn8H-dV6Y Message-ID: Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism From: Linus Torvalds To: Eric Dumazet Cc: Thomas Gleixner , LKML , Peter Zijlstra , Ingo Molnar , Andrew Morton , Sebastian Andrzej Siewior , Darren Hart , Michael Kerrisk , Davidlohr Bueso , Chris Mason , "Carlos O'Donell" , Torvald Riegel , Eric Dumazet Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2306 Lines: 56 On Sat, Apr 30, 2016 at 9:45 AM, Eric Dumazet wrote: > > I use hash_32() in net/sched/sch_fq.c, for all packets sent by Google > servers. (Note that I did _not_ use hash_ptr()) > > That's gazillions of packets per second, and the current multiply worked > just fine in term of hash spreading. So hash_32() really is much better than hash_64(). I think we'll tweak it a bit, but largely leave it alone. The 64-bit case needs to be tweaked a _lot_. For the 32-bit case, I like the one that George Spelvin suggested: #define GOLDEN_RATIO_32 0x61c88647 /* phi^2 = 1-phi */ because of his slow multiplier fallback version that we could also use: /* Returns x * GOLDEN_RATIO_32 without a hardware multiplier */ unsigned hash_32(unsigned x) { unsigned y, z; /* Path length */ y = (x << 19) + x; /* 1 shift + 1 add */ z = (x << 9) + y; /* 1 shift + 2 add */ x = (x << 23) + z; /* 1 shift + 3 add */ z = (z << 8) + y; /* 2 shift + 3 add */ x = (x << 6) - x; /* 2 shift + 4 add */ return (z << 3) + x; /* 3 shift + 4 add */ } and I don't think that we really need the several big constants with the fancy "full cascade" function. If you have a test-case for that sch_fq.c case, it might be a good idea to test the above GOLDEN_RATIO_32 value, but quite frankly, I don't see any way it would be materially different from the one we use now. It does avoid that long series of zeroes in the low bits, but that's actually not a huge problem for the 32-bit hash to begin with. It's not nearly as long a series (or in the wrong bit positions) as the 64-bit hash multiplier value had. Also, I suspect that since you hash the kernel "struct sock" pointers, you actually never get the kinds of really bad patterns that Thomas had. But maybe you use hash_32() on a pointer because you noticed that hash_long() or hash_ptr() (which use hash_64 on 64-bit architectures, and would have been more natural) gave worse performance? Maybe you thought that it was the bigger multiply that caused the performance problems? If you did performance work, I suspect it really could have been that hash_64() did a bad job for you. Linus