Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp689590rbb; Sat, 24 Feb 2024 21:51:52 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVD0DUIsAH/zQ84EDp5mQfNevTGDA7/sYNEBCnWW+ouC6eVhT1+2S00SC8Q9BTNUrazYPKYduGeUR8RdHyluQjMq+2JpseQlc2iphm+KA== X-Google-Smtp-Source: AGHT+IHGsGzgSX/FIw9Qwc8oNlblMsdEC+4W1sxFPzJefkz7/BdqJjuD2fRfMYa+hn95KBg7DVrT X-Received: by 2002:a17:906:b118:b0:a3e:c3b0:e346 with SMTP id u24-20020a170906b11800b00a3ec3b0e346mr2463684ejy.16.1708840312516; Sat, 24 Feb 2024 21:51:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708840312; cv=pass; d=google.com; s=arc-20160816; b=ZUJw1KklNfGznOnI3GmxdaxJBXMh6ug9kiA1a+JIMYWmDOcJ5OKPXjptnV8bSTBnuC zsVFTfaAfthtaAFgaTfypi5oR+yRa0b62H5UKYe45sxa94D0zDn4/HN7yLqK5Oo60o5F sbHeU401HGX9nj53kdjczbjj4i5n9PuUXfd8imLLa8UbXF61dHtu2NI9sirVikFP8ei1 e5GAoUi1YvXa4A6SQuaEupbiRwZ5qMUNZmfPUOHJ2bf+ZqOa10hg4K9yMM8a9NMFZUnb ixnWzswZexqdvFz7JldJ94sbfZx0xXOF/5W8kVlZUyUx09h/buSzZLcIPQUOsTRrVxmT NLMA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:dkim-signature:date; bh=I+gOsAge3PE5P+r0j0YPahgmyjokIkbHG2RX+N12yBs=; fh=gbPUs1PgA6u+cFkqKSonAETXXLFvPkCRsS8/6kSqEl8=; b=Hfto00NoYUbN+wjbBY34xIN0xWMP0xtG1XHFgwRM3tgh296LpGirI178iXUSneAEza zoPqUCV4M3p9oErxQyfG+cQRXIFVElVqM+carnevrfXmyskQd6nghH66FXBrQgr4Lw/7 ic9mfay7GM+EL3UoIkAH9m0+jbiFTCBLiNlJklejmVVqZdr7gBvtzprkSeMMv8n1Ojnq 8GuEdaXlNAP1AmX2HnWOid0iqj+2CPDxCO2BNOyQnwxD94OwGuTeBOOQmGoQqU6Nz4hn HXQitQChWI47wQuEOqtRGK9xntA8f4gSCrq7mbt8H04305cbyjvj5+d8JaT0GIVrVTVa hvsA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="iI/ja1D9"; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-79983-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-79983-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id dk23-20020a170906f0d700b00a3c1e5ce8b5si972864ejb.785.2024.02.24.21.51.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 24 Feb 2024 21:51:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-79983-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="iI/ja1D9"; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-79983-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-79983-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 4B66E1F21A16 for ; Sun, 25 Feb 2024 05:51:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C5959D268; Sun, 25 Feb 2024 05:51:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="iI/ja1D9" Received: from out-176.mta0.migadu.com (out-176.mta0.migadu.com [91.218.175.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D989D8F70 for ; Sun, 25 Feb 2024 05:51:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708840296; cv=none; b=Qot4zip956EHfStJplsHnN0HHXHH0WEeJXIH8rZdFWEw1+FydeB38xo96Iy2iAFdCMLmMaIvfW0k+qV4cAp3dChIjuENbcYUJS5ABkCK4mr4bKhf1DBEFPIXovDJPx3lWK9Ww8IJCQtb8DHsFhmubr7zWxdljCq9LRAd9u/dPdQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708840296; c=relaxed/simple; bh=kOcPJP7HmKte6IzOK94doH3AAanK9Mu1jw8WbI8Dx6c=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lSq9S7wYeCPSCjONYPOkVAbji2Bm7CrD6/quEbV0T0v+8ZJNN2FAwQM4qNNDf8SeOLLRIgQ0l0FJ/qJvNJRMKkgtZvs8a0aTvgF2AYH+5rI3ft+ujUfeh+ByhEk4xJziylyPhYtA6r8kClFrWPUNMFddthw+3HkUPTjaufESU48= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=iI/ja1D9; arc=none smtp.client-ip=91.218.175.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Date: Sun, 25 Feb 2024 00:51:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1708840291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=I+gOsAge3PE5P+r0j0YPahgmyjokIkbHG2RX+N12yBs=; b=iI/ja1D9rkXm8/vVy/c21mbVr2w8iKfXX0vF2Y7pPRmqaxwR+e+Q3JtDOP5m4SPMjDNdEW 62eyjzZAZx2RInOOwZb2FiJVW8JgKStz3oh9oTSIVhRJGBlVi7aKLGQTJFTB/K00duhpgm B7rclZm5oCmQcq+OfGMdm5IMMxkNJG0= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Matthew Wilcox Cc: David Laight , 'Herbert Xu' , "linux-kernel@vger.kernel.org" , Thomas Graf , "netdev@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "maple-tree@lists.infradead.org" , "rcu@vger.kernel.org" Subject: Re: [PATCH 0/1] Rosebush, a new hash table Message-ID: <5p5sypt3y643rr7kp66lhmgksgtuvdgijrryh53mqiiqkrgyty@d4zcnya22owg> References: <20240222203726.1101861-1-willy@infradead.org> <4a1416fcb3c547eb9612ce07da6a77ed@AcuMS.aculab.com> <2s73sed5n6kxg42xqceenjtcwxys4j2r5dc5x4fdtwkmhkw3go@7viy7qli43wd> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT On Sun, Feb 25, 2024 at 05:01:19AM +0000, Matthew Wilcox wrote: > On Sat, Feb 24, 2024 at 10:18:31PM -0500, Kent Overstreet wrote: > > On Sat, Feb 24, 2024 at 10:10:27PM +0000, David Laight wrote: > > > I remember playing around with the elf symbol table for a browser > > > and all its shared libraries. > > > While the hash function is pretty trivial, it really didn't matter > > > whether you divided 2^n, 2^n-1 or 'the prime below 2^n' some hash > > > chains were always long. > > > > that's a pretty bad hash, even golden ratio hash would be better, but > > still bad; you really should be using at least jhash. > > There's a "fun" effect; essentially the "biased observer" effect which > leads students to erroneously conclude that the majority of classes are > oversubscribed. As somebody observed in this thread, for some usecases > you only look up hashes which actually exist. > > Task a trivial example where you have four entries unevenly distributed > between two buckets, three in one bucket and one in the other. Now 3/4 > of your lookups hit in one bucket and 1/4 in the other bucket. > Obviously it's not as pronounced if you have 1000 buckets with 1000 > entries randomly distributed between the buckets. But that distribution > is not nearly as even as you might expect: > > $ ./distrib > 0: 362 > 1: 371 > 2: 193 > 3: 57 > 4: 13 > 5: 4 > > That's using lrand48() to decide which bucket to use, so not even a > "quality of hash" problem, just a "your mathematical intuition may not > be right here". well, golden ratio hash - hash_32(i, 32) 0: 368 1: 264 2: 368 3: 0 but your distribution actually is accurate in general, golden ratio hash is relly nice for sequential integers. the actual problem with your test is that you're testing 100% occupancy - no one does that. 75% occupancy, siphash: 0: 933 1: 60 2: 6 3: 1 4: 0 that looks about right to me.