Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp988640rbb; Sun, 25 Feb 2024 13:48:45 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXBRtcrap5Rq5HxUpSBivrjUat6Qd4yMINOOQez1v9WKMjmZq+WZk0K/crmN/qbuHbtR94/XuMNngOm78CjNGPfxaCfnfEteh3HUQ7DOw== X-Google-Smtp-Source: AGHT+IHu+nzBZBr1xoDwFomSG7OMrCIq+r3mGQNdPBlp1pJswFC8+8cB7e2/6DfJgM+PJymrszxt X-Received: by 2002:a05:6830:44a3:b0:6e2:de47:d5f7 with SMTP id r35-20020a05683044a300b006e2de47d5f7mr7778301otv.15.1708897725080; Sun, 25 Feb 2024 13:48:45 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708897725; cv=pass; d=google.com; s=arc-20160816; b=DAzh5jdd3bdXUNh7B4tXIkmazIl7MVYEx3phviz04y5DjuuOaCZnj2aGRyfaehyiW/ BtWop8963wiLuJKNQf93BZLtL53cgOHM6eO0s98FhbNYlT2ezgeSK4Xkmvox0IJkkxag 2GkYDnN0UnwWv9LKJh2aggRHDwriAJ94q0RlPJAvIpOauVNHCWVjhsuZJZNLwNO/N/bl y+pj8HNGz/qEYbgo/X2Ykb9cbwxfhf56/f3YYcENwVxcS4pc3Z3TMyuqUsSQPxELDC8e YtO50Tn8GLsTxyfYdLxFNlqGsWVVToxYKznuJUyRnVh7+FupbglkHsU8nXwZfPgJVKF/ rhUA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:dkim-signature:date; bh=hZLPGWILyQb0uie5XWZz0sri12gIiwEuygESDSHUCVk=; fh=yEeywzTE8hb4iMypOMCi/jp2FrqqJmS7fjgkDg2QdQs=; b=ocqRIK13I2dt61Te90OjLzOWQwuDPF2o+2epAEGNsWhMDqB6dckK5nQNALu4x4GKVa FfXC0DvtI88rHCh1YBXubhfpIoBxyK82ykkb/FNnSMEDqUJMIG+fk6dYgf30xCZQT+/v /KH9GJ8hruxJ8TPPV/xwZlcxyWK/fKCROE1nOHr5LocCwf8hqgrAP0YmwPyXe8u5HHA1 I1cEX0bUS4kFw19l6KUARRgwKoiqMc4SngaeQRuq+n4zoLoFIb4HjtT+aRXXg8DeV5HF S6XWWOlXZwFFaGi/npLQ28zqoE0rqWDQjWcu+pJSh/G6cgPWCxFDCjnFZntebJuBCIm3 EVxg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=Mjt0ivCt; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-80315-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80315-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id a4-20020aa78e84000000b006e4e93f4b51si2660575pfr.140.2024.02.25.13.48.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Feb 2024 13:48:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-80315-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=Mjt0ivCt; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-80315-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-80315-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id BBDCA28231B for ; Sun, 25 Feb 2024 21:48:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 682D31BC30; Sun, 25 Feb 2024 21:48:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Mjt0ivCt" Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4A801B950 for ; Sun, 25 Feb 2024 21:48:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708897713; cv=none; b=Xq1q3GFEQwKNzBlBrwMoQRuDusxii9Db42WdzJaiDj1JU99OjX/WCVflf2MbgzrTxWs+wOrU+gM61tKalFp4PNjCsJziVV66uPKOPmdroecwA8h0reJJzwWK7Xxp2CYrCmieZ9GcdLj997OcObvbGonSEThYurIK1Wu9g7GFxPI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708897713; c=relaxed/simple; bh=dvUDecwBMflNx4CehdCSDahY5V9l4jB2UPQiq5FVREc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PrGO3lKtdX8cNFIsxe3qkcJpl2fAp1yoZey4wcbRZpsCZb9YsekUILfwa8Ku/zyHxMKtJz02+jKGgURr9rF6K5YR5q0ZUj9A0FY+qkaHztyAXw3X8Rky9V8nnMljwYVejAAM4KC9+u8NTHK3Z8o61sD3yJ/qouZpX33gSEa8sto= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Mjt0ivCt; arc=none smtp.client-ip=91.218.175.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Date: Sun, 25 Feb 2024 16:48:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1708897709; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hZLPGWILyQb0uie5XWZz0sri12gIiwEuygESDSHUCVk=; b=Mjt0ivCt7EdkAJ/tpZ4vBCFUaqud0EetWM2nBfj0PDARbuDEYfLiwPCmO+8FQyUf1FXdfK FVg/OoP8xi3mQ8sb+oDtygwMJ6YTysy3H6B6s8jKttLHwjCDctJ7K1/6COxMO8MIW170zO y1pGGRjGU3Mg12gJagPvuo4QwPQgzCU= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: David Laight Cc: 'Herbert Xu' , "Matthew Wilcox (Oracle)" , "linux-kernel@vger.kernel.org" , Thomas Graf , "netdev@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "maple-tree@lists.infradead.org" , "rcu@vger.kernel.org" Subject: Re: [PATCH 0/1] Rosebush, a new hash table Message-ID: <22nvcwzzfq3ae2eva6m43wolfo3b3pit7xsauuux267hxciigi@52zuwg5yorg5> References: <20240222203726.1101861-1-willy@infradead.org> <4a1416fcb3c547eb9612ce07da6a77ed@AcuMS.aculab.com> <2s73sed5n6kxg42xqceenjtcwxys4j2r5dc5x4fdtwkmhkw3go@7viy7qli43wd> <2a6001442b354c2fb5b881c2a9d75895@AcuMS.aculab.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2a6001442b354c2fb5b881c2a9d75895@AcuMS.aculab.com> X-Migadu-Flow: FLOW_OUT On Sun, Feb 25, 2024 at 02:47:45PM +0000, David Laight wrote: > From: Kent Overstreet > > Sent: 25 February 2024 03:19 > .. > > when I implemented cuckoo (which is more obviously sensitive to a weak > > hash function), I had to go with siphash, even jhash wasn't giving me > > great reslts. and looking at the code it's not hard to see why, it's all > > adds, and the rotates are byte aligned... you want mixed adds and xors > > and the rotates to be more prime-ish. > > > > right idea, just old... > > > > what would be ideal is something more like siphash, but with fewer > > rounds, so same number of instructions as jhash. xxhash might fit the > > bill, I haven't looked at the code yet... > > There is likely to be a point where scanning a list of values > for the right hash value is faster than executing a hash function > that is good enough to separate them to separate buckets. Executing a bunch of alu ops that parallelize nicely is _fast_ (it's all xors/adds/rotates, chacha20 is a good example of how the modern stuff works). Also, for 2-way cuckoo there's an xor trick so you don't need to compute a second hash. But the key thing about cuckoo is that the loads on lookup _aren't dependent_ - they can run in parallel. Every cache miss that goes all the way to DRAM is stupidly expensive, remember - hundreds of instructions, had you been able to keep the pipeline fed. > You don't want to scan a linked list because they have horrid > cache footprints. > The locking is equally horrid - especially for remove. > Arrays of pointers ar ethe way forward :-) Well, maybe; I'm waiting to see fill factor numbers and benchmarks. Fill factor was my concern when I was playing around with the concept; I used it in a place where the hash table didn't need to be that big, and the point was to avoid having to separately allocate the entries (and it avoids the hassle of tombstone entries with linear/quadratic probing). The fact that it requires a second dependent load, because buckets are separately allocated, also looks like a big negative to me. I still think a good lockless cuckoo hashing implementation ought to beat it.