Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp1340294imm; Fri, 1 Jun 2018 22:04:32 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJh9q8+iwuViKzKrgAl0xEsIM0Ouny29ZYAe/gSMFt9PAJ/co7QHd2PAW6n7ECvIuH3mudR X-Received: by 2002:a62:d653:: with SMTP id r80-v6mr13425878pfg.54.1527915872565; Fri, 01 Jun 2018 22:04:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527915872; cv=none; d=google.com; s=arc-20160816; b=uLN+5M21xmyF53pB3t5xQkPLZvyljek8LM3D28Ze8WS1ADubo0O4NeyvwP1G1JlnDq DWjBun/2H1bQ4c0QiYLheyLtWuRY0j4RTfoCre31Y4B4Hm89aLpb6CwDV3FaT319IuNT 6KE1d51onDnRvymMu0nTeMeV37rqouKlH56YyBxYmSzi/qsuYFTY6FIRgttNoh79D8Y8 fQmAKOJp84CKfgQQ1TRWwcrkSU1A4/gXhkgRhMJRIclZJKToFAlwP7XYkjCNRMJf7WAY hZIt2XgKINOwuCBzBUece/01fqzs38TihQT1uXe8mift1/DytkFlgvuDKccY4EPlLONI Xgcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=ToWyg5G4Qhl7CYFYASknkFBZW09Vz3u4BblKfXLWiY0=; b=peO9wqhrk/dzJYqW8eFHUiR3FQ+g6t/3HSTyh+EKBhNXvHQkdJcOAiq4lawbsDLnjW VBabL30WrqUm9s+2vGKNL7yMtQ1Um2pa2i6NftgiwkhAdWKuHBEXmhKeaecSXwd6nEz2 Zo3fa2gZaHHm6WvzGDgVuWdDJU2S9OquHvP2jxB34oxBfeSb81zP2Hgz1JCwxsQozWZu uBywPgXtTzWpkQwQ7EkTDDCW3SDgFx47by931LXe4gwj6wYLA3Czmx2LHE9OsDXBw25P 44DfE1IOPKPAS0ZqVsNAbriQlxC30N6DHLVHCC7t/984DGMou8gAd9p2zF4lsSwWX/r2 lAqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p9-v6si26169681plo.208.2018.06.01.22.04.16; Fri, 01 Jun 2018 22:04:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751263AbeFBFDb (ORCPT + 99 others); Sat, 2 Jun 2018 01:03:31 -0400 Received: from orcrist.hmeau.com ([104.223.48.154]:58492 "EHLO deadmen.hmeau.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750877AbeFBFD3 (ORCPT ); Sat, 2 Jun 2018 01:03:29 -0400 Received: from gondobar.mordor.me.apana.org.au ([192.168.128.4] helo=gondobar) by deadmen.hmeau.com with esmtps (Exim 4.89 #2 (Debian)) id 1fOyhJ-0002AQ-NF; Sat, 02 Jun 2018 13:03:25 +0800 Received: from herbert by gondobar with local (Exim 4.89) (envelope-from ) id 1fOyhG-0002SD-TD; Sat, 02 Jun 2018 13:03:22 +0800 Date: Sat, 2 Jun 2018 13:03:22 +0800 From: Herbert Xu To: NeilBrown Cc: Thomas Graf , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Eric Dumazet , "David S. Miller" Subject: Re: [PATCH 15/18] rhashtable: use bit_spin_locks to protect hash bucket. Message-ID: <20180602050322.liesw324q5kawcue@gondor.apana.org.au> References: <152782754287.30340.4395718227884933670.stgit@noble> <152782824984.30340.1634082820568216846.stgit@noble> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <152782824984.30340.1634082820568216846.stgit@noble> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 01, 2018 at 02:44:09PM +1000, NeilBrown wrote: > This patch changes rhashtables to use a bit_spin_lock (BIT(1)) > the bucket pointer to lock the hash chain for that bucket. > > The benefits of a bit spin_lock are: > - no need to allocate a separate array of locks. > - no need to have a configuration option to guide the > choice of the size of this array > - locking cost if often a single test-and-set in a cache line > that will have to be loaded anyway. When inserting at, or removing > from, the head of the chain, the unlock is free - writing the new > address in the bucket head implicitly clears the lock bit. > - even when lockings costs 2 updates (lock and unlock), they are > in a cacheline that needs to be read anyway. > > The cost of using a bit spin_lock is a little bit of code complexity, > which I think is quite manageable. > > Bit spin_locks are sometimes inappropriate because they are not fair - > if multiple CPUs repeatedly contend of the same lock, one CPU can > easily be starved. This is not a credible situation with rhashtable. > Multiple CPUs may want to repeatedly add or remove objects, but they > will typically do so at different buckets, so they will attempt to > acquire different locks. > > As we have more bit-locks than we previously had spinlocks (by at > least a factor of two) we can expect slightly less contention to > go with the slightly better cache behavior and reduced memory > consumption. > > Signed-off-by: NeilBrown ... > @@ -74,6 +71,61 @@ struct bucket_table { > struct rhash_head __rcu *buckets[] ____cacheline_aligned_in_smp; > }; > > +/* > + * We lock a bucket by setting BIT(1) in the pointer - this is always > + * zero in real pointers and in the nulls marker. > + * bit_spin_locks do not handle contention well, but the whole point > + * of the hashtable design is to achieve minimum per-bucket contention. > + * A nested hash table might not have a bucket pointer. In that case > + * we cannot get a lock. For remove and replace the bucket cannot be > + * interesting and doesn't need locking. > + * For insert we allocate the bucket if this is the last bucket_table, > + * and then take the lock. > + * Sometimes we unlock a bucket by writing a new pointer there. In that > + * case we don't need to unlock, but we do need to reset state such as > + * local_bh. For that we have rht_unlocked(). This doesn't include > + * the memory barrier that bit_spin_unlock() provides, but rcu_assign_pointer() > + * will have provided that. > + */ Yes the concept looks good to me. But I would like to hear from Eric/Dave as to whether this would be acceptable for existing network hash tables such as the ones in inet. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt