Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756945AbZDPVM0 (ORCPT ); Thu, 16 Apr 2009 17:12:26 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753801AbZDPVML (ORCPT ); Thu, 16 Apr 2009 17:12:11 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:49025 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753642AbZDPVMK (ORCPT ); Thu, 16 Apr 2009 17:12:10 -0400 Date: Thu, 16 Apr 2009 14:02:42 -0700 (PDT) From: Linus Torvalds X-X-Sender: torvalds@localhost.localdomain To: Stephen Hemminger cc: Eric Dumazet , paulmck@linux.vnet.ibm.com, Patrick McHardy , David Miller , jeff.chua.linux@gmail.com, paulus@samba.org, mingo@elte.hu, laijs@cn.fujitsu.com, jengelh@medozas.de, r000n@r000n.net, linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, netdev@vger.kernel.org, benh@kernel.crashing.org Subject: Re: [PATCH[] netfilter: use per-cpu reader-writer lock (v0.7) In-Reply-To: <20090416134956.6c1f0087@nehalam> Message-ID: References: <20090415135526.2afc4d18@nehalam> <49E64C91.5020708@cosmosbay.com> <20090415.164811.19905145.davem@davemloft.net> <20090415170111.6e1ca264@nehalam> <20090415174551.529d241c@nehalam> <49E6BBA9.2030701@cosmosbay.com> <49E7384B.5020707@trash.net> <20090416144748.GB6924@linux.vnet.ibm.com> <49E75876.10509@cosmosbay.com> <20090416175850.GH6924@linux.vnet.ibm.com> <49E77BF6.1080206@cosmosbay.com> <20090416134956.6c1f0087@nehalam> User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1577 Lines: 45 On Thu, 16 Apr 2009, Stephen Hemminger wrote: > > This version of x_tables (ip/ip6/arp) locking uses a per-cpu > rwlock that can be nested. It is sort of like earlier brwlock > (fast reader, slow writer). The locking is isolated so future improvements > can concentrate on measuring/optimizing xt_table_info_lock. I tried > other versions based on recursive spin locks and sequence counters and > for me, the risk of inventing own locking primitives not worth it at this time. This is stil scary. Do we guarantee that read-locks nest in the presense of a waiting writer on another CPU? Now, I know we used to (ie readers always nested happily with readers even if there were pending writers), and then we broke it. I don't know that we ever unbroke it. IOW, at least at some point we deadlocked on this (due to trying to be fair, and not lettign in readers while earlier writers were waiting): CPU#1 CPU#2 read_lock write_lock .. spins with write bit set, waiting for readers to go away .. recursive read_lock .. spins due to the write bit being. BOOM: deadlock .. Now, I _think_ we avoid this, but somebody should double-check. Also, I have still yet to hear the answer to why we care about stale counters of dead rules so much that we couldn't just free them later with RCU. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/