Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758242AbZD0Rpq (ORCPT ); Mon, 27 Apr 2009 13:45:46 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754525AbZD0Rph (ORCPT ); Mon, 27 Apr 2009 13:45:37 -0400 Received: from bombadil.infradead.org ([18.85.46.34]:42229 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753948AbZD0Rpg (ORCPT ); Mon, 27 Apr 2009 13:45:36 -0400 Subject: Re: [PATCH] netfilter: use per-CPU recursive lock {XV} From: Peter Zijlstra To: Stephen Hemminger Cc: Mathieu Desnoyers , Eric Dumazet , David Miller , Jarek Poplawski , Linus Torvalds , Ingo Molnar , Paul Mackerras , paulmck@linux.vnet.ibm.com, Evgeniy Polyakov , kaber@trash.net, jeff.chua.linux@gmail.com, laijs@cn.fujitsu.com, jengelh@medozas.de, r000n@r000n.net, linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, netdev@vger.kernel.org, benh@kernel.crashing.org In-Reply-To: <20090426145746.1184aeba@nehalam> References: <20090421111541.228e977a@nehalam> <20090421193924.GA24404@elte.hu> <20090421143927.52d7d89d@nehalam> <20090423210938.1501507b@nehalam> <49F146FF.5050200@cosmosbay.com> <20090424091839.6e13ebec@nehalam> <49F22465.80305@gmail.com> <20090425133052.4cb711f5@nehalam> <49F4A6E3.7080102@cosmosbay.com> <20090426185646.GB29238@Krystal> <20090426145746.1184aeba@nehalam> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Mon, 27 Apr 2009 19:44:57 +0200 Message-Id: <1240854297.7620.65.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.26.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3878 Lines: 102 On Sun, 2009-04-26 at 14:57 -0700, Stephen Hemminger wrote: > On Sun, 26 Apr 2009 14:56:46 -0400 > Mathieu Desnoyers wrote: > > > * Eric Dumazet (dada1@cosmosbay.com) wrote: > > > From: Stephen Hemminger > > > > > > > Epilogue due to master Jarek. Lockdep carest not about the locking > > > > doth bestowed. Therefore no keys are needed. > > > > > > > > Signed-off-by: Stephen Hemminger > > > > > > So far, so good, should be ready for inclusion now, nobody complained :) > > > > > > I include the final patch, merge of your last two patches. > > > > > > David, could you please review it once again and apply it if it's OK ? > > > > > Thanks to all for your help and patience > > > > > > [PATCH] netfilter: use per-CPU recursive lock {XV} > > > > Hi Eric, > > > > Suitable name would probably be : > > > > But Linus is trying to delude himself. > > This usage is recursive even if he doesn't like the terminology. > The same CPU has to be able to reacquire the read lock without deadlocking. > If reader/writer locks were implemented in a pure writer gets priority > method, then this code would break! So yes read locks can be used recursively > now in Linux, but if the were implemented differently then this code > would break. For example, the -rt kernel turns all read/write locks into > mutexs, so the -rt kernel developers will have to address this. A recursive lock has the property: lock() { if (lock->owner == current) { lock->depth++; return; } /* regular lock stuff */ } unlock() { if (!--lock->depth) /* regular unlock */ } non of the linux kernel locking primitives have this -- with the possible exception of the cpu-hotplug lock. What rwlock_t has, is reader bias to the point where you can utterly starve writers, with the side effect that you can obtain multiple read ownerships without causing a deadlock. This is not what is called a recursive lock. A recursive lock would have each owner only once, this rwlock_t thing is simply so unfair that it can have unlimited owners, including multiple copies of the same one. rwsem has fifo fairness, and therefore can deadlock in this scenario, suppose thread A does a read, thread B tries a write and blocks, then thread A recurses and tries to obtain another read ownership -- deadlock, as the FIFO fairness will demand the second read ownership will wait on the pending writer, which will wait on the outstanding read owner. Now if rwsem were a fifo-fair recursive lock, the above would not deadlock, since it would detect that the task already had (read) ownership and simply increment the depth, instead of trying to acquire a second ownership. This is all common and well understood terminology, not something Linus invented just to harras you with. Generally speaking we do not condone recursive locking strategies -- and afaik reiserfs (as per the BKL) and the network code (as per abusing rwlock_t unfairness) are the only offenders. Like Linus stated, recursive locking is generally poor taste and indicates you basically gave up on trying to find a proper locking scheme. We should very much work towards getting rid of these abberations instead of adding new ones. Linus is very much right on what he said, and you calling him delusional only high-lights your ignorance on the issue. [ PS. -rt implements rwlock_t as a proper recursive lock (either a mutex or a full multi-owner reader-writer lock with PI fairness) so if anybody abuses rwlock_t unfairness in a way that is not strict owner recursive we have a problem. ] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/