Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755224AbXENM5j (ORCPT ); Mon, 14 May 2007 08:57:39 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752246AbXENM5d (ORCPT ); Mon, 14 May 2007 08:57:33 -0400 Received: from viefep13-int.chello.at ([213.46.255.15]:45054 "EHLO viefep12-int.chello.at" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751793AbXENM5c (ORCPT ); Mon, 14 May 2007 08:57:32 -0400 Subject: Re: [PATCH 0/2] convert mmap_sem to a scalable rw_mutex From: Peter Zijlstra To: Nick Piggin Cc: Eric Dumazet , Ingo Molnar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oleg Nesterov , Andrew Morton , Thomas Gleixner In-Reply-To: <20070514120737.GE31234@wotan.suse.de> References: <20070511131541.992688403@chello.nl> <20070511155621.GA13150@elte.hu> <46449F61.2060004@cosmosbay.com> <1178903913.2781.20.camel@lappy> <20070514120737.GE31234@wotan.suse.de> Content-Type: text/plain Date: Mon, 14 May 2007 14:57:28 +0200 Message-Id: <1179147448.6810.79.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1863 Lines: 41 On Mon, 2007-05-14 at 14:07 +0200, Nick Piggin wrote: > On Fri, May 11, 2007 at 07:18:33PM +0200, Peter Zijlstra wrote: > > On Fri, 2007-05-11 at 18:52 +0200, Eric Dumazet wrote: > > > > > > But I personally find this new rw_mutex not scalable at all if you have some > > > writers around. > > > > > > percpu_counter_sum is just a L1 cache eater, and O(NR_CPUS) > > > > Yeah, that is true; there are two occurences, the one in > > rw_mutex_read_unlock() is not strictly needed for correctness. > > > > Write locks are indeed quite expensive. But given the ratio of > > reader:writer locks on mmap_sem (I'm not all that familiar with other > > rwsem users) this trade-off seems workable. > > I guess the problem with that logic is assuming the mmap_sem read side > always needs to be scalable. Given the ratio of threaded:unthreaded > apps, maybe the trade-off swings away from favour? Could be; I've been bashing my head against the wall trying to find a scalable write side solution. But so far only got a massive dent in my brain from the effort. Perhaps I can do a similar optimistic locking for my rcu-btree as I did for the radix tree. That way most of the trouble would be endowed upon the vmas instead of the mm itself. And then it would be up to user-space to ensure it has in the order of nr_cpu_ids arenas to work in. Also, as Hugh pointed out in an earlier thread; mmap_sem's write side also protects the page tables, so we'd need to fix that up too; assumedly the write side equivalent of the vma lock would then protect all underlying page tables.... /me drifting away, rambling incoherently,.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/