Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755033Ab0AETJS (ORCPT ); Tue, 5 Jan 2010 14:09:18 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754766Ab0AETJR (ORCPT ); Tue, 5 Jan 2010 14:09:17 -0500 Received: from smtp1.linux-foundation.org ([140.211.169.13]:43740 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754728Ab0AETJR (ORCPT ); Tue, 5 Jan 2010 14:09:17 -0500 Date: Tue, 5 Jan 2010 11:08:46 -0800 (PST) From: Linus Torvalds X-X-Sender: torvalds@localhost.localdomain To: "Paul E. McKenney" cc: Christoph Lameter , Andi Kleen , KAMEZAWA Hiroyuki , Minchan Kim , Peter Zijlstra , Peter Zijlstra , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "hugh.dickins" , Nick Piggin , Ingo Molnar Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault() In-Reply-To: <20100105185542.GH6714@linux.vnet.ibm.com> Message-ID: References: <20100105134357.4bfb4951.kamezawa.hiroyu@jp.fujitsu.com> <20100105143046.73938ea2.kamezawa.hiroyu@jp.fujitsu.com> <20100105163939.a3f146fb.kamezawa.hiroyu@jp.fujitsu.com> <87wrzwbh0z.fsf@basil.nowhere.org> <20100105185542.GH6714@linux.vnet.ibm.com> User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1389 Lines: 33 On Tue, 5 Jan 2010, Paul E. McKenney wrote: > > But on many systems, it does take some time for the idle reads to make > their way to the CPU that just acquired the lock. Yes. But the point is that there is lots of them. So think of it this way: every time _one_ CPU acquires a lock (and then releases it), _all_ CPU's will read the new value. Imagine the cross-socket traffic. In contrast, doing just a single xadd (which replaces the whole "spin_lock+non-atomics+spin_unlock"), every times _once_ CPU cquires a lock, that's it. The other CPU's arent' all waiting in line for the lock to be released, and reading the cacheline to see if it's their turn. Sure, after they got the lock they'll all eventually end up reading from that cacheline that contains 'struct mm_struct', but that's something we could even think about trying to minimize by putting the mmap_sem as far away from the other fields as possible. Now, it's very possible that if you have a broadcast model of cache coherency, none of this much matters and you end up with almost all the same bus traffic anyway. But I do think it could matter a lot. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/