Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758745Ab1FQSCR (ORCPT ); Fri, 17 Jun 2011 14:02:17 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:42611 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755133Ab1FQSCO (ORCPT ); Fri, 17 Jun 2011 14:02:14 -0400 MIME-Version: 1.0 In-Reply-To: References: <1308097798.17300.142.camel@schen9-DESK> <1308101214.15392.151.camel@sli10-conroe> <1308138750.15315.62.camel@twins> <20110615161827.GA11769@tassilo.jf.intel.com> <1308156337.2171.23.camel@laptop> <1308163398.17300.147.camel@schen9-DESK> <1308169937.15315.88.camel@twins> <4DF91CB9.5080504@linux.intel.com> <1308172336.17300.177.camel@schen9-DESK> <1308173849.15315.91.camel@twins> <1308255972.17300.450.camel@schen9-DESK> <1308310080.2355.19.camel@twins> From: Linus Torvalds Date: Fri, 17 Jun 2011 11:01:08 -0700 Message-ID: Subject: Re: REGRESSION: Performance regressions from switching anon_vma->lock to mutex To: Hugh Dickins Cc: Peter Zijlstra , Tim Chen , Andi Kleen , Shaohua Li , Andrew Morton , KOSAKI Motohiro , Benjamin Herrenschmidt , David Miller , Martin Schwidefsky , Russell King , Paul Mundt , Jeff Dike , Richard Weinberger , "Luck, Tony" , KAMEZAWA Hiroyuki , Mel Gorman , Nick Piggin , Namhyung Kim , "Shi, Alex" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Rafael J. Wysocki" Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2073 Lines: 45 On Fri, Jun 17, 2011 at 10:41 AM, Hugh Dickins wrote: > > Applying load with those two patches applied (combined patch shown at > the bottom, in case you can tell me I misunderstood what to apply, > and have got the wrong combination on), lockdep very soon protested. Combined patch looks good, it's just the one without the NULL ptr tests removed. And yup, that makes sense. Since we now hold the anon_vma lock over an allocation, the memory allocation might want to start to free things. > I've not given it _any_ thought, and won't be able to come back to > it for a couple of hours: chucked over the wall for your delectation. It's a mis-feature of "page_referenced()": we can call it without holding any page locks ("is_locked=0"), and that function will then do a try_lock() on the page, and just consider it referenced if it failed. HOWEVER, the code to then do the same thing for the anon_vma lock doesn't do the same trylock model, because it used to be a spinlock: so there was "no need" (since you can never do a non-atomic allocation from within a spinlock). So this is arguably a bug in memory reclaim, but sicne the mutex->spinlock conversion had been pretty mindless, nobody noticed until the mutex region grew. So I do think that "page_referenced_anon()" should do a trylock, and return "referenced" if the trylock fails. Comments? That said, we have a few other mutexes that are just not allowed to be held over an allocation. page_referenced_file() has that mapping->i_mmap_mutex lock, for example. So maybe the rule just has to be "you cannot hold anon_vma lock over an allocation". Which would be sad: one of the whole _points_ of turning it from a spinlock to a mutex would be that it relaxes the locking rules a lot (and not just the preemptibility) Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/