Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753098Ab3FZVgI (ORCPT ); Wed, 26 Jun 2013 17:36:08 -0400 Received: from mga02.intel.com ([134.134.136.20]:53078 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753053Ab3FZVgE (ORCPT ); Wed, 26 Jun 2013 17:36:04 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.87,946,1363158000"; d="scan'208";a="336030449" Subject: Re: Performance regression from switching lock to rw-sem for anon-vma tree From: Tim Chen To: Ingo Molnar Cc: Ingo Molnar , Andrea Arcangeli , Mel Gorman , "Shi, Alex" , Andi Kleen , Andrew Morton , Michel Lespinasse , Davidlohr Bueso , "Wilcox, Matthew R" , Dave Hansen , Peter Zijlstra , Rik van Riel , linux-kernel@vger.kernel.org, linux-mm In-Reply-To: <20130626095108.GB29181@gmail.com> References: <1371165992.27102.573.camel@schen9-DESK> <20130619131611.GC24957@gmail.com> <1371660831.27102.663.camel@schen9-DESK> <1372205996.22432.119.camel@schen9-DESK> <20130626095108.GB29181@gmail.com> Content-Type: text/plain; charset="UTF-8" Date: Wed, 26 Jun 2013 14:36:00 -0700 Message-ID: <1372282560.22432.139.camel@schen9-DESK> Mime-Version: 1.0 X-Mailer: Evolution 2.32.3 (2.32.3-1.fc14) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3615 Lines: 77 On Wed, 2013-06-26 at 11:51 +0200, Ingo Molnar wrote: > * Tim Chen wrote: > > > On Wed, 2013-06-19 at 09:53 -0700, Tim Chen wrote: > > > On Wed, 2013-06-19 at 15:16 +0200, Ingo Molnar wrote: > > > > > > > > vmstat for mutex implementation: > > > > > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- > > > > > r b swpd free buff cache si so bi bo in cs us sy id wa st > > > > > 38 0 0 130957920 47860 199956 0 0 0 56 236342 476975 14 72 14 0 0 > > > > > 41 0 0 130938560 47860 219900 0 0 0 0 236816 479676 14 72 14 0 0 > > > > > > > > > > vmstat for rw-sem implementation (3.10-rc4) > > > > > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- > > > > > r b swpd free buff cache si so bi bo in cs us sy id wa st > > > > > 40 0 0 130933984 43232 202584 0 0 0 0 321817 690741 13 71 16 0 0 > > > > > 39 0 0 130913904 43232 224812 0 0 0 0 322193 692949 13 71 16 0 0 > > > > > > > > It appears the main difference is that the rwsem variant context-switches > > > > about 36% more than the mutex version, right? > > > > > > > > I'm wondering how that's possible - the lock is mostly write-locked, > > > > correct? So the lock-stealing from Davidlohr Bueso and Michel Lespinasse > > > > ought to have brought roughly the same lock-stealing behavior as mutexes > > > > do, right? > > > > > > > > So the next analytical step would be to figure out why rwsem lock-stealing > > > > is not behaving in an equivalent fashion on this workload. Do readers come > > > > in frequently enough to disrupt write-lock-stealing perhaps? > > > > Ingo, > > > > I did some instrumentation on the write lock failure path. I found that > > for the exim workload, there are no readers blocking for the rwsem when > > write locking failed. The lock stealing is successful for 9.1% of the > > time and the rest of the write lock failure caused the writer to go to > > sleep. About 1.4% of the writers sleep more than once. Majority of the > > writers sleep once. > > > > It is weird that lock stealing is not successful more often. > > For this to be comparable to the mutex scalability numbers you'd have to > compare wlock-stealing _and_ adaptive spinning for failed-wlock rwsems. > > Are both techniques applied in the kernel you are running your tests on? > Ingo, The previous experiment was done on a kernel without spinning. I've redone the testing on two kernel for a 15 sec stretch of the workload run. One with the adaptive (or optimistic) spinning and the other without. Both have the patches from Alex to avoid cmpxchg induced cache bouncing. With the spinning, I sleep much less for lock acquisition (18.6% vs 91.58%). However, I've got doubling of write lock acquisition getting blocked. So that offset the gain from spinning which may be why I didn't see gain for this particular workload. No Opt Spin Opt Spin Writer acquisition blocked count 3448946 7359040 Blocked by reader 0.00% 0.55% Lock acquired first attempt (lock stealing) 8.42% 16.92% Lock acquired second attempt (1 sleep) 90.26% 17.60% Lock acquired after more than 1 sleep 1.32% 1.00% Lock acquired with optimistic spin N/A 64.48% Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/