Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932695AbbFWRu3 (ORCPT ); Tue, 23 Jun 2015 13:50:29 -0400 Received: from casper.infradead.org ([85.118.1.10]:59284 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932247AbbFWRuV (ORCPT ); Tue, 23 Jun 2015 13:50:21 -0400 Date: Tue, 23 Jun 2015 19:50:12 +0200 From: Peter Zijlstra To: Daniel Wagner Cc: oleg@redhat.com, paulmck@linux.vnet.ibm.com, tj@kernel.org, mingo@redhat.com, linux-kernel@vger.kernel.org, der.herr@hofr.at, dave@stgolabs.net, riel@redhat.com, viro@ZenIV.linux.org.uk, torvalds@linux-foundation.org, jlayton@poochiereds.net Subject: Re: [RFC][PATCH 00/13] percpu rwsem -v2 Message-ID: <20150623175012.GD3644@twins.programming.kicks-ass.net> References: <20150622121623.291363374@infradead.org> <55884FC2.6030607@bmw-carit.de> <20150622190553.GZ3644@twins.programming.kicks-ass.net> <5589285C.2010100@bmw-carit.de> <20150623143411.GA25159@twins.programming.kicks-ass.net> <558973A7.6010407@bmw-carit.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <558973A7.6010407@bmw-carit.de> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4079 Lines: 86 On Tue, Jun 23, 2015 at 04:56:39PM +0200, Daniel Wagner wrote: > flock02 > mean variance sigma max min > tip-1 11.8994 0.5874 0.7664 13.2022 8.6324 > tip-2 11.7394 0.5252 0.7247 13.2540 9.7513 > tip-3 11.8155 0.5288 0.7272 13.2700 9.9480 > tip+percpu-rswem-1 15.3601 0.8981 0.9477 16.8116 12.6910 > tip+percpu-rswem-2 15.2558 0.8442 0.9188 17.0199 12.9586 > tip+percpu-rswem-3 15.5297 0.6386 0.7991 17.4392 12.7992 I did indeed manage to get flock02 down to a usable level and found: 3.20 : ffffffff811ecbdf: incl %gs:0x7ee1de72(%rip) # aa58 <__preempt_count> 0.27 : ffffffff811ecbe6: mov 0xa98553(%rip),%rax # ffffffff81c85140 10.78 : ffffffff811ecbed: incl %gs:(%rax) 0.19 : ffffffff811ecbf0: mov 0xa9855a(%rip),%edx # ffffffff81c85150 0.00 : ffffffff811ecbf6: test %edx,%edx 0.00 : ffffffff811ecbf8: jne ffffffff811ecdd1 3.47 : ffffffff811ecbfe: decl %gs:0x7ee1de53(%rip) # aa58 <__preempt_count> 0.00 : ffffffff811ecc05: je ffffffff811eccec Which is percpu_down_read(). Now aside from the fact that I run a PREEMPT=y kernel, it looks like that sem->refcount increment stalls because of the dependent load. Manually hoisting the load very slightly improves things: 0.24 : ffffffff811ecbdf: mov 0xa9855a(%rip),%rax # ffffffff81c85140 5.88 : ffffffff811ecbe6: incl %gs:0x7ee1de6b(%rip) # aa58 <__preempt_count> 7.94 : ffffffff811ecbed: incl %gs:(%rax) 0.30 : ffffffff811ecbf0: mov 0xa9855a(%rip),%edx # ffffffff81c85150 0.00 : ffffffff811ecbf6: test %edx,%edx 0.00 : ffffffff811ecbf8: jne ffffffff811ecdd1 3.70 : ffffffff811ecbfe: decl %gs:0x7ee1de53(%rip) # aa58 <__preempt_count> 0.00 : ffffffff811ecc05: je ffffffff811eccec But its not much :/ Using DEFINE_STATIC_PERCPU_RWSEM(file_rwsem) would allow GCC to omit the sem->refcount load entirely, but its not smart enough to see that it can (tested 4.9 and 5.1). --- --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -35,6 +35,8 @@ extern void __percpu_up_read(struct perc static inline void _percpu_down_read(struct percpu_rw_semaphore *sem) { + unsigned int __percpu *refcount = sem->refcount; + might_sleep(); preempt_disable(); @@ -47,7 +49,7 @@ static inline void _percpu_down_read(str * writer will see anything we did within this RCU-sched read-side * critical section. */ - __this_cpu_inc(*sem->refcount); + __this_cpu_inc(*refcount); if (unlikely(!rcu_sync_is_idle(&sem->rss))) __percpu_down_read(sem); /* Unconditional memory barrier. */ preempt_enable(); @@ -81,6 +83,8 @@ static inline bool percpu_down_read_tryl static inline void percpu_up_read(struct percpu_rw_semaphore *sem) { + unsigned int __percpu *refcount = sem->refcount; + /* * The barrier() in preempt_disable() prevents the compiler from * bleeding the critical section out. @@ -90,7 +94,7 @@ static inline void percpu_up_read(struct * Same as in percpu_down_read(). */ if (likely(rcu_sync_is_idle(&sem->rss))) - __this_cpu_dec(*sem->refcount); + __this_cpu_dec(*refcount); else __percpu_up_read(sem); /* Unconditional memory barrier. */ preempt_enable(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/