Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756706AbbDJSFa (ORCPT ); Fri, 10 Apr 2015 14:05:30 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:47327 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756422AbbDJSFY (ORCPT ); Fri, 10 Apr 2015 14:05:24 -0400 Date: Fri, 10 Apr 2015 11:05:17 -0700 From: "Paul E. McKenney" To: Ingo Molnar Cc: Linus Torvalds , Jason Low , Peter Zijlstra , Davidlohr Bueso , Tim Chen , Aswin Chandramouleeswaran , LKML Subject: Re: [PATCH] mutex: Speed up mutex_spin_on_owner() by not taking the RCU lock Message-ID: <20150410180517.GI6464@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1428561611.3506.78.camel@j-VirtualBox> <20150409075311.GA4645@gmail.com> <20150409175652.GI6464@linux.vnet.ibm.com> <20150409183926.GM6464@linux.vnet.ibm.com> <20150410090051.GA28549@gmail.com> <20150410142024.GY6464@linux.vnet.ibm.com> <20150410174400.GA6563@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150410174400.GA6563@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15041018-0033-0000-0000-00000432B9A4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1725 Lines: 44 On Fri, Apr 10, 2015 at 07:44:00PM +0200, Ingo Molnar wrote: > > * Paul E. McKenney wrote: > > > > No RCU overhead, and this is the access to owner->on_cpu: > > > > > > 69: 49 8b 81 10 c0 ff ff mov -0x3ff0(%r9),%rax > > > > > > Totally untested and all that, I only built the mutex.o. > > > > > > What do you think? Am I missing anything? > > > > I suspect it is good, but let's take a look at Linus' summary of the code: > > > > rcu_read_lock(); > > while (sem->owner == owner) { > > if (!owner->on_cpu || need_resched()) > > break; > > cpu_relax_lowlatency(); > > } > > rcu_read_unlock(); > > Note that I patched the mutex case as a prototype, which is more > commonly used than rwsem-xadd. But the rwsem case is similar as well. > > > The cpu_relax_lowlatency() looks to have barrier() semantics, so the > > sem->owner should get reloaded every time through the loop. This is > > needed, because otherwise the task structure could get freed and > > reallocated as something else that happened to have the field at the > > ->on_cpu offset always zero, resulting in an infinite loop. > > So at least with the get_kernel(..., &owner->on_cpu) approach, the > get_kernel() copy has barrier semantics as well (it's in assembly), so > it will be reloaded in every iteration in a natural fashion. Good point, even better! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/