Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757031AbbGGLZE (ORCPT ); Tue, 7 Jul 2015 07:25:04 -0400 Received: from casper.infradead.org ([85.118.1.10]:49837 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756737AbbGGLYz (ORCPT ); Tue, 7 Jul 2015 07:24:55 -0400 Date: Tue, 7 Jul 2015 13:24:49 +0200 From: Peter Zijlstra To: Waiman Long Cc: Ingo Molnar , Arnd Bergmann , Thomas Gleixner , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, Will Deacon , Scott J Norton , Douglas Hatch Subject: Re: [PATCH 4/4] locking/qrwlock: Use direct MCS lock/unlock in slowpath Message-ID: <20150707112449.GR3644@twins.programming.kicks-ass.net> References: <1436197386-58635-1-git-send-email-Waiman.Long@hp.com> <1436197386-58635-5-git-send-email-Waiman.Long@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1436197386-58635-5-git-send-email-Waiman.Long@hp.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1435 Lines: 36 On Mon, Jul 06, 2015 at 11:43:06AM -0400, Waiman Long wrote: > Lock waiting in the qrwlock uses the spinlock (qspinlock for x86) > as the waiting queue. This is slower than using MCS lock directly > because of the extra level of indirection causing more atomics to > be used as well as 2 waiting threads spinning on the lock cacheline > instead of only one. This needs a better explanation. Didn't we find with the qspinlock thing that the pending spinner improved performance on light loads? Taking it out seems counter intuitive, we could very much like these two the be the same. > --- a/kernel/locking/qrwlock.c > +++ b/kernel/locking/qrwlock.c > +static DEFINE_PER_CPU_ALIGNED(struct mcs_spinlock, _mcs_qnodes[4]); > --- a/kernel/locking/qspinlock.c > +++ b/kernel/locking/qspinlock.c > @@ -81,8 +81,9 @@ > * Exactly fits one 64-byte cacheline on a 64-bit architecture. > * > * PV doubles the storage and uses the second cacheline for PV state. > + * The MCS nodes are also shared with qrwlock. > */ > -static DEFINE_PER_CPU_ALIGNED(struct mcs_spinlock, mcs_nodes[MAX_NODES]); > +DEFINE_PER_CPU_ALIGNED(struct mcs_spinlock, _mcs_qnodes[MAX_NODES]); Except you don't... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/