Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754182AbbGPOvz (ORCPT ); Thu, 16 Jul 2015 10:51:55 -0400 Received: from g2t2353.austin.hp.com ([15.217.128.52]:32000 "EHLO g2t2353.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752260AbbGPOvx (ORCPT ); Thu, 16 Jul 2015 10:51:53 -0400 Message-ID: <55A7C506.9030309@hp.com> Date: Thu, 16 Jul 2015 10:51:50 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Peter Zijlstra CC: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, Scott J Norton , Douglas Hatch , Davidlohr Bueso Subject: Re: [PATCH v2 4/6] locking/pvqspinlock: Allow vCPUs kick-ahead References: <1436926417-20256-1-git-send-email-Waiman.Long@hp.com> <1436926417-20256-5-git-send-email-Waiman.Long@hp.com> <20150715093924.GH2859@worktop.programming.kicks-ass.net> <55A7105E.5020400@hp.com> <20150716054626.GV19282@twins.programming.kicks-ass.net> In-Reply-To: <20150716054626.GV19282@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3435 Lines: 69 On 07/16/2015 01:46 AM, Peter Zijlstra wrote: > On Wed, Jul 15, 2015 at 10:01:02PM -0400, Waiman Long wrote: >> On 07/15/2015 05:39 AM, Peter Zijlstra wrote: >>> On Tue, Jul 14, 2015 at 10:13:35PM -0400, Waiman Long wrote: >>>> Frequent CPU halting (vmexit) and CPU kicking (vmenter) lengthens >>>> critical section and block forward progress. This patch implements >>>> a kick-ahead mechanism where the unlocker will kick the queue head >>>> vCPUs as well as up to four additional vCPUs next to the queue head >>>> if they were halted. The kickings are done after exiting the critical >>>> section to improve parallelism. >>>> >>>> The amount of kick-ahead allowed depends on the number of vCPUs >>>> in the VM guest. This patch, by itself, won't do much as most of >>>> the kickings are currently done at lock time. Coupled with the next >>>> patch that defers lock time kicking to unlock time, it should improve >>>> overall system performance in a busy overcommitted guest. >>>> >>>> Linux kernel builds were run in KVM guest on an 8-socket, 4 >>>> cores/socket Westmere-EX system and a 4-socket, 8 cores/socket >>>> Haswell-EX system. Both systems are configured to have 32 physical >>>> CPUs. The kernel build times before and after the patch were: >>>> >>>> Westmere Haswell >>>> Patch 32 vCPUs 48 vCPUs 32 vCPUs 48 vCPUs >>>> ----- -------- -------- -------- -------- >>>> Before patch 3m25.0s 10m34.1s 2m02.0s 15m35.9s >>>> After patch 3m27.4s 10m32.0s 2m00.8s 14m52.5s >>>> >>>> There wasn't too much difference before and after the patch. >>> That means either the patch isn't worth it, or as you seem to imply its >>> in the wrong place in this series. >> It needs to be coupled with the next patch to be effective as most of the >> kicking are happening at the lock side, instead of at the unlock side. If >> you look at the sample pvqspinlock stats in patch 3: >> >> lock_kick_count=755354 >> unlock_kick_count=87 >> >> The number of unlock kicks is negligible compared with the lock kicks. Patch >> 5 does have a dependency on patch 4 unless we make it unconditionally defers >> kicking to the unlock call which was what I had done in the v1 patch. The >> reason why I change this in v2 is because I found a very slight performance >> degradation in doing so. > This way we cannot see the gains of the proposed complexity. So put it > in a place where you can. OK, I will see what I can do to make the performance change more visible on a patch-by-patch basis. >>> You also do not offer any support for any of the magic numbers.. >> I chose 4 for PV_KICK_AHEAD_MAX as I didn't see much performance difference >> when I did a kick-ahead of 5. Also, it may be too unfair to the vCPU that >> was doing the kicking if the number is too big. Another magic number is >> pv_kick_ahead number. This one is kind of arbitrary. Right now I do a log2, >> but it can be divided by 4 (rshift 2) as well. > So what was the difference between 1-2-3-4 ? I would be thinking one > extra kick is the biggest help, no? I was seeing diminishing returns with more kicks. I can add a table on that in the next patch. Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/