Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754448Ab2K2MRE (ORCPT ); Thu, 29 Nov 2012 07:17:04 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54717 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754010Ab2K2MRB (ORCPT ); Thu, 29 Nov 2012 07:17:01 -0500 Date: Thu, 29 Nov 2012 14:16:36 +0200 From: Gleb Natapov To: Raghavendra K T Cc: Marcelo Tosatti , Peter Zijlstra , "H. Peter Anvin" , Avi Kivity , Ingo Molnar , Rik van Riel , Srikar , "Nikunj A. Dadhania" , KVM , Jiannan Ouyang , Chegu Vinod , "Andrew M. Theurer" , LKML , Srivatsa Vaddagiri , Andrew Jones Subject: Re: [PATCH V3 RFC 2/2] kvm: Handle yield_to failure return code for potential undercommit case Message-ID: <20121129121636.GB9711@redhat.com> References: <20121126120740.2595.33651.sendpatchset@codeblue> <20121126120804.2595.20280.sendpatchset@codeblue> <20121128011228.GH8295@amt.cnet> <50B59CE0.70305@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <50B59CE0.70305@linux.vnet.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3645 Lines: 85 On Wed, Nov 28, 2012 at 10:40:56AM +0530, Raghavendra K T wrote: > On 11/28/2012 06:42 AM, Marcelo Tosatti wrote: > > > >Don't understand the reasoning behind why 3 is a good choice. > > Here is where I came from. (explaining from scratch for > completeness, forgive me :)) > In moderate overcommits, we can falsely exit from ple handler even when > we have preempted task of same VM waiting on other cpus. To reduce this > problem, we try few times before exiting. > The problem boils down to: > what is the probability that we exit ple handler even when we have more > than 1 task in other cpus. Theoretical worst case should be around 1.5x > overcommit (As also pointed by Andrew Theurer). [But practical > worstcase may be around 2x,3x overcommits as indicated by the results > for the patch series] > > So if p is the probability of finding rq length one on a particular cpu, > and if we do n tries, then probability of exiting ple handler is: > > p^(n+1) [ because we would have come across one source with rq length > 1 and n target cpu rqs with length 1 ] > > so > num tries: probability of aborting ple handler (1.5x overcommit) > 1 1/4 > 2 1/8 > 3 1/16 > > We can increase this probability with more tries, but the problem is > the overhead. IIRC Avi (again) had an idea to track vcpu preemption. When vcpu thread is preempted we do kvm->preempted_vcpus++, when it runs again we do kvm->preempted_vcpus--. PLE handler can try harder if kvm->preempted_vcpus is big or do not try at all if it is zero. > Also, If we have tried three times that means we would have iterated > over 3 good eligible vcpus along with many non-eligible candidates. In > worst case if we iterate all the vcpus, we reduce 1x performance and > overcommit performance get hit. [ as in results ]. > > I have tried num_tries = 1,2,3 and n already ( not 4 yet). So I > concluded 3 is enough. > > Infact I have also run kernbench and hackbench which are giving 5-20% > improvement. > > [ As a side note , I also thought how about having num_tries = f(n) = > ceil ( log(num_online_cpus)/2 ) But I thought calculation is too much > overhead and also there is no point in probably making it dependent on > online cpus ] > > Please let me know if you are happy with this rationale/ or correct me > if you foresee some problem. (Infact Avi, Rik's concern about false > exiting made me arrive at 'try' logic which I did not have earlier). > > I am currently trying out the result for 1.5x overcommit will post the > result. > > > > >On Mon, Nov 26, 2012 at 05:38:04PM +0530, Raghavendra K T wrote: > >>From: Raghavendra K T > >> > >>yield_to returns -ESRCH, When source and target of yield_to > >>run queue length is one. When we see three successive failures of > >>yield_to we assume we are in potential undercommit case and abort > >>from PLE handler. > >>The assumption is backed by low probability of wrong decision > >>for even worst case scenarios such as average runqueue length > >>between 1 and 2. > >> > >>note that we do not update last boosted vcpu in failure cases. > >>Thank Avi for raising question on aborting after first fail from yield_to. > >> > >>Reviewed-by: Srikar Dronamraju > >>Signed-off-by: Raghavendra K T > [...] -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/