Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751815AbbH1Fa6 (ORCPT ); Fri, 28 Aug 2015 01:30:58 -0400 Received: from blu004-omc1s18.hotmail.com ([65.55.116.29]:58389 "EHLO BLU004-OMC1S18.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751681AbbH1Fa4 convert rfc822-to-8bit (ORCPT ); Fri, 28 Aug 2015 01:30:56 -0400 X-TMN: [DtM9gdBWBqCKIMjzlLLks2rRySfCyF7+] X-Originating-Email: [wanpeng.li@hotmail.com] Message-ID: From: Wanpeng Li Subject: Re: [PATCH v2 2/3] KVM: dynamic halt_poll_ns adjustment To: David Matlack References: <1440484519-2709-1-git-send-email-wanpeng.li@hotmail.com> CC: Paolo Bonzini , kvm list , "linux-kernel@vger.kernel.org" Date: Fri, 28 Aug 2015 13:30:49 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8BIT X-OriginalArrivalTime: 28 Aug 2015 05:30:54.0739 (UTC) FILETIME=[B5CC7E30:01D0E152] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7147 Lines: 192 On 8/28/15 12:25 AM, David Matlack wrote: > On Thu, Aug 27, 2015 at 2:59 AM, Wanpeng Li wrote: >> Hi David, >> On 8/26/15 1:19 AM, David Matlack wrote: >>> Thanks for writing v2, Wanpeng. >>> >>> On Mon, Aug 24, 2015 at 11:35 PM, Wanpeng Li >>> wrote: >>>> There is a downside of halt_poll_ns since poll is still happen for idle >>>> VCPU which can waste cpu usage. This patch adds the ability to adjust >>>> halt_poll_ns dynamically. >>> What testing have you done with these patches? Do you know if this removes >>> the overhead of polling in idle VCPUs? Do we lose any of the performance >>> from always polling? >>> >>>> There are two new kernel parameters for changing the halt_poll_ns: >>>> halt_poll_ns_grow and halt_poll_ns_shrink. A third new parameter, >>>> halt_poll_ns_max, controls the maximal halt_poll_ns; it is internally >>>> rounded down to a closest multiple of halt_poll_ns_grow. The shrink/grow >>>> matrix is suggested by David: >>>> >>>> if (poll successfully for interrupt): stay the same >>>> else if (length of kvm_vcpu_block is longer than halt_poll_ns_max): >>>> shrink >>>> else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow >>> The way you implemented this wasn't what I expected. I thought you would >>> time >>> the whole function (kvm_vcpu_block). But I like your approach better. It's >>> simpler and [by inspection] does what we want. >> >> I see there is more idle vCPUs overhead w/ this method even more than always >> halt-poll, so I bring back grow vcpu->halt_poll_ns when interrupt arrives >> and shrinks when idle VCPU is detected. The perfomance looks good in v4. > Why did this patch have a worse idle overhead than always poll? I’m not sure. I make a mistake when I report the kernelbuild test, the perfomance is also worse than always poll w/ your method. I think your method didn't grow halt_poll_ns according to if interrupt arrives. Regards, Wanpeng Li > >> Regards, >> Wanpeng Li >> >> >>>> halt_poll_ns_shrink/ | >>>> halt_poll_ns_grow | grow halt_poll_ns | shrink halt_poll_ns >>>> ---------------------+----------------------+------------------- >>>> < 1 | = halt_poll_ns | = 0 >>>> < halt_poll_ns | *= halt_poll_ns_grow | /= halt_poll_ns_shrink >>>> otherwise | += halt_poll_ns_grow | -= halt_poll_ns_shrink >>> I was curious why you went with this approach rather than just the >>> middle row, or just the last row. Do you think we'll want the extra >>> flexibility? >>> >>>> Signed-off-by: Wanpeng Li >>>> --- >>>> virt/kvm/kvm_main.c | 65 >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++- >>>> 1 file changed, 64 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >>>> index 93db833..2a4962b 100644 >>>> --- a/virt/kvm/kvm_main.c >>>> +++ b/virt/kvm/kvm_main.c >>>> @@ -66,9 +66,26 @@ >>>> MODULE_AUTHOR("Qumranet"); >>>> MODULE_LICENSE("GPL"); >>>> >>>> -static unsigned int halt_poll_ns; >>>> +#define KVM_HALT_POLL_NS 500000 >>>> +#define KVM_HALT_POLL_NS_GROW 2 >>>> +#define KVM_HALT_POLL_NS_SHRINK 0 >>>> +#define KVM_HALT_POLL_NS_MAX 2000000 >>> The macros are not necessary. Also, hard coding the numbers in the param >>> definitions will make reading the comments above them easier. >>> >>>> + >>>> +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS; >>>> module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR); >>>> >>>> +/* Default doubles per-vcpu halt_poll_ns. */ >>>> +static unsigned int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW; >>>> +module_param(halt_poll_ns_grow, int, S_IRUGO); >>>> + >>>> +/* Default resets per-vcpu halt_poll_ns . */ >>>> +static unsigned int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK; >>>> +module_param(halt_poll_ns_shrink, int, S_IRUGO); >>>> + >>>> +/* halt polling only reduces halt latency by 10-15 us, 2ms is enough */ >>> Ah, I misspoke before. I was thinking about round-trip latency. The >>> latency >>> of a single halt is reduced by about 5-7 us. >>> >>>> +static unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX; >>>> +module_param(halt_poll_ns_max, int, S_IRUGO); >>> We can remove halt_poll_ns_max. vcpu->halt_poll_ns can always start at >>> zero >>> and grow from there. Then we just need one module param to keep >>> vcpu->halt_poll_ns from growing too large. >>> >>> [ It would make more sense to remove halt_poll_ns and keep >>> halt_poll_ns_max, >>> but since halt_poll_ns already exists in upstream kernels, we probably >>> can't >>> remove it. ] >>> >>>> + >>>> /* >>>> * Ordering of locks: >>>> * >>>> @@ -1907,6 +1924,48 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu >>>> *vcpu, gfn_t gfn) >>>> } >>>> EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty); >>>> >>>> +static unsigned int __grow_halt_poll_ns(unsigned int val) >>>> +{ >>>> + if (halt_poll_ns_grow < 1) >>>> + return halt_poll_ns; >>>> + >>>> + val = min(val, halt_poll_ns_max); >>>> + >>>> + if (val == 0) >>>> + return halt_poll_ns; >>>> + >>>> + if (halt_poll_ns_grow < halt_poll_ns) >>>> + val *= halt_poll_ns_grow; >>>> + else >>>> + val += halt_poll_ns_grow; >>>> + >>>> + return val; >>>> +} >>>> + >>>> +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int >>>> minimum) >>> minimum never gets used. >>> >>>> +{ >>>> + if (modifier < 1) >>>> + return 0; >>>> + >>>> + if (modifier < halt_poll_ns) >>>> + val /= modifier; >>>> + else >>>> + val -= modifier; >>>> + >>>> + return val; >>>> +} >>>> + >>>> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) >>> These wrappers aren't necessary. >>> >>>> +{ >>>> + vcpu->halt_poll_ns = __grow_halt_poll_ns(vcpu->halt_poll_ns); >>>> +} >>>> + >>>> +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu) >>>> +{ >>>> + vcpu->halt_poll_ns = __shrink_halt_poll_ns(vcpu->halt_poll_ns, >>>> + halt_poll_ns_shrink, halt_poll_ns); >>>> +} >>>> + >>>> static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) >>>> { >>>> if (kvm_arch_vcpu_runnable(vcpu)) { >>>> @@ -1954,6 +2013,10 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) >>>> break; >>>> >>>> waited = true; >>>> + if (vcpu->halt_poll_ns > halt_poll_ns_max) >>>> + shrink_halt_poll_ns(vcpu); >>>> + else >>>> + grow_halt_poll_ns(vcpu); >>> Shouldn't this go after the loop, and before "out:", in case we schedule >>> more than once? You can gate it on "if (waited)" so it only runs if we >>> actually scheduled. >>> >>>> schedule(); >>>> } >>>> >>>> -- >>>> 1.9.1 >>>> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/