Message-ID: <BLU436-SMTP20728BC58B3359978FD0DE2806F0@phx.gbl>
Subject: Re: [PATCH v2 2/3] KVM: dynamic halt_poll_ns adjustment
To: David Matlack <dmatlack@google.com>
References: <1440484519-2709-1-git-send-email-wanpeng.li@hotmail.com>
 <BLU436-SMTP2428E3874423D6FB88D208C80610@phx.gbl>
 <CALzav=fpBrohrKyGbjDBhAtC7uwm11eqBsWjCqP9rsaX4QGXNw@mail.gmail.com>
CC: Paolo Bonzini <pbonzini@redhat.com>, kvm list <kvm@vger.kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
From: Wanpeng Li <wanpeng.li@hotmail.com>
Date: Thu, 27 Aug 2015 17:59:19 +0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0)
 Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <CALzav=fpBrohrKyGbjDBhAtC7uwm11eqBsWjCqP9rsaX4QGXNw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 6322
Lines: 172

Hi David,
On 8/26/15 1:19 AM, David Matlack wrote:
> Thanks for writing v2, Wanpeng.
>
> On Mon, Aug 24, 2015 at 11:35 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>> There is a downside of halt_poll_ns since poll is still happen for idle
>> VCPU which can waste cpu usage. This patch adds the ability to adjust
>> halt_poll_ns dynamically.
> What testing have you done with these patches? Do you know if this removes
> the overhead of polling in idle VCPUs? Do we lose any of the performance
> from always polling?
>
>> There are two new kernel parameters for changing the halt_poll_ns:
>> halt_poll_ns_grow and halt_poll_ns_shrink. A third new parameter,
>> halt_poll_ns_max, controls the maximal halt_poll_ns; it is internally
>> rounded down to a closest multiple of halt_poll_ns_grow. The shrink/grow
>> matrix is suggested by David:
>>
>> if (poll successfully for interrupt): stay the same
>>    else if (length of kvm_vcpu_block is longer than halt_poll_ns_max): shrink
>>    else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow
> The way you implemented this wasn't what I expected. I thought you would time
> the whole function (kvm_vcpu_block). But I like your approach better. It's
> simpler and [by inspection] does what we want.

I see there is more idle vCPUs overhead w/ this method even more than 
always halt-poll, so I bring back grow vcpu->halt_poll_ns when interrupt 
arrives and shrinks when idle VCPU is detected. The perfomance looks 
good in v4.

Regards,
Wanpeng Li

>
>>    halt_poll_ns_shrink/ |
>>    halt_poll_ns_grow    | grow halt_poll_ns    | shrink halt_poll_ns
>>    ---------------------+----------------------+-------------------
>>    < 1                  |  = halt_poll_ns      |  = 0
>>    < halt_poll_ns       | *= halt_poll_ns_grow | /= halt_poll_ns_shrink
>>    otherwise            | += halt_poll_ns_grow | -= halt_poll_ns_shrink
> I was curious why you went with this approach rather than just the
> middle row, or just the last row. Do you think we'll want the extra
> flexibility?
>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>>   virt/kvm/kvm_main.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 64 insertions(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 93db833..2a4962b 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -66,9 +66,26 @@
>>   MODULE_AUTHOR("Qumranet");
>>   MODULE_LICENSE("GPL");
>>
>> -static unsigned int halt_poll_ns;
>> +#define KVM_HALT_POLL_NS  500000
>> +#define KVM_HALT_POLL_NS_GROW   2
>> +#define KVM_HALT_POLL_NS_SHRINK 0
>> +#define KVM_HALT_POLL_NS_MAX 2000000
> The macros are not necessary. Also, hard coding the numbers in the param
> definitions will make reading the comments above them easier.
>
>> +
>> +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS;
>>   module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);
>>
>> +/* Default doubles per-vcpu halt_poll_ns. */
>> +static unsigned int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW;
>> +module_param(halt_poll_ns_grow, int, S_IRUGO);
>> +
>> +/* Default resets per-vcpu halt_poll_ns . */
>> +static unsigned int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK;
>> +module_param(halt_poll_ns_shrink, int, S_IRUGO);
>> +
>> +/* halt polling only reduces halt latency by 10-15 us, 2ms is enough */
> Ah, I misspoke before. I was thinking about round-trip latency. The latency
> of a single halt is reduced by about 5-7 us.
>
>> +static unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX;
>> +module_param(halt_poll_ns_max, int, S_IRUGO);
> We can remove halt_poll_ns_max. vcpu->halt_poll_ns can always start at zero
> and grow from there. Then we just need one module param to keep
> vcpu->halt_poll_ns from growing too large.
>
> [ It would make more sense to remove halt_poll_ns and keep halt_poll_ns_max,
>    but since halt_poll_ns already exists in upstream kernels, we probably can't
>    remove it. ]
>
>> +
>>   /*
>>    * Ordering of locks:
>>    *
>> @@ -1907,6 +1924,48 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);
>>
>> +static unsigned int __grow_halt_poll_ns(unsigned int val)
>> +{
>> +       if (halt_poll_ns_grow < 1)
>> +               return halt_poll_ns;
>> +
>> +       val = min(val, halt_poll_ns_max);
>> +
>> +       if (val == 0)
>> +               return halt_poll_ns;
>> +
>> +       if (halt_poll_ns_grow < halt_poll_ns)
>> +               val *= halt_poll_ns_grow;
>> +       else
>> +               val += halt_poll_ns_grow;
>> +
>> +       return val;
>> +}
>> +
>> +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int minimum)
> minimum never gets used.
>
>> +{
>> +       if (modifier < 1)
>> +               return 0;
>> +
>> +       if (modifier < halt_poll_ns)
>> +               val /= modifier;
>> +       else
>> +               val -= modifier;
>> +
>> +       return val;
>> +}
>> +
>> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
> These wrappers aren't necessary.
>
>> +{
>> +       vcpu->halt_poll_ns = __grow_halt_poll_ns(vcpu->halt_poll_ns);
>> +}
>> +
>> +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu)
>> +{
>> +       vcpu->halt_poll_ns = __shrink_halt_poll_ns(vcpu->halt_poll_ns,
>> +                       halt_poll_ns_shrink, halt_poll_ns);
>> +}
>> +
>>   static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>>   {
>>          if (kvm_arch_vcpu_runnable(vcpu)) {
>> @@ -1954,6 +2013,10 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>                          break;
>>
>>                  waited = true;
>> +               if (vcpu->halt_poll_ns > halt_poll_ns_max)
>> +                       shrink_halt_poll_ns(vcpu);
>> +               else
>> +                       grow_halt_poll_ns(vcpu);
> Shouldn't this go after the loop, and before "out:", in case we schedule
> more than once? You can gate it on "if (waited)" so it only runs if we
> actually scheduled.
>
>>                  schedule();
>>          }
>>
>> --
>> 1.9.1
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/