Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751392AbdIAGWK (ORCPT ); Fri, 1 Sep 2017 02:22:10 -0400 Received: from mail-oi0-f45.google.com ([209.85.218.45]:36054 "EHLO mail-oi0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751014AbdIAGWI (ORCPT ); Fri, 1 Sep 2017 02:22:08 -0400 X-Google-Smtp-Source: ADKCNb5FCI/aDxte7Q870zg/Vm3ZSiKilxYHh/6VzwtTUjlnfCOqEJQQ550CyMrVLCRdckCgH2aNHA== Subject: Re: [RFC PATCH v2 0/7] x86/idle: add halt poll support To: Alexander Graf , linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, wanpeng.li@hotmail.com, mst@redhat.com, pbonzini@redhat.com, tglx@linutronix.de, rkrcmar@redhat.com, dmatlack@google.com, peterz@infradead.org, linux-doc@vger.kernel.org References: <1504007201-12904-1-git-send-email-yang.zhang.wz@gmail.com> <6ba7f198-4403-c9d1-f0be-7069cc8cd421@suse.de> From: Yang Zhang Message-ID: <2c048ae6-329e-1351-2711-e72e31c8554e@gmail.com> Date: Fri, 1 Sep 2017 14:21:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <6ba7f198-4403-c9d1-f0be-7069cc8cd421@suse.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2563 Lines: 61 On 2017/8/29 19:58, Alexander Graf wrote: > On 08/29/2017 01:46 PM, Yang Zhang wrote: >> Some latency-intensive workload will see obviously performance >> drop when running inside VM. The main reason is that the overhead >> is amplified when running inside VM. The most cost i have seen is >> inside idle path. >> >> This patch introduces a new mechanism to poll for a while before >> entering idle state. If schedule is needed during poll, then we >> don't need to goes through the heavy overhead path. >> >> Here is the data we get when running benchmark contextswitch to measure >> the latency(lower is better): >> >> ??? 1. w/o patch: >> ?????? 2493.14 ns/ctxsw -- 200.3 %CPU >> ??? 2. w/ patch: >> ?????? halt_poll_threshold=10000 -- 1485.96ns/ctxsw -- 201.0 %CPU >> ?????? halt_poll_threshold=20000 -- 1391.26 ns/ctxsw -- 200.7 %CPU >> ?????? halt_poll_threshold=30000 -- 1488.55 ns/ctxsw -- 200.1 %CPU >> ?????? halt_poll_threshold=500000 -- 1159.14 ns/ctxsw -- 201.5 %CPU >> ??? 3. kvm dynamic poll >> ?????? halt_poll_ns=10000 -- 2296.11 ns/ctxsw -- 201.2 %CPU >> ?????? halt_poll_ns=20000 -- 2599.7 ns/ctxsw -- 201.7 %CPU >> ?????? halt_poll_ns=30000 -- 2588.68 ns/ctxsw -- 211.6 %CPU >> ?????? halt_poll_ns=500000 -- 2423.20 ns/ctxsw -- 229.2 %CPU >> ??? 4. idle=poll >> ?????? 2050.1 ns/ctxsw -- 1003 %CPU >> ??? 5. idle=mwait >> ?????? 2188.06 ns/ctxsw -- 206.3 %CPU > > Could you please try to create another metric for guest initiated, host > aborted mwait? > > For a quick benchmark, reserve 4 registers for a magic value, set them > to the magic value before you enter MWAIT in the guest. Then allow > native MWAIT execution on the host. If you see the guest wants to enter I guess you want to allow native MWAIT execution on the guest not host? > with the 4 registers containing the magic contents and no events are > pending, directly go into the vcpu block function on the host. Mmm..It is not very clear to me. If guest executes MWAIT without vmexit, how to check the register? > > That way any time a guest gets naturally aborted while in mwait, it will > only reenter mwait when an event actually occured. While the guest is > normally running (and nobody else wants to run on the host), we just > stay in guest context, but with a sleeping CPU. > > Overall, that might give us even better performance, as it allows for > turbo boost and HT to work properly. In our testing, we have enough cores(32cores) but only 10VCPUs, so in the best case, we may see the same performance as poll. -- Yang Alibaba Cloud Computing