Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751726AbdINJUD (ORCPT ); Thu, 14 Sep 2017 05:20:03 -0400 Received: from mail-io0-f194.google.com ([209.85.223.194]:38131 "EHLO mail-io0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751661AbdINJUA (ORCPT ); Thu, 14 Sep 2017 05:20:00 -0400 X-Google-Smtp-Source: AOwi7QATGsU3d/zDEDfSbDIbAeD8/p/DlwopNViDcjNwV3Lg9kML5lklQxVq5Czf5lIB5KFgCFyv71ROrUgIPG8anQU= MIME-Version: 1.0 In-Reply-To: <25e566b9-a8f4-2d90-0ba3-725f1a215c1f@gmail.com> References: <1504007201-12904-1-git-send-email-yang.zhang.wz@gmail.com> <20170829174147-mutt-send-email-mst@kernel.org> <259c95bc-3641-965b-4054-a233a6ee785c@gmail.com> <25e566b9-a8f4-2d90-0ba3-725f1a215c1f@gmail.com> From: Wanpeng Li Date: Thu, 14 Sep 2017 17:19:59 +0800 Message-ID: Subject: Re: [RFC PATCH v2 0/7] x86/idle: add halt poll support To: Quan Xu Cc: Yang Zhang , "Michael S. Tsirkin" , "linux-kernel@vger.kernel.org" , kvm , Wanpeng Li , Paolo Bonzini , Thomas Gleixner , Radim Krcmar , David Matlack , Alexander Graf , Peter Zijlstra , linux-doc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3397 Lines: 113 2017-09-14 16:36 GMT+08:00 Quan Xu : > > > on 2017/9/13 19:56, Yang Zhang wrote: >> >> On 2017/8/29 22:56, Michael S. Tsirkin wrote: >>> >>> On Tue, Aug 29, 2017 at 11:46:34AM +0000, Yang Zhang wrote: >>>> >>>> Some latency-intensive workload will see obviously performance >>>> drop when running inside VM. >>> >>> >>> But are we trading a lot of CPU for a bit of lower latency? >>> >>>> The main reason is that the overhead >>>> is amplified when running inside VM. The most cost i have seen is >>>> inside idle path. >>>> >>>> This patch introduces a new mechanism to poll for a while before >>>> entering idle state. If schedule is needed during poll, then we >>>> don't need to goes through the heavy overhead path. >>> >>> >>> Isn't it the job of an idle driver to find the best way to >>> halt the CPU? >>> >>> It looks like just by adding a cstate we can make it >>> halt at higher latencies only. And at lower latencies, >>> if it's doing a good job we can hopefully use mwait to >>> stop the CPU. >>> >>> In fact I have been experimenting with exactly that. >>> Some initial results are encouraging but I could use help >>> with testing and especially tuning. If you can help >>> pls let me know! >> >> >> Quan, Can you help to test it and give result? Thanks. >> > > Hi, MST > > I have tested the patch "intel_idle: add pv cstates when running on kvm" on > a recent host that allows guests > to execute mwait without an exit. also I have tested our patch "[RFC PATCH > v2 0/7] x86/idle: add halt poll support", > upstream linux, and idle=poll. > > the following is the result (which seems better than ever berfore, as I ran > test case on a more powerful machine): > > for __netperf__, the first column is trans. rate per sec, the second column > is CPU utilzation. > > 1. upstream linux This "upstream linux" means that disables the kvm adaptive halt-polling after confirm with Xu Quan. Regards, Wanpeng Li > > 28371.7 bits/s -- 76.6 %CPU > > 2. idle=poll > > 34372 bit/s -- 999.3 %CPU > > 3. "[RFC PATCH v2 0/7] x86/idle: add halt poll support", with different > values of parameter 'halt_poll_threshold': > > 28362.7 bits/s -- 74.7 %CPU (halt_poll_threshold=10000) > 32949.5 bits/s -- 82.5 %CPU (halt_poll_threshold=20000) > 39717.9 bits/s -- 104.1 %CPU (halt_poll_threshold=30000) > 40137.9 bits/s -- 104.4 %CPU (halt_poll_threshold=40000) > 40079.8 bits/s -- 105.6 %CPU (halt_poll_threshold=50000) > > > 4. "intel_idle: add pv cstates when running on kvm" > > 33041.8 bits/s -- 999.4 %CPU > > > > > > for __ctxsw__, the first column is the time per process context switches, > the second column is CPU utilzation.. > > 1. upstream linux > > 3624.19 ns/ctxsw -- 191.9 %CPU > > 2. idle=poll > > 3419.66 ns/ctxsw -- 999.2 %CPU > > 3. "[RFC PATCH v2 0/7] x86/idle: add halt poll support", with different > values of parameter 'halt_poll_threshold': > > 1123.40 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=10000) > 1127.38 ns/ctxsw -- 199.7 %CPU (halt_poll_threshold=20000) > 1113.58 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=30000) > 1117.12 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=40000) > 1121.62 ns/ctxsw -- 199.6 %CPU (halt_poll_threshold=50000) > > 4. "intel_idle: add pv cstates when running on kvm" > > 3427.59 ns/ctxsw -- 999.4 %CPU > > -Quan