Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753050AbdF0NlU (ORCPT ); Tue, 27 Jun 2017 09:41:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58456 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752835AbdF0NlK (ORCPT ); Tue, 27 Jun 2017 09:41:10 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3AC7B4E05D Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 3AC7B4E05D Date: Tue, 27 Jun 2017 15:40:44 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Paolo Bonzini Cc: Wanpeng Li , Yang Zhang , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , the arch/x86 maintainers , Jonathan Corbet , tony.luck@intel.com, Borislav Petkov , Peter Zijlstra , mchehab@kernel.org, Andrew Morton , krzk@kernel.org, jpoimboe@redhat.com, Andy Lutomirski , Christian Borntraeger , Thomas Garnier , Robert Gerst , Mathias Krause , douly.fnst@cn.fujitsu.com, Nicolai Stange , Frederic Weisbecker , dvlasenk@redhat.com, Daniel Bristot de Oliveira , yamada.masahiro@socionext.com, mika.westerberg@linux.intel.com, Chen Yu , aaron.lu@intel.com, Steven Rostedt , Kyle Huey , Len Brown , Prarit Bhargava , hidehiro.kawai.ez@hitachi.com, fengtiantian@huawei.com, pmladek@suse.com, jeyu@redhat.com, Larry.Finger@lwfinger.net, zijun_hu@htc.com, luisbg@osg.samsung.com, johannes.berg@intel.com, niklas.soderlund+renesas@ragnatech.se, zlpnobody@gmail.com, Alexey Dobriyan , fgao@48lvckh6395k16k5.yundunddos.com, ebiederm@xmission.com, Subash Abhinov Kasiviswanathan , Arnd Bergmann , Matt Fleming , Mel Gorman , "linux-kernel@vger.kernel.org" , linux-doc@vger.kernel.org, linux-edac@vger.kernel.org, kvm Subject: Re: [PATCH 2/2] x86/idle: use dynamic halt poll Message-ID: <20170627134043.GA1487@potion> References: <1498130534-26568-1-git-send-email-root@ip-172-31-39-62.us-west-2.compute.internal> <1498130534-26568-3-git-send-email-root@ip-172-31-39-62.us-west-2.compute.internal> <4444ffc8-9e7b-5bd2-20da-af422fe834cc@redhat.com> <2245bef7-b668-9265-f3f8-3b63d71b1033@gmail.com> <7d085956-2573-212f-44f4-86104beba9bb@gmail.com> <05ec7efc-fb9c-ae24-5770-66fc472545a4@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <05ec7efc-fb9c-ae24-5770-66fc472545a4@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 27 Jun 2017 13:41:00 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1665 Lines: 30 2017-06-27 14:28+0200, Paolo Bonzini: > On 27/06/2017 14:23, Wanpeng Li wrote: >>>>> I have considered single_task_running() before. But since there is no >>>>> such paravirtual interface currently and i am not sure whether it is a >>>>> information leak from host if introducing such interface, so i didn't do >>>>> it. Do you mean vcpu_is_preempted can do the same thing? I check the >>>>> code and seems it only tells whether the VCPU is scheduled out or not >>>>> which cannot satisfy the needs. >>>> Can you help to answer my confusion? I have double checked the code, but >>>> still not get your point. Do you think it is necessary to introduce an >>>> paravirtual interface to expose single_task_running() to guest? >> >> I think vcpu_is_preempted is a good enough replacement. >> For example, vcpu->arch.st.steal.preempted is 0 when the vCPU is sched >> in and vmentry, then several tasks are enqueued on the same pCPU and >> waiting on cfs red-black tree, the guest should avoid to poll in this >> scenario, however, vcpu_is_preempted returns false and guest decides >> to poll. > > ... which is not necessarily _wrong_. It's just a different heuristic. Right, it's just harder to use than host's single_task_running() -- the VCPU calling vcpu_is_preempted() is never preempted, so we have to look at other VCPUs that are not halted, but still preempted. If we see some ratio of preempted VCPUs (> 0?), then we stop polling and yield to the host. Working under the assumption that there is work for this PCPU if other VCPUs have stuff to do. The downside is that it misses information about host's topology, so it would be hard to make it work well.