Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753824AbdCMQJF (ORCPT ); Mon, 13 Mar 2017 12:09:05 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43874 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751095AbdCMQJA (ORCPT ); Mon, 13 Mar 2017 12:09:00 -0400 Date: Mon, 13 Mar 2017 18:08:58 +0200 From: "Michael S. Tsirkin" To: Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: linux-kernel@vger.kernel.org, Paolo Bonzini , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, kvm@vger.kernel.org, linux-doc@vger.kernel.org Subject: Re: [PATCH] kvm: better MWAIT emulation for guests Message-ID: <20170313180046-mutt-send-email-mst@kernel.org> References: <1489098555-23856-1-git-send-email-mst@redhat.com> <20170313154618.GA4547@potion> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170313154618.GA4547@potion> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 13 Mar 2017 16:09:00 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3107 Lines: 78 On Mon, Mar 13, 2017 at 04:46:20PM +0100, Radim Krčmář wrote: > 2017-03-10 00:29+0200, Michael S. Tsirkin: > > Some guests call mwait without checking the cpu flags. We currently > > emulate that as a NOP but on VMX we can do better: let guest stop the > > CPU until timer or IPI. CPU will be busy but that isn't any worse than > > a NOP emulation. > > > > Note that mwait within guests is not the same as on real hardware > > because you must halt if you want to go deep into sleep. > > SDM (25.3 CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION) > says that "MWAIT operates normally". What is the reason why MWAIT > inside VMX cannot reach the same states as MWAIT outside VMX? If you are going into a deep sleep state with huge latency you are better off exiting and paying an extra microsecond latency since a chance some other task will want to schedule seems higher. > > Thus it isn't > > a good idea to use the regular MWAIT flag in CPUID for that. Add a flag > > in the hypervisor leaf instead. > > > > Signed-off-by: Michael S. Tsirkin > > --- > [...] > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > @@ -594,6 +594,9 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, > > + if (this_cpu_has(X86_FEATURE_MWAIT)) > > + entry->eax = (1 << KVM_FEATURE_MWAIT); > > I'd rather not add it as a paravirt feature: > > - MWAIT requires the software to provide a target state, but we're not > doing anything to expose those states. Current linux guests just discover these states based on CPU model, so we do expose enough info. > The feature would need very constrained setup, which is hard to > support Why would it? It works without any tweaking on several boxes I own. > - we've had requests to support MWAIT emulation for Linux and fully > emulating MWAIT would be best. > MWAIT is not going to enabled by default, of course; it would be > targeted at LPAR-like uses of KVM. Yes I think this limited emulation is safe to enable by default. Pretending mwait is equivalent to halt maybe isn't. > What about keeping just the last hunk to improve OS X, for now? > > Thanks. IMHO if we have a new functionality we are better of creating some way for guests to discover it is there. Do we really have to argue about a single bit in HV leaf? What harm does it do? > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > @@ -3547,13 +3547,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) > > CPU_BASED_USE_IO_BITMAPS | > > CPU_BASED_MOV_DR_EXITING | > > CPU_BASED_USE_TSC_OFFSETING | > > - CPU_BASED_MWAIT_EXITING | > > - CPU_BASED_MONITOR_EXITING | > > CPU_BASED_INVLPG_EXITING | > > CPU_BASED_RDPMC_EXITING; > > > > - printk(KERN_ERR "cleared CPU_BASED_MWAIT_EXITING + CPU_BASED_MONITOR_EXITING\n"); > > - > > opt = CPU_BASED_TPR_SHADOW | > > CPU_BASED_USE_MSR_BITMAPS | > > CPU_BASED_ACTIVATE_SECONDARY_CONTROLS; > > -- > > MST