Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932909AbaFSM2l (ORCPT ); Thu, 19 Jun 2014 08:28:41 -0400 Received: from mail-wg0-f49.google.com ([74.125.82.49]:56425 "EHLO mail-wg0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932300AbaFSM2j (ORCPT ); Thu, 19 Jun 2014 08:28:39 -0400 Message-ID: <53A2D771.3000700@gmail.com> Date: Thu, 19 Jun 2014 15:28:33 +0300 From: Nadav Amit User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: "Michael S. Tsirkin" CC: Gleb Natapov , "Gabriel L. Somlo" , Eric Northup , Nadav Amit , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , the arch/x86 maintainers , Linux Kernel Mailing List , KVM , joro@8bytes.org, agraf@suse.de Subject: Re: [PATCH 3/3] KVM: x86: correct mwait and monitor emulation References: <20140618184601.GE1695@ERROL.INI.CMU.EDU> <20140619101811.GA5777@redhat.com> <1B06E887-9D07-4E85-AE06-75B01787C488@gmail.com> <20140619112356.GB429@minantech.com> <53A2CEF4.3050902@gmail.com> <20140619120739.GA7289@minantech.com> <53A2D32D.8020305@gmail.com> <20140619121756.GA28523@redhat.com> In-Reply-To: <20140619121756.GA28523@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/19/14, 3:17 PM, Michael S. Tsirkin wrote: > On Thu, Jun 19, 2014 at 03:10:21PM +0300, Nadav Amit wrote: >> On 6/19/14, 3:07 PM, Gleb Natapov wrote: >>> On Thu, Jun 19, 2014 at 02:52:20PM +0300, Nadav Amit wrote: >>>> On 6/19/14, 2:23 PM, Gleb Natapov wrote: >>>>> On Thu, Jun 19, 2014 at 01:53:36PM +0300, Nadav Amit wrote: >>>>>> >>>>>> On Jun 19, 2014, at 1:18 PM, Michael S. Tsirkin wrote: >>>>>> >>>>>>> On Wed, Jun 18, 2014 at 02:46:01PM -0400, Gabriel L. Somlo wrote: >>>>>>>> On Wed, Jun 18, 2014 at 10:59:14AM -0700, Eric Northup wrote: >>>>>>>>> On Wed, Jun 18, 2014 at 7:19 AM, Nadav Amit wrote: >>>>>>>>>> mwait and monitor are currently handled as nop. Considering this behavior, they >>>>>>>>>> should still be handled correctly, i.e., check execution conditions and generate >>>>>>>>>> exceptions when required. mwait and monitor may also be executed in real-mode >>>>>>>>>> and are not handled in that case. This patch performs the emulation of >>>>>>>>>> monitor-mwait according to Intel SDM (other than checking whether interrupt can >>>>>>>>>> be used as a break event). >>>>>>>>>> >>>>>>>>>> Signed-off-by: Nadav Amit >>>>>>>> >>>>>>>> How about this instead (details in the commit log below) ? Please let >>>>>>>> me know what you think, and if you'd prefer me to send it out as a >>>>>>>> separate patch rather than a reply to this thread. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> --Gabriel >>>>>>> >>>>>>> If there's an easy workaround, I'm inclined to agree. >>>>>>> We can always go back to Gabriel's patch (and then we'll need >>>>>>> Nadav's one too) but if we release a kernel with this >>>>>>> support it becomes an ABI and we can't go back. >>>>>>> >>>>>>> So let's be careful here, and revert the hack for 3.16. >>>>>>> >>>>>>> >>>>>>> Acked-by: Michael S. Tsirkin >>>>>>> >>>>>> Personally, I got a custom guest which requires mwait for executing correctly. >>>>> Can you elaborate on this guest a little bit. With nop implementation >>>>> for mwait the guest will hog a host cpu. Do you consider this to be >>>>> "executing correctly?" >>>>> >>>>> -- >>>> >>>> mwait is not as "clean" as it may appear. It encounters false wake-ups due >>>> to a variety of reasons, and any code need to recheck the wake-up condition >>>> afterwards. Actually, some CPUs had bugs that caused excessive wake-ups that >>>> degraded performance considerably (Nehalem, if I am not mistaken). >>>> Therefore, handling mwait as nop is logically correct (although it may >>>> degrade performance). >>>> >>>> For the reference, if you look at the SDM 8.10.4, you'll see: >>>> "Multiple events other than a write to the triggering address range can >>>> cause a processor that executed MWAIT to wake up. These include events that >>>> would lead to voluntary or involuntary context switches, such as..." >>>> >>>> Note the words "include" in the sentence "These include events". Software >>>> has no way of controlling whether it gets false wake-ups and cannot rely on >>>> the wake-up as indication to anything. >>>> >>> That's all well and good and I didn't say that nop is not a valid >>> mwait implementation, it is, though there is a big difference between >>> "encounters false wake-ups" and never sleeps. What I asked is do you >>> consider your guest hogging host cpu to be "executing correctly?". What >>> this guest is doing that such behaviour is tolerated and shouldn't it >>> be better to just poll for a condition you are waiting for instead of >>> executing expensive vmexits. This will also hog 100% host cpu, but will >>> be actually faster. >>> >> You are correct, but unfortunately I have no control over the guest >> workload. In this specific workload I do not care about performance but only >> about correctness. >> >> Nadav > > No one prevents you from patching your kernel to run this workload. But > is this of use to anyone else? If yes why? > I do not say it should be the default behavior, and I can try to push to qemu some setting to turn it on by demand. Anyhow, I believe there are cases you may want mwait support - either an OS X guest which was not modified to run without mwait, or for debugging the monitor-mwait flow of a guest OS. I am not going to argue too much. Since I was under the impression there are needs for mwait, other than mine, I thought it would make all of our lives easier to have a better implementation. Nadav -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/