Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933959AbbDWLZX (ORCPT ); Thu, 23 Apr 2015 07:25:23 -0400 Received: from thoth.sbs.de ([192.35.17.2]:38532 "EHLO thoth.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933774AbbDWLZU (ORCPT ); Thu, 23 Apr 2015 07:25:20 -0400 Message-ID: <5538D68E.4010702@siemens.com> Date: Thu, 23 Apr 2015 13:25:02 +0200 From: Jan Kiszka User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: Paolo Bonzini , Liang Li , kvm@vger.kernel.org, linux-kernel@vger.kernel.org CC: gleb@kernel.org, Marcelo Tosatti , tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, joro@8bytes.org, yang.z.zhang@intel.com, Xudong Hao Subject: Re: [v6] kvm/fpu: Enable fully eager restore kvm FPU References: <1429823583-3226-1-git-send-email-liang.z.li@intel.com> <5538CC15.4010005@redhat.com> In-Reply-To: <5538CC15.4010005@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1917 Lines: 50 On 2015-04-23 12:40, Paolo Bonzini wrote: > > > On 23/04/2015 23:13, Liang Li wrote: >> Romove lazy FPU logic and use eager FPU entirely. Eager FPU does >> not have performance regression, and it can simplify the code. >> >> When compiling kernel on westmere, the performance of eager FPU >> is about 0.4% faster than lazy FPU. >> >> Signed-off-by: Liang Li >> Signed-off-by: Xudong Hao > > A patch like this requires much more benchmarking than what you have done. > > First, what guest did you use? A modern Linux guest will hardly ever exit > to userspace: the scheduler uses the TSC deadline timer, which is handled > in the kernel; the clocksource uses the TSC; virtio-blk devices are kicked > via ioeventfd. > > What happens if you time a Windows guest (without any Hyper-V enlightenments), > or if you use clocksource=acpi_pm? > > Second, "0.4%" by itself may not be statistically significant. How did > you gather the result? How many times did you run the benchmark? Did > the guest report any stolen time? > > > And finally, even if the patch was indeed a performance improvement, > there is much more that you can remove. fpu_active is always 1, > vmx_fpu_activate only has one call site that can be simplified just to > > vcpu->arch.cr0_guest_owned_bits = X86_CR0_TS; > vmcs_writel(CR0_GUEST_HOST_MASK, ~vcpu->arch.cr0_guest_owned_bits); > > and so on. And it would be good to know how the benchmarks look like on other CPUs than the chosen Intel model. Including older ones. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/