Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751847AbbEQFat (ORCPT ); Sun, 17 May 2015 01:30:49 -0400 Received: from mail-wg0-f46.google.com ([74.125.82.46]:32820 "EHLO mail-wg0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751051AbbEQFam (ORCPT ); Sun, 17 May 2015 01:30:42 -0400 Date: Sun, 17 May 2015 07:30:36 +0200 From: Ingo Molnar To: Juergen Gross Cc: Jeremy Fitzhardinge , linux-kernel@vger.kernel.org, x86@kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, xen-devel@lists.xensource.com, konrad.wilk@oracle.com, david.vrabel@citrix.com, boris.ostrovsky@oracle.com, chrisw@sous-sol.org, akataria@vmware.com, rusty@rustcorp.com.au, virtualization@lists.linux-foundation.org, gleb@kernel.org, pbonzini@redhat.com, kvm@vger.kernel.org Subject: Re: [PATCH 0/6] x86: reduce paravirtualized spinlock overhead Message-ID: <20150517053036.GB16607@gmail.com> References: <1430391243-7112-1-git-send-email-jgross@suse.com> <55425ADA.4060105@goop.org> <554709BB.7090400@suse.com> <5548FC1A.7000806@goop.org> <554A0132.3070802@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <554A0132.3070802@suse.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1886 Lines: 49 * Juergen Gross wrote: > On 05/05/2015 07:21 PM, Jeremy Fitzhardinge wrote: > >On 05/03/2015 10:55 PM, Juergen Gross wrote: > >>I did a small measurement of the pure locking functions on bare metal > >>without and with my patches. > >> > >>spin_lock() for the first time (lock and code not in cache) dropped from > >>about 600 to 500 cycles. > >> > >>spin_unlock() for first time dropped from 145 to 87 cycles. > >> > >>spin_lock() in a loop dropped from 48 to 45 cycles. > >> > >>spin_unlock() in the same loop dropped from 24 to 22 cycles. > > > >Did you isolate icache hot/cold from dcache hot/cold? It seems to me the > >main difference will be whether the branch predictor is warmed up rather > >than if the lock itself is in dcache, but its much more likely that the > >lock code is icache if the code is lock intensive, making the cold case > >moot. But that's pure speculation. > > > >Could you see any differences in workloads beyond microbenchmarks? > > > >Not that its my call at all, but I think we'd need to see some concrete > >improvements in real workloads before adding the complexity of more pvops. > > I did another test on a larger machine: > > 25 kernel builds (time make -j 32) on a 32 core machine. Before each > build "make clean" was called, the first result after boot was omitted > to avoid disk cache warmup effects. > > System time without my patches: 861.5664 +/- 3.3665 s > with my patches: 852.2269 +/- 3.6629 s So how does the profile look like in the guest, before/after the PV spinlock patches? I'm a bit surprised to see so much spinlock overhead. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/