Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754864AbbDUPWj (ORCPT ); Tue, 21 Apr 2015 11:22:39 -0400 Received: from mail-wg0-f47.google.com ([74.125.82.47]:33508 "EHLO mail-wg0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752209AbbDUPWh (ORCPT ); Tue, 21 Apr 2015 11:22:37 -0400 Date: Tue, 21 Apr 2015 17:22:32 +0200 From: Ingo Molnar To: Borislav Petkov Cc: Andy Lutomirski , Andrew Cooper , Xen-devel , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, Konrad Rzeszutek Wilk , Boris Ostrovsky , David Vrabel , Rusty Russell , lguest@lists.ozlabs.org, Denys Vlasenko , Linus Torvalds Subject: Re: [RFC PATCH] x86/asm/irq: Don't use POPF but STI Message-ID: <20150421152232.GA22536@gmail.com> References: <1429549782-12962-1-git-send-email-andrew.cooper3@citrix.com> <55359B57.3070008@kernel.org> <20150421124558.GA3483@gmail.com> <20150421130916.GC28895@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150421130916.GC28895@pd.tnic> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2784 Lines: 71 * Borislav Petkov wrote: > On Tue, Apr 21, 2015 at 02:45:58PM +0200, Ingo Molnar wrote: > > From 6f01f6381e8293c360b7a89f516b8605e357d563 Mon Sep 17 00:00:00 2001 > > From: Ingo Molnar > > Date: Tue, 21 Apr 2015 13:32:13 +0200 > > Subject: [PATCH] x86/asm/irq: Don't use POPF but STI > > > > So because the POPF instruction is slow and STI is faster on > > essentially all x86 CPUs that matter, instead of: > > > > ffffffff81891848: 9d popfq > > > > we can do: > > > > ffffffff81661a2e: 41 f7 c4 00 02 00 00 test $0x200,%r12d > > ffffffff81661a35: 74 01 je ffffffff81661a38 > > ffffffff81661a37: fb sti > > ffffffff81661a38: > > > > This bloats the kernel a bit, by about 1K on the 64-bit defconfig: > > > > text data bss dec hex filename > > 12258634 1812120 1085440 15156194 e743e2 vmlinux.before > > 12259582 1812120 1085440 15157142 e74796 vmlinux.after > > > > the other cost is the extra branching, adding extra pressure to the > > branch prediction hardware and also potential branch misses. > > Do we care? [...] Only if it makes stuff faster. > [...] After we enable interrupts, we'll most likely go somewhere > cache "cold" anyway, so the branch misses will happen anyway. > > The question is, would the cost drop from POPF -> STI cover the > increase in branch misses overhead? > > Hmm, interesting. So there's a few places where the POPF is a STI in 100% of the cases. It's probably a win there. But my main worry would be sites that are 'multi use', such as locking APIs - for example spin_unlock_irqrestore(): those tend to be called from different code paths, and each one has a different IRQ flags state. For example scheduler wakeups done from irqs-off codepaths (it's very common), or from irqs-on codepaths (that's very common as well). In the former case we won't have a STI, in the latter case we will - and both would hit a POPF at the end of the critical section. The probability of a branch prediction miss is high in this case. So the question is, is the POPF/STI performance difference higher than the average cost of branch misses. If yes, then the change is probably a win. If not, then it's probably a loss. My gut feeling is that we should let the hardware do it, i.e. we should continue to use POPF - but I can be convinced ... Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/