Date: Tue, 21 Apr 2015 17:22:32 +0200
From: Ingo Molnar <mingo@kernel.org>
To: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>,
        Andrew Cooper <andrew.cooper3@citrix.com>,
        Xen-devel <xen-devel@lists.xen.org>,
        Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>,
        "H. Peter Anvin" <hpa@zytor.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
        Boris Ostrovsky <boris.ostrovsky@oracle.com>,
        David Vrabel <david.vrabel@citrix.com>,
        Rusty Russell <rusty@rustcorp.com.au>, lguest@lists.ozlabs.org,
        Denys Vlasenko <vda.linux@googlemail.com>,
        Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC PATCH] x86/asm/irq: Don't use POPF but STI
Message-ID: <20150421152232.GA22536@gmail.com>
References: <1429549782-12962-1-git-send-email-andrew.cooper3@citrix.com>
 <55359B57.3070008@kernel.org>
 <20150421124558.GA3483@gmail.com>
 <20150421130916.GC28895@pd.tnic>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150421130916.GC28895@pd.tnic>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2784
Lines: 71


* Borislav Petkov <bp@alien8.de> wrote:

> On Tue, Apr 21, 2015 at 02:45:58PM +0200, Ingo Molnar wrote:
> > From 6f01f6381e8293c360b7a89f516b8605e357d563 Mon Sep 17 00:00:00 2001
> > From: Ingo Molnar <mingo@kernel.org>
> > Date: Tue, 21 Apr 2015 13:32:13 +0200
> > Subject: [PATCH] x86/asm/irq: Don't use POPF but STI
> > 
> > So because the POPF instruction is slow and STI is faster on 
> > essentially all x86 CPUs that matter, instead of:
> > 
> >   ffffffff81891848:       9d                      popfq
> > 
> > we can do:
> > 
> >   ffffffff81661a2e:       41 f7 c4 00 02 00 00    test   $0x200,%r12d
> >   ffffffff81661a35:       74 01                   je     ffffffff81661a38 <snd_pcm_stream_unlock_irqrestore+0x28>
> >   ffffffff81661a37:       fb                      sti
> >   ffffffff81661a38:
> > 
> > This bloats the kernel a bit, by about 1K on the 64-bit defconfig:
> > 
> >    text    data     bss     dec     hex filename
> >    12258634        1812120 1085440 15156194         e743e2 vmlinux.before
> >    12259582        1812120 1085440 15157142         e74796 vmlinux.after
> > 
> > the other cost is the extra branching, adding extra pressure to the
> > branch prediction hardware and also potential branch misses.
> 
> Do we care? [...]

Only if it makes stuff faster.

> [...] After we enable interrupts, we'll most likely go somewhere 
> cache "cold" anyway, so the branch misses will happen anyway.
> 
> The question is, would the cost drop from POPF -> STI cover the 
> increase in branch misses overhead?
> 
> Hmm, interesting.

So there's a few places where the POPF is a STI in 100% of the cases. 
It's probably a win there.

But my main worry would be sites that are 'multi use', such as locking 
APIs - for example spin_unlock_irqrestore(): those tend to be called 
from different code paths, and each one has a different IRQ flags 
state.

For example scheduler wakeups done from irqs-off codepaths (it's very 
common), or from irqs-on codepaths (that's very common as well). In 
the former case we won't have a STI, in the latter case we will - and 
both would hit a POPF at the end of the critical section. The 
probability of a branch prediction miss is high in this case.

So the question is, is the POPF/STI performance difference higher than 
the average cost of branch misses. If yes, then the change is probably 
a win. If not, then it's probably a loss.

My gut feeling is that we should let the hardware do it, i.e. we 
should continue to use POPF - but I can be convinced ...

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/