Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753606AbbDANVL (ORCPT ); Wed, 1 Apr 2015 09:21:11 -0400 Received: from mail-wi0-f174.google.com ([209.85.212.174]:36828 "EHLO mail-wi0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753155AbbDANVI (ORCPT ); Wed, 1 Apr 2015 09:21:08 -0400 Date: Wed, 1 Apr 2015 15:21:03 +0200 From: Ingo Molnar To: Denys Vlasenko Cc: Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Andy Lutomirski , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/9] x86/asm/entry/32: Use PUSH instructions to build pt_regs on stack Message-ID: <20150401132103.GB13492@gmail.com> References: <1427821211-25099-1-git-send-email-dvlasenk@redhat.com> <1427821211-25099-2-git-send-email-dvlasenk@redhat.com> <20150401085140.GC23916@gmail.com> <551BEED2.2080805@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <551BEED2.2080805@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1508 Lines: 49 * Denys Vlasenko wrote: > On 04/01/2015 10:51 AM, Ingo Molnar wrote: > > > > * Denys Vlasenko wrote: > > > >> This mimics the recent similar 64-bit change. > >> Saves ~110 bytes of code. > >> > >> Patch was run-tested on 32 and 64 bits, Intel and AMD CPU. > >> I also looked at the diff of entry_64.o disassembly, to have > >> a different view of the changes. > > > > The other important question would be: what performance difference (if > > any) did you observe before/after the change? > > I did not measure it then. > > At the moment I don't have AMD CPUs here, cant benchmark > 32-bit syscall-based codepath. > > On a Sandy Bridge CPU (IOW: sysenter codepath) - > > Before: 78.57 ns per getpid > After: 76.90 ns per getpid > > It's better than I thought it would be. > Probably because this load: > > movl ASM_THREAD_INFO(TI_sysenter_return, %rsp, 0), %r10d > > has been moved up by the patch (happens sooner). There's also less I$ used, and in straight, continuous spots, which should result in less cache misses in the very common "the kernel's code is cache cold" situation that syscall entry operates under - and that's not captured by your benchmark. So it's a good change. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/