Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932504AbbHYImk (ORCPT ); Tue, 25 Aug 2015 04:42:40 -0400 Received: from mail-wi0-f173.google.com ([209.85.212.173]:35197 "EHLO mail-wi0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751763AbbHYImF (ORCPT ); Tue, 25 Aug 2015 04:42:05 -0400 Date: Tue, 25 Aug 2015 10:42:01 +0200 From: Ingo Molnar To: Andy Lutomirski Cc: X86 ML , Denys Vlasenko , Brian Gerst , Borislav Petkov , Linus Torvalds , "linux-kernel@vger.kernel.org" , Jan Beulich Subject: Re: Proposal for finishing the 64-bit x86 syscall cleanup Message-ID: <20150825084201.GA21589@gmail.com> References: <20150825081841.GA19412@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150825081841.GA19412@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3904 Lines: 72 * Ingo Molnar wrote: > > * Andy Lutomirski wrote: > > > Hi all- > > > > I want to (try to) mostly or fully get rid of the messy bits (as > > opposed to the hardware-bs-forced bits) of the 64-bit syscall asm. > > There are two major conceptual things that are in the way. > > > > Thing 1: partial pt_regs > > > > 64-bit fast path syscalls don't fully initialize pt_regs: bx, bp, and > > r12-r15 are uninitialized. Some syscalls require them to be > > initialized, and they have special awful stubs to do it. The entry > > and exit tracing code (except for phase1 tracing) also need them > > initialized, and they have their own messy initialization. Compat > > syscalls are their own private little mess here. > > > > This gets in the way of all kinds of cleanups, because C code can't > > switch between the full and partial pt_regs states. > > > > I can see two ways out. We could remove the optimization entirely, > > which consists of pushing and popping six more registers and adds > > about ten cycles to fast path syscalls on Sandy Bridge. It also > > simplifies and presumably speeds up the slow paths. > > So out of hundreds of regular system calls there's only a handful of such system > calls: > > triton:~/tip> git grep stub arch/x86/entry/syscalls/ > arch/x86/entry/syscalls/syscall_32.tbl:2 i386 fork sys_fork stub32_fork > arch/x86/entry/syscalls/syscall_32.tbl:11 i386 execve sys_execve stub32_execve > arch/x86/entry/syscalls/syscall_32.tbl:119 i386 sigreturn sys_sigreturn stub32_sigreturn > arch/x86/entry/syscalls/syscall_32.tbl:120 i386 clone sys_clone stub32_clone > arch/x86/entry/syscalls/syscall_32.tbl:173 i386 rt_sigreturn sys_rt_sigreturn stub32_rt_sigreturn > arch/x86/entry/syscalls/syscall_32.tbl:190 i386 vfork sys_vfork stub32_vfork > arch/x86/entry/syscalls/syscall_32.tbl:358 i386 execveat sys_execveat stub32_execveat > arch/x86/entry/syscalls/syscall_64.tbl:15 64 rt_sigreturn stub_rt_sigreturn > arch/x86/entry/syscalls/syscall_64.tbl:56 common clone stub_clone > arch/x86/entry/syscalls/syscall_64.tbl:57 common fork stub_fork > arch/x86/entry/syscalls/syscall_64.tbl:58 common vfork stub_vfork > arch/x86/entry/syscalls/syscall_64.tbl:59 64 execve stub_execve > arch/x86/entry/syscalls/syscall_64.tbl:322 64 execveat stub_execveat > arch/x86/entry/syscalls/syscall_64.tbl:513 x32 rt_sigreturn stub_x32_rt_sigreturn > arch/x86/entry/syscalls/syscall_64.tbl:520 x32 execve stub_x32_execve > arch/x86/entry/syscalls/syscall_64.tbl:545 x32 execveat stub_x32_execveat > > and none of them are super performance critical system calls, so no way would I > go for unconditionally saving/restoring all of ptregs, just to make it a bit > simpler for these syscalls. Let me qualify that: no way in the long run. In the short run we can drop the optimization and reintroduce it later, to lower all the risks that the C conversion brings with itself. ( That would also make it easier to re-analyze the cost/benefit ratio of the optimization. ) So feel free to introduce a simple ptregs save/restore pattern for now. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/