Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753169AbbBYMsq (ORCPT ); Wed, 25 Feb 2015 07:48:46 -0500 Received: from mail-qg0-f52.google.com ([209.85.192.52]:35243 "EHLO mail-qg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751959AbbBYMsp (ORCPT ); Wed, 25 Feb 2015 07:48:45 -0500 MIME-Version: 1.0 In-Reply-To: <20150225085351.GA16165@gmail.com> References: <1424803895-4420-1-git-send-email-dvlasenk@redhat.com> <1424803895-4420-2-git-send-email-dvlasenk@redhat.com> <20150225085351.GA16165@gmail.com> From: Denys Vlasenko Date: Wed, 25 Feb 2015 13:48:24 +0100 Message-ID: Subject: Re: [PATCH 2/4] x86: get rid of KERNEL_STACK_OFFSET To: Ingo Molnar Cc: Denys Vlasenko , Andy Lutomirski , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , X86 ML , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2004 Lines: 55 On Wed, Feb 25, 2015 at 9:53 AM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> PER_CPU_VAR(kernel_stack) was set up in a way where it >> points five stack slots below the top of stack. >> >> Presumably, it was done to avoid one "sub $5*8,%rsp" in >> syscall/sysenter code paths, where iret frame needs to be >> created by hand. >> >> Ironically, none of them benefit from this optimization, >> since all of them need to allocate additional data on >> stack (struct pt_regs), so they still have to perform >> subtraction. > > Well, the original idea of percpu::kernel_stack was that of > an optimization of the 64-bit system_call() path: to set up > RSP as it has to be before we call into system calls. > > This optimization has bitrotted away: because these days > the first SAVE_ARGS in the 64-bit entry path modifies RSP > as well, undoing the optimization. Yes, I figured this is how it was supposed to work. > But the fix should be to not touch RSP in SAVE_ARGS, to > keep percpu::kernel_stack as an optimized entry point - > with KERNEL_STACK_OFFSET pointing to. > > So NAK - this should be fixed for real. IOW, the proposal is to set KERNEL_STACK_OFFSET to SIZEOF_PTREGS. I can do that. However. There is an ortogonal idea we were discussing: to save registers and construct iret frame using PUSH insns, not MOVs. IIRC Andy and Linus liked it. I am ambivalent: the code will be smaller, but might get slower (at least on some CPUs). If we go that way, we will require KERNEL_STACK_OFFSET = 0 (IOW: the current patch). The decision on how exactly we should fix KERNEL_STACK_OFFSET (set it to SIZEOF_PTREGS or to zero) depends on whether we switch to using PUSHes, or not. What do you think? -- vda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/