Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753093AbbBXUC7 (ORCPT ); Tue, 24 Feb 2015 15:02:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39312 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753065AbbBXUC5 (ORCPT ); Tue, 24 Feb 2015 15:02:57 -0500 Message-ID: <54ECD8D2.9080301@redhat.com> Date: Tue, 24 Feb 2015 21:02:26 +0100 From: Denys Vlasenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Steven Rostedt CC: Andy Lutomirski , Linus Torvalds , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/4] x86: get rid of KERNEL_STACK_OFFSET References: <1424803895-4420-1-git-send-email-dvlasenk@redhat.com> <1424803895-4420-2-git-send-email-dvlasenk@redhat.com> <20150224143004.7bbc8cf2@gandalf.local.home> In-Reply-To: <20150224143004.7bbc8cf2@gandalf.local.home> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2041 Lines: 49 On 02/24/2015 08:30 PM, Steven Rostedt wrote: > On Tue, 24 Feb 2015 19:51:33 +0100 > Denys Vlasenko wrote: > >> PER_CPU_VAR(kernel_stack) was set up in a way where it points >> five stack slots below the top of stack. >> >> Presumably, it was done to avoid one "sub $5*8,%rsp" >> in syscall/sysenter code paths, where iret frame needs to be >> created by hand. >> >> Ironically, none of them benefit from this optimization, >> since all of them need to allocate additional data on stack >> (struct pt_regs), so they still have to perform subtraction. >> And ia32_sysenter_target even needs to *undo* this optimization: >> it constructs iret stack with pushes instead of movs, >> so it needs to start right at the top. >> >> This patch eliminates KERNEL_STACK_OFFSET. >> PER_CPU_VAR(kernel_stack) now points directly to top of stack. >> pt_regs allocations are adjusted to allocate iret frame as well. >> > > I always thought the KERNEL_STACK_OFFSET wasn't an optimization, but a > buffer from the real top of stack, in case we had any off by one bugs, > it wouldn't crash the system. I was thinking about it, but it looks unlikely. Reasons: (1) ia32_sysenter_target does "addq $(KERNEL_STACK_OFFSET),%rsp" on entry before saving registers with PUSHes, this returns %rsp to the very top of kernel stack. If that is a problem (say, a NMI at this point would do bad things), it would be noticed by now. (2) even ordinary 64-bit syscall path uses IRET return at times. For one, on every execve and signal return (because they need to load a modified %rsp). With current layout, return frame for IRET lies exactly there, in those 5 stack slots "reserved" via KERNEL_STACK_OFFSET thingy. (3) There are no comments anywhere about KERNEL_STACK_OFFSET being a safety measure. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/