Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754072AbbGGECC (ORCPT ); Tue, 7 Jul 2015 00:02:02 -0400 Received: from mail-pd0-f182.google.com ([209.85.192.182]:35890 "EHLO mail-pd0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752349AbbGGEBP (ORCPT ); Tue, 7 Jul 2015 00:01:15 -0400 Message-ID: <1436241672.5554.2.camel@gmail.com> Subject: Re: [PATCH] x86: Fix detection of GCC -mpreferred-stack-boundary support From: Raymond Jennings To: Andy Lutomirski Cc: Ingo Molnar , Andy Lutomirski , "linux-kernel@vger.kernel.org" , X86 ML , Linus Torvalds , Jan Kara , Borislav Petkov , Denys Vlasenko Date: Mon, 06 Jul 2015 21:01:12 -0700 In-Reply-To: References: <20150706134423.GA8094@gmail.com> <20150706174011.GB30566@gmail.com> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2589 Lines: 56 On Mon, 2015-07-06 at 10:59 -0700, Andy Lutomirski wrote: > On Mon, Jul 6, 2015 at 10:40 AM, Ingo Molnar wrote: > > > > * Andy Lutomirski wrote: > > > >> > My reasoning: on modern uarchs there's no penalty for 32-bit misalignment of > >> > 64-bit variables, only if they cross 64-byte cache lines, which should be rare > >> > with a chance of 1:16. This small penalty (of at most +1 cycle in some > >> > circumstances IIRC) should be more than counterbalanced by the compression of > >> > the stack by 5% on average. > >> > >> I'll counter with: what's the benefit? There are no operations that will > >> naturally change RSP by anything that isn't a multiple of 8 (there's no pushl in > >> 64-bit mode, or at least not on AMD chips -- the Intel manual is a bit vague on > >> this point), so we'll end up with RSP being a multiple of 8 regardless. Even if > >> we somehow shaved 4 bytes off in asm, that still wouldn't buy us anything, as a > >> dangling 4 bytes at the bottom of the stack isn't useful for anything. > > > > Yeah, so it might be utilized in frame-pointer less builds (which we might be able > > to utilize in the future if sane Dwarf code comes around), which does not use > > push/pop to manage the stack but often has patterns like: > > > > ffffffff8102aa90 : > > ffffffff8102aa90: 48 83 ec 18 sub $0x18,%rsp > > > > and uses MOVs to manage the stack. Those kinds of stack frames could be 4-byte > > granular as well. > > > > But yeah ... it's pretty marginal. > > To get even that, we'd need an additional ABI-changing GCC flag to > change GCC's idea of the alignment of long from 8 to 4. (I just > checked: g++ thinks that alignof(long) == 8. I was too lazy to look > up how to ask the equivalent question in C.) I just want to point out that long itself is 8 bytes on 64-bit x86, but only 4 bytes on 32-bit x86. Perhaps we should keep in mind sizeof(long) and not just alignof(long)? My opinion btw, is that if long is 8 bytes wide, it should also be 8 bytes aligned. > --Andy > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/