Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755931AbdCGShP (ORCPT ); Tue, 7 Mar 2017 13:37:15 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49330 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754592AbdCGShN (ORCPT ); Tue, 7 Mar 2017 13:37:13 -0500 Date: Tue, 7 Mar 2017 12:28:55 -0600 From: Josh Poimboeuf To: Andy Lutomirski Cc: Linus Torvalds , Pavel Machek , kernel list , Ingo Molnar , Andrew Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , Peter Anvin , Peter Zijlstra , Thomas Gleixner Subject: Re: v4.10: kernel stack frame pointer .. has bad value (null) Message-ID: <20170307182855.262ezbon2pm67qfd@treble> References: <20170222224755.GA4310@amd> <20170222225614.4z4z24uz6l2iz6qm@treble> <20170222231808.hmr6ulbvfnrg2at7@treble> <20170223201039.GB5177@amd> <20170225050439.7dplheb6nyne4nkm@treble> <20170302234514.3qcqdozibcltkdai@treble> <20170306163807.GA20689@amd> <20170307173821.yknj5htr7plgdwxv@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 07 Mar 2017 18:28:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2623 Lines: 70 On Tue, Mar 07, 2017 at 09:59:44AM -0800, Andy Lutomirski wrote: > On Tue, Mar 7, 2017 at 9:52 AM, Linus Torvalds > wrote: > > On Tue, Mar 7, 2017 at 9:38 AM, Josh Poimboeuf wrote: > >> > >> So I'm thinking we should have -maccumulate-outgoing-args always enabled > >> on x86_32 just like we already do on x86_64. > > > > Ugh. I realize we have workarounds for bugs, but I think > > -maccumulate-outgoing-args is nasty. It just generates worse code by > > avoiding the much nicer push/pop sequences, afaik. Yes, maybe the pushes/pops around a function call are a little easier to read than movs. But the -maccumulate-outgoing-args realignment prologue is a *lot* worse for readability, IMO. Also, the gcc documentation says -maccumulate-outgoing-args is "generally beneficial for performance and size." Not to mention the fact that -maccumulate-outgoing-args seems to already be enabled in most cases anyway. Having it uniformly enabled everywhere makes it less confusing overall when the rare divergences are encountered. From looking at some of the changes related to ADD_ACCUMULATE_OUTGOING_ARGS in arch/x86/Makefile_32.cpu, I can tell that several others before me have stumbled into this prologue issue. > > On x86-64 it's not such a big deal, because we pass the first six > > arguments in registers anyway, so the arguments on the stack is a > > fairly unusual special case. > > > > But on x86-32, we only have three argument registers, so this > > braindamage is potentially worse. > > > > I guess we already do this in most situations due to the gcc bugs, but > > I do think it's sad that we would do it for our _own_ bugs too. > > > > Is it our bug or a gcc bug? I would have thought > -fno-omit-frame-pointer meant that the call-frame-to-return-address > offset should be constant and -fomit-frame-pointer meant "do > whatever". I don't think it's a gcc bug because it doesn't seem to violate frame pointer conventions: pushl -0x4(%edi) # copy return address push %ebp The frame pointer and return address are still stored adjacently. And it normally allows unwinds to work fine. The problem is the kernel unwinder's assumption that the last frame pointer is at a certain address. That assumption breaks with the DRAP prologue. > Also, maybe I'm missing something, but does gcc's code even allow the > function to return sensibly? It could do it by a nasty calculation > involving backing out the old esp from edi, but that seems quite > overcomplicated. That's what it does: lea -0x8(%edi),%esp pop %edi ret -- Josh