Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754404AbdCFQry (ORCPT ); Mon, 6 Mar 2017 11:47:54 -0500 Received: from atrey.karlin.mff.cuni.cz ([195.113.26.193]:51544 "EHLO atrey.karlin.mff.cuni.cz" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752973AbdCFQru (ORCPT ); Mon, 6 Mar 2017 11:47:50 -0500 Date: Mon, 6 Mar 2017 17:38:07 +0100 From: Pavel Machek To: Josh Poimboeuf Cc: kernel list , mingo@kernel.org, luto@kernel.org, bp@alien8.de, brgerst@gmail.com, dvlasenk@redhat.com, hpa@zytor.com, torvalds@linux-foundation.org, peterz@infradead.org, tglx@linutronix.de Subject: Re: v4.10: kernel stack frame pointer .. has bad value (null) Message-ID: <20170306163807.GA20689@amd> References: <20170221221418.GA6918@amd> <20170221231216.y56gb62vkn5ewgea@treble> <20170222210548.GC8467@amd> <20170222212103.tigzbw5sfrwd7uwh@treble> <20170222224755.GA4310@amd> <20170222225614.4z4z24uz6l2iz6qm@treble> <20170222231808.hmr6ulbvfnrg2at7@treble> <20170223201039.GB5177@amd> <20170225050439.7dplheb6nyne4nkm@treble> <20170302234514.3qcqdozibcltkdai@treble> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tKW2IUtsqtDRztdT" Content-Disposition: inline In-Reply-To: <20170302234514.3qcqdozibcltkdai@treble> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4748 Lines: 125 --tKW2IUtsqtDRztdT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu 2017-03-02 17:45:14, Josh Poimboeuf wrote: > On Fri, Feb 24, 2017 at 11:04:39PM -0600, Josh Poimboeuf wrote: > > On Thu, Feb 23, 2017 at 09:10:39PM +0100, Pavel Machek wrote: > > > Hi! > > >=20 > > >=20 > > > > > > > Somehow, startup_32_smp() is on the stack twice. The stack u= nwind led > > > > > > > to the startup_32_smp() frame at 0xf50cdf9c rather than the o= ne at > > > > > > > 0xf50cdfa8 (which is where it should normally be). So the qu= estion is > > > > > > > how startup_32_smp() got executed the second time, with the w= rong stack > > > > > > > offset. > > > > > >=20 > > > > > > Not much idea... but this is stack dump, right? Just because so= me > > > > > > value is on the stack does not mean it is a return address, no? > > > > >=20 > > > > > Right, but the one at 0xf50cdfa8 is where the startup_32_smp() is > > > > > *supposed* to be. If the unwinder had unwinded to that one, it w= ouldn't > > > > > have complained. So it looks to me like the CPU somehow booted t= wice: > > > > > the first time at the right stack address, and the second time it > > > > > somehow ended up with a different stack address. > > > > >=20 > > > > > > And .... startup_32_smp is kind of "interesting" function. Take= a > > > > > > look... > > > > >=20 > > > > > Yes, it's used in bringing up the CPU. > > > >=20 > > > > Can you share your .config? =20 > > >=20 > > > Here you go... > >=20 > > What version of gcc are you using? > >=20 > > Can you post a disassembly of the first 10 instructions of > > start_secondary()? >=20 > Pavel, ping? I'd like to try to get to the bottom of this issue soon. >=20 > I asked for the gcc version and the disassembly of start_secondary() > because I suspect gcc may have done a funky stack alignment prologue > which copies the return address on the stack a second time after > aligning it. Sorry for the delay. This is on v4.11-rc1, but that should be similar. pavel@duo:~$ gcc --version gcc (Debian 4.9.2-10) 4.9.2 And here's the disassemble: c402d200 : c402d200: 57 push %edi c402d201: 8d 7c 24 08 lea 0x8(%esp),%edi c402d205: 83 e4 f8 and $0xfffffff8,%esp c402d208: ff 77 fc pushl -0x4(%edi) c402d20b: 55 push %ebp c402d20c: 89 e5 mov %esp,%ebp c402d20e: 57 push %edi c402d20f: 56 push %esi c402d210: 83 ec 10 sub $0x10,%esp c402d213: e8 78 78 ff ff call c4024a90 c402d218: ff 15 d0 d7 0c c5 call *0xc50cd7d0 c402d21e: 8b 15 00 53 05 c5 mov 0xc5055300,%edx c402d224: 8d 75 e8 lea -0x18(%ebp),%esi c402d227: 64 a1 f4 c0 1d c5 mov %fs:0xc51dc0f4,%eax c402d22d: 89 45 e8 mov %eax,-0x18(%ebp) c402d230: b8 20 00 00 00 mov $0x20,%eax c402d235: ff 52 78 call *0x78(%edx) c402d238: 8b 15 00 53 05 c5 mov 0xc5055300,%edx c402d23e: ff 52 4c call *0x4c(%edx) c402d241: e8 ea 2c 00 00 call c402ff30 c402d246: 8b 45 e8 mov -0x18(%ebp),%eax c402d249: e8 42 fb ff ff call c402cd90 c402d24e: e8 5d 37 fd ff call c40009b0 c402d253: 8b 55 e8 mov -0x18(%ebp),%edx c402d256: b8 00 c0 1d c5 mov $0xc51dc000,%eax c402d25b: 8b 0d 88 d6 0b c5 mov 0xc50bd688,%ecx c402d261: f6 05 fa fc 13 c5 04 testb $0x4,0xc513fcfa c402d268: 8b 14 95 20 52 05 c5 mov -0x3afaade0(,%edx,4),%edx c402d26f: 89 8c 10 c4 00 00 00 mov %ecx,0xc4(%eax,%edx,1) c402d276: 0f 85 24 01 00 00 jne c402d3a0 c402d27c: 64 a1 f4 c0 1d c5 mov %fs:0xc51dc0f4,%eax c402d282: e8 49 fb ff ff call c402cdd0 Let me know if I should go back to v4.10 and retry. Best regards, Pavel --=20 (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blo= g.html --tKW2IUtsqtDRztdT Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAli9kG8ACgkQMOfwapXb+vKbMwCfSdBBJ8XFdEvlJZ1U1E3dCmoh 4AsAniSyL2k9/Lgc42nEqT0kMB+Vwh/U =9Txk -----END PGP SIGNATURE----- --tKW2IUtsqtDRztdT--