Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935645AbcLMR1g (ORCPT ); Tue, 13 Dec 2016 12:27:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33754 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934837AbcLMR11 (ORCPT ); Tue, 13 Dec 2016 12:27:27 -0500 Date: Tue, 13 Dec 2016 11:26:57 -0600 From: Josh Poimboeuf To: Andy Lutomirski Cc: Borislav Petkov , x86-ml , lkml , Andy Lutomirski Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null) Message-ID: <20161213172657.e4rhehdmmup3amii@treble> References: <20161212154542.ul6cnzjtkbvooluh@treble> <20161212175023.455mlwhmnqfew2nu@pd.tnic> <20161212181025.uz3gk3jasuskqfmf@treble> <20161212211627.yuodrw35xwq3hmn7@treble> <20161212213446.imd3hpt3nudomu7r@pd.tnic> <20161212221147.twp2nlcolokzq4a4@treble> <20161212223354.qpkut2kc254dfmxi@pd.tnic> <20161212230511.22dusz7bwrvmtjd3@treble> <20161213143455.kj5siigmad4f3xim@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 13 Dec 2016 17:26:59 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3950 Lines: 92 On Tue, Dec 13, 2016 at 08:55:53AM -0800, Andy Lutomirski wrote: > On Tue, Dec 13, 2016 at 6:34 AM, Josh Poimboeuf wrote: > > On Mon, Dec 12, 2016 at 05:05:11PM -0600, Josh Poimboeuf wrote: > >> On Mon, Dec 12, 2016 at 11:33:54PM +0100, Borislav Petkov wrote: > >> > On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote: > >> > > Yes, please. > >> > > >> > Attached. > >> > >> Thanks, I was able to recreate. Will take a look tomorrow. > > > > Figured it out. Your config has CONFIG_PARAVIRT=n, which convinces gcc > > to create the following preamble for x86_64_start_kernel(): > > > > 0000000000000124 : > > 124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10 > > 129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > > 12d: 41 ff 72 f8 pushq -0x8(%r10) > > 131: 55 push %rbp > > 132: 48 89 e5 mov %rsp,%rbp > > > > It's an unusual pattern which aligns rsp (though in this case it's > > already aligned) and saves the start_cpu() return address again on the > > stack before storing the frame pointer. > > Um, what? I can reproduce it -- I get: > > 0000000000000124 : > 124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10 > 129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > 12d: 41 ff 72 f8 pushq -0x8(%r10) > 131: 55 push %rbp > 132: 48 89 e5 mov %rsp,%rbp > 135: 41 57 push %r15 > 137: 41 56 push %r14 > 139: 41 55 push %r13 > 13b: 41 54 push %r12 > 13d: 41 52 push %r10 > 13f: 53 push %rbx > 140: 48 83 ec 10 sub $0x10,%rsp > > ... > > and the epilog looks like: > > 29c: 58 pop %rax > 29d: 5a pop %rdx > 29e: 5b pop %rbx > 29f: 41 5a pop %r10 > 2a1: 41 5c pop %r12 > 2a3: 41 5d pop %r13 > 2a5: 41 5e pop %r14 > 2a7: 41 5f pop %r15 > 2a9: 5d pop %rbp > 2aa: 49 8d 62 f8 lea -0x8(%r10),%rsp > 2ae: c3 retq > > This is, I think, *terrible* code. It makes it entirely impossible > for the CPU to look through the retq until the instruction right > before it finishes because there's no way the CPU can tell what rsp is > until the instruction right before the retq. And it's saving and > restoring an entire extra register (r10) instead of just using rbp for > this purpose. *And* the extra copy of the return address seems > totally useless except for unwinding. > > This does indeed depend on CONFIG_PARAVIRT, but I'm not seeing what > changes. Presumably something related to what happens in the > function? > > I want to file a GCC bug, though. This code sucks. Yeah, I can't figure out why it's doing that. I've seen it align the stack before, but it was always *after* setting up the frame pointer and pushing the registers on the stack. I have no idea what triggered it to do it this way, but it would interesting to know if we can turn it off somehow. > > The unwinder assumes the last stack frame header is at a certain offset, > > but the above code breaks that assumption. I still need to think about > > the best way to fix it. > > Have a dummy written-in-asm top-of-the-stack function? Or recognize > the end by the final saved RBP? Assuming this issue could theoretically show up in *any* function called by entry or head code, I think the least pervasive change would be to just adjust the "last frame" check in the unwinder, aka is_last_task_frame()), to check at the "aligned" off-by-one-word offset in addition to the normal offset. -- Josh