Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750895AbdL3Dtv (ORCPT ); Fri, 29 Dec 2017 22:49:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:51944 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750790AbdL3Dtt (ORCPT ); Fri, 29 Dec 2017 22:49:49 -0500 Date: Fri, 29 Dec 2017 21:49:47 -0600 From: Josh Poimboeuf To: Andy Lutomirski Cc: Linus Torvalds , Toralf =?utf-8?Q?F=C3=B6rster?= , Alexander Tsoy , stable , Linux Kernel , the arch/x86 maintainers Subject: Re: 4.14.9 doesn't boot (regression) Message-ID: <20171230034947.6jgk5t7c7jrl6dwg@treble> References: <33249a35-7d6a-f0f3-5a98-e6474f9366e3@gmx.de> <7A0A9B37-20FF-4B17-B4F5-D8B999269FC4@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <7A0A9B37-20FF-4B17-B4F5-D8B999269FC4@amacapital.net> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Sat, 30 Dec 2017 03:49:49 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2812 Lines: 59 On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote: > (Also, Josh, the oops code should have printed the contents of the > struct pt_regs at the top of the DF stack. Any idea why it didn't?) Looking at one of the dumps: [ 392.774879] NMI backtrace for cpu 0 [ 392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo #1 [ 392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 392.774882] task: ffff8802368b8000 task.stack: ffffc9000000c000 [ 392.774885] RIP: 0010:double_fault+0x0/0x30 [ 392.774886] RSP: 0000:ffffffffff527fd0 EFLAGS: 00000086 [ 392.774887] RAX: 000000003fc00000 RBX: 0000000000000001 RCX: 00000000c0000101 [ 392.774887] RDX: 00000000ffff8802 RSI: 0000000000000000 RDI: ffffffffff527f58 [ 392.774887] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 392.774888] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff816ae726 [ 392.774888] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 392.774889] FS: 0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 [ 392.774889] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 392.774890] CR2: ffffffffff526f08 CR3: 0000000235b48002 CR4: 00000000001606f0 [ 392.774892] Call Trace: [ 392.774894] <#DF> [ 392.774897] do_double_fault+0xb/0x140 [ 392.774898] It should have at least printed the #DF iret frame registers, which I recently added support for in "x86/unwinder: Handle stack overflows more gracefully", which is in both 4.14.9 and 4.15-rc5. I think the missing iret regs are due to a bug in show_trace_log_lvl(), where if the unwind starts with two regs frames in a row, the second regs don't get printed. Alexander, would you mind reproducing again with the below patch? It should still fail, but this time it should hopefully show another RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line. diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c index 36b17e0febe8..39a320d077aa 100644 --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -103,6 +103,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, unwind_start(&state, task, regs, stack); stack = stack ? : get_stack_pointer(task, regs); + regs = unwind_get_entry_regs(&state); /* * Iterate through the stacks, starting with the current stack pointer. @@ -120,7 +121,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, * - hardirq stack * - entry stack */ - for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) { + for ( ; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) { const char *stack_name; if (get_stack_info(stack, task, &stack_info, &visit_mask)) {