2016-12-10 16:18:05

by Borislav Petkov

[permalink] [raw]
Subject: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

Hey,

I see the line in $Subject on rc8+tip/master.

Fixed already?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


2016-12-10 17:04:46

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Sat, Dec 10, 2016 at 05:17:49PM +0100, Borislav Petkov wrote:
> Hey,
>
> I see the line in $Subject on rc8+tip/master.
>
> Fixed already?

That's a new one. Was there anything else printed? Were you doing
anything special when it happened? Do you see it reliably? I should
probably figure out a way to dump more data for that warning.

--
Josh

2016-12-10 17:28:17

by Borislav Petkov

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Sat, Dec 10, 2016 at 11:04:44AM -0600, Josh Poimboeuf wrote:
> That's a new one. Was there anything else printed?

It is the first line that appears in dmesg when I boot:

[ 0.000000] WARNING: kernel stack frame pointer at ffffffffb5e03f40 in swapper:0 has bad value (null)
[ 0.000000] Linux version 4.9.0-rc8+ (boris@gondor) (gcc version 6.2.0 20161109 (Debian 6.2.0-13) ) #1 SMP PREEMPT Sat Dec 10 13:25:29 CET 2016
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.9.0-rc8+ root=/dev/sda7 ro earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0 root=/dev/sda7 log_buf_len=10M resume=/dev/sda5 no_console_suspend ignore_loglevel
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
...

> Were you doing anything special when it happened?

Not really - just booting ;-)

> Do you see it reliably?

2 of 2 boots.

> I should probably figure out a way to dump more data for that warning.

Sure, I can test patches.

Thanks.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-12-12 15:45:45

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Sat, Dec 10, 2016 at 06:28:02PM +0100, Borislav Petkov wrote:
> On Sat, Dec 10, 2016 at 11:04:44AM -0600, Josh Poimboeuf wrote:
> > That's a new one. Was there anything else printed?
>
> It is the first line that appears in dmesg when I boot:
>
> [ 0.000000] WARNING: kernel stack frame pointer at ffffffffb5e03f40 in swapper:0 has bad value (null)
> [ 0.000000] Linux version 4.9.0-rc8+ (boris@gondor) (gcc version 6.2.0 20161109 (Debian 6.2.0-13) ) #1 SMP PREEMPT Sat Dec 10 13:25:29 CET 2016
> [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.9.0-rc8+ root=/dev/sda7 ro earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0 root=/dev/sda7 log_buf_len=10M resume=/dev/sda5 no_console_suspend ignore_loglevel
> [ 0.000000] KERNEL supported cpus:
> [ 0.000000] Intel GenuineIntel
> [ 0.000000] AMD AuthenticAMD
> [ 0.000000] Centaur CentaurHauls
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> ...
>
> > Were you doing anything special when it happened?
>
> Not really - just booting ;-)
>
> > Do you see it reliably?
>
> 2 of 2 boots.
>
> > I should probably figure out a way to dump more data for that warning.
>
> Sure, I can test patches.

Can you try with this?


diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index c5a7f3a..6fa75b1 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -12,7 +12,7 @@ struct unwind_state {
struct task_struct *task;
int graph_idx;
#ifdef CONFIG_FRAME_POINTER
- unsigned long *bp;
+ unsigned long *bp, *orig_sp;
struct pt_regs *regs;
#else
unsigned long *sp;
diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
index ea7b7f9..1c93984 100644
--- a/arch/x86/kernel/unwind_frame.c
+++ b/arch/x86/kernel/unwind_frame.c
@@ -6,6 +6,36 @@

#define FRAME_HEADER_SIZE (sizeof(long) * 2)

+static void unwind_dump(struct unwind_state *state, unsigned long *sp)
+{
+ static bool dumped_before = false;
+ bool prev_zero, zero = false;
+ unsigned long word;
+
+ if (dumped_before)
+ return;
+
+ dumped_before = true;
+
+ printk_deferred("unwind stack type:%d next_sp:%p mask:%lx graph_idx:%d\n",
+ state->stack_info.type, state->stack_info.next_sp,
+ state->stack_mask, state->graph_idx);
+
+ for (sp = state->orig_sp; sp < state->stack_info.end; sp++) {
+ prev_zero = zero;
+ word = READ_ONCE_NOCHECK(*sp);
+ zero = word == 0;
+
+ if (zero) {
+ if (!prev_zero)
+ printk_deferred("%p: %016x ...\n", sp, 0);
+ continue;
+ }
+
+ printk_deferred("%p: %016lx (%pB)\n", sp, word, (void *)word);
+ }
+}
+
unsigned long unwind_get_return_address(struct unwind_state *state)
{
unsigned long addr;
@@ -25,6 +55,7 @@ unsigned long unwind_get_return_address(struct unwind_state *state)
"WARNING: unrecognized kernel stack return address %p at %p in %s:%d\n",
(void *)addr, addr_p, state->task->comm,
state->task->pid);
+ unwind_dump(state, addr_p);
return 0;
}

@@ -67,6 +98,7 @@ static bool update_stack_state(struct unwind_state *state, void *addr,
size_t len)
{
struct stack_info *info = &state->stack_info;
+ enum stack_type orig_type = info->type;

/*
* If addr isn't on the current stack, switch to the next one.
@@ -80,6 +112,9 @@ static bool update_stack_state(struct unwind_state *state, void *addr,
&state->stack_mask))
return false;

+ if (!state->orig_sp || info->type != orig_type)
+ state->orig_sp = addr;
+
return true;
}

@@ -178,11 +213,13 @@ bool unwind_next_frame(struct unwind_state *state)
"WARNING: kernel stack regs at %p in %s:%d has bad 'bp' value %p\n",
state->regs, state->task->comm,
state->task->pid, next_frame);
+ unwind_dump(state, (unsigned long *)state->regs);
} else {
printk_deferred_once(KERN_WARNING
"WARNING: kernel stack frame pointer at %p in %s:%d has bad value %p\n",
state->bp, state->task->comm,
state->task->pid, next_frame);
+ unwind_dump(state, state->bp);
}
the_end:
state->stack_info.type = STACK_TYPE_UNKNOWN;

2016-12-12 17:50:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 09:45:42AM -0600, Josh Poimboeuf wrote:
> Can you try with this?

That's alot of stack:

---
[ 0.000000] WARNING: kernel stack frame pointer at ffffffffbce03f40 in swapper:0 has bad value (null)
[ 0.000000] unwind stack type:0 next_sp: (null) mask:2 graph_idx:0
[ 0.000000] ffffffffbce03bc0: ffffffffbce03c40 (init_thread_union+0x3c40/0x4000)
[ 0.000000] ffffffffbce03bc8: ffffffffbc027643 (__save_stack_trace+0x73/0xd0)
[ 0.000000] ffffffffbce03bd0: 0000000000000000 ...
[ 0.000000] ffffffffbce03bd8: ffffffffbce00000 (__end_rodata+0x59000/0x59000)
[ 0.000000] ffffffffbce03be0: ffffffffbce04000 (init_thread_union+0x4000/0x4000)
[ 0.000000] ffffffffbce03be8: 0000000000000000 ...
[ 0.000000] ffffffffbce03bf0: 0000000000000002 (0x2)
[ 0.000000] ffffffffbce03bf8: ffffffffbce0e540 (root_mountflags+0x40/0x40)
[ 0.000000] ffffffffbce03c00: 0000000000000000 ...
[ 0.000000] ffffffffbce03c08: ffffffffbce03f40 (init_thread_union+0x3f40/0x4000)
[ 0.000000] ffffffffbce03c10: ffffffffbce03bc0 (init_thread_union+0x3bc0/0x4000)
[ 0.000000] ffffffffbce03c18: 0000000000000000 ...
[ 0.000000] ffffffffbce03c20: ffffffffbd7b4278 (lock_classes+0x158/0x35fe60)
[ 0.000000] ffffffffbce03c28: ffffffffbce0e540 (root_mountflags+0x40/0x40)
[ 0.000000] ffffffffbce03c30: ffffffffbcec9080 (primary_crng+0x60/0x80)
[ 0.000000] ffffffffbce03c38: 000000000000000c (0xc)
[ 0.000000] ffffffffbce03c40: ffffffffbce03c50 (init_thread_union+0x3c50/0x4000)
[ 0.000000] ffffffffbce03c48: ffffffffbc0276bb (save_stack_trace+0x1b/0x20)
[ 0.000000] ffffffffbce03c50: ffffffffbce03c68 (init_thread_union+0x3c68/0x4000)
[ 0.000000] ffffffffbce03c58: ffffffffbc0bf60b (save_trace+0x3b/0xc0)
[ 0.000000] ffffffffbce03c60: 000000000000000c (0xc)
[ 0.000000] ffffffffbce03c68: ffffffffbce03cb0 (init_thread_union+0x3cb0/0x4000)
[ 0.000000] ffffffffbce03c70: ffffffffbc0c1d78 (mark_lock+0x188/0x610)
[ 0.000000] ffffffffbce03c78: 0000000000000000 ...
[ 0.000000] ffffffffbce03c80: ffffffffbcec9080 (primary_crng+0x60/0x80)
[ 0.000000] ffffffffbce03c88: ffffffffbce0ed08 (init_task+0x7c8/0x2340)
[ 0.000000] ffffffffbce03c90: 0000000000000000 ...
[ 0.000000] ffffffffbce03c98: ffffffffbcec9080 (primary_crng+0x60/0x80)
[ 0.000000] ffffffffbce03ca0: ffffffffbce0e540 (root_mountflags+0x40/0x40)
[ 0.000000] ffffffffbce03ca8: 0000000000000001 (0x1)
[ 0.000000] ffffffffbce03cb0: ffffffffbce03d68 (init_thread_union+0x3d68/0x4000)
[ 0.000000] ffffffffbce03cb8: ffffffffbc0c2a6f (__lock_acquire+0x2ff/0x1760)
[ 0.000000] ffffffffbce03cc0: 0000000000000000 ...
[ 0.000000] ffffffffbce03cd0: 000000000000002d (0x2d)
[ 0.000000] ffffffffbce03cd8: ffffffff00000001 (0xffffffff00000001)
[ 0.000000] ffffffffbce03ce0: ffffffffbd7b4120 (classhash_table+0x8000/0x8000)
[ 0.000000] ffffffffbce03ce8: 0000000000000000 ...
[ 0.000000] ffffffffbce03cf0: ffffffffbce0ed08 (init_task+0x7c8/0x2340)
[ 0.000000] ffffffffbce03cf8: 0000000000000000 ...
[ 0.000000] ffffffffbce03d10: ffffffffbce03e0c (init_thread_union+0x3e0c/0x4000)
[ 0.000000] ffffffffbce03d18: ffffffffffffffff (0xffffffffffffffff)
[ 0.000000] ffffffffbce03d20: 00000000431fc205 (0x431fc205)
[ 0.000000] ffffffffbce03d28: ffffffffbcc881f4 (.LC9+0x10e9/0x1606)
[ 0.000000] ffffffffbce03d30: ffffffffbcc881f0 (.LC9+0x10e5/0x1606)
[ 0.000000] ffffffffbce03d38: ffffffffbce03d90 (init_thread_union+0x3d90/0x4000)
[ 0.000000] ffffffffbce03d40: 0000000000000096 (0x96)
[ 0.000000] ffffffffbce03d48: 0000000000000000 ...
[ 0.000000] ffffffffbce03d58: 0000000000000001 (0x1)
[ 0.000000] ffffffffbce03d60: 0000000000000000 ...
[ 0.000000] ffffffffbce03d68: ffffffffbce03dd8 (init_thread_union+0x3dd8/0x4000)
[ 0.000000] ffffffffbce03d70: ffffffffbc0c4537 (lock_acquire+0x107/0x210)
[ 0.000000] ffffffffbce03d78: 0000000000000000 ...
[ 0.000000] ffffffffbce03d80: ffffffffbc498ac0 (_extract_crng+0x40/0xb0)
[ 0.000000] ffffffffbce03d88: 0000000000000000 ...
[ 0.000000] ffffffffbce03d90: ffffffff00000000 (0xffffffff00000000)
[ 0.000000] ffffffffbce03d98: 0000000000000096 (0x96)
[ 0.000000] ffffffffbce03da0: 00000000bc3c8673 (0xbc3c8673)
[ 0.000000] ffffffffbce03da8: ffffffffbcec9080 (primary_crng+0x60/0x80)
[ 0.000000] ffffffffbce03db0: ffffffffbcec9068 (primary_crng+0x48/0x80)
[ 0.000000] ffffffffbce03db8: 0000000000000096 (0x96)
[ 0.000000] ffffffffbce03dc0: 0000000000000008 (0x8)
[ 0.000000] ffffffffbce03dc8: ffffffffbce03e50 (init_thread_union+0x3e50/0x4000)
[ 0.000000] ffffffffbce03dd0: ffffffff0000ffff (0xffffffff0000ffff)
[ 0.000000] ffffffffbce03dd8: ffffffffbce03e00 (init_thread_union+0x3e00/0x4000)
[ 0.000000] ffffffffbce03de0: ffffffffbc873466 (_raw_spin_lock_irqsave+0x46/0x60)
[ 0.000000] ffffffffbce03de8: ffffffffbc498ac0 (_extract_crng+0x40/0xb0)
[ 0.000000] ffffffffbce03df0: ffffffffbcec9020 (input_pool+0xc0/0xc0)
[ 0.000000] ffffffffbce03df8: ffffffffbcec9068 (primary_crng+0x48/0x80)
[ 0.000000] ffffffffbce03e00: ffffffffbce03e30 (init_thread_union+0x3e30/0x4000)
[ 0.000000] ffffffffbce03e08: ffffffffbc498ac0 (_extract_crng+0x40/0xb0)
[ 0.000000] ffffffffbce03e10: ffffffffbce03ec8 (init_thread_union+0x3ec8/0x4000)
[ 0.000000] ffffffffbce03e18: ffffffffbce03e50 (init_thread_union+0x3e50/0x4000)
[ 0.000000] ffffffffbce03e20: 0000000000000008 (0x8)
[ 0.000000] ffffffffbce03e28: ffffffffbcf67b66 (start_kernel+0x2b/0x3d9)
[ 0.000000] ffffffffbce03e30: ffffffffbce03e40 (init_thread_union+0x3e40/0x4000)
[ 0.000000] ffffffffbce03e38: ffffffffbc498b6f (extract_crng+0x3f/0x50)
[ 0.000000] ffffffffbce03e40: ffffffffbce03eb8 (init_thread_union+0x3eb8/0x4000)
[ 0.000000] ffffffffbce03e48: ffffffffbc49a6f5 (get_random_bytes+0x95/0x260)
[ 0.000000] ffffffffbce03e50: ffffffffbcf67155 (early_idt_handler_common+0x35/0x4c)
[ 0.000000] ffffffffbce03e58: ffffffff0000ffff (0xffffffff0000ffff)
[ 0.000000] ffffffffbce03e60: fffffff8ffffffff (0xfffffff8ffffffff)
[ 0.000000] ffffffffbce03e68: 0000000000055000 (0x55000)
[ 0.000000] ffffffffbce03e70: bcf68e0000107117 (0xbcf68e0000107117)
[ 0.000000] ffffffffbce03e78: ffffffffbce03f00 (init_thread_union+0x3f00/0x4000)
[ 0.000000] ffffffffbce03e80: ffffffffbcb00ef0 (_fw_end+0x4000/0x4000)
[ 0.000000] ffffffffbce03e88: 0000000000004400 (0x4400)
[ 0.000000] ffffffffbce03e90: 0000000000000000 ...
[ 0.000000] ffffffffbce03e98: bcf68e0000107117 (0xbcf68e0000107117)
[ 0.000000] ffffffffbce03ea0: bcf68e0000107117 (0xbcf68e0000107117)
[ 0.000000] ffffffffbce03ea8: 0000000000055000 (0x55000)
[ 0.000000] ffffffffbce03eb0: fffffff8ffffffff (0xfffffff8ffffffff)
[ 0.000000] ffffffffbce03eb8: ffffffffbce03ef0 (init_thread_union+0x3ef0/0x4000)
[ 0.000000] ffffffffbce03ec0: ffffffffbcf67b66 (start_kernel+0x2b/0x3d9)
[ 0.000000] ffffffffbce03ec8: 0000000000000000 ...
[ 0.000000] ffffffffbce03ed0: bcf68e0000107117 (0xbcf68e0000107117)
[ 0.000000] ffffffffbce03ed8: bcf68e0000107117 (0xbcf68e0000107117)
[ 0.000000] ffffffffbce03ee0: 0000000000055000 (0x55000)
[ 0.000000] ffffffffbce03ee8: fffffff8ffffffff (0xfffffff8ffffffff)
[ 0.000000] ffffffffbce03ef0: ffffffffbce03f00 (init_thread_union+0x3f00/0x4000)
[ 0.000000] ffffffffbce03ef8: ffffffffbcf67286 (x86_64_start_reservations+0x24/0x26)
[ 0.000000] ffffffffbce03f00: ffffffffbce03f40 (init_thread_union+0x3f40/0x4000)
[ 0.000000] ffffffffbce03f08: ffffffffbcf6740a (x86_64_start_kernel+0x182/0x193)
[ 0.000000] ffffffffbce03f10: 0000000000000000 ...
[ 0.000000] ffffffffbce03f18: ffffffffbce03f58 (init_thread_union+0x3f58/0x4000)
[ 0.000000] ffffffffbce03f20: 00000000000000f0 (0xf0)
[ 0.000000] ffffffffbce03f28: 00000000ac41fb60 (0xac41fb60)
[ 0.000000] ffffffffbce03f30: 00000000ac6c8364 (0xac6c8364)
[ 0.000000] ffffffffbce03f38: 00000000ac767740 (0xac767740)
[ 0.000000] ffffffffbce03f40: 0000000000000000 ...
[ 0.000000] ffffffffbce03f48: ffffffffbc0001b5 (start_cpu+0x5/0x14)
[ 0.000000] ffffffffbce03f50: ffffffffbc0001b5 (start_cpu+0x5/0x14)
[ 0.000000] ffffffffbce03f58: 0000000000000000 ...
[ 0.000000] Linux version 4.9.0-rc8+ (root@gondor) (gcc version 6.2.0 20161109 (Debian 6.2.0-13) ) #2 SMP PREEMPT Mon Dec 12 18:36:48 CET 2016
...

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-12-12 18:10:28

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 06:50:23PM +0100, Borislav Petkov wrote:
> [ 0.000000] ffffffffbce03f40: 0000000000000000 ...
> [ 0.000000] ffffffffbce03f48: ffffffffbc0001b5 (start_cpu+0x5/0x14)
> [ 0.000000] ffffffffbce03f50: ffffffffbc0001b5 (start_cpu+0x5/0x14)
> [ 0.000000] ffffffffbce03f58: 0000000000000000 ...
> [ 0.000000] Linux version 4.9.0-rc8+ (root@gondor) (gcc version 6.2.0 20161109 (Debian 6.2.0-13) ) #2 SMP PREEMPT Mon Dec 12 18:36:48 CET 2016

Something funky is going on in the head code. It looks like start_cpu
is getting executed twice somehow. Need to do some more head
scratching...

--
Josh

2016-12-12 21:16:30

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 12:10:25PM -0600, Josh Poimboeuf wrote:
> On Mon, Dec 12, 2016 at 06:50:23PM +0100, Borislav Petkov wrote:
> > [ 0.000000] ffffffffbce03f40: 0000000000000000 ...
> > [ 0.000000] ffffffffbce03f48: ffffffffbc0001b5 (start_cpu+0x5/0x14)
> > [ 0.000000] ffffffffbce03f50: ffffffffbc0001b5 (start_cpu+0x5/0x14)
> > [ 0.000000] ffffffffbce03f58: 0000000000000000 ...
> > [ 0.000000] Linux version 4.9.0-rc8+ (root@gondor) (gcc version 6.2.0 20161109 (Debian 6.2.0-13) ) #2 SMP PREEMPT Mon Dec 12 18:36:48 CET 2016
>
> Something funky is going on in the head code. It looks like start_cpu
> is getting executed twice somehow. Need to do some more head
> scratching...

I still can't figure out what could cause this, nor can I recreate it.

Andy, any idea? I'm trying to figure out why a stack trace of the
initial task, early in start_kernel(), would show start_cpu() on the
stack *twice*. The start_cpu() entry on the stack at ffffffffbce03f50
is right where it's supposed to be. But then there's another
start_cpu() entry at 0xffffffffbce03f48 which is pointed to by the frame
pointer chain. I can't figure out where that one came from and why the
stack is offset by a word, compared to all the other idle task stacks
I've seen.

Boris, what kind of CPU is it?

--
Josh

2016-12-12 21:34:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 03:16:27PM -0600, Josh Poimboeuf wrote:
> I still can't figure out what could cause this, nor can I recreate it.

Want my .config?

> Andy, any idea? I'm trying to figure out why a stack trace of the
> initial task, early in start_kernel(), would show start_cpu() on the
> stack *twice*. The start_cpu() entry on the stack at ffffffffbce03f50
> is right where it's supposed to be. But then there's another
> start_cpu() entry at 0xffffffffbce03f48 which is pointed to by the frame
> pointer chain. I can't figure out where that one came from and why the
> stack is offset by a word, compared to all the other idle task stacks
> I've seen.

Btw, why do you have:

call 1f # put return address on stack for unwinder

there in start_cpu() instead of

push $start_cpu

or so? That CALL looks strange there. If you want to put the return
address, just push start_cpu's address and that's it.

Or am I missing something?

> Boris, what kind of CPU is it?

[ 0.169371] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz (family: 0x6, model: 0x2d, stepping: 0x7)

Sandybridge. The model 0x2d corresponds to that INTEL_FAM6_SANDYBRIDGE_X
define in intel-family.h.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-12-12 22:11:50

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 10:34:46PM +0100, Borislav Petkov wrote:
> On Mon, Dec 12, 2016 at 03:16:27PM -0600, Josh Poimboeuf wrote:
> > I still can't figure out what could cause this, nor can I recreate it.
>
> Want my .config?

Yes, please.

> > Andy, any idea? I'm trying to figure out why a stack trace of the
> > initial task, early in start_kernel(), would show start_cpu() on the
> > stack *twice*. The start_cpu() entry on the stack at ffffffffbce03f50
> > is right where it's supposed to be. But then there's another
> > start_cpu() entry at 0xffffffffbce03f48 which is pointed to by the frame
> > pointer chain. I can't figure out where that one came from and why the
> > stack is offset by a word, compared to all the other idle task stacks
> > I've seen.
>
> Btw, why do you have:
>
> call 1f # put return address on stack for unwinder
>
> there in start_cpu() instead of
>
> push $start_cpu
>
> or so? That CALL looks strange there. If you want to put the return
> address, just push start_cpu's address and that's it.
>
> Or am I missing something?

Yeah, it's kind of obtuse.

The problem with "push $start_cpu" is that it will show up on the stack
trace as:

secondary_startup_64+0x90/0x90

instead of what you would expect:

start_cpu+0x0/0x14

That's because the printk '%pB' modifier is smart enough to know that
the beginning of a function isn't a valid function call return address.
The only way such an address could end up on the stack would be if the
previous function made a tail call. So it shows the end of the previous
function instead.

That said, the code could probably be made a little clearer by changing
"call 1f" to "push $1f" and then move the '1' label to after the lretq
instruction, like:

pushq $1f # put return address on stack for unwinder
xorq %rbp, %rbp # clear frame pointer
movq initial_code(%rip), %rax
pushq $__KERNEL_CS # set correct cs
pushq %rax # target address in negative space
lretq
1:
ENDPROC(start_cpu)

That shows:

start_cpu+0x14/0x14

Which is more accurate anyway. I'll make a patch.

--
Josh

2016-12-12 22:34:06

by Borislav Petkov

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote:
> Yes, please.

Attached.

> That said, the code could probably be made a little clearer by changing
> "call 1f" to "push $1f" and then move the '1' label to after the lretq
> instruction, like:
>
> pushq $1f # put return address on stack for unwinder
> xorq %rbp, %rbp # clear frame pointer
> movq initial_code(%rip), %rax
> pushq $__KERNEL_CS # set correct cs
> pushq %rax # target address in negative space
> lretq
> 1:
> ENDPROC(start_cpu)
>
> That shows:
>
> start_cpu+0x14/0x14
>
> Which is more accurate anyway. I'll make a patch.

Yap, makes more sense. I'll run it tomorrow.

FWIW, I tried

pushq $1f

before without moving the 1: label but the thing still fired.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Attachments:
(No filename) (855.00 B)
config.gz (25.40 kB)
Download all attachments

2016-12-12 23:05:15

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 11:33:54PM +0100, Borislav Petkov wrote:
> On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote:
> > Yes, please.
>
> Attached.

Thanks, I was able to recreate. Will take a look tomorrow.

--
Josh

2016-12-13 14:35:15

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Mon, Dec 12, 2016 at 05:05:11PM -0600, Josh Poimboeuf wrote:
> On Mon, Dec 12, 2016 at 11:33:54PM +0100, Borislav Petkov wrote:
> > On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote:
> > > Yes, please.
> >
> > Attached.
>
> Thanks, I was able to recreate. Will take a look tomorrow.

Figured it out. Your config has CONFIG_PARAVIRT=n, which convinces gcc
to create the following preamble for x86_64_start_kernel():

0000000000000124 <x86_64_start_kernel>:
124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
12d: 41 ff 72 f8 pushq -0x8(%r10)
131: 55 push %rbp
132: 48 89 e5 mov %rsp,%rbp

It's an unusual pattern which aligns rsp (though in this case it's
already aligned) and saves the start_cpu() return address again on the
stack before storing the frame pointer.

The unwinder assumes the last stack frame header is at a certain offset,
but the above code breaks that assumption. I still need to think about
the best way to fix it.

--
Josh

2016-12-13 16:56:19

by Andy Lutomirski

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Tue, Dec 13, 2016 at 6:34 AM, Josh Poimboeuf <[email protected]> wrote:
> On Mon, Dec 12, 2016 at 05:05:11PM -0600, Josh Poimboeuf wrote:
>> On Mon, Dec 12, 2016 at 11:33:54PM +0100, Borislav Petkov wrote:
>> > On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote:
>> > > Yes, please.
>> >
>> > Attached.
>>
>> Thanks, I was able to recreate. Will take a look tomorrow.
>
> Figured it out. Your config has CONFIG_PARAVIRT=n, which convinces gcc
> to create the following preamble for x86_64_start_kernel():
>
> 0000000000000124 <x86_64_start_kernel>:
> 124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
> 129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
> 12d: 41 ff 72 f8 pushq -0x8(%r10)
> 131: 55 push %rbp
> 132: 48 89 e5 mov %rsp,%rbp
>
> It's an unusual pattern which aligns rsp (though in this case it's
> already aligned) and saves the start_cpu() return address again on the
> stack before storing the frame pointer.

Um, what? I can reproduce it -- I get:

0000000000000124 <x86_64_start_kernel>:
124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
12d: 41 ff 72 f8 pushq -0x8(%r10)
131: 55 push %rbp
132: 48 89 e5 mov %rsp,%rbp
135: 41 57 push %r15
137: 41 56 push %r14
139: 41 55 push %r13
13b: 41 54 push %r12
13d: 41 52 push %r10
13f: 53 push %rbx
140: 48 83 ec 10 sub $0x10,%rsp

...

and the epilog looks like:

29c: 58 pop %rax
29d: 5a pop %rdx
29e: 5b pop %rbx
29f: 41 5a pop %r10
2a1: 41 5c pop %r12
2a3: 41 5d pop %r13
2a5: 41 5e pop %r14
2a7: 41 5f pop %r15
2a9: 5d pop %rbp
2aa: 49 8d 62 f8 lea -0x8(%r10),%rsp
2ae: c3 retq

This is, I think, *terrible* code. It makes it entirely impossible
for the CPU to look through the retq until the instruction right
before it finishes because there's no way the CPU can tell what rsp is
until the instruction right before the retq. And it's saving and
restoring an entire extra register (r10) instead of just using rbp for
this purpose. *And* the extra copy of the return address seems
totally useless except for unwinding.

This does indeed depend on CONFIG_PARAVIRT, but I'm not seeing what
changes. Presumably something related to what happens in the
function?

I want to file a GCC bug, though. This code sucks.

>
> The unwinder assumes the last stack frame header is at a certain offset,
> but the above code breaks that assumption. I still need to think about
> the best way to fix it.

Have a dummy written-in-asm top-of-the-stack function? Or recognize
the end by the final saved RBP?

--Andy

2016-12-13 17:20:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Tue, Dec 13, 2016 at 08:55:53AM -0800, Andy Lutomirski wrote:
> I want to file a GCC bug, though. This code sucks.

Looks like it started doing that with gcc-5. gcc-4.9 output looks fine here:

.globl x86_64_start_kernel
.type x86_64_start_kernel, @function
x86_64_start_kernel:
.LFB2013:
.loc 1 134 0
.cfi_startproc
pushq %rbp #
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp #,
.cfi_def_cfa_register 6
pushq %rbx #
.cfi_offset 3, -24
.loc 1 134 0
movq %rdi, %rbx # real_mode_data, real_mode_data
.LBB157:
.LBB158:
.LBB159:
.LBB160:
.LBB161:
.loc 3 69 0
#APP
# 69 "./arch/x86/include/asm/special_insns.h" 1
mov %cr4,%rax # val

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-12-13 17:27:36

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

On Tue, Dec 13, 2016 at 08:55:53AM -0800, Andy Lutomirski wrote:
> On Tue, Dec 13, 2016 at 6:34 AM, Josh Poimboeuf <[email protected]> wrote:
> > On Mon, Dec 12, 2016 at 05:05:11PM -0600, Josh Poimboeuf wrote:
> >> On Mon, Dec 12, 2016 at 11:33:54PM +0100, Borislav Petkov wrote:
> >> > On Mon, Dec 12, 2016 at 04:11:47PM -0600, Josh Poimboeuf wrote:
> >> > > Yes, please.
> >> >
> >> > Attached.
> >>
> >> Thanks, I was able to recreate. Will take a look tomorrow.
> >
> > Figured it out. Your config has CONFIG_PARAVIRT=n, which convinces gcc
> > to create the following preamble for x86_64_start_kernel():
> >
> > 0000000000000124 <x86_64_start_kernel>:
> > 124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
> > 129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
> > 12d: 41 ff 72 f8 pushq -0x8(%r10)
> > 131: 55 push %rbp
> > 132: 48 89 e5 mov %rsp,%rbp
> >
> > It's an unusual pattern which aligns rsp (though in this case it's
> > already aligned) and saves the start_cpu() return address again on the
> > stack before storing the frame pointer.
>
> Um, what? I can reproduce it -- I get:
>
> 0000000000000124 <x86_64_start_kernel>:
> 124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
> 129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
> 12d: 41 ff 72 f8 pushq -0x8(%r10)
> 131: 55 push %rbp
> 132: 48 89 e5 mov %rsp,%rbp
> 135: 41 57 push %r15
> 137: 41 56 push %r14
> 139: 41 55 push %r13
> 13b: 41 54 push %r12
> 13d: 41 52 push %r10
> 13f: 53 push %rbx
> 140: 48 83 ec 10 sub $0x10,%rsp
>
> ...
>
> and the epilog looks like:
>
> 29c: 58 pop %rax
> 29d: 5a pop %rdx
> 29e: 5b pop %rbx
> 29f: 41 5a pop %r10
> 2a1: 41 5c pop %r12
> 2a3: 41 5d pop %r13
> 2a5: 41 5e pop %r14
> 2a7: 41 5f pop %r15
> 2a9: 5d pop %rbp
> 2aa: 49 8d 62 f8 lea -0x8(%r10),%rsp
> 2ae: c3 retq
>
> This is, I think, *terrible* code. It makes it entirely impossible
> for the CPU to look through the retq until the instruction right
> before it finishes because there's no way the CPU can tell what rsp is
> until the instruction right before the retq. And it's saving and
> restoring an entire extra register (r10) instead of just using rbp for
> this purpose. *And* the extra copy of the return address seems
> totally useless except for unwinding.
>
> This does indeed depend on CONFIG_PARAVIRT, but I'm not seeing what
> changes. Presumably something related to what happens in the
> function?
>
> I want to file a GCC bug, though. This code sucks.

Yeah, I can't figure out why it's doing that. I've seen it align the
stack before, but it was always *after* setting up the frame pointer and
pushing the registers on the stack. I have no idea what triggered it to
do it this way, but it would interesting to know if we can turn it off
somehow.

> > The unwinder assumes the last stack frame header is at a certain offset,
> > but the above code breaks that assumption. I still need to think about
> > the best way to fix it.
>
> Have a dummy written-in-asm top-of-the-stack function? Or recognize
> the end by the final saved RBP?

Assuming this issue could theoretically show up in *any* function called
by entry or head code, I think the least pervasive change would be to
just adjust the "last frame" check in the unwinder, aka
is_last_task_frame()), to check at the "aligned" off-by-one-word offset
in addition to the normal offset.

--
Josh