Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752717AbdCFHCq (ORCPT ); Mon, 6 Mar 2017 02:02:46 -0500 Received: from mail-ua0-f196.google.com ([209.85.217.196]:35875 "EHLO mail-ua0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752703AbdCFHBl (ORCPT ); Mon, 6 Mar 2017 02:01:41 -0500 MIME-Version: 1.0 X-Originating-IP: [118.200.139.147] In-Reply-To: <20170227154727.lbfvtizqzzybekjz@treble> References: <20170227154727.lbfvtizqzzybekjz@treble> From: Daniel J Blueman Date: Mon, 6 Mar 2017 14:52:01 +0800 Message-ID: Subject: Re: stack frame unwindind KASAN errors To: Josh Poimboeuf Cc: Linux Kernel , Ingo Molnar Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5878 Lines: 153 On 27 February 2017 at 23:47, Josh Poimboeuf wrote: > On Mon, Feb 27, 2017 at 12:49:59PM +0800, Daniel J Blueman wrote: >> On 4.9.13 with KASAN enabled [1], we see a number of stack unwinding >> errors reported [2,3]. >> >> This seems to occur at half of boots. >> >> Let me know for further debug info/patch testing and thanks, >> Daniel >> >> [1] https://quora.org/config >> [2] https://quora.org/dmesg.txt > > Hi Daniel, > > Can you try the following patch? It's a backport of the following > upstream commit: > > 09ae68dd0a8d ("x86/unwind: Disable KASAN checks for non-current tasks") > > If it fixes it then I'll submit it for 4.9 stable. > > --- > > From: Josh Poimboeuf > Subject: [PATCH] x86/unwind: Disable KASAN checks for non-current tasks > > There are a handful of callers to save_stack_trace_tsk() and > show_stack() which try to unwind the stack of a task other than current. > In such cases, it's remotely possible that the task is running on one > CPU while the unwinder is reading its stack from another CPU, causing > the unwinder to see stack corruption. > > These cases seem to be mostly harmless. The unwinder has checks which > prevent it from following bad pointers beyond the bounds of the stack. > So it's not really a bug as long as the caller understands that > unwinding another task will not always succeed. > > In such cases, it's possible that the unwinder may read a KASAN-poisoned > region of the stack. Account for that by using READ_ONCE_NOCHECK() when > reading the stack of another task. > > Use READ_ONCE() when reading the stack of the current task, since KASAN > warnings can still be useful for finding bugs in that case. > > Reported-by: Dmitry Vyukov > Signed-off-by: Josh Poimboeuf > Cc: Andy Lutomirski > Cc: Andy Lutomirski > Cc: Borislav Petkov > Cc: Brian Gerst > Cc: Dave Jones > Cc: Denys Vlasenko > Cc: H. Peter Anvin > Cc: Linus Torvalds > Cc: Miroslav Benes > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com > Signed-off-by: Ingo Molnar > --- > arch/x86/include/asm/stacktrace.h | 5 ++++- > arch/x86/kernel/unwind_frame.c | 20 ++++++++++++++++++-- > 2 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h > index 37f2e0b..4141ead 100644 > --- a/arch/x86/include/asm/stacktrace.h > +++ b/arch/x86/include/asm/stacktrace.h > @@ -55,13 +55,16 @@ extern int kstack_depth_to_print; > static inline unsigned long * > get_frame_pointer(struct task_struct *task, struct pt_regs *regs) > { > + struct inactive_task_frame *frame; > + > if (regs) > return (unsigned long *)regs->bp; > > if (task == current) > return __builtin_frame_address(0); > > - return (unsigned long *)((struct inactive_task_frame *)task->thread.sp)->bp; > + frame = (struct inactive_task_frame *)task->thread.sp; > + return (unsigned long *)READ_ONCE_NOCHECK(frame->bp); > } > #else > static inline unsigned long * > diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c > index a2456d4..caff129 100644 > --- a/arch/x86/kernel/unwind_frame.c > +++ b/arch/x86/kernel/unwind_frame.c > @@ -6,6 +6,21 @@ > > #define FRAME_HEADER_SIZE (sizeof(long) * 2) > > +/* > + * This disables KASAN checking when reading a value from another task's stack, > + * since the other task could be running on another CPU and could have poisoned > + * the stack in the meantime. > + */ > +#define READ_ONCE_TASK_STACK(task, x) \ > +({ \ > + unsigned long val; \ > + if (task == current) \ > + val = READ_ONCE(x); \ > + else \ > + val = READ_ONCE_NOCHECK(x); \ > + val; \ > +}) > + > unsigned long unwind_get_return_address(struct unwind_state *state) > { > unsigned long addr; > @@ -14,7 +29,8 @@ unsigned long unwind_get_return_address(struct unwind_state *state) > if (unwind_done(state)) > return 0; > > - addr = ftrace_graph_ret_addr(state->task, &state->graph_idx, *addr_p, > + addr = READ_ONCE_TASK_STACK(state->task, *addr_p); > + addr = ftrace_graph_ret_addr(state->task, &state->graph_idx, addr, > addr_p); > > return __kernel_text_address(addr) ? addr : 0; > @@ -48,7 +64,7 @@ bool unwind_next_frame(struct unwind_state *state) > if (unwind_done(state)) > return false; > > - next_bp = (unsigned long *)*state->bp; > + next_bp = (unsigned long *)READ_ONCE_TASK_STACK(state->task, *state->bp); > > /* make sure the next frame's data is accessible */ > if (!update_stack_state(state, next_bp, FRAME_HEADER_SIZE)) Thanks Josh! With this patch, the KASAN warning still occurs, but at unwind_get_return_address+0x1d3/0x130 instead; the rest of the trace is identical. (gdb) list *(unwind_get_return_address+0x1d3) 0xffffffff8112bca3 is in unwind_get_return_address (./include/linux/compiler.h:243). 238 }) 239 240 static __always_inline 241 void __read_once_size(const volatile void *p, void *res, int size) 242 { 243 __READ_ONCE_SIZE; Thanks, Daniel -- Daniel J Blueman