Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761458AbcLPNIN (ORCPT ); Fri, 16 Dec 2016 08:08:13 -0500 Received: from mx2.suse.de ([195.135.220.15]:37971 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757541AbcLPNHq (ORCPT ); Fri, 16 Dec 2016 08:07:46 -0500 Date: Fri, 16 Dec 2016 14:07:39 +0100 From: Petr Mladek To: Josh Poimboeuf Cc: Jessica Yu , Jiri Kosina , Miroslav Benes , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Michael Ellerman , Heiko Carstens , x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, Vojtech Pavlik , Jiri Slaby , Chris J Arges , Andy Lutomirski , Ingo Molnar , Peter Zijlstra Subject: Re: [PATCH v3 01/15] stacktrace/x86: add function for detecting reliable stack traces Message-ID: <20161216130739.GB393@pathway.suse.cz> References: <0315b36c08c104d56a4b43537fb300d200418996.1481220077.git.jpoimboe@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0315b36c08c104d56a4b43537fb300d200418996.1481220077.git.jpoimboe@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3059 Lines: 104 On Thu 2016-12-08 12:08:26, Josh Poimboeuf wrote: > For live patching and possibly other use cases, a stack trace is only > useful if it can be assured that it's completely reliable. Add a new > save_stack_trace_tsk_reliable() function to achieve that. > > Scenarios which indicate that a stack trace may be unreliable: > > - running task It seems that this has to be enforced by save_stack_trace_tsk_reliable() caller. It should be mentioned in the function description. > - interrupt stack I guess that it is detected by saved regs on the stack. And it covers also dynamic changes like kprobes. Do I get it correctly, please? What about ftrace? Is ftrace without regs safe and detected? > - preemption I wonder if some very active kthreads might almost always be preempted using irq in preemptive kernel. Then they block the conversion with the non-reliable stacks. Have you noticed such problems, please? > - corrupted stack data > - stack grows the wrong way This is detected in unwind_next_frame() and passed via state->error. Am I right? > - stack walk doesn't reach the bottom > - user didn't provide a large enough entries array > > Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can > determine at build time whether the function is implemented. > > diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c > index 0653788..3e0cf5e 100644 > --- a/arch/x86/kernel/stacktrace.c > +++ b/arch/x86/kernel/stacktrace.c > @@ -74,6 +74,64 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace) > } > EXPORT_SYMBOL_GPL(save_stack_trace_tsk); > > +#ifdef CONFIG_HAVE_RELIABLE_STACKTRACE > +static int __save_stack_trace_reliable(struct stack_trace *trace, > + struct task_struct *task) > +{ > + struct unwind_state state; > + struct pt_regs *regs; > + unsigned long addr; > + > + for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state); > + unwind_next_frame(&state)) { > + > + regs = unwind_get_entry_regs(&state); > + if (regs) { > + /* > + * Preemption and page faults on the stack can make > + * frame pointers unreliable. > + */ > + if (!user_mode(regs)) > + return -1; By other words, it we find regs on the stack, it almost always mean a non-reliable stack. The only exception is when we are in the userspace mode. Do I get it correctly, please? > + > + /* > + * This frame contains the (user mode) pt_regs at the > + * end of the stack. Finish the unwind. > + */ > + unwind_next_frame(&state); > + break; > + } > + > + addr = unwind_get_return_address(&state); > + if (!addr || save_stack_address(trace, addr, false)) > + return -1; > + } > + > + if (!unwind_done(&state) || unwind_error(&state)) > + return -1; > + > + if (trace->nr_entries < trace->max_entries) > + trace->entries[trace->nr_entries++] = ULONG_MAX; > + > + return 0; > +} Great work! I am surprised that it looks so straightforward. I still have to think and investigate it more. But it looks very promissing. Best Regards, Petr