Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756541AbcDGOq7 (ORCPT ); Thu, 7 Apr 2016 10:46:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60751 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756076AbcDGOq6 (ORCPT ); Thu, 7 Apr 2016 10:46:58 -0400 Date: Thu, 7 Apr 2016 09:46:55 -0500 From: Josh Poimboeuf To: Petr Mladek Cc: Jiri Kosina , Jessica Yu , Miroslav Benes , linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Vojtech Pavlik Subject: Re: [RFC PATCH v1.9 07/14] x86/stacktrace: add function for detecting reliable stack traces Message-ID: <20160407144655.kf7ujfctdlvnwx6s@treble.redhat.com> References: <1f8c648ed8b8eb49a75f5a6cacf8b7ca76f44fa9.1458933243.git.jpoimboe@redhat.com> <20160407115552.GB27670@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160407115552.GB27670@pathway.suse.cz> User-Agent: Mutt/1.5.23.1 (2014-03-12) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 07 Apr 2016 14:46:57 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4060 Lines: 102 On Thu, Apr 07, 2016 at 01:55:52PM +0200, Petr Mladek wrote: > On Fri 2016-03-25 14:34:54, Josh Poimboeuf wrote: > > For live patching and possibly other use cases, a stack trace is only > > useful if you can be assured that it's completely reliable. Add a new > > save_stack_trace_tsk_reliable() function to achieve that. > > > > Scenarios which indicate that a stack strace may be unreliable: > > > > - interrupt stacks > > - preemption > > - corrupted stack data > > - newly forked tasks > > - running tasks > > - the user didn't provide a large enough entries array > > > > Also add a config option so arch-independent code can determine at build > > time whether the function is implemented. > > > > diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c > > index 3b10518..9c68bfc 100644 > > --- a/arch/x86/kernel/dumpstack.c > > +++ b/arch/x86/kernel/dumpstack.c > > @@ -145,6 +145,42 @@ int print_context_stack_bp(struct thread_info *tinfo, > > } > > EXPORT_SYMBOL_GPL(print_context_stack_bp); > > > > +int print_context_stack_reliable(struct thread_info *tinfo, > > + unsigned long *stack, unsigned long *bp, > > + const struct stacktrace_ops *ops, > > + void *data, unsigned long *end, int *graph) > > +{ > > + struct stack_frame *frame = (struct stack_frame *)*bp; > > + struct stack_frame *last_frame = frame; > > + unsigned long *ret_addr = &frame->return_address; > > + > > + if (test_ti_thread_flag(tinfo, TIF_FORK)) > > + return -EINVAL; > > Why exactly is a stack of a forked task unreliable, please? > > There was some discussion about the possible stack state and the patch > state after forking, see > http://thread.gmane.org/gmane.linux.kernel/2184163/focus=2191057 > > Anyway, I think that the stack should be ready for use when the process > is visible in the task list. It means that it should be reliable. To be honest I don't remember exactly why I added this check. I think it's related to the fact that newly forked tasks don't yet have a stack. I should have definitely added a comment. I'll need to revisit it. > > + while (valid_stack_ptr(tinfo, ret_addr, sizeof(*ret_addr), end)) { > > + unsigned long addr = *ret_addr; > > + > > + if (frame <= last_frame || !__kernel_text_address(addr) || > > + in_preempt_schedule_irq(addr)) > > I wonder how exactly this works :-) > > First, __kernel_text_address() returns true also for dynamically generated > ftrace handlers, see is_ftrace_trampoline(). Do we have a guarantee > that these functions generate a valid stack frame? We might want to > ignore these here. This is a good point. I think the ftrace code does the right thing. But we don't necessarily have a way to guarantee that. Maybe we should consider it unreliable. > Second, if I get it correctly, in_preempt_schedule_irq() works because > this functions is called only for tasks that are _not_ running. > It means that we must be exactly at > > preempt_schedule_irq() > __schedule() > context_switch() > switch_to() > > It means that preempt_schedule_irq() must be on the stack if at > least one of the other functions is not inlined. > > As Jiri Kosina explained to me. We check it because it is > called on exit from an interrupt handler. The interrupt might > came at any time, for example, right before a function saves > the stack frame. This is why it makes the stack unreliable. > > If I get it correctly, this is the only location when the > running process might get rescheduled from irq context. Other > possibilities keeps the process running and the stack is > marked unreliable elsewhere. Right. I guess that needs a better comment too. > Well, I wonder if we should be more suspicious and make > sure that only the regular process stack is used. Notice the save_stack_stack_reliable() function, which is called by dump_trace() when the task is running on an interrupt or exception stack. It returns -EINVAL, so the stack gets marked unreliable. Does that address your concern, or did you mean something else? -- Josh