Date: Thu, 7 Apr 2016 13:55:52 +0200
From: Petr Mladek <pmladek@suse.com>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>, Jessica Yu <jeyu@redhat.com>,
        Miroslav Benes <mbenes@suse.cz>, linux-kernel@vger.kernel.org,
        live-patching@vger.kernel.org, Vojtech Pavlik <vojtech@suse.com>
Subject: Re: [RFC PATCH v1.9 07/14] x86/stacktrace: add function for
 detecting reliable stack traces
Message-ID: <20160407115552.GB27670@pathway.suse.cz>
References: <cover.1458933243.git.jpoimboe@redhat.com>
 <1f8c648ed8b8eb49a75f5a6cacf8b7ca76f44fa9.1458933243.git.jpoimboe@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1f8c648ed8b8eb49a75f5a6cacf8b7ca76f44fa9.1458933243.git.jpoimboe@redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3156
Lines: 88

On Fri 2016-03-25 14:34:54, Josh Poimboeuf wrote:
> For live patching and possibly other use cases, a stack trace is only
> useful if you can be assured that it's completely reliable.  Add a new
> save_stack_trace_tsk_reliable() function to achieve that.
> 
> Scenarios which indicate that a stack strace may be unreliable:
> 
> - interrupt stacks
> - preemption
> - corrupted stack data
> - newly forked tasks
> - running tasks
> - the user didn't provide a large enough entries array
> 
> Also add a config option so arch-independent code can determine at build
> time whether the function is implemented.
> 
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index 3b10518..9c68bfc 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -145,6 +145,42 @@ int print_context_stack_bp(struct thread_info *tinfo,
>  }
>  EXPORT_SYMBOL_GPL(print_context_stack_bp);
>  
> +int print_context_stack_reliable(struct thread_info *tinfo,
> +				 unsigned long *stack, unsigned long *bp,
> +				 const struct stacktrace_ops *ops,
> +				 void *data, unsigned long *end, int *graph)
> +{
> +	struct stack_frame *frame = (struct stack_frame *)*bp;
> +	struct stack_frame *last_frame = frame;
> +	unsigned long *ret_addr = &frame->return_address;
> +
> +	if (test_ti_thread_flag(tinfo, TIF_FORK))
> +		return -EINVAL;

Why exactly is a stack of a forked task unreliable, please?

There was some discussion about the possible stack state and the patch
state after forking, see
http://thread.gmane.org/gmane.linux.kernel/2184163/focus=2191057

Anyway, I think that the stack should be ready for use when the process
is visible in the task list. It means that it should be reliable.


> +	while (valid_stack_ptr(tinfo, ret_addr, sizeof(*ret_addr), end)) {
> +		unsigned long addr = *ret_addr;
> +
> +		if (frame <= last_frame || !__kernel_text_address(addr) ||
> +		    in_preempt_schedule_irq(addr))

I wonder how exactly this works :-)

First, __kernel_text_address() returns true also for dynamically generated
ftrace handlers, see is_ftrace_trampoline(). Do we have a guarantee
that these functions generate a valid stack frame? We might want to
ignore these here.


Second, if I get it correctly, in_preempt_schedule_irq() works because
this functions is called only for tasks that are _not_ running.
It means that we must be exactly at

     preempt_schedule_irq()
       __schedule()
	  context_switch()
	    switch_to()

It means that preempt_schedule_irq() must be on the stack if at
least one of the other functions is not inlined.

As Jiri Kosina explained to me. We check it because it is
called on exit from an interrupt handler. The interrupt might
came at any time, for example, right before a function saves
the stack frame. This is why it makes the stack unreliable.

If I get it correctly, this is the only location when the
running process might get rescheduled from irq context. Other
possibilities keeps the process running and the stack is
marked unreliable elsewhere.

Well, I wonder if we should be more suspicious and make
sure that only the regular process stack is used.

Best Regards,
Petr