Date: Tue, 29 Nov 2016 11:17:25 -0600
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
        Vince Weaver <vincent.weaver@maine.edu>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Ingo Molnar <mingo@redhat.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        "dvyukov@google.com" <dvyukov@google.com>, pmladek@suse.com
Subject: Re: perf: fuzzer BUG: KASAN: stack-out-of-bounds in __unwind_start
Message-ID: <20161129171725.dql7evlzqiit63a3@treble>
References: <alpine.DEB.2.20.1611241229180.25241@macbook-air>
 <20161128215411.fkis7bbimjy4v4j7@treble>
 <20161129004021.GL3924@linux.vnet.ibm.com>
 <20161129055241.6dy2dt4q4ptazk2s@treble>
 <20161129091650.GA3092@twins.programming.kicks-ass.net>
 <20161129140734.GQ3924@linux.vnet.ibm.com>
 <20161129150917.tk5xkl7teveybaxa@treble>
 <20161129165152.GV3924@linux.vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <20161129165152.GV3924@linux.vnet.ibm.com>
User-Agent: Mutt/1.6.0.1 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5674
Lines: 139

On Tue, Nov 29, 2016 at 08:51:52AM -0800, Paul E. McKenney wrote:
> On Tue, Nov 29, 2016 at 09:09:17AM -0600, Josh Poimboeuf wrote:
> > On Tue, Nov 29, 2016 at 06:07:34AM -0800, Paul E. McKenney wrote:
> > > On Tue, Nov 29, 2016 at 10:16:50AM +0100, Peter Zijlstra wrote:
> > > > On Mon, Nov 28, 2016 at 11:52:41PM -0600, Josh Poimboeuf wrote:
> > > > > > We used to do that, but the resulting NMIs were problematic on some
> > > > > > platforms.  Perhaps things have gotten better?
> > > > > 
> > > > > Did a little digging on git blame and found the following commit (which
> > > > > seems to be the cause of the KASAN warning and missing stack dump):
> > > > > 
> > > > >   bc1dce514e9b ("rcu: Don't use NMIs to dump other CPUs' stacks")
> > > > > 
> > > > > I presume this commit is still needed because of the NMI printk deadlock
> > > > > issues which were discussed at Kernel Summit.  I guess those issues need
> > > > > to be sorted out before the above commit can be reverted.
> > > > 
> > > > so printk should more or less work from NMI, esp. after:
> > > > 
> > > >   42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI")
> > > 
> > > And of course bc1dce514e9b doesn't revert cleanly, but see hand reversion
> > > below.  Also, 42a0bb3f7138's commit log calls out MN10300 and Xtensa as
> > > needing more work.  Has that happened?
> > 
> > Petr M, any idea?
> 
> My Not-yet-signed-off-by is due to this concern, FWIW.

I think Petr's replies have addressed that now.

> > > But I really like the fact that RCU CPU stall warnings dump only those
> > > stacks that are likely to be involved, and the patch below goes back
> > > to dumping everyone.  Shouldn't be that hard to fix, though...
> > 
> > There's a new trigger_single_cpu_backtrace() function which can be used
> > for that.
> 
> Even better, thank you!  Killed an hour or so of coding, but I must
> confess that it was a mercy killing.  ;-)

Ha :-)

> Much nicer (but completely untested) patch below.

The kernel/rcu/tree.h changes seem intended for another patch?

Otherwise:

  Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

Also I think this will fix the KASAN warnings reported by Vince, so you
might add:

  Reported-by: Vince Weaver <vincent.weaver@maine.edu>

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit d3515ee46e0cff880170e48a05e8f2791b507758
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Tue Nov 29 05:49:06 2016 -0800
> 
>     rcu: Once again use NMI-based stack traces in stall warnings
>     
>     This commit is for all intents and purposes a revert of bc1dce514e9b
>     ("rcu: Don't use NMIs to dump other CPUs' stacks").  The reason to suppose
>     that this can now safely be reverted is the presence of 42a0bb3f7138
>     ("printk/nmi: generic solution for safe printk in NMI"), which is said
>     to have made NMI-based stack dumps safe.
>     
>     However, this reversion keeps one nice property of bc1dce514e9b
>     ("rcu: Don't use NMIs to dump other CPUs' stacks"), namely that
>     only those CPUs blocking the grace period are dumped.  The new
>     trigger_single_cpu_backtrace() is used to make this happen, as
>     suggested by Josh Poimboeuf.
>     
>     Not-yet-signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>     Cc: Petr Mladek <pmladek@suse.com>
>     Cc: Josh Poimboeuf <jpoimboe@redhat.com>
>     Cc: Peter Zijlstra <peterz@infradead.org>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 91a68e4e6671..ba0e4825be9d 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1396,7 +1396,10 @@ static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp)
>  }
>  
>  /*
> - * Dump stacks of all tasks running on stalled CPUs.
> + * Dump stacks of all tasks running on stalled CPUs.  First try using
> + * NMIs, but fall back to manual remote stack tracing on architectures
> + * that don't support NMI-based stack dumps.  The NMI-triggered stack
> + * traces are more accurate because they are printed by the target CPU.
>   */
>  static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
>  {
> @@ -1406,11 +1409,10 @@ static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
>  
>  	rcu_for_each_leaf_node(rsp, rnp) {
>  		raw_spin_lock_irqsave_rcu_node(rnp, flags);
> -		if (rnp->qsmask != 0) {
> -			for_each_leaf_node_possible_cpu(rnp, cpu)
> -				if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
> +		for_each_leaf_node_possible_cpu(rnp, cpu)
> +			if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
> +				if (!trigger_single_cpu_backtrace(cpu))
>  					dump_cpu_task(cpu);
> -		}
>  		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>  	}
>  }
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 7dcdd59d894c..c0a4bf8f1ed0 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -691,18 +691,6 @@ static inline void rcu_nocb_q_lengths(struct rcu_data *rdp, long *ql, long *qll)
>  #endif /* #ifdef CONFIG_RCU_TRACE */
>  
>  /*
> - * Place this after a lock-acquisition primitive to guarantee that
> - * an UNLOCK+LOCK pair act as a full barrier.  This guarantee applies
> - * if the UNLOCK and LOCK are executed by the same CPU or if the
> - * UNLOCK and LOCK operate on the same lock variable.
> - */
> -#ifdef CONFIG_PPC
> -#define smp_mb__after_unlock_lock()	smp_mb()  /* Full ordering for lock. */
> -#else /* #ifdef CONFIG_PPC */
> -#define smp_mb__after_unlock_lock()	do { } while (0)
> -#endif /* #else #ifdef CONFIG_PPC */
> -
> -/*
>   * Wrappers for the rcu_node::lock acquire and release.
>   *
>   * Because the rcu_nodes form a tree, the tree traversal locking will observe
>