Date: Tue, 16 Jun 2015 05:27:33 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>,
        LKML <linux-kernel@vger.kernel.org>, rostedt@goodmis.org
Subject: Re: call_rcu from trace_preempt
Message-ID: <20150616122733.GG3913@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <557F509D.2000509@plumgrid.com>
 <20150615230702.GB3913@linux.vnet.ibm.com>
 <557F7764.5060707@plumgrid.com>
 <20150616021458.GE3913@linux.vnet.ibm.com>
 <557FB7E1.6080004@plumgrid.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <557FB7E1.6080004@plumgrid.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3364
Lines: 85

On Mon, Jun 15, 2015 at 10:45:05PM -0700, Alexei Starovoitov wrote:
> On 6/15/15 7:14 PM, Paul E. McKenney wrote:
> >
> >Why do you believe that it is better to fix it within call_rcu()?
> 
> found it:
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8cf7304b2867..a3be09d482ae 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -935,9 +935,9 @@ bool notrace rcu_is_watching(void)
>  {
>         bool ret;
> 
> -       preempt_disable();
> +       preempt_disable_notrace();
>         ret = __rcu_is_watching();
> -       preempt_enable();
> +       preempt_enable_notrace();
>         return ret;
>  }
> 
> the rcu_is_watching() and __rcu_is_watching() are already marked
> notrace, so imo it's a good 'fix'.
> What was happening is that the above preempt_enable was triggering
> recursive call_rcu that was indeed messing 'rdp' that was
> prepared by __call_rcu and before __call_rcu_core could use that.

> btw, also noticed that local_irq_save done by note_gp_changes
> is partially redundant. In __call_rcu_core path the irqs are
> already disabled.

But you said earlier that nothing happened when interrupts were
disabled.  And interrupts are disabled across the call to
rcu_is_watching() in __call_rcu_core().  So why did those calls
to preempt_disable() and preempt_enable() cause trouble?

That said, the patch looks inoffensive to me, adding Steven for his
trace expertise.

Still, I do need to understand what was really happening.  Did interrupts
get enabled somehow?  Or is your code that ignores calls when interrupts
are disabled incomplete in some way?  Something else?

> >Perhaps you are self-deadlocking within __call_rcu_core().  If you have
> >not already done so, please try running with CONFIG_PROVE_LOCKING=y.
> 
> yes, I had CONFIG_PROVE_LOCKING on.

Good!  ;-)

> >I suspect that your problem may range quite a bit further than just
> >call_rcu().  For example, in your stack trace, you have a recursive
> >call to debug_object_activate(), which might not be such good thing.
> 
> nope :) recursive debug_object_activate() is fine.
> with the above 'fix' the trace.patch is now passing.
> 
> Why I'm digging into all of these? Well, to find out when
> it's safe to finally do call_rcu. If I will use deferred kfree
> approach in bpf maps, I need to know when it's safe to finally
> call_rcu and it's not an easy answer.

Given that reentrant calls to call_rcu() and/or kfree_rcu() were not
in any way considered during design and implementation, it is not a
surprise that the answer is not easy.  The reason I need to understand
what your code does in interrupt-disabled situations is to work out
whether or not it makes sense to agree to support reentrancy in call_rcu()
and kfree_rcu().

> kprobes potentially can be placed in any part of call_rcu stack,
> so things can go messy quickly. So it helps to understand the call_rcu
> logic well enough to come up with good solution.

Indeed, I do have some concerns about that sort of thing, as it is not
at all clear that designing call_rcu() and kfree_rcu() for unrestricted
reentrancy is a win.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/