2020-02-12 21:14:29

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

To facilitate tracers that need RCU, add some helpers to wrap the
magic required.

The problem is that we can call into tracers (trace events and
function tracing) while RCU isn't watching and this can happen from
any context, including NMI.

It is this latter that is causing most of the trouble; we must make
sure in_nmi() returns true before we land in anything tracing,
otherwise we cannot recover.

These helpers are macros because of header-hell; they're placed here
because of the proximity to nmi_{enter,exit{().

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
include/linux/hardirq.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -89,4 +89,52 @@ extern void irq_exit(void);
arch_nmi_exit(); \
} while (0)

+/*
+ * Tracing vs RCU
+ * --------------
+ *
+ * tracepoints and function-tracing can happen when RCU isn't watching (idle,
+ * or early IRQ/NMI entry).
+ *
+ * When it happens during idle or early during IRQ entry, tracing will have
+ * to inform RCU that it ought to pay attention, this is done by calling
+ * rcu_irq_enter_irqsave().
+ *
+ * On NMI entry, we must be very careful that tracing only happens after we've
+ * incremented preempt_count(), otherwise we cannot tell we're in NMI and take
+ * the special path.
+ */
+
+#define __TR_IRQ 1
+#define __TR_NMI 2
+
+#define trace_rcu_enter() \
+({ \
+ unsigned long state = 0; \
+ if (!rcu_is_watching()) { \
+ if (in_nmi()) { \
+ state = __TR_NMI; \
+ rcu_nmi_enter(); \
+ } else { \
+ state = __TR_IRQ; \
+ rcu_irq_enter_irqsave(); \
+ } \
+ } \
+ state; \
+})
+
+#define trace_rcu_exit(state) \
+do { \
+ switch (state) { \
+ case __TR_IRQ: \
+ rcu_irq_exit_irqsave(); \
+ break; \
+ case __TR_NMI: \
+ rcu_nmi_exit(); \
+ break; \
+ default: \
+ break; \
+ } \
+} while (0)
+
#endif /* LINUX_HARDIRQ_H */



2020-02-12 23:21:25

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:
> To facilitate tracers that need RCU, add some helpers to wrap the
> magic required.
>
> The problem is that we can call into tracers (trace events and
> function tracing) while RCU isn't watching and this can happen from
> any context, including NMI.
>
> It is this latter that is causing most of the trouble; we must make
> sure in_nmi() returns true before we land in anything tracing,
> otherwise we cannot recover.
>
> These helpers are macros because of header-hell; they're placed here
> because of the proximity to nmi_{enter,exit{().
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---
> include/linux/hardirq.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -89,4 +89,52 @@ extern void irq_exit(void);
> arch_nmi_exit(); \
> } while (0)
>
> +/*
> + * Tracing vs RCU
> + * --------------
> + *
> + * tracepoints and function-tracing can happen when RCU isn't watching (idle,
> + * or early IRQ/NMI entry).
> + *
> + * When it happens during idle or early during IRQ entry, tracing will have
> + * to inform RCU that it ought to pay attention, this is done by calling
> + * rcu_irq_enter_irqsave().
> + *
> + * On NMI entry, we must be very careful that tracing only happens after we've
> + * incremented preempt_count(), otherwise we cannot tell we're in NMI and take
> + * the special path.
> + */
> +
> +#define __TR_IRQ 1
> +#define __TR_NMI 2
> +
> +#define trace_rcu_enter() \
> +({ \
> + unsigned long state = 0; \
> + if (!rcu_is_watching()) { \
> + if (in_nmi()) { \
> + state = __TR_NMI; \
> + rcu_nmi_enter(); \
> + } else { \
> + state = __TR_IRQ; \
> + rcu_irq_enter_irqsave(); \

I think this can be simplified. You don't need to rely on in_nmi() here. I
believe for NMI's, you can just call rcu_irq_enter_irqsave() and that should
be sufficient to get RCU watching. Paul can correct me if I'm wrong, but I am
pretty sure that would work.

In fact, I think a better naming for rcu_irq_enter_irqsave() pair could be
(in the first patch):

rcu_ensure_watching_begin();
rcu_ensure_watching_end();

thanks,

- Joel



> + } \
> + } \
> + state; \
> +})
> +
> +#define trace_rcu_exit(state) \
> +do { \
> + switch (state) { \
> + case __TR_IRQ: \
> + rcu_irq_exit_irqsave(); \
> + break; \
> + case __TR_NMI: \
> + rcu_nmi_exit(); \
> + break; \
> + default: \
> + break; \
> + } \
> +} while (0)
> +
> #endif /* LINUX_HARDIRQ_H */
>
>

2020-02-12 23:27:40

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:
> To facilitate tracers that need RCU, add some helpers to wrap the
> magic required.
>
> The problem is that we can call into tracers (trace events and
> function tracing) while RCU isn't watching and this can happen from
> any context, including NMI.
>
> It is this latter that is causing most of the trouble; we must make
> sure in_nmi() returns true before we land in anything tracing,
> otherwise we cannot recover.
>
> These helpers are macros because of header-hell; they're placed here
> because of the proximity to nmi_{enter,exit{().
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---
> include/linux/hardirq.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -89,4 +89,52 @@ extern void irq_exit(void);
> arch_nmi_exit(); \
> } while (0)
>
> +/*
> + * Tracing vs RCU
> + * --------------
> + *
> + * tracepoints and function-tracing can happen when RCU isn't watching (idle,
> + * or early IRQ/NMI entry).
> + *
> + * When it happens during idle or early during IRQ entry, tracing will have
> + * to inform RCU that it ought to pay attention, this is done by calling
> + * rcu_irq_enter_irqsave().
> + *
> + * On NMI entry, we must be very careful that tracing only happens after we've
> + * incremented preempt_count(), otherwise we cannot tell we're in NMI and take
> + * the special path.
> + */
> +
> +#define __TR_IRQ 1
> +#define __TR_NMI 2
> +
> +#define trace_rcu_enter() \
> +({ \
> + unsigned long state = 0; \
> + if (!rcu_is_watching()) { \
> + if (in_nmi()) { \
> + state = __TR_NMI; \
> + rcu_nmi_enter(); \
> + } else { \
> + state = __TR_IRQ; \
> + rcu_irq_enter_irqsave(); \

Since rcu_irq_enter_irqsave can be called from a tracer context, should those
be marked with notrace as well? AFAICS, there's no notrace marking on them.

thanks,

- Joel


> + } \
> + } \
> + state; \
> +})
> +
> +#define trace_rcu_exit(state) \
> +do { \
> + switch (state) { \
> + case __TR_IRQ: \
> + rcu_irq_exit_irqsave(); \
> + break; \
> + case __TR_NMI: \
> + rcu_nmi_exit(); \
> + break; \
> + default: \
> + break; \
> + } \
> +} while (0)
> +
> #endif /* LINUX_HARDIRQ_H */
>
>

2020-02-13 08:28:18

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Wed, Feb 12, 2020 at 06:20:05PM -0500, Joel Fernandes wrote:
> On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:

> > +#define trace_rcu_enter() \
> > +({ \
> > + unsigned long state = 0; \
> > + if (!rcu_is_watching()) { \
> > + if (in_nmi()) { \
> > + state = __TR_NMI; \
> > + rcu_nmi_enter(); \
> > + } else { \
> > + state = __TR_IRQ; \
> > + rcu_irq_enter_irqsave(); \
>
> I think this can be simplified. You don't need to rely on in_nmi() here. I
> believe for NMI's, you can just call rcu_irq_enter_irqsave() and that should
> be sufficient to get RCU watching. Paul can correct me if I'm wrong, but I am
> pretty sure that would work.
>
> In fact, I think a better naming for rcu_irq_enter_irqsave() pair could be
> (in the first patch):
>
> rcu_ensure_watching_begin();
> rcu_ensure_watching_end();

So I hadn't looked deeply into rcu_irq_enter(), it seems to call
rcu_nmi_enter_common(), but with @irq=true.

What exactly is the purpose of that @irq argument, and how much will it
hurt to lie there? Will it come apart if we have @irq != !in_nmi()
for example?

There is a comment in there that says ->dynticks_nmi_nesting ought to be
odd only if we're in NMI. The only place that seems to care is
rcu_nmi_exit_common(), and that does indeed do something different for
IRQs vs NMIs.

So I don't think we can blindly unify this. But perhaps Paul sees a way?

2020-02-13 08:29:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Wed, Feb 12, 2020 at 06:27:02PM -0500, Joel Fernandes wrote:
> On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:

> > +#define trace_rcu_enter() \
> > +({ \
> > + unsigned long state = 0; \
> > + if (!rcu_is_watching()) { \
> > + if (in_nmi()) { \
> > + state = __TR_NMI; \
> > + rcu_nmi_enter(); \
> > + } else { \
> > + state = __TR_IRQ; \
> > + rcu_irq_enter_irqsave(); \
>
> Since rcu_irq_enter_irqsave can be called from a tracer context, should those
> be marked with notrace as well? AFAICS, there's no notrace marking on them.

It should work, these functions are re-entrant (as are IRQs / NMIs) and
Steve wants to be able to trace RCU itself.

2020-02-13 13:33:06

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 09:27:16AM +0100, Peter Zijlstra wrote:
> On Wed, Feb 12, 2020 at 06:20:05PM -0500, Joel Fernandes wrote:
> > On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:
>
> > > +#define trace_rcu_enter() \
> > > +({ \
> > > + unsigned long state = 0; \
> > > + if (!rcu_is_watching()) { \
> > > + if (in_nmi()) { \
> > > + state = __TR_NMI; \
> > > + rcu_nmi_enter(); \
> > > + } else { \
> > > + state = __TR_IRQ; \
> > > + rcu_irq_enter_irqsave(); \
> >
> > I think this can be simplified. You don't need to rely on in_nmi() here. I
> > believe for NMI's, you can just call rcu_irq_enter_irqsave() and that should
> > be sufficient to get RCU watching. Paul can correct me if I'm wrong, but I am
> > pretty sure that would work.
> >
> > In fact, I think a better naming for rcu_irq_enter_irqsave() pair could be
> > (in the first patch):
> >
> > rcu_ensure_watching_begin();
> > rcu_ensure_watching_end();
>
> So I hadn't looked deeply into rcu_irq_enter(), it seems to call
> rcu_nmi_enter_common(), but with @irq=true.
>
> What exactly is the purpose of that @irq argument, and how much will it
> hurt to lie there? Will it come apart if we have @irq != !in_nmi()
> for example?

At least rcu_dynticks_task_exit() and rcu_cleanup_after_idle() seem to be
safe regardless of IRQ or NMI. If they are safe either way, we should
probably get look into removing @irq. But I'm not fully sure and looking
forward to Paul's thought on that.. I would love for us to simplify that as
well if possible.

> There is a comment in there that says ->dynticks_nmi_nesting ought to be
> odd only if we're in NMI. The only place that seems to care is
> rcu_nmi_exit_common(), and that does indeed do something different for
> IRQs vs NMIs.

I know a bit about the counters. I had previously unified it and it passed
RCU torture testing etc. (Didn't get merge as Paul wanted it done slightly
done differently) : https://lore.kernel.org/patchwork/patch/1120022/ . I am
planning to get back to finishing that soon.

About the comment on dynticks_nmi_nesting and counters, you mean this comment
right? The odd value of '1' is just to catch the outer most handler. We need
to enter/exit the EQS (extended QS) state only from the outermost handler. I
believe that would work regardless whether it is NMI and IRQ that are
nesting. If IRQs could nest within other IRQs in Linux, then that could also
very well have used the odd/ even trick.

/*
* If idle from RCU viewpoint, atomically increment ->dynticks
* to mark non-idle and increment ->dynticks_nmi_nesting by one.
* Otherwise, increment ->dynticks_nmi_nesting by two. This means
* if ->dynticks_nmi_nesting is equal to one, we are guaranteed
* to be in the outermost NMI handler that interrupted an RCU-idle
* period (observation due to Andy Lutomirski).
*/

thanks,

- Joel

2020-02-13 13:52:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 09:27:16AM +0100, Peter Zijlstra wrote:
> On Wed, Feb 12, 2020 at 06:20:05PM -0500, Joel Fernandes wrote:
> > On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:
>
> > > +#define trace_rcu_enter() \
> > > +({ \
> > > + unsigned long state = 0; \
> > > + if (!rcu_is_watching()) { \
> > > + if (in_nmi()) { \
> > > + state = __TR_NMI; \
> > > + rcu_nmi_enter(); \
> > > + } else { \
> > > + state = __TR_IRQ; \
> > > + rcu_irq_enter_irqsave(); \
> >
> > I think this can be simplified. You don't need to rely on in_nmi() here. I
> > believe for NMI's, you can just call rcu_irq_enter_irqsave() and that should
> > be sufficient to get RCU watching. Paul can correct me if I'm wrong, but I am
> > pretty sure that would work.
> >
> > In fact, I think a better naming for rcu_irq_enter_irqsave() pair could be
> > (in the first patch):
> >
> > rcu_ensure_watching_begin();
> > rcu_ensure_watching_end();
>
> So I hadn't looked deeply into rcu_irq_enter(), it seems to call
> rcu_nmi_enter_common(), but with @irq=true.
>
> What exactly is the purpose of that @irq argument, and how much will it
> hurt to lie there? Will it come apart if we have @irq != !in_nmi()
> for example?
>
> There is a comment in there that says ->dynticks_nmi_nesting ought to be
> odd only if we're in NMI. The only place that seems to care is
> rcu_nmi_exit_common(), and that does indeed do something different for
> IRQs vs NMIs.
>
> So I don't think we can blindly unify this. But perhaps Paul sees a way?

The reason for the irq argument is to avoid invoking
rcu_prepare_for_idle() and rcu_dynticks_task_enter() from NMI context
from rcu_nmi_exit_common(). Similarly, we need to avoid invoking
rcu_dynticks_task_exit() and rcu_cleanup_after_idle() from NMI context
from rcu_nmi_enter_common().

It might well be that I could make these functions be NMI-safe, but
rcu_prepare_for_idle() in particular would be a bit ugly at best.
So, before looking into that, I have a question. Given these proposed
changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
to just use in_nmi()?

Thanx, Paul

2020-02-13 16:42:32

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 05:51:38AM -0800, Paul E. McKenney wrote:

> The reason for the irq argument is to avoid invoking
> rcu_prepare_for_idle() and rcu_dynticks_task_enter() from NMI context
> from rcu_nmi_exit_common(). Similarly, we need to avoid invoking
> rcu_dynticks_task_exit() and rcu_cleanup_after_idle() from NMI context
> from rcu_nmi_enter_common().

Aaah, I see. I didn't grep hard enough earlier today (I only found
stubs). Yes, those take locks, we mustn't call them from NMI context.

> It might well be that I could make these functions be NMI-safe, but
> rcu_prepare_for_idle() in particular would be a bit ugly at best.
> So, before looking into that, I have a question. Given these proposed
> changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> to just use in_nmi()?

That _should_ already be the case today. That is, if we end up in a
tracer and in_nmi() is unreliable we're already screwed anyway.

2020-02-13 18:41:13

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 09:28:14AM +0100, Peter Zijlstra wrote:
> On Wed, Feb 12, 2020 at 06:27:02PM -0500, Joel Fernandes wrote:
> > On Wed, Feb 12, 2020 at 10:01:42PM +0100, Peter Zijlstra wrote:
>
> > > +#define trace_rcu_enter() \
> > > +({ \
> > > + unsigned long state = 0; \
> > > + if (!rcu_is_watching()) { \
> > > + if (in_nmi()) { \
> > > + state = __TR_NMI; \
> > > + rcu_nmi_enter(); \
> > > + } else { \
> > > + state = __TR_IRQ; \
> > > + rcu_irq_enter_irqsave(); \
> >
> > Since rcu_irq_enter_irqsave can be called from a tracer context, should those
> > be marked with notrace as well? AFAICS, there's no notrace marking on them.
>
> It should work, these functions are re-entrant (as are IRQs / NMIs) and
> Steve wants to be able to trace RCU itself.

Hoping there are no recursion scenarios possible, but that sounds good to me. thanks,

- Joel

2020-02-13 18:56:38

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 05:40:31PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 13, 2020 at 05:51:38AM -0800, Paul E. McKenney wrote:
>
> > The reason for the irq argument is to avoid invoking
> > rcu_prepare_for_idle() and rcu_dynticks_task_enter() from NMI context
> > from rcu_nmi_exit_common(). Similarly, we need to avoid invoking
> > rcu_dynticks_task_exit() and rcu_cleanup_after_idle() from NMI context
> > from rcu_nmi_enter_common().
>
> Aaah, I see. I didn't grep hard enough earlier today (I only found
> stubs). Yes, those take locks, we mustn't call them from NMI context.

Been there, done that...

> > It might well be that I could make these functions be NMI-safe, but
> > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > So, before looking into that, I have a question. Given these proposed
> > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > to just use in_nmi()?
>
> That _should_ already be the case today. That is, if we end up in a
> tracer and in_nmi() is unreliable we're already screwed anyway.

So something like this, then? This is untested, probably doesn't even
build, and could use some careful review from both Peter and Steve,
at least. As in the below is the second version of the patch, the first
having been missing a couple of important "!" characters.

Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1f5fdf7..f783572 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -623,16 +623,18 @@ void rcu_user_enter(void)
}
#endif /* CONFIG_NO_HZ_FULL */

-/*
+/**
+ * rcu_nmi_exit - inform RCU of exit from NMI context
+ *
* If we are returning from the outermost NMI handler that interrupted an
* RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
* to let the RCU grace-period handling know that the CPU is back to
* being RCU-idle.
*
- * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
+ * If you add or remove a call to rcu_nmi_exit(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
-static __always_inline void rcu_nmi_exit_common(bool irq)
+static __always_inline void rcu_nmi_exit(void)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);

@@ -660,27 +662,16 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */

- if (irq)
+ if (!in_nmi())
rcu_prepare_for_idle();

rcu_dynticks_eqs_enter();

- if (irq)
+ if (!in_nmi())
rcu_dynticks_task_enter();
}

/**
- * rcu_nmi_exit - inform RCU of exit from NMI context
- *
- * If you add or remove a call to rcu_nmi_exit(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_nmi_exit(void)
-{
- rcu_nmi_exit_common(false);
-}
-
-/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
*
* Exit from an interrupt handler, which might possibly result in entering
@@ -702,7 +693,7 @@ void rcu_nmi_exit(void)
void rcu_irq_exit(void)
{
lockdep_assert_irqs_disabled();
- rcu_nmi_exit_common(true);
+ rcu_nmi_exit();
}

/*
@@ -786,7 +777,7 @@ void rcu_user_exit(void)
#endif /* CONFIG_NO_HZ_FULL */

/**
- * rcu_nmi_enter_common - inform RCU of entry to NMI context
+ * rcu_nmi_enter - inform RCU of entry to NMI context
* @irq: Is this call from rcu_irq_enter?
*
* If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
@@ -795,10 +786,10 @@ void rcu_user_exit(void)
* long as the nesting level does not overflow an int. (You will probably
* run out of stack space first.)
*
- * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
+ * If you add or remove a call to rcu_nmi_enter(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
-static __always_inline void rcu_nmi_enter_common(bool irq)
+static __always_inline void rcu_nmi_enter(void)
{
long incby = 2;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
@@ -816,12 +807,12 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
*/
if (rcu_dynticks_curr_cpu_in_eqs()) {

- if (irq)
+ if (!in_nmi())
rcu_dynticks_task_exit();

rcu_dynticks_eqs_exit();

- if (irq)
+ if (!in_nmi())
rcu_cleanup_after_idle();

incby = 1;
@@ -844,14 +835,6 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
rdp->dynticks_nmi_nesting + incby);
barrier();
}
-
-/**
- * rcu_nmi_enter - inform RCU of entry to NMI context
- */
-void rcu_nmi_enter(void)
-{
- rcu_nmi_enter_common(false);
-}
NOKPROBE_SYMBOL(rcu_nmi_enter);

/**
@@ -879,7 +862,7 @@ NOKPROBE_SYMBOL(rcu_nmi_enter);
void rcu_irq_enter(void)
{
lockdep_assert_irqs_disabled();
- rcu_nmi_enter_common(true);
+ rcu_nmi_enter();
}

/*

2020-02-13 20:59:11

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
[...]
> > > It might well be that I could make these functions be NMI-safe, but
> > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > So, before looking into that, I have a question. Given these proposed
> > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > to just use in_nmi()?
> >
> > That _should_ already be the case today. That is, if we end up in a
> > tracer and in_nmi() is unreliable we're already screwed anyway.
>
> So something like this, then? This is untested, probably doesn't even
> build, and could use some careful review from both Peter and Steve,
> at least. As in the below is the second version of the patch, the first
> having been missing a couple of important "!" characters.

I removed the static from rcu_nmi_enter()/exit() as it is called from
outside, that makes it build now. Updated below is Paul's diff. I also added
NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
asymmetric.

---8<-----------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d91c9156fab2e..bbcc7767f18ee 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -614,16 +614,18 @@ void rcu_user_enter(void)
}
#endif /* CONFIG_NO_HZ_FULL */

-/*
+/**
+ * rcu_nmi_exit - inform RCU of exit from NMI context
+ *
* If we are returning from the outermost NMI handler that interrupted an
* RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
* to let the RCU grace-period handling know that the CPU is back to
* being RCU-idle.
*
- * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
+ * If you add or remove a call to rcu_nmi_exit(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
-static __always_inline void rcu_nmi_exit_common(bool irq)
+__always_inline void rcu_nmi_exit(void)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);

@@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */

- if (irq)
+ if (!in_nmi())
rcu_prepare_for_idle();

rcu_dynticks_eqs_enter();

- if (irq)
+ if (!in_nmi())
rcu_dynticks_task_enter();
}
-
-/**
- * rcu_nmi_exit - inform RCU of exit from NMI context
- *
- * If you add or remove a call to rcu_nmi_exit(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_nmi_exit(void)
-{
- rcu_nmi_exit_common(false);
-}
+NOKPROBE_SYMBOL(rcu_nmi_exit);

/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
@@ -693,7 +685,7 @@ void rcu_nmi_exit(void)
void rcu_irq_exit(void)
{
lockdep_assert_irqs_disabled();
- rcu_nmi_exit_common(true);
+ rcu_nmi_exit();
}

/*
@@ -777,7 +769,7 @@ void rcu_user_exit(void)
#endif /* CONFIG_NO_HZ_FULL */

/**
- * rcu_nmi_enter_common - inform RCU of entry to NMI context
+ * rcu_nmi_enter - inform RCU of entry to NMI context
* @irq: Is this call from rcu_irq_enter?
*
* If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
@@ -786,10 +778,10 @@ void rcu_user_exit(void)
* long as the nesting level does not overflow an int. (You will probably
* run out of stack space first.)
*
- * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
+ * If you add or remove a call to rcu_nmi_enter(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
-static __always_inline void rcu_nmi_enter_common(bool irq)
+__always_inline void rcu_nmi_enter(void)
{
long incby = 2;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
@@ -807,12 +799,12 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
*/
if (rcu_dynticks_curr_cpu_in_eqs()) {

- if (irq)
+ if (!in_nmi())
rcu_dynticks_task_exit();

rcu_dynticks_eqs_exit();

- if (irq)
+ if (!in_nmi())
rcu_cleanup_after_idle();

incby = 1;
@@ -834,14 +826,6 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
rdp->dynticks_nmi_nesting + incby);
barrier();
}
-
-/**
- * rcu_nmi_enter - inform RCU of entry to NMI context
- */
-void rcu_nmi_enter(void)
-{
- rcu_nmi_enter_common(false);
-}
NOKPROBE_SYMBOL(rcu_nmi_enter);

/**
@@ -869,7 +853,7 @@ NOKPROBE_SYMBOL(rcu_nmi_enter);
void rcu_irq_enter(void)
{
lockdep_assert_irqs_disabled();
- rcu_nmi_enter_common(true);
+ rcu_nmi_enter();
}

/*

2020-02-13 20:59:57

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> [...]
> > > > It might well be that I could make these functions be NMI-safe, but
> > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > So, before looking into that, I have a question. Given these proposed
> > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > to just use in_nmi()?
> > >
> > > That _should_ already be the case today. That is, if we end up in a
> > > tracer and in_nmi() is unreliable we're already screwed anyway.
> >
> > So something like this, then? This is untested, probably doesn't even
> > build, and could use some careful review from both Peter and Steve,
> > at least. As in the below is the second version of the patch, the first
> > having been missing a couple of important "!" characters.
>
> I removed the static from rcu_nmi_enter()/exit() as it is called from
> outside, that makes it build now. Updated below is Paul's diff. I also added
> NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> asymmetric.

My compiler complained about the static and the __always_inline, so I
fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
to rcu_nmi_exit(). What bad thing happens if we leave this on only
rcu_nmi_enter()?

Thanx, Paul

> ---8<-----------------------
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index d91c9156fab2e..bbcc7767f18ee 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -614,16 +614,18 @@ void rcu_user_enter(void)
> }
> #endif /* CONFIG_NO_HZ_FULL */
>
> -/*
> +/**
> + * rcu_nmi_exit - inform RCU of exit from NMI context
> + *
> * If we are returning from the outermost NMI handler that interrupted an
> * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
> * to let the RCU grace-period handling know that the CPU is back to
> * being RCU-idle.
> *
> - * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
> + * If you add or remove a call to rcu_nmi_exit(), be sure to test
> * with CONFIG_RCU_EQS_DEBUG=y.
> */
> -static __always_inline void rcu_nmi_exit_common(bool irq)
> +__always_inline void rcu_nmi_exit(void)
> {
> struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>
> @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>
> - if (irq)
> + if (!in_nmi())
> rcu_prepare_for_idle();
>
> rcu_dynticks_eqs_enter();
>
> - if (irq)
> + if (!in_nmi())
> rcu_dynticks_task_enter();
> }
> -
> -/**
> - * rcu_nmi_exit - inform RCU of exit from NMI context
> - *
> - * If you add or remove a call to rcu_nmi_exit(), be sure to test
> - * with CONFIG_RCU_EQS_DEBUG=y.
> - */
> -void rcu_nmi_exit(void)
> -{
> - rcu_nmi_exit_common(false);
> -}
> +NOKPROBE_SYMBOL(rcu_nmi_exit);
>
> /**
> * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> @@ -693,7 +685,7 @@ void rcu_nmi_exit(void)
> void rcu_irq_exit(void)
> {
> lockdep_assert_irqs_disabled();
> - rcu_nmi_exit_common(true);
> + rcu_nmi_exit();
> }
>
> /*
> @@ -777,7 +769,7 @@ void rcu_user_exit(void)
> #endif /* CONFIG_NO_HZ_FULL */
>
> /**
> - * rcu_nmi_enter_common - inform RCU of entry to NMI context
> + * rcu_nmi_enter - inform RCU of entry to NMI context
> * @irq: Is this call from rcu_irq_enter?
> *
> * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
> @@ -786,10 +778,10 @@ void rcu_user_exit(void)
> * long as the nesting level does not overflow an int. (You will probably
> * run out of stack space first.)
> *
> - * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
> + * If you add or remove a call to rcu_nmi_enter(), be sure to test
> * with CONFIG_RCU_EQS_DEBUG=y.
> */
> -static __always_inline void rcu_nmi_enter_common(bool irq)
> +__always_inline void rcu_nmi_enter(void)
> {
> long incby = 2;
> struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> @@ -807,12 +799,12 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> */
> if (rcu_dynticks_curr_cpu_in_eqs()) {
>
> - if (irq)
> + if (!in_nmi())
> rcu_dynticks_task_exit();
>
> rcu_dynticks_eqs_exit();
>
> - if (irq)
> + if (!in_nmi())
> rcu_cleanup_after_idle();
>
> incby = 1;
> @@ -834,14 +826,6 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> rdp->dynticks_nmi_nesting + incby);
> barrier();
> }
> -
> -/**
> - * rcu_nmi_enter - inform RCU of entry to NMI context
> - */
> -void rcu_nmi_enter(void)
> -{
> - rcu_nmi_enter_common(false);
> -}
> NOKPROBE_SYMBOL(rcu_nmi_enter);
>
> /**
> @@ -869,7 +853,7 @@ NOKPROBE_SYMBOL(rcu_nmi_enter);
> void rcu_irq_enter(void)
> {
> lockdep_assert_irqs_disabled();
> - rcu_nmi_enter_common(true);
> + rcu_nmi_enter();
> }
>
> /*

2020-02-13 21:46:53

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > [...]
> > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > So, before looking into that, I have a question. Given these proposed
> > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > to just use in_nmi()?
> > > >
> > > > That _should_ already be the case today. That is, if we end up in a
> > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > >
> > > So something like this, then? This is untested, probably doesn't even
> > > build, and could use some careful review from both Peter and Steve,
> > > at least. As in the below is the second version of the patch, the first
> > > having been missing a couple of important "!" characters.
> >
> > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > outside, that makes it build now. Updated below is Paul's diff. I also added
> > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > asymmetric.
>
> My compiler complained about the static and the __always_inline, so I
> fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> to rcu_nmi_exit(). What bad thing happens if we leave this on only
> rcu_nmi_enter()?

It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
allowing it on exit (from a code reading standpoint) so my reaction was to
add it to both, but we could probably keep that as a separate
patch/discussion since it is slightly unrelated to the patch.. Sorry to
confuse the topic.

thanks,

- Joel


> Thanx, Paul
>
> > ---8<-----------------------
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index d91c9156fab2e..bbcc7767f18ee 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -614,16 +614,18 @@ void rcu_user_enter(void)
> > }
> > #endif /* CONFIG_NO_HZ_FULL */
> >
> > -/*
> > +/**
> > + * rcu_nmi_exit - inform RCU of exit from NMI context
> > + *
> > * If we are returning from the outermost NMI handler that interrupted an
> > * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
> > * to let the RCU grace-period handling know that the CPU is back to
> > * being RCU-idle.
> > *
> > - * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
> > + * If you add or remove a call to rcu_nmi_exit(), be sure to test
> > * with CONFIG_RCU_EQS_DEBUG=y.
> > */
> > -static __always_inline void rcu_nmi_exit_common(bool irq)
> > +__always_inline void rcu_nmi_exit(void)
> > {
> > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> >
> > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> >
> > - if (irq)
> > + if (!in_nmi())
> > rcu_prepare_for_idle();
> >
> > rcu_dynticks_eqs_enter();
> >
> > - if (irq)
> > + if (!in_nmi())
> > rcu_dynticks_task_enter();
> > }
> > -
> > -/**
> > - * rcu_nmi_exit - inform RCU of exit from NMI context
> > - *
> > - * If you add or remove a call to rcu_nmi_exit(), be sure to test
> > - * with CONFIG_RCU_EQS_DEBUG=y.
> > - */
> > -void rcu_nmi_exit(void)
> > -{
> > - rcu_nmi_exit_common(false);
> > -}
> > +NOKPROBE_SYMBOL(rcu_nmi_exit);
> >
> > /**
> > * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> > @@ -693,7 +685,7 @@ void rcu_nmi_exit(void)
> > void rcu_irq_exit(void)
> > {
> > lockdep_assert_irqs_disabled();
> > - rcu_nmi_exit_common(true);
> > + rcu_nmi_exit();
> > }
> >
> > /*
> > @@ -777,7 +769,7 @@ void rcu_user_exit(void)
> > #endif /* CONFIG_NO_HZ_FULL */
> >
> > /**
> > - * rcu_nmi_enter_common - inform RCU of entry to NMI context
> > + * rcu_nmi_enter - inform RCU of entry to NMI context
> > * @irq: Is this call from rcu_irq_enter?
> > *
> > * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
> > @@ -786,10 +778,10 @@ void rcu_user_exit(void)
> > * long as the nesting level does not overflow an int. (You will probably
> > * run out of stack space first.)
> > *
> > - * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
> > + * If you add or remove a call to rcu_nmi_enter(), be sure to test
> > * with CONFIG_RCU_EQS_DEBUG=y.
> > */
> > -static __always_inline void rcu_nmi_enter_common(bool irq)
> > +__always_inline void rcu_nmi_enter(void)
> > {
> > long incby = 2;
> > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > @@ -807,12 +799,12 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> > */
> > if (rcu_dynticks_curr_cpu_in_eqs()) {
> >
> > - if (irq)
> > + if (!in_nmi())
> > rcu_dynticks_task_exit();
> >
> > rcu_dynticks_eqs_exit();
> >
> > - if (irq)
> > + if (!in_nmi())
> > rcu_cleanup_after_idle();
> >
> > incby = 1;
> > @@ -834,14 +826,6 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> > rdp->dynticks_nmi_nesting + incby);
> > barrier();
> > }
> > -
> > -/**
> > - * rcu_nmi_enter - inform RCU of entry to NMI context
> > - */
> > -void rcu_nmi_enter(void)
> > -{
> > - rcu_nmi_enter_common(false);
> > -}
> > NOKPROBE_SYMBOL(rcu_nmi_enter);
> >
> > /**
> > @@ -869,7 +853,7 @@ NOKPROBE_SYMBOL(rcu_nmi_enter);
> > void rcu_irq_enter(void)
> > {
> > lockdep_assert_irqs_disabled();
> > - rcu_nmi_enter_common(true);
> > + rcu_nmi_enter();
> > }
> >
> > /*

2020-02-13 21:48:51

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

[ Added Masami ]

On Thu, 13 Feb 2020 16:19:30 -0500
Joel Fernandes <[email protected]> wrote:

> On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > [...]
> > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > to just use in_nmi()?
> > > > >
> > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > >
> > > > So something like this, then? This is untested, probably doesn't even
> > > > build, and could use some careful review from both Peter and Steve,
> > > > at least. As in the below is the second version of the patch, the first
> > > > having been missing a couple of important "!" characters.
> > >
> > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > asymmetric.
> >
> > My compiler complained about the static and the __always_inline, so I
> > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > rcu_nmi_enter()?
>
> It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> allowing it on exit (from a code reading standpoint) so my reaction was to
> add it to both, but we could probably keep that as a separate
> patch/discussion since it is slightly unrelated to the patch.. Sorry to
> confuse the topic.
>

rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
kprobe_int3_handler()")

The issue was that we must not allow anything in do_int3() call kprobe
code before kprobe_int3_handler() is called. Because ist_enter() (in
do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
nothing to do with it being RCU nor NMI, but because it was simply
called in do_int3().

Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
to why rcu_nmi_enter() would probably be useful, like below:

-- Steve

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1694a6b57ad8..e2c9e3e2f480 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -846,6 +846,12 @@ void rcu_nmi_enter(void)
{
rcu_nmi_enter_common(false);
}
+/*
+ * On x86, All functions in do_int3() must be marked NOKPROBE before
+ * kprobe_int3_handler() is called. ist_enter() which is called in do_int3()
+ * before kprobe_int3_handle() happens to call rcu_nmi_enter() in which case
+ * rcu_nmi_enter() must be marked NOKRPOBE.
+ */
NOKPROBE_SYMBOL(rcu_nmi_enter);

/**

2020-02-13 21:51:33

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 04:19:30PM -0500, Joel Fernandes wrote:
> On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > [...]
> > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > to just use in_nmi()?
> > > > >
> > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > >
> > > > So something like this, then? This is untested, probably doesn't even
> > > > build, and could use some careful review from both Peter and Steve,
> > > > at least. As in the below is the second version of the patch, the first
> > > > having been missing a couple of important "!" characters.
> > >
> > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > asymmetric.
> >
> > My compiler complained about the static and the __always_inline, so I
> > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > rcu_nmi_enter()?
>
> It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> allowing it on exit (from a code reading standpoint) so my reaction was to
> add it to both, but we could probably keep that as a separate
> patch/discussion since it is slightly unrelated to the patch.. Sorry to
> confuse the topic.

Actually and perhaps unusually, I was not being sarcastic, but was instead
asking a serious question. Is the current code correct? Should the
current NOKPROBE_SYMBOL() be removed? Should the other NOKPROBE_SYMBOL()
be added? Something else? And either way, why?

Thanx, Paul

> thanks,
>
> - Joel
>
>
> > Thanx, Paul
> >
> > > ---8<-----------------------
> > >
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index d91c9156fab2e..bbcc7767f18ee 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -614,16 +614,18 @@ void rcu_user_enter(void)
> > > }
> > > #endif /* CONFIG_NO_HZ_FULL */
> > >
> > > -/*
> > > +/**
> > > + * rcu_nmi_exit - inform RCU of exit from NMI context
> > > + *
> > > * If we are returning from the outermost NMI handler that interrupted an
> > > * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
> > > * to let the RCU grace-period handling know that the CPU is back to
> > > * being RCU-idle.
> > > *
> > > - * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
> > > + * If you add or remove a call to rcu_nmi_exit(), be sure to test
> > > * with CONFIG_RCU_EQS_DEBUG=y.
> > > */
> > > -static __always_inline void rcu_nmi_exit_common(bool irq)
> > > +__always_inline void rcu_nmi_exit(void)
> > > {
> > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > >
> > > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> > > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> > > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> > >
> > > - if (irq)
> > > + if (!in_nmi())
> > > rcu_prepare_for_idle();
> > >
> > > rcu_dynticks_eqs_enter();
> > >
> > > - if (irq)
> > > + if (!in_nmi())
> > > rcu_dynticks_task_enter();
> > > }
> > > -
> > > -/**
> > > - * rcu_nmi_exit - inform RCU of exit from NMI context
> > > - *
> > > - * If you add or remove a call to rcu_nmi_exit(), be sure to test
> > > - * with CONFIG_RCU_EQS_DEBUG=y.
> > > - */
> > > -void rcu_nmi_exit(void)
> > > -{
> > > - rcu_nmi_exit_common(false);
> > > -}
> > > +NOKPROBE_SYMBOL(rcu_nmi_exit);
> > >
> > > /**
> > > * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> > > @@ -693,7 +685,7 @@ void rcu_nmi_exit(void)
> > > void rcu_irq_exit(void)
> > > {
> > > lockdep_assert_irqs_disabled();
> > > - rcu_nmi_exit_common(true);
> > > + rcu_nmi_exit();
> > > }
> > >
> > > /*
> > > @@ -777,7 +769,7 @@ void rcu_user_exit(void)
> > > #endif /* CONFIG_NO_HZ_FULL */
> > >
> > > /**
> > > - * rcu_nmi_enter_common - inform RCU of entry to NMI context
> > > + * rcu_nmi_enter - inform RCU of entry to NMI context
> > > * @irq: Is this call from rcu_irq_enter?
> > > *
> > > * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
> > > @@ -786,10 +778,10 @@ void rcu_user_exit(void)
> > > * long as the nesting level does not overflow an int. (You will probably
> > > * run out of stack space first.)
> > > *
> > > - * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
> > > + * If you add or remove a call to rcu_nmi_enter(), be sure to test
> > > * with CONFIG_RCU_EQS_DEBUG=y.
> > > */
> > > -static __always_inline void rcu_nmi_enter_common(bool irq)
> > > +__always_inline void rcu_nmi_enter(void)
> > > {
> > > long incby = 2;
> > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > > @@ -807,12 +799,12 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> > > */
> > > if (rcu_dynticks_curr_cpu_in_eqs()) {
> > >
> > > - if (irq)
> > > + if (!in_nmi())
> > > rcu_dynticks_task_exit();
> > >
> > > rcu_dynticks_eqs_exit();
> > >
> > > - if (irq)
> > > + if (!in_nmi())
> > > rcu_cleanup_after_idle();
> > >
> > > incby = 1;
> > > @@ -834,14 +826,6 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> > > rdp->dynticks_nmi_nesting + incby);
> > > barrier();
> > > }
> > > -
> > > -/**
> > > - * rcu_nmi_enter - inform RCU of entry to NMI context
> > > - */
> > > -void rcu_nmi_enter(void)
> > > -{
> > > - rcu_nmi_enter_common(false);
> > > -}
> > > NOKPROBE_SYMBOL(rcu_nmi_enter);
> > >
> > > /**
> > > @@ -869,7 +853,7 @@ NOKPROBE_SYMBOL(rcu_nmi_enter);
> > > void rcu_irq_enter(void)
> > > {
> > > lockdep_assert_irqs_disabled();
> > > - rcu_nmi_enter_common(true);
> > > + rcu_nmi_enter();
> > > }
> > >
> > > /*

2020-02-13 21:53:18

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> [ Added Masami ]
>
> On Thu, 13 Feb 2020 16:19:30 -0500
> Joel Fernandes <[email protected]> wrote:
>
> > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > [...]
> > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > to just use in_nmi()?
> > > > > >
> > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > >
> > > > > So something like this, then? This is untested, probably doesn't even
> > > > > build, and could use some careful review from both Peter and Steve,
> > > > > at least. As in the below is the second version of the patch, the first
> > > > > having been missing a couple of important "!" characters.
> > > >
> > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > asymmetric.
> > >
> > > My compiler complained about the static and the __always_inline, so I
> > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > rcu_nmi_enter()?
> >
> > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > allowing it on exit (from a code reading standpoint) so my reaction was to
> > add it to both, but we could probably keep that as a separate
> > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > confuse the topic.
> >
>
> rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> kprobe_int3_handler()")
>
> The issue was that we must not allow anything in do_int3() call kprobe
> code before kprobe_int3_handler() is called. Because ist_enter() (in
> do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> nothing to do with it being RCU nor NMI, but because it was simply
> called in do_int3().
>
> Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> to why rcu_nmi_enter() would probably be useful, like below:

Thank you, Steve! Could I please have your Signed-off-by for this?

Thanx, Paul

> -- Steve
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1694a6b57ad8..e2c9e3e2f480 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -846,6 +846,12 @@ void rcu_nmi_enter(void)
> {
> rcu_nmi_enter_common(false);
> }
> +/*
> + * On x86, All functions in do_int3() must be marked NOKPROBE before
> + * kprobe_int3_handler() is called. ist_enter() which is called in do_int3()
> + * before kprobe_int3_handle() happens to call rcu_nmi_enter() in which case
> + * rcu_nmi_enter() must be marked NOKRPOBE.
> + */
> NOKPROBE_SYMBOL(rcu_nmi_enter);
>
> /**

2020-02-13 22:05:24

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, 13 Feb 2020 13:50:04 -0800
"Paul E. McKenney" <[email protected]> wrote:

> On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> > [ Added Masami ]
> >
> > On Thu, 13 Feb 2020 16:19:30 -0500
> > Joel Fernandes <[email protected]> wrote:
> >
> > > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > > [...]
> > > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > > to just use in_nmi()?
> > > > > > >
> > > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > > >
> > > > > > So something like this, then? This is untested, probably doesn't even
> > > > > > build, and could use some careful review from both Peter and Steve,
> > > > > > at least. As in the below is the second version of the patch, the first
> > > > > > having been missing a couple of important "!" characters.
> > > > >
> > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > > asymmetric.
> > > >
> > > > My compiler complained about the static and the __always_inline, so I
> > > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > > rcu_nmi_enter()?
> > >
> > > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > > allowing it on exit (from a code reading standpoint) so my reaction was to
> > > add it to both, but we could probably keep that as a separate
> > > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > > confuse the topic.
> > >
> >
> > rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > kprobe_int3_handler()")
> >
> > The issue was that we must not allow anything in do_int3() call kprobe
> > code before kprobe_int3_handler() is called. Because ist_enter() (in
> > do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> > nothing to do with it being RCU nor NMI, but because it was simply
> > called in do_int3().
> >
> > Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> > to why rcu_nmi_enter() would probably be useful, like below:
>
> Thank you, Steve! Could I please have your Signed-off-by for this?

Sure, but it was untested ;-)

Signed-off-by: Steven Rostedt (VMware) <[email protected]>

I'd like a Reviewed-by from Masami though.

-- Steve


2020-02-13 22:40:04

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 05:04:51PM -0500, Steven Rostedt wrote:
> On Thu, 13 Feb 2020 13:50:04 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> > > [ Added Masami ]
> > >
> > > On Thu, 13 Feb 2020 16:19:30 -0500
> > > Joel Fernandes <[email protected]> wrote:
> > >
> > > > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > > > [...]
> > > > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > > > to just use in_nmi()?
> > > > > > > >
> > > > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > > > >
> > > > > > > So something like this, then? This is untested, probably doesn't even
> > > > > > > build, and could use some careful review from both Peter and Steve,
> > > > > > > at least. As in the below is the second version of the patch, the first
> > > > > > > having been missing a couple of important "!" characters.
> > > > > >
> > > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > > > asymmetric.
> > > > >
> > > > > My compiler complained about the static and the __always_inline, so I
> > > > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > > > rcu_nmi_enter()?
> > > >
> > > > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > > > allowing it on exit (from a code reading standpoint) so my reaction was to
> > > > add it to both, but we could probably keep that as a separate
> > > > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > > > confuse the topic.
> > > >
> > >
> > > rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> > > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > > kprobe_int3_handler()")
> > >
> > > The issue was that we must not allow anything in do_int3() call kprobe
> > > code before kprobe_int3_handler() is called. Because ist_enter() (in
> > > do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> > > nothing to do with it being RCU nor NMI, but because it was simply
> > > called in do_int3().
> > >
> > > Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> > > to why rcu_nmi_enter() would probably be useful, like below:
> >
> > Thank you, Steve! Could I please have your Signed-off-by for this?
>
> Sure, but it was untested ;-)

No problem! I will fire up rcutorture on it. ;-)

But experience indicates that you cannot even make a joke around here.
There is probably already someone out there somewhere building a
comment-checker based on deep semantic analysis and machine learning. :-/

> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
>
> I'd like a Reviewed-by from Masami though.

Sounds good! Masami, would you be willing to review?

Thanx, Paul

2020-02-13 22:59:15

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 01:48:59PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 13, 2020 at 04:19:30PM -0500, Joel Fernandes wrote:
> > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > [...]
> > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > to just use in_nmi()?
> > > > > >
> > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > >
> > > > > So something like this, then? This is untested, probably doesn't even
> > > > > build, and could use some careful review from both Peter and Steve,
> > > > > at least. As in the below is the second version of the patch, the first
> > > > > having been missing a couple of important "!" characters.
> > > >
> > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > asymmetric.
> > >
> > > My compiler complained about the static and the __always_inline, so I
> > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > rcu_nmi_enter()?
> >
> > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > allowing it on exit (from a code reading standpoint) so my reaction was to
> > add it to both, but we could probably keep that as a separate
> > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > confuse the topic.
>
> Actually and perhaps unusually, I was not being sarcastic, but was instead
> asking a serious question. Is the current code correct? Should the
> current NOKPROBE_SYMBOL() be removed? Should the other NOKPROBE_SYMBOL()
> be added? Something else? And either way, why?

Oh ok, it was a fair question. Seems Steve nailed it, only the
rcu_nmi_enter() needs NOKPROBE, although as you mentioned in the other
thread, it would be good to get Masami's eyes on it since he introduced the
NOKPROBE.

thanks,

- Joel


> Thanx, Paul
>
> > thanks,
> >
> > - Joel
> >
> >
> > > Thanx, Paul
> > >
> > > > ---8<-----------------------
> > > >
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index d91c9156fab2e..bbcc7767f18ee 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -614,16 +614,18 @@ void rcu_user_enter(void)
> > > > }
> > > > #endif /* CONFIG_NO_HZ_FULL */
> > > >
> > > > -/*
> > > > +/**
> > > > + * rcu_nmi_exit - inform RCU of exit from NMI context
> > > > + *
> > > > * If we are returning from the outermost NMI handler that interrupted an
> > > > * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
> > > > * to let the RCU grace-period handling know that the CPU is back to
> > > > * being RCU-idle.
> > > > *
> > > > - * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
> > > > + * If you add or remove a call to rcu_nmi_exit(), be sure to test
> > > > * with CONFIG_RCU_EQS_DEBUG=y.
> > > > */
> > > > -static __always_inline void rcu_nmi_exit_common(bool irq)
> > > > +__always_inline void rcu_nmi_exit(void)
> > > > {
> > > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > > >
> > > > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> > > > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> > > > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> > > >
> > > > - if (irq)
> > > > + if (!in_nmi())
> > > > rcu_prepare_for_idle();
> > > >
> > > > rcu_dynticks_eqs_enter();
> > > >
> > > > - if (irq)
> > > > + if (!in_nmi())
> > > > rcu_dynticks_task_enter();
> > > > }
> > > > -
> > > > -/**
> > > > - * rcu_nmi_exit - inform RCU of exit from NMI context
> > > > - *
> > > > - * If you add or remove a call to rcu_nmi_exit(), be sure to test
> > > > - * with CONFIG_RCU_EQS_DEBUG=y.
> > > > - */
> > > > -void rcu_nmi_exit(void)
> > > > -{
> > > > - rcu_nmi_exit_common(false);
> > > > -}
> > > > +NOKPROBE_SYMBOL(rcu_nmi_exit);
> > > >
> > > > /**
> > > > * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
> > > > @@ -693,7 +685,7 @@ void rcu_nmi_exit(void)
> > > > void rcu_irq_exit(void)
> > > > {
> > > > lockdep_assert_irqs_disabled();
> > > > - rcu_nmi_exit_common(true);
> > > > + rcu_nmi_exit();
> > > > }
> > > >
> > > > /*
> > > > @@ -777,7 +769,7 @@ void rcu_user_exit(void)
> > > > #endif /* CONFIG_NO_HZ_FULL */
> > > >
> > > > /**
> > > > - * rcu_nmi_enter_common - inform RCU of entry to NMI context
> > > > + * rcu_nmi_enter - inform RCU of entry to NMI context
> > > > * @irq: Is this call from rcu_irq_enter?
> > > > *
> > > > * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
> > > > @@ -786,10 +778,10 @@ void rcu_user_exit(void)
> > > > * long as the nesting level does not overflow an int. (You will probably
> > > > * run out of stack space first.)
> > > > *
> > > > - * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
> > > > + * If you add or remove a call to rcu_nmi_enter(), be sure to test
> > > > * with CONFIG_RCU_EQS_DEBUG=y.
> > > > */
> > > > -static __always_inline void rcu_nmi_enter_common(bool irq)
> > > > +__always_inline void rcu_nmi_enter(void)
> > > > {
> > > > long incby = 2;
> > > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > > > @@ -807,12 +799,12 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> > > > */
> > > > if (rcu_dynticks_curr_cpu_in_eqs()) {
> > > >
> > > > - if (irq)
> > > > + if (!in_nmi())
> > > > rcu_dynticks_task_exit();
> > > >
> > > > rcu_dynticks_eqs_exit();
> > > >
> > > > - if (irq)
> > > > + if (!in_nmi())
> > > > rcu_cleanup_after_idle();
> > > >
> > > > incby = 1;
> > > > @@ -834,14 +826,6 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
> > > > rdp->dynticks_nmi_nesting + incby);
> > > > barrier();
> > > > }
> > > > -
> > > > -/**
> > > > - * rcu_nmi_enter - inform RCU of entry to NMI context
> > > > - */
> > > > -void rcu_nmi_enter(void)
> > > > -{
> > > > - rcu_nmi_enter_common(false);
> > > > -}
> > > > NOKPROBE_SYMBOL(rcu_nmi_enter);
> > > >
> > > > /**
> > > > @@ -869,7 +853,7 @@ NOKPROBE_SYMBOL(rcu_nmi_enter);
> > > > void rcu_irq_enter(void)
> > > > {
> > > > lockdep_assert_irqs_disabled();
> > > > - rcu_nmi_enter_common(true);
> > > > + rcu_nmi_enter();
> > > > }
> > > >
> > > > /*

2020-02-13 23:56:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, 13 Feb 2020 17:58:53 -0500
Joel Fernandes <[email protected]> wrote:

> Oh ok, it was a fair question. Seems Steve nailed it, only the
> rcu_nmi_enter() needs NOKPROBE, although as you mentioned in the other
> thread, it would be good to get Masami's eyes on it since he introduced the
> NOKPROBE.

Note, I did go and verify that the issue still exists, and the NOKPROBE
looks to still be needed. But as you stated, because Masami added the
NOKPROBE I'd feel more comfortable with him looking over my explanation.

-- Steve

2020-02-14 06:19:45

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, 13 Feb 2020 14:39:18 -0800
"Paul E. McKenney" <[email protected]> wrote:

> On Thu, Feb 13, 2020 at 05:04:51PM -0500, Steven Rostedt wrote:
> > On Thu, 13 Feb 2020 13:50:04 -0800
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> > > > [ Added Masami ]
> > > >
> > > > On Thu, 13 Feb 2020 16:19:30 -0500
> > > > Joel Fernandes <[email protected]> wrote:
> > > >
> > > > > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > > > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > > > > [...]
> > > > > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > > > > to just use in_nmi()?
> > > > > > > > >
> > > > > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > > > > >
> > > > > > > > So something like this, then? This is untested, probably doesn't even
> > > > > > > > build, and could use some careful review from both Peter and Steve,
> > > > > > > > at least. As in the below is the second version of the patch, the first
> > > > > > > > having been missing a couple of important "!" characters.
> > > > > > >
> > > > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > > > > asymmetric.
> > > > > >
> > > > > > My compiler complained about the static and the __always_inline, so I
> > > > > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > > > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > > > > rcu_nmi_enter()?
> > > > >
> > > > > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > > > > allowing it on exit (from a code reading standpoint) so my reaction was to
> > > > > add it to both, but we could probably keep that as a separate
> > > > > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > > > > confuse the topic.
> > > > >
> > > >
> > > > rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> > > > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > > > kprobe_int3_handler()")
> > > >
> > > > The issue was that we must not allow anything in do_int3() call kprobe
> > > > code before kprobe_int3_handler() is called. Because ist_enter() (in
> > > > do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> > > > nothing to do with it being RCU nor NMI, but because it was simply
> > > > called in do_int3().
> > > >
> > > > Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> > > > to why rcu_nmi_enter() would probably be useful, like below:
> > >
> > > Thank you, Steve! Could I please have your Signed-off-by for this?
> >
> > Sure, but it was untested ;-)
>
> No problem! I will fire up rcutorture on it. ;-)
>
> But experience indicates that you cannot even make a joke around here.
> There is probably already someone out there somewhere building a
> comment-checker based on deep semantic analysis and machine learning. :-/
>
> > Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> >
> > I'd like a Reviewed-by from Masami though.
>
> Sounds good! Masami, would you be willing to review?

Yes, the functions before calling kprobe_int3_handler() must not
be kprobed. It can cause an infinite recursive int3 trapping.

Reviewed-by: Masami Hiramatsu <[email protected]>

Thank you!

--
Masami Hiramatsu <[email protected]>

2020-02-15 17:14:02

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Fri, Feb 14, 2020 at 03:19:06PM +0900, Masami Hiramatsu wrote:
> On Thu, 13 Feb 2020 14:39:18 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Thu, Feb 13, 2020 at 05:04:51PM -0500, Steven Rostedt wrote:
> > > On Thu, 13 Feb 2020 13:50:04 -0800
> > > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > > On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> > > > > [ Added Masami ]
> > > > >
> > > > > On Thu, 13 Feb 2020 16:19:30 -0500
> > > > > Joel Fernandes <[email protected]> wrote:
> > > > >
> > > > > > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > > > > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > > > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > > > > > [...]
> > > > > > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > > > > > to just use in_nmi()?
> > > > > > > > > >
> > > > > > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > > > > > >
> > > > > > > > > So something like this, then? This is untested, probably doesn't even
> > > > > > > > > build, and could use some careful review from both Peter and Steve,
> > > > > > > > > at least. As in the below is the second version of the patch, the first
> > > > > > > > > having been missing a couple of important "!" characters.
> > > > > > > >
> > > > > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > > > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > > > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > > > > > asymmetric.
> > > > > > >
> > > > > > > My compiler complained about the static and the __always_inline, so I
> > > > > > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > > > > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > > > > > rcu_nmi_enter()?
> > > > > >
> > > > > > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > > > > > allowing it on exit (from a code reading standpoint) so my reaction was to
> > > > > > add it to both, but we could probably keep that as a separate
> > > > > > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > > > > > confuse the topic.
> > > > > >
> > > > >
> > > > > rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> > > > > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > > > > kprobe_int3_handler()")
> > > > >
> > > > > The issue was that we must not allow anything in do_int3() call kprobe
> > > > > code before kprobe_int3_handler() is called. Because ist_enter() (in
> > > > > do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> > > > > nothing to do with it being RCU nor NMI, but because it was simply
> > > > > called in do_int3().
> > > > >
> > > > > Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> > > > > to why rcu_nmi_enter() would probably be useful, like below:
> > > >
> > > > Thank you, Steve! Could I please have your Signed-off-by for this?
> > >
> > > Sure, but it was untested ;-)
> >
> > No problem! I will fire up rcutorture on it. ;-)
> >
> > But experience indicates that you cannot even make a joke around here.
> > There is probably already someone out there somewhere building a
> > comment-checker based on deep semantic analysis and machine learning. :-/
> >
> > > Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> > >
> > > I'd like a Reviewed-by from Masami though.
> >
> > Sounds good! Masami, would you be willing to review?
>
> Yes, the functions before calling kprobe_int3_handler() must not
> be kprobed. It can cause an infinite recursive int3 trapping.
>
> Reviewed-by: Masami Hiramatsu <[email protected]>

Thank you both!

Like this?

Thanx, Paul

------------------------------------------------------------------------

commit 1817fdc8f4e4bd18c76305c9b937fb0dccbb1583
Author: Steven Rostedt <[email protected]>
Date: Sat Feb 15 06:54:50 2020 -0800

rcu: Provide comment for NOKPROBE() on rcu_nmi_enter()

The rcu_nmi_enter() function was marked NOKPROBE() by commit
c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
kprobe_int3_handler()") because the do_int3() call kprobe code must
not be invoked before kprobe_int3_handler() is called. It turns out
that ist_enter() (in do_int3()) calls rcu_nmi_enter(), hence the
marking NOKPROBE() being added to rcu_nmi_enter().

This commit therefore adds a comment documenting this line of reasoning.

Signed-off-by: Steven Rostedt <[email protected]>
Reviewed-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 132b53e..4a885af 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -835,6 +835,12 @@ void rcu_nmi_enter(void)
rdp->dynticks_nmi_nesting + incby);
barrier();
}
+/*
+ * On x86, All functions in do_int3() must be marked NOKPROBE before
+ * kprobe_int3_handler() is called. ist_enter() which is called in do_int3()
+ * before kprobe_int3_handle() happens to call rcu_nmi_enter() which means
+ * that rcu_nmi_enter() must be marked NOKRPOBE.
+ */
NOKPROBE_SYMBOL(rcu_nmi_enter);

/**

2020-02-17 08:56:02

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Sat, 15 Feb 2020 06:59:34 -0800
"Paul E. McKenney" <[email protected]> wrote:

> On Fri, Feb 14, 2020 at 03:19:06PM +0900, Masami Hiramatsu wrote:
> > On Thu, 13 Feb 2020 14:39:18 -0800
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > On Thu, Feb 13, 2020 at 05:04:51PM -0500, Steven Rostedt wrote:
> > > > On Thu, 13 Feb 2020 13:50:04 -0800
> > > > "Paul E. McKenney" <[email protected]> wrote:
> > > >
> > > > > On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> > > > > > [ Added Masami ]
> > > > > >
> > > > > > On Thu, 13 Feb 2020 16:19:30 -0500
> > > > > > Joel Fernandes <[email protected]> wrote:
> > > > > >
> > > > > > > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > > > > > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > > > > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > > > > > > [...]
> > > > > > > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > > > > > > to just use in_nmi()?
> > > > > > > > > > >
> > > > > > > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > > > > > > >
> > > > > > > > > > So something like this, then? This is untested, probably doesn't even
> > > > > > > > > > build, and could use some careful review from both Peter and Steve,
> > > > > > > > > > at least. As in the below is the second version of the patch, the first
> > > > > > > > > > having been missing a couple of important "!" characters.
> > > > > > > > >
> > > > > > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > > > > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > > > > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > > > > > > asymmetric.
> > > > > > > >
> > > > > > > > My compiler complained about the static and the __always_inline, so I
> > > > > > > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > > > > > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > > > > > > rcu_nmi_enter()?
> > > > > > >
> > > > > > > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > > > > > > allowing it on exit (from a code reading standpoint) so my reaction was to
> > > > > > > add it to both, but we could probably keep that as a separate
> > > > > > > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > > > > > > confuse the topic.
> > > > > > >
> > > > > >
> > > > > > rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> > > > > > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > > > > > kprobe_int3_handler()")
> > > > > >
> > > > > > The issue was that we must not allow anything in do_int3() call kprobe
> > > > > > code before kprobe_int3_handler() is called. Because ist_enter() (in
> > > > > > do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> > > > > > nothing to do with it being RCU nor NMI, but because it was simply
> > > > > > called in do_int3().
> > > > > >
> > > > > > Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> > > > > > to why rcu_nmi_enter() would probably be useful, like below:
> > > > >
> > > > > Thank you, Steve! Could I please have your Signed-off-by for this?
> > > >
> > > > Sure, but it was untested ;-)
> > >
> > > No problem! I will fire up rcutorture on it. ;-)
> > >
> > > But experience indicates that you cannot even make a joke around here.
> > > There is probably already someone out there somewhere building a
> > > comment-checker based on deep semantic analysis and machine learning. :-/
> > >
> > > > Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> > > >
> > > > I'd like a Reviewed-by from Masami though.
> > >
> > > Sounds good! Masami, would you be willing to review?
> >
> > Yes, the functions before calling kprobe_int3_handler() must not
> > be kprobed. It can cause an infinite recursive int3 trapping.
> >
> > Reviewed-by: Masami Hiramatsu <[email protected]>
>
> Thank you both!
>
> Like this?
>
> Thanx, Paul
>

This is good to me.
BTW, if you consider the x86 specific code is in the generic file,
we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
(Sorry, I've hit this idea right now)

Thank you,

> ------------------------------------------------------------------------
>
> commit 1817fdc8f4e4bd18c76305c9b937fb0dccbb1583
> Author: Steven Rostedt <[email protected]>
> Date: Sat Feb 15 06:54:50 2020 -0800
>
> rcu: Provide comment for NOKPROBE() on rcu_nmi_enter()
>
> The rcu_nmi_enter() function was marked NOKPROBE() by commit
> c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> kprobe_int3_handler()") because the do_int3() call kprobe code must
> not be invoked before kprobe_int3_handler() is called. It turns out
> that ist_enter() (in do_int3()) calls rcu_nmi_enter(), hence the
> marking NOKPROBE() being added to rcu_nmi_enter().
>
> This commit therefore adds a comment documenting this line of reasoning.
>
> Signed-off-by: Steven Rostedt <[email protected]>
> Reviewed-by: Masami Hiramatsu <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 132b53e..4a885af 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -835,6 +835,12 @@ void rcu_nmi_enter(void)
> rdp->dynticks_nmi_nesting + incby);
> barrier();
> }
> +/*
> + * On x86, All functions in do_int3() must be marked NOKPROBE before
> + * kprobe_int3_handler() is called. ist_enter() which is called in do_int3()
> + * before kprobe_int3_handle() happens to call rcu_nmi_enter() which means
> + * that rcu_nmi_enter() must be marked NOKRPOBE.
> + */
> NOKPROBE_SYMBOL(rcu_nmi_enter);
>
> /**


--
Masami Hiramatsu <[email protected]>

2020-02-17 16:32:06

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Mon, Feb 17, 2020 at 05:55:19PM +0900, Masami Hiramatsu wrote:
> On Sat, 15 Feb 2020 06:59:34 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Fri, Feb 14, 2020 at 03:19:06PM +0900, Masami Hiramatsu wrote:
> > > On Thu, 13 Feb 2020 14:39:18 -0800
> > > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > > On Thu, Feb 13, 2020 at 05:04:51PM -0500, Steven Rostedt wrote:
> > > > > On Thu, 13 Feb 2020 13:50:04 -0800
> > > > > "Paul E. McKenney" <[email protected]> wrote:
> > > > >
> > > > > > On Thu, Feb 13, 2020 at 04:38:25PM -0500, Steven Rostedt wrote:
> > > > > > > [ Added Masami ]
> > > > > > >
> > > > > > > On Thu, 13 Feb 2020 16:19:30 -0500
> > > > > > > Joel Fernandes <[email protected]> wrote:
> > > > > > >
> > > > > > > > On Thu, Feb 13, 2020 at 12:54:42PM -0800, Paul E. McKenney wrote:
> > > > > > > > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > > > > > > > > > On Thu, Feb 13, 2020 at 10:56:12AM -0800, Paul E. McKenney wrote:
> > > > > > > > > > [...]
> > > > > > > > > > > > > It might well be that I could make these functions be NMI-safe, but
> > > > > > > > > > > > > rcu_prepare_for_idle() in particular would be a bit ugly at best.
> > > > > > > > > > > > > So, before looking into that, I have a question. Given these proposed
> > > > > > > > > > > > > changes, will rcu_nmi_exit_common() and rcu_nmi_enter_common() be able
> > > > > > > > > > > > > to just use in_nmi()?
> > > > > > > > > > > >
> > > > > > > > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > > > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > > > > > > > > > >
> > > > > > > > > > > So something like this, then? This is untested, probably doesn't even
> > > > > > > > > > > build, and could use some careful review from both Peter and Steve,
> > > > > > > > > > > at least. As in the below is the second version of the patch, the first
> > > > > > > > > > > having been missing a couple of important "!" characters.
> > > > > > > > > >
> > > > > > > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > > > > > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > > > > > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > > > > > > > asymmetric.
> > > > > > > > >
> > > > > > > > > My compiler complained about the static and the __always_inline, so I
> > > > > > > > > fixed those. But please help me out on adding the NOKPROBE_SYMBOL()
> > > > > > > > > to rcu_nmi_exit(). What bad thing happens if we leave this on only
> > > > > > > > > rcu_nmi_enter()?
> > > > > > > >
> > > > > > > > It seemed odd to me we were not allowing kprobe on the rcu_nmi_enter() but
> > > > > > > > allowing it on exit (from a code reading standpoint) so my reaction was to
> > > > > > > > add it to both, but we could probably keep that as a separate
> > > > > > > > patch/discussion since it is slightly unrelated to the patch.. Sorry to
> > > > > > > > confuse the topic.
> > > > > > > >
> > > > > > >
> > > > > > > rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit
> > > > > > > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > > > > > > kprobe_int3_handler()")
> > > > > > >
> > > > > > > The issue was that we must not allow anything in do_int3() call kprobe
> > > > > > > code before kprobe_int3_handler() is called. Because ist_enter() (in
> > > > > > > do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> > > > > > > nothing to do with it being RCU nor NMI, but because it was simply
> > > > > > > called in do_int3().
> > > > > > >
> > > > > > > Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> > > > > > > to why rcu_nmi_enter() would probably be useful, like below:
> > > > > >
> > > > > > Thank you, Steve! Could I please have your Signed-off-by for this?
> > > > >
> > > > > Sure, but it was untested ;-)
> > > >
> > > > No problem! I will fire up rcutorture on it. ;-)
> > > >
> > > > But experience indicates that you cannot even make a joke around here.
> > > > There is probably already someone out there somewhere building a
> > > > comment-checker based on deep semantic analysis and machine learning. :-/
> > > >
> > > > > Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> > > > >
> > > > > I'd like a Reviewed-by from Masami though.
> > > >
> > > > Sounds good! Masami, would you be willing to review?
> > >
> > > Yes, the functions before calling kprobe_int3_handler() must not
> > > be kprobed. It can cause an infinite recursive int3 trapping.
> > >
> > > Reviewed-by: Masami Hiramatsu <[email protected]>
> >
> > Thank you both!
> >
> > Like this?
> >
> > Thanx, Paul
> >
>
> This is good to me.

Thank you for looking it over! (I already have your

> BTW, if you consider the x86 specific code is in the generic file,
> we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> (Sorry, I've hit this idea right now)

Might this affect other architectures with NMIs and probe-like things?
If so, it might make sense to leave it where it is.

Thanx, Paul

> Thank you,
>
> > ------------------------------------------------------------------------
> >
> > commit 1817fdc8f4e4bd18c76305c9b937fb0dccbb1583
> > Author: Steven Rostedt <[email protected]>
> > Date: Sat Feb 15 06:54:50 2020 -0800
> >
> > rcu: Provide comment for NOKPROBE() on rcu_nmi_enter()
> >
> > The rcu_nmi_enter() function was marked NOKPROBE() by commit
> > c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> > kprobe_int3_handler()") because the do_int3() call kprobe code must
> > not be invoked before kprobe_int3_handler() is called. It turns out
> > that ist_enter() (in do_int3()) calls rcu_nmi_enter(), hence the
> > marking NOKPROBE() being added to rcu_nmi_enter().
> >
> > This commit therefore adds a comment documenting this line of reasoning.
> >
> > Signed-off-by: Steven Rostedt <[email protected]>
> > Reviewed-by: Masami Hiramatsu <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 132b53e..4a885af 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -835,6 +835,12 @@ void rcu_nmi_enter(void)
> > rdp->dynticks_nmi_nesting + incby);
> > barrier();
> > }
> > +/*
> > + * On x86, All functions in do_int3() must be marked NOKPROBE before
> > + * kprobe_int3_handler() is called. ist_enter() which is called in do_int3()
> > + * before kprobe_int3_handle() happens to call rcu_nmi_enter() which means
> > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > + */
> > NOKPROBE_SYMBOL(rcu_nmi_enter);
> >
> > /**
>
>
> --
> Masami Hiramatsu <[email protected]>

2020-02-18 04:35:11

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Mon, 17 Feb 2020 08:31:12 -0800
"Paul E. McKenney" <[email protected]> wrote:
>
> > BTW, if you consider the x86 specific code is in the generic file,
> > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > (Sorry, I've hit this idea right now)
>
> Might this affect other architectures with NMIs and probe-like things?
> If so, it might make sense to leave it where it is.

Yes, git grep shows that arm64 is using rcu_nmi_enter() in
debug_exception_enter().
OK, let's keep it, but maybe it is good to update the comment for
arm64 too. What about following?

+/*
+ * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
+ * marked NOKPROBE before kprobes handler is called.
+ * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
+ * before kprobes handle happens to call rcu_nmi_enter() which means
+ * that rcu_nmi_enter() must be marked NOKRPOBE.
+ */

Thank you,

--
Masami Hiramatsu <[email protected]>

2020-02-18 16:13:11

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, Feb 18, 2020 at 01:33:35PM +0900, Masami Hiramatsu wrote:
> On Mon, 17 Feb 2020 08:31:12 -0800
> "Paul E. McKenney" <[email protected]> wrote:
> >
> > > BTW, if you consider the x86 specific code is in the generic file,
> > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > (Sorry, I've hit this idea right now)
> >
> > Might this affect other architectures with NMIs and probe-like things?
> > If so, it might make sense to leave it where it is.
>
> Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> debug_exception_enter().
> OK, let's keep it, but maybe it is good to update the comment for
> arm64 too. What about following?
>
> +/*
> + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> + * marked NOKPROBE before kprobes handler is called.
> + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> + * before kprobes handle happens to call rcu_nmi_enter() which means
> + * that rcu_nmi_enter() must be marked NOKRPOBE.
> + */

Would it work to describe the general problem, then give x86 details
as a specific example, as follows?

/*
* On some architectures, certain exceptions prohibit use of kprobes until
* the exception code path reaches a certain point. For example, on x86 all
* functions called by do_int3() must be marked NOKPROBE. However, once
* kprobe_int3_handler() is called, kprobing is permitted. Specifically,
* ist_enter() is called in do_int3() before kprobe_int3_handle().
* Furthermore, ist_enter() calls rcu_nmi_enter(), which means that
* rcu_nmi_enter() must be marked NOKRPOBE.
*/

That way, I don't feel like I need to update the commment each time
a new architecture adds this capability. ;-)

Thanx, Paul

2020-02-18 16:17:18

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

----- On Feb 18, 2020, at 11:12 AM, paulmck [email protected] wrote:

> On Tue, Feb 18, 2020 at 01:33:35PM +0900, Masami Hiramatsu wrote:
>> On Mon, 17 Feb 2020 08:31:12 -0800
>> "Paul E. McKenney" <[email protected]> wrote:
>> >
>> > > BTW, if you consider the x86 specific code is in the generic file,
>> > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
>> > > (Sorry, I've hit this idea right now)
>> >
>> > Might this affect other architectures with NMIs and probe-like things?
>> > If so, it might make sense to leave it where it is.
>>
>> Yes, git grep shows that arm64 is using rcu_nmi_enter() in
>> debug_exception_enter().
>> OK, let's keep it, but maybe it is good to update the comment for
>> arm64 too. What about following?
>>
>> +/*
>> + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
>> + * marked NOKPROBE before kprobes handler is called.
>> + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
>> + * before kprobes handle happens to call rcu_nmi_enter() which means
>> + * that rcu_nmi_enter() must be marked NOKRPOBE.
>> + */
>
> Would it work to describe the general problem, then give x86 details
> as a specific example, as follows?
>
> /*
> * On some architectures, certain exceptions prohibit use of kprobes until
> * the exception code path reaches a certain point. For example, on x86 all
> * functions called by do_int3() must be marked NOKPROBE. However, once
> * kprobe_int3_handler() is called, kprobing is permitted. Specifically,
> * ist_enter() is called in do_int3() before kprobe_int3_handle().
> * Furthermore, ist_enter() calls rcu_nmi_enter(), which means that
> * rcu_nmi_enter() must be marked NOKRPOBE.
> */
>
> That way, I don't feel like I need to update the commment each time
> a new architecture adds this capability. ;-)

I suspect this kind of comment would belong in a new
Documentation/features/kprobes/annotations.txt or something
similar. You might want to look at other "features" files to see
the expected layout.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2020-02-18 16:35:47

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, 18 Feb 2020 08:12:31 -0800
"Paul E. McKenney" <[email protected]> wrote:

> Would it work to describe the general problem, then give x86 details
> as a specific example, as follows?
>
> /*
> * On some architectures, certain exceptions prohibit use of kprobes until
> * the exception code path reaches a certain point. For example, on x86 all
> * functions called by do_int3() must be marked NOKPROBE. However, once
> * kprobe_int3_handler() is called, kprobing is permitted. Specifically,
> * ist_enter() is called in do_int3() before kprobe_int3_handle().
> * Furthermore, ist_enter() calls rcu_nmi_enter(), which means that
> * rcu_nmi_enter() must be marked NOKRPOBE.
> */
>
> That way, I don't feel like I need to update the commment each time
> a new architecture adds this capability. ;-)

I don't think this is going to be an issue for other archs, as they
don't have an IST.

-- Steve

2020-02-18 17:47:36

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, 18 Feb 2020 13:33:35 +0900
Masami Hiramatsu <[email protected]> wrote:

> On Mon, 17 Feb 2020 08:31:12 -0800
> "Paul E. McKenney" <[email protected]> wrote:
> >
> > > BTW, if you consider the x86 specific code is in the generic file,
> > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > (Sorry, I've hit this idea right now)
> >
> > Might this affect other architectures with NMIs and probe-like things?
> > If so, it might make sense to leave it where it is.
>
> Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> debug_exception_enter().
> OK, let's keep it, but maybe it is good to update the comment for
> arm64 too. What about following?
>
> +/*
> + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> + * marked NOKPROBE before kprobes handler is called.
> + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> + * before kprobes handle happens to call rcu_nmi_enter() which means
> + * that rcu_nmi_enter() must be marked NOKRPOBE.
> + */
>

Ah, why don't we just say...

/*
* All functions called in the breakpoint trap handler (e.g. do_int3()
* on x86), must not allow kprobes until the kprobe breakpoint handler
* is called, otherwise it can cause an infinite recursion.
* On some archs, rcu_nmi_enter() is called in the breakpoint handler
* before the kprobe breakpoint handler is called, thus it must be
* marked as NOKPROBE.
*/

And that way we don't make this an arch specific comment.

-- Steve

2020-02-18 20:08:20

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:

> > > That _should_ already be the case today. That is, if we end up in a
> > > tracer and in_nmi() is unreliable we're already screwed anyway.

> I removed the static from rcu_nmi_enter()/exit() as it is called from
> outside, that makes it build now. Updated below is Paul's diff. I also added
> NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> asymmetric.

> +__always_inline void rcu_nmi_exit(void)
> {
> struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>
> @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>
> - if (irq)
> + if (!in_nmi())
> rcu_prepare_for_idle();
>
> rcu_dynticks_eqs_enter();
>
> - if (irq)
> + if (!in_nmi())
> rcu_dynticks_task_enter();
> }

Boris and me have been going over the #MC code (and finding loads of
'interesting' code) and ran into ist_enter(), whish has the following
code:

/*
* We might have interrupted pretty much anything. In
* fact, if we're a machine check, we can even interrupt
* NMI processing. We don't want in_nmi() to return true,
* but we need to notify RCU.
*/
rcu_nmi_enter();


Which, to me, sounds all sorts of broken. The IST (be it #DB or #MC) can
happen while we're holding all sorts of locks. This must be an NMI-like
context.


2020-02-18 20:18:10

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, Feb 18, 2020 at 08:58:31PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
>
> > > > That _should_ already be the case today. That is, if we end up in a
> > > > tracer and in_nmi() is unreliable we're already screwed anyway.
>
> > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > outside, that makes it build now. Updated below is Paul's diff. I also added
> > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > asymmetric.
>
> > +__always_inline void rcu_nmi_exit(void)
> > {
> > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> >
> > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> >
> > - if (irq)
> > + if (!in_nmi())
> > rcu_prepare_for_idle();
> >
> > rcu_dynticks_eqs_enter();
> >
> > - if (irq)
> > + if (!in_nmi())
> > rcu_dynticks_task_enter();
> > }
>
> Boris and me have been going over the #MC code (and finding loads of
> 'interesting' code) and ran into ist_enter(), whish has the following
> code:
>
> /*
> * We might have interrupted pretty much anything. In
> * fact, if we're a machine check, we can even interrupt
> * NMI processing. We don't want in_nmi() to return true,
> * but we need to notify RCU.
> */
> rcu_nmi_enter();
>
>
> Which, to me, sounds all sorts of broken. The IST (be it #DB or #MC) can
> happen while we're holding all sorts of locks. This must be an NMI-like
> context.

Ouch! Looks like I need to hold off on getting rid of the "irq"
parameters if in_nmi() isn't going to be accurate.

Thanx, Paul

2020-02-18 20:18:24

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, Feb 18, 2020 at 12:46:09PM -0500, Steven Rostedt wrote:
> On Tue, 18 Feb 2020 13:33:35 +0900
> Masami Hiramatsu <[email protected]> wrote:
>
> > On Mon, 17 Feb 2020 08:31:12 -0800
> > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > > BTW, if you consider the x86 specific code is in the generic file,
> > > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > > (Sorry, I've hit this idea right now)
> > >
> > > Might this affect other architectures with NMIs and probe-like things?
> > > If so, it might make sense to leave it where it is.
> >
> > Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> > debug_exception_enter().
> > OK, let's keep it, but maybe it is good to update the comment for
> > arm64 too. What about following?
> >
> > +/*
> > + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> > + * marked NOKPROBE before kprobes handler is called.
> > + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> > + * before kprobes handle happens to call rcu_nmi_enter() which means
> > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > + */
> >
>
> Ah, why don't we just say...
>
> /*
> * All functions called in the breakpoint trap handler (e.g. do_int3()
> * on x86), must not allow kprobes until the kprobe breakpoint handler
> * is called, otherwise it can cause an infinite recursion.
> * On some archs, rcu_nmi_enter() is called in the breakpoint handler
> * before the kprobe breakpoint handler is called, thus it must be
> * marked as NOKPROBE.
> */
>
> And that way we don't make this an arch specific comment.

That looks good to me. Masami, does this work for you?

Thanx, Paul

2020-02-18 20:41:13

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, Feb 18, 2020 at 12:17:28PM -0800, Paul E. McKenney wrote:
> On Tue, Feb 18, 2020 at 08:58:31PM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> >
> > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> >
> > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > asymmetric.
> >
> > > +__always_inline void rcu_nmi_exit(void)
> > > {
> > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > >
> > > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> > > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> > > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> > >
> > > - if (irq)
> > > + if (!in_nmi())
> > > rcu_prepare_for_idle();
> > >
> > > rcu_dynticks_eqs_enter();
> > >
> > > - if (irq)
> > > + if (!in_nmi())
> > > rcu_dynticks_task_enter();
> > > }
> >
> > Boris and me have been going over the #MC code (and finding loads of
> > 'interesting' code) and ran into ist_enter(), whish has the following
> > code:
> >
> > /*
> > * We might have interrupted pretty much anything. In
> > * fact, if we're a machine check, we can even interrupt
> > * NMI processing. We don't want in_nmi() to return true,
> > * but we need to notify RCU.
> > */
> > rcu_nmi_enter();
> >
> >
> > Which, to me, sounds all sorts of broken. The IST (be it #DB or #MC) can
> > happen while we're holding all sorts of locks. This must be an NMI-like
> > context.
>
> Ouch! Looks like I need to hold off on getting rid of the "irq"
> parameters if in_nmi() isn't going to be accurate.

I'm currently trying to twist my brain around all this, because I
suspect it's all completely broken one way or another.

But yes, we definitely need to fix this before your patch goes in.

2020-02-18 21:40:24

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, Feb 18, 2020 at 09:40:21PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 18, 2020 at 12:17:28PM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 18, 2020 at 08:58:31PM +0100, Peter Zijlstra wrote:
> > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:
> > >
> > > > > > That _should_ already be the case today. That is, if we end up in a
> > > > > > tracer and in_nmi() is unreliable we're already screwed anyway.
> > >
> > > > I removed the static from rcu_nmi_enter()/exit() as it is called from
> > > > outside, that makes it build now. Updated below is Paul's diff. I also added
> > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> > > > asymmetric.
> > >
> > > > +__always_inline void rcu_nmi_exit(void)
> > > > {
> > > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
> > > >
> > > > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
> > > > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
> > > > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> > > >
> > > > - if (irq)
> > > > + if (!in_nmi())
> > > > rcu_prepare_for_idle();
> > > >
> > > > rcu_dynticks_eqs_enter();
> > > >
> > > > - if (irq)
> > > > + if (!in_nmi())
> > > > rcu_dynticks_task_enter();
> > > > }
> > >
> > > Boris and me have been going over the #MC code (and finding loads of
> > > 'interesting' code) and ran into ist_enter(), whish has the following
> > > code:
> > >
> > > /*
> > > * We might have interrupted pretty much anything. In
> > > * fact, if we're a machine check, we can even interrupt
> > > * NMI processing. We don't want in_nmi() to return true,
> > > * but we need to notify RCU.
> > > */
> > > rcu_nmi_enter();
> > >
> > >
> > > Which, to me, sounds all sorts of broken. The IST (be it #DB or #MC) can
> > > happen while we're holding all sorts of locks. This must be an NMI-like
> > > context.
> >
> > Ouch! Looks like I need to hold off on getting rid of the "irq"
> > parameters if in_nmi() isn't going to be accurate.
>
> I'm currently trying to twist my brain around all this, because I
> suspect it's all completely broken one way or another.
>
> But yes, we definitely need to fix this before your patch goes in.

OK. I will drop it later today, but leave tag in_nmi.2020.02.18a
pointing at it for future reference.

Thanx, Paul

2020-02-19 02:45:54

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, 18 Feb 2020 12:18:06 -0800
"Paul E. McKenney" <[email protected]> wrote:

> On Tue, Feb 18, 2020 at 12:46:09PM -0500, Steven Rostedt wrote:
> > On Tue, 18 Feb 2020 13:33:35 +0900
> > Masami Hiramatsu <[email protected]> wrote:
> >
> > > On Mon, 17 Feb 2020 08:31:12 -0800
> > > "Paul E. McKenney" <[email protected]> wrote:
> > > >
> > > > > BTW, if you consider the x86 specific code is in the generic file,
> > > > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > > > (Sorry, I've hit this idea right now)
> > > >
> > > > Might this affect other architectures with NMIs and probe-like things?
> > > > If so, it might make sense to leave it where it is.
> > >
> > > Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> > > debug_exception_enter().
> > > OK, let's keep it, but maybe it is good to update the comment for
> > > arm64 too. What about following?
> > >
> > > +/*
> > > + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> > > + * marked NOKPROBE before kprobes handler is called.
> > > + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> > > + * before kprobes handle happens to call rcu_nmi_enter() which means
> > > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > > + */
> > >
> >
> > Ah, why don't we just say...
> >
> > /*
> > * All functions called in the breakpoint trap handler (e.g. do_int3()
> > * on x86), must not allow kprobes until the kprobe breakpoint handler
> > * is called, otherwise it can cause an infinite recursion.
> > * On some archs, rcu_nmi_enter() is called in the breakpoint handler
> > * before the kprobe breakpoint handler is called, thus it must be
> > * marked as NOKPROBE.
> > */
> >
> > And that way we don't make this an arch specific comment.
>
> That looks good to me. Masami, does this work for you?

Yes, that looks good to me too :)

Thank you!


--
Masami Hiramatsu <[email protected]>

2020-02-19 10:00:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Tue, Feb 18, 2020 at 01:39:25PM -0800, Paul E. McKenney wrote:
> OK. I will drop it later today, but leave tag in_nmi.2020.02.18a
> pointing at it for future reference.

Thanks, I've picked it up in my series.

2020-02-19 12:47:43

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Wed, Feb 19, 2020 at 10:57:43AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 18, 2020 at 01:39:25PM -0800, Paul E. McKenney wrote:
> > OK. I will drop it later today, but leave tag in_nmi.2020.02.18a
> > pointing at it for future reference.
>
> Thanks, I've picked it up in my series.

Very good, thank you!

Thanx, Paul

2020-03-06 00:43:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

Steven Rostedt <[email protected]> writes:
> rcu_nmi_enter() was marked NOKPROBE or other reasons. See commit

Very well said: OR other reasons. I assume you meant 'for' but ...

> c13324a505c77 ("x86/kprobes: Prohibit probing on functions before
> kprobe_int3_handler()")
>
> The issue was that we must not allow anything in do_int3() call kprobe
> code before kprobe_int3_handler() is called. Because ist_enter() (in
> do_int3()) calls rcu_nmi_enter() it had to be marked NOKPROBE. It had
> nothing to do with it being RCU nor NMI, but because it was simply
> called in do_int3().
>
> Thus, there's no reason to make rcu_nmi_exit() NOKPROBE. But a commont
> to why rcu_nmi_enter() would probably be useful, like below:

... this is really wrong.

While the int3 issue was the reason why it was marked NOKPROBE, fact is
that aside of int3 problem (which is probably true for any other
architecture using breakpoints for patching) any probe _before_ RCU is
watching and _after_ RCU stopped watching is broken. Same applies for
BPF and tracepoints which call into BPF or other nonsense.

Can we please stop claiming that instrumentation can touch anything it
wants and just admit that anything outside RCU covered regions is
off-limits for instrumentation? Same applies for entry code and critical
exceptions/traps.

There is a reason why the tracer can't trace itself and there are very
valid reasons to limit the instrumentation ability in other places.

It's nice to be able to see into 'everything' but for 99.9999% of the
cases the coverage of these things is absolutely irrelevant.

Yes I know, "Correctness first" is this old school thing for grumpy old
greybeards who are still stuck in the 90's. "Features first" is the new
mantra. I deal with that every day by mopping up the mess it creates.

Thanks,

tglx

2020-03-06 18:04:00

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

Hi,

On Wed, 19 Feb 2020 11:45:10 +0900
Masami Hiramatsu <[email protected]> wrote:

> On Tue, 18 Feb 2020 12:18:06 -0800
> "Paul E. McKenney" <[email protected]> wrote:
>
> > On Tue, Feb 18, 2020 at 12:46:09PM -0500, Steven Rostedt wrote:
> > > On Tue, 18 Feb 2020 13:33:35 +0900
> > > Masami Hiramatsu <[email protected]> wrote:
> > >
> > > > On Mon, 17 Feb 2020 08:31:12 -0800
> > > > "Paul E. McKenney" <[email protected]> wrote:
> > > > >
> > > > > > BTW, if you consider the x86 specific code is in the generic file,
> > > > > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > > > > (Sorry, I've hit this idea right now)
> > > > >
> > > > > Might this affect other architectures with NMIs and probe-like things?
> > > > > If so, it might make sense to leave it where it is.
> > > >
> > > > Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> > > > debug_exception_enter().
> > > > OK, let's keep it, but maybe it is good to update the comment for
> > > > arm64 too. What about following?
> > > >
> > > > +/*
> > > > + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> > > > + * marked NOKPROBE before kprobes handler is called.
> > > > + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> > > > + * before kprobes handle happens to call rcu_nmi_enter() which means
> > > > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > > > + */
> > > >
> > >
> > > Ah, why don't we just say...
> > >
> > > /*
> > > * All functions called in the breakpoint trap handler (e.g. do_int3()
> > > * on x86), must not allow kprobes until the kprobe breakpoint handler
> > > * is called, otherwise it can cause an infinite recursion.
> > > * On some archs, rcu_nmi_enter() is called in the breakpoint handler
> > > * before the kprobe breakpoint handler is called, thus it must be
> > > * marked as NOKPROBE.
> > > */
> > >
> > > And that way we don't make this an arch specific comment.
> >
> > That looks good to me. Masami, does this work for you?
>
> Yes, that looks good to me too :)

Oops, I'm guilty!
Sorry *rcu_nmi_exit()* also must be NOKPROBE, since even if we could catch
a recursive kprobe call, we can only skip the kprobe handler, but we must
exit from do_int3() and hit rcu_nmi_exit() again!

[45235.497591] Unrecoverable kprobe detected.
[45235.501400] Dumping kprobe:
[45235.502433] Name: (null)
[45235.502433] Offset: 0
[45235.502433] Address: rcu_nmi_exit+0x0/0x290
[45235.504044] ------------[ cut here ]------------
[45235.504855] kernel BUG at arch/x86/kernel/kprobes/core.c:646!
[45235.505816] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[45235.506615] CPU: 7 PID: 143 Comm: sh Not tainted 5.6.0-rc3+ #143
[45235.507662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[45235.509764] RIP: 0010:reenter_kprobe.cold+0x14/0x16
[45235.510630] Code: 48 8b 75 10 48 c7 c7 f0 70 0e 82 48 8b 56 28 e8 22 91 08 00 0f 0b 48 c7 c7 20 71 0e 82 e8 14 91 08 00 48 89 ef e8 23 ee 0f 00 <0f> 0b 48 89 ee 48 c7 c7 48 71 0e 82 e8 fb 90 08 00 e9 c3 fc ff ff
[45235.513948] RSP: 0018:ffffc90000347bf8 EFLAGS: 00010046
[45235.514906] RAX: 0000000000000036 RBX: 0000000000017f20 RCX: 0000000000000000
[45235.516109] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[45235.517278] RBP: ffff88807c9820c0 R08: 0000000000000000 R09: 0000000000000001
[45235.518415] R10: 0000000000000000 R11: ffff88807c9d1f18 R12: ffff88807d9d7f20
[45235.519609] R13: ffffc90000347c68 R14: ffffffff810e8a60 R15: ffffffff810e8a61
[45235.520787] FS: 0000000001d9a8c0(0000) GS:ffff88807d9c0000(0000) knlGS:0000000000000000
[45235.522198] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[45235.523172] CR2: 0000000001da9000 CR3: 000000007a880000 CR4: 00000000000006a0
[45235.524288] Call Trace:
[45235.524825] kprobe_int3_handler+0x74/0x150
[45235.525627] do_int3+0x36/0xf0
[45235.526244] int3+0x42/0x50
[45235.526767] RIP: 0010:rcu_nmi_exit+0x1/0x290
[45235.527551] Code: a2 0d 82 be c2 01 00 00 48 c7 c7 d5 44 0f 82 c6 05 e7 ac 24 01 01 e8 1f ba fd ff eb b8 66 66 2e 0f 1f 84 00 00 00 00 00 90 cc <57> 41 56 41 55 41 54 55 48 c7 c5 40 c2 02 00 53 48 89 eb e8 77 75
[45235.530898] RSP: 0018:ffffc90000347d40 EFLAGS: 00000046
[45235.531816] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[45235.533001] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
[45235.534252] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[45235.535516] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[45235.536759] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[45235.537945] ? ist_exit+0xe/0x20
[45235.538593] ? ist_exit+0xe/0x20
[45235.539239] ? rcu_nmi_exit+0x1/0x290
[45235.541182] int3+0x42/0x50
[45235.541687] RIP: 0010:0xffffffffa000005a
[45235.542363] Code: 2e 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 1f 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 20 16 13 e1 00 00 00 00 00 00 00 00 <41> 57 e9 01 8a 0e e1 00 00 00 00 00 00 00 00 41 57 e9 f2 22 26 e1
[45235.545628] RSP: 0018:ffffc90000347e20 EFLAGS: 00000146
[45235.546596] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[45235.547989] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
[45235.550183] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88807d2aa000
[45235.551591] R10: 0000000000000a4c R11: ffff88807bfec600 R12: 0000000000000000
[45235.552893] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[45235.554633] ? ist_exit+0xe/0x20
[45235.555537] ? ist_exit+0xe/0x20
[45235.556565] ? rcu_nmi_exit+0x1/0x290
[45235.557909] ? int3+0x42/0x50
[45235.559156] ? 0xffffffffa0000069
[45235.560547] ? vfs_read+0x1/0x150
[45235.561522] ? ksys_read+0x60/0xe0
[45235.562458] ? do_syscall_64+0x4b/0x1e0
[45235.563404] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[45235.564705] Modules linked in:
[45235.565556] ---[ end trace 870af8724dba9ac8 ]---

So all functions called from do_int3() must be NOKPROBE.

Thank you,

--
Masami Hiramatsu <[email protected]>

2020-03-06 18:48:05

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Sat, Mar 07, 2020 at 03:01:49AM +0900, Masami Hiramatsu wrote:
> Hi,
>
> On Wed, 19 Feb 2020 11:45:10 +0900
> Masami Hiramatsu <[email protected]> wrote:
>
> > On Tue, 18 Feb 2020 12:18:06 -0800
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > On Tue, Feb 18, 2020 at 12:46:09PM -0500, Steven Rostedt wrote:
> > > > On Tue, 18 Feb 2020 13:33:35 +0900
> > > > Masami Hiramatsu <[email protected]> wrote:
> > > >
> > > > > On Mon, 17 Feb 2020 08:31:12 -0800
> > > > > "Paul E. McKenney" <[email protected]> wrote:
> > > > > >
> > > > > > > BTW, if you consider the x86 specific code is in the generic file,
> > > > > > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > > > > > (Sorry, I've hit this idea right now)
> > > > > >
> > > > > > Might this affect other architectures with NMIs and probe-like things?
> > > > > > If so, it might make sense to leave it where it is.
> > > > >
> > > > > Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> > > > > debug_exception_enter().
> > > > > OK, let's keep it, but maybe it is good to update the comment for
> > > > > arm64 too. What about following?
> > > > >
> > > > > +/*
> > > > > + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> > > > > + * marked NOKPROBE before kprobes handler is called.
> > > > > + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> > > > > + * before kprobes handle happens to call rcu_nmi_enter() which means
> > > > > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > > > > + */
> > > > >
> > > >
> > > > Ah, why don't we just say...
> > > >
> > > > /*
> > > > * All functions called in the breakpoint trap handler (e.g. do_int3()
> > > > * on x86), must not allow kprobes until the kprobe breakpoint handler
> > > > * is called, otherwise it can cause an infinite recursion.
> > > > * On some archs, rcu_nmi_enter() is called in the breakpoint handler
> > > > * before the kprobe breakpoint handler is called, thus it must be
> > > > * marked as NOKPROBE.
> > > > */
> > > >
> > > > And that way we don't make this an arch specific comment.
> > >
> > > That looks good to me. Masami, does this work for you?
> >
> > Yes, that looks good to me too :)

Aha! So then I'm glad I brought it up ;-) OCDs pay off these days :-D

thanks,

- Joel


> Oops, I'm guilty!
> Sorry *rcu_nmi_exit()* also must be NOKPROBE, since even if we could catch
> a recursive kprobe call, we can only skip the kprobe handler, but we must
> exit from do_int3() and hit rcu_nmi_exit() again!
>
> [45235.497591] Unrecoverable kprobe detected.
> [45235.501400] Dumping kprobe:
> [45235.502433] Name: (null)
> [45235.502433] Offset: 0
> [45235.502433] Address: rcu_nmi_exit+0x0/0x290
> [45235.504044] ------------[ cut here ]------------
> [45235.504855] kernel BUG at arch/x86/kernel/kprobes/core.c:646!
> [45235.505816] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> [45235.506615] CPU: 7 PID: 143 Comm: sh Not tainted 5.6.0-rc3+ #143
> [45235.507662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [45235.509764] RIP: 0010:reenter_kprobe.cold+0x14/0x16
> [45235.510630] Code: 48 8b 75 10 48 c7 c7 f0 70 0e 82 48 8b 56 28 e8 22 91 08 00 0f 0b 48 c7 c7 20 71 0e 82 e8 14 91 08 00 48 89 ef e8 23 ee 0f 00 <0f> 0b 48 89 ee 48 c7 c7 48 71 0e 82 e8 fb 90 08 00 e9 c3 fc ff ff
> [45235.513948] RSP: 0018:ffffc90000347bf8 EFLAGS: 00010046
> [45235.514906] RAX: 0000000000000036 RBX: 0000000000017f20 RCX: 0000000000000000
> [45235.516109] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
> [45235.517278] RBP: ffff88807c9820c0 R08: 0000000000000000 R09: 0000000000000001
> [45235.518415] R10: 0000000000000000 R11: ffff88807c9d1f18 R12: ffff88807d9d7f20
> [45235.519609] R13: ffffc90000347c68 R14: ffffffff810e8a60 R15: ffffffff810e8a61
> [45235.520787] FS: 0000000001d9a8c0(0000) GS:ffff88807d9c0000(0000) knlGS:0000000000000000
> [45235.522198] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [45235.523172] CR2: 0000000001da9000 CR3: 000000007a880000 CR4: 00000000000006a0
> [45235.524288] Call Trace:
> [45235.524825] kprobe_int3_handler+0x74/0x150
> [45235.525627] do_int3+0x36/0xf0
> [45235.526244] int3+0x42/0x50
> [45235.526767] RIP: 0010:rcu_nmi_exit+0x1/0x290
> [45235.527551] Code: a2 0d 82 be c2 01 00 00 48 c7 c7 d5 44 0f 82 c6 05 e7 ac 24 01 01 e8 1f ba fd ff eb b8 66 66 2e 0f 1f 84 00 00 00 00 00 90 cc <57> 41 56 41 55 41 54 55 48 c7 c5 40 c2 02 00 53 48 89 eb e8 77 75
> [45235.530898] RSP: 0018:ffffc90000347d40 EFLAGS: 00000046
> [45235.531816] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [45235.533001] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
> [45235.534252] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [45235.535516] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [45235.536759] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [45235.537945] ? ist_exit+0xe/0x20
> [45235.538593] ? ist_exit+0xe/0x20
> [45235.539239] ? rcu_nmi_exit+0x1/0x290
> [45235.541182] int3+0x42/0x50
> [45235.541687] RIP: 0010:0xffffffffa000005a
> [45235.542363] Code: 2e 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 1f 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 20 16 13 e1 00 00 00 00 00 00 00 00 <41> 57 e9 01 8a 0e e1 00 00 00 00 00 00 00 00 41 57 e9 f2 22 26 e1
> [45235.545628] RSP: 0018:ffffc90000347e20 EFLAGS: 00000146
> [45235.546596] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [45235.547989] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
> [45235.550183] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88807d2aa000
> [45235.551591] R10: 0000000000000a4c R11: ffff88807bfec600 R12: 0000000000000000
> [45235.552893] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [45235.554633] ? ist_exit+0xe/0x20
> [45235.555537] ? ist_exit+0xe/0x20
> [45235.556565] ? rcu_nmi_exit+0x1/0x290
> [45235.557909] ? int3+0x42/0x50
> [45235.559156] ? 0xffffffffa0000069
> [45235.560547] ? vfs_read+0x1/0x150
> [45235.561522] ? ksys_read+0x60/0xe0
> [45235.562458] ? do_syscall_64+0x4b/0x1e0
> [45235.563404] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [45235.564705] Modules linked in:
> [45235.565556] ---[ end trace 870af8724dba9ac8 ]---
>
> So all functions called from do_int3() must be NOKPROBE.
>
> Thank you,
>
> --
> Masami Hiramatsu <[email protected]>

2020-03-06 19:12:34

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Sat, Mar 07, 2020 at 03:01:49AM +0900, Masami Hiramatsu wrote:
> Hi,
>
> On Wed, 19 Feb 2020 11:45:10 +0900
> Masami Hiramatsu <[email protected]> wrote:
>
> > On Tue, 18 Feb 2020 12:18:06 -0800
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > > On Tue, Feb 18, 2020 at 12:46:09PM -0500, Steven Rostedt wrote:
> > > > On Tue, 18 Feb 2020 13:33:35 +0900
> > > > Masami Hiramatsu <[email protected]> wrote:
> > > >
> > > > > On Mon, 17 Feb 2020 08:31:12 -0800
> > > > > "Paul E. McKenney" <[email protected]> wrote:
> > > > > >
> > > > > > > BTW, if you consider the x86 specific code is in the generic file,
> > > > > > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > > > > > (Sorry, I've hit this idea right now)
> > > > > >
> > > > > > Might this affect other architectures with NMIs and probe-like things?
> > > > > > If so, it might make sense to leave it where it is.
> > > > >
> > > > > Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> > > > > debug_exception_enter().
> > > > > OK, let's keep it, but maybe it is good to update the comment for
> > > > > arm64 too. What about following?
> > > > >
> > > > > +/*
> > > > > + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> > > > > + * marked NOKPROBE before kprobes handler is called.
> > > > > + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> > > > > + * before kprobes handle happens to call rcu_nmi_enter() which means
> > > > > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > > > > + */
> > > > >
> > > >
> > > > Ah, why don't we just say...
> > > >
> > > > /*
> > > > * All functions called in the breakpoint trap handler (e.g. do_int3()
> > > > * on x86), must not allow kprobes until the kprobe breakpoint handler
> > > > * is called, otherwise it can cause an infinite recursion.
> > > > * On some archs, rcu_nmi_enter() is called in the breakpoint handler
> > > > * before the kprobe breakpoint handler is called, thus it must be
> > > > * marked as NOKPROBE.
> > > > */
> > > >
> > > > And that way we don't make this an arch specific comment.
> > >
> > > That looks good to me. Masami, does this work for you?
> >
> > Yes, that looks good to me too :)
>
> Oops, I'm guilty!
> Sorry *rcu_nmi_exit()* also must be NOKPROBE, since even if we could catch
> a recursive kprobe call, we can only skip the kprobe handler, but we must
> exit from do_int3() and hit rcu_nmi_exit() again!
>
> [45235.497591] Unrecoverable kprobe detected.
> [45235.501400] Dumping kprobe:
> [45235.502433] Name: (null)
> [45235.502433] Offset: 0
> [45235.502433] Address: rcu_nmi_exit+0x0/0x290
> [45235.504044] ------------[ cut here ]------------
> [45235.504855] kernel BUG at arch/x86/kernel/kprobes/core.c:646!
> [45235.505816] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> [45235.506615] CPU: 7 PID: 143 Comm: sh Not tainted 5.6.0-rc3+ #143
> [45235.507662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [45235.509764] RIP: 0010:reenter_kprobe.cold+0x14/0x16
> [45235.510630] Code: 48 8b 75 10 48 c7 c7 f0 70 0e 82 48 8b 56 28 e8 22 91 08 00 0f 0b 48 c7 c7 20 71 0e 82 e8 14 91 08 00 48 89 ef e8 23 ee 0f 00 <0f> 0b 48 89 ee 48 c7 c7 48 71 0e 82 e8 fb 90 08 00 e9 c3 fc ff ff
> [45235.513948] RSP: 0018:ffffc90000347bf8 EFLAGS: 00010046
> [45235.514906] RAX: 0000000000000036 RBX: 0000000000017f20 RCX: 0000000000000000
> [45235.516109] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
> [45235.517278] RBP: ffff88807c9820c0 R08: 0000000000000000 R09: 0000000000000001
> [45235.518415] R10: 0000000000000000 R11: ffff88807c9d1f18 R12: ffff88807d9d7f20
> [45235.519609] R13: ffffc90000347c68 R14: ffffffff810e8a60 R15: ffffffff810e8a61
> [45235.520787] FS: 0000000001d9a8c0(0000) GS:ffff88807d9c0000(0000) knlGS:0000000000000000
> [45235.522198] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [45235.523172] CR2: 0000000001da9000 CR3: 000000007a880000 CR4: 00000000000006a0
> [45235.524288] Call Trace:
> [45235.524825] kprobe_int3_handler+0x74/0x150
> [45235.525627] do_int3+0x36/0xf0
> [45235.526244] int3+0x42/0x50
> [45235.526767] RIP: 0010:rcu_nmi_exit+0x1/0x290
> [45235.527551] Code: a2 0d 82 be c2 01 00 00 48 c7 c7 d5 44 0f 82 c6 05 e7 ac 24 01 01 e8 1f ba fd ff eb b8 66 66 2e 0f 1f 84 00 00 00 00 00 90 cc <57> 41 56 41 55 41 54 55 48 c7 c5 40 c2 02 00 53 48 89 eb e8 77 75
> [45235.530898] RSP: 0018:ffffc90000347d40 EFLAGS: 00000046
> [45235.531816] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [45235.533001] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
> [45235.534252] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [45235.535516] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [45235.536759] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [45235.537945] ? ist_exit+0xe/0x20
> [45235.538593] ? ist_exit+0xe/0x20
> [45235.539239] ? rcu_nmi_exit+0x1/0x290
> [45235.541182] int3+0x42/0x50
> [45235.541687] RIP: 0010:0xffffffffa000005a
> [45235.542363] Code: 2e 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 1f 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 20 16 13 e1 00 00 00 00 00 00 00 00 <41> 57 e9 01 8a 0e e1 00 00 00 00 00 00 00 00 41 57 e9 f2 22 26 e1
> [45235.545628] RSP: 0018:ffffc90000347e20 EFLAGS: 00000146
> [45235.546596] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [45235.547989] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
> [45235.550183] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88807d2aa000
> [45235.551591] R10: 0000000000000a4c R11: ffff88807bfec600 R12: 0000000000000000
> [45235.552893] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [45235.554633] ? ist_exit+0xe/0x20
> [45235.555537] ? ist_exit+0xe/0x20
> [45235.556565] ? rcu_nmi_exit+0x1/0x290
> [45235.557909] ? int3+0x42/0x50
> [45235.559156] ? 0xffffffffa0000069
> [45235.560547] ? vfs_read+0x1/0x150
> [45235.561522] ? ksys_read+0x60/0xe0
> [45235.562458] ? do_syscall_64+0x4b/0x1e0
> [45235.563404] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [45235.564705] Modules linked in:
> [45235.565556] ---[ end trace 870af8724dba9ac8 ]---
>
> So all functions called from do_int3() must be NOKPROBE.

Makes sense, I am curious though why you thought before that the breakpoint
recursion was not possible *after* kprobe_int3_handler(). In the below thread
you said it is to be marked only for functions *before*
kprobe_int3_handler(). Was there a reason?
https://lore.kernel.org/lkml/154998793571.31052.11301258949601150994.stgit@devbox/

thanks,

- Joel


>
> Thank you,
>
> --
> Masami Hiramatsu <[email protected]>

2020-03-07 01:59:24

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

On Fri, 6 Mar 2020 14:11:51 -0500
Joel Fernandes <[email protected]> wrote:

> On Sat, Mar 07, 2020 at 03:01:49AM +0900, Masami Hiramatsu wrote:
> > Hi,
> >
> > On Wed, 19 Feb 2020 11:45:10 +0900
> > Masami Hiramatsu <[email protected]> wrote:
> >
> > > On Tue, 18 Feb 2020 12:18:06 -0800
> > > "Paul E. McKenney" <[email protected]> wrote:
> > >
> > > > On Tue, Feb 18, 2020 at 12:46:09PM -0500, Steven Rostedt wrote:
> > > > > On Tue, 18 Feb 2020 13:33:35 +0900
> > > > > Masami Hiramatsu <[email protected]> wrote:
> > > > >
> > > > > > On Mon, 17 Feb 2020 08:31:12 -0800
> > > > > > "Paul E. McKenney" <[email protected]> wrote:
> > > > > > >
> > > > > > > > BTW, if you consider the x86 specific code is in the generic file,
> > > > > > > > we can move NOKPROBE_SYMBOL() in arch/x86/kernel/traps.c.
> > > > > > > > (Sorry, I've hit this idea right now)
> > > > > > >
> > > > > > > Might this affect other architectures with NMIs and probe-like things?
> > > > > > > If so, it might make sense to leave it where it is.
> > > > > >
> > > > > > Yes, git grep shows that arm64 is using rcu_nmi_enter() in
> > > > > > debug_exception_enter().
> > > > > > OK, let's keep it, but maybe it is good to update the comment for
> > > > > > arm64 too. What about following?
> > > > > >
> > > > > > +/*
> > > > > > + * All functions in do_int3() on x86, do_debug_exception() on arm64 must be
> > > > > > + * marked NOKPROBE before kprobes handler is called.
> > > > > > + * ist_enter() on x86 and debug_exception_enter() on arm64 which is called
> > > > > > + * before kprobes handle happens to call rcu_nmi_enter() which means
> > > > > > + * that rcu_nmi_enter() must be marked NOKRPOBE.
> > > > > > + */
> > > > > >
> > > > >
> > > > > Ah, why don't we just say...
> > > > >
> > > > > /*
> > > > > * All functions called in the breakpoint trap handler (e.g. do_int3()
> > > > > * on x86), must not allow kprobes until the kprobe breakpoint handler
> > > > > * is called, otherwise it can cause an infinite recursion.
> > > > > * On some archs, rcu_nmi_enter() is called in the breakpoint handler
> > > > > * before the kprobe breakpoint handler is called, thus it must be
> > > > > * marked as NOKPROBE.
> > > > > */
> > > > >
> > > > > And that way we don't make this an arch specific comment.
> > > >
> > > > That looks good to me. Masami, does this work for you?
> > >
> > > Yes, that looks good to me too :)
> >
> > Oops, I'm guilty!
> > Sorry *rcu_nmi_exit()* also must be NOKPROBE, since even if we could catch
> > a recursive kprobe call, we can only skip the kprobe handler, but we must
> > exit from do_int3() and hit rcu_nmi_exit() again!
> >
> > [45235.497591] Unrecoverable kprobe detected.
> > [45235.501400] Dumping kprobe:
> > [45235.502433] Name: (null)
> > [45235.502433] Offset: 0
> > [45235.502433] Address: rcu_nmi_exit+0x0/0x290
> > [45235.504044] ------------[ cut here ]------------
> > [45235.504855] kernel BUG at arch/x86/kernel/kprobes/core.c:646!
> > [45235.505816] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> > [45235.506615] CPU: 7 PID: 143 Comm: sh Not tainted 5.6.0-rc3+ #143
> > [45235.507662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> > [45235.509764] RIP: 0010:reenter_kprobe.cold+0x14/0x16
> > [45235.510630] Code: 48 8b 75 10 48 c7 c7 f0 70 0e 82 48 8b 56 28 e8 22 91 08 00 0f 0b 48 c7 c7 20 71 0e 82 e8 14 91 08 00 48 89 ef e8 23 ee 0f 00 <0f> 0b 48 89 ee 48 c7 c7 48 71 0e 82 e8 fb 90 08 00 e9 c3 fc ff ff
> > [45235.513948] RSP: 0018:ffffc90000347bf8 EFLAGS: 00010046
> > [45235.514906] RAX: 0000000000000036 RBX: 0000000000017f20 RCX: 0000000000000000
> > [45235.516109] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
> > [45235.517278] RBP: ffff88807c9820c0 R08: 0000000000000000 R09: 0000000000000001
> > [45235.518415] R10: 0000000000000000 R11: ffff88807c9d1f18 R12: ffff88807d9d7f20
> > [45235.519609] R13: ffffc90000347c68 R14: ffffffff810e8a60 R15: ffffffff810e8a61
> > [45235.520787] FS: 0000000001d9a8c0(0000) GS:ffff88807d9c0000(0000) knlGS:0000000000000000
> > [45235.522198] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [45235.523172] CR2: 0000000001da9000 CR3: 000000007a880000 CR4: 00000000000006a0
> > [45235.524288] Call Trace:
> > [45235.524825] kprobe_int3_handler+0x74/0x150
> > [45235.525627] do_int3+0x36/0xf0
> > [45235.526244] int3+0x42/0x50
> > [45235.526767] RIP: 0010:rcu_nmi_exit+0x1/0x290
> > [45235.527551] Code: a2 0d 82 be c2 01 00 00 48 c7 c7 d5 44 0f 82 c6 05 e7 ac 24 01 01 e8 1f ba fd ff eb b8 66 66 2e 0f 1f 84 00 00 00 00 00 90 cc <57> 41 56 41 55 41 54 55 48 c7 c5 40 c2 02 00 53 48 89 eb e8 77 75
> > [45235.530898] RSP: 0018:ffffc90000347d40 EFLAGS: 00000046
> > [45235.531816] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > [45235.533001] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
> > [45235.534252] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> > [45235.535516] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> > [45235.536759] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > [45235.537945] ? ist_exit+0xe/0x20
> > [45235.538593] ? ist_exit+0xe/0x20
> > [45235.539239] ? rcu_nmi_exit+0x1/0x290
> > [45235.541182] int3+0x42/0x50
> > [45235.541687] RIP: 0010:0xffffffffa000005a
> > [45235.542363] Code: 2e 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 1f 16 13 e1 00 00 00 00 00 00 00 00 89 f8 e9 20 16 13 e1 00 00 00 00 00 00 00 00 <41> 57 e9 01 8a 0e e1 00 00 00 00 00 00 00 00 41 57 e9 f2 22 26 e1
> > [45235.545628] RSP: 0018:ffffc90000347e20 EFLAGS: 00000146
> > [45235.546596] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > [45235.547989] RDX: 0000000000000001 RSI: ffffffff8101e1fe RDI: ffffffff8101e1fe
> > [45235.550183] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88807d2aa000
> > [45235.551591] R10: 0000000000000a4c R11: ffff88807bfec600 R12: 0000000000000000
> > [45235.552893] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > [45235.554633] ? ist_exit+0xe/0x20
> > [45235.555537] ? ist_exit+0xe/0x20
> > [45235.556565] ? rcu_nmi_exit+0x1/0x290
> > [45235.557909] ? int3+0x42/0x50
> > [45235.559156] ? 0xffffffffa0000069
> > [45235.560547] ? vfs_read+0x1/0x150
> > [45235.561522] ? ksys_read+0x60/0xe0
> > [45235.562458] ? do_syscall_64+0x4b/0x1e0
> > [45235.563404] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [45235.564705] Modules linked in:
> > [45235.565556] ---[ end trace 870af8724dba9ac8 ]---
> >
> > So all functions called from do_int3() must be NOKPROBE.
>
> Makes sense, I am curious though why you thought before that the breakpoint
> recursion was not possible *after* kprobe_int3_handler(). In the below thread
> you said it is to be marked only for functions *before*
> kprobe_int3_handler(). Was there a reason?
> https://lore.kernel.org/lkml/154998793571.31052.11301258949601150994.stgit@devbox/

Sorry, that was my simple mistake. When I tested the rcu_nmi_exit(), I forgot
to disable optprobe. So whole the do_int3() was not hit... (I had to think
deeper why the hit count of the probe was not incremented at that point.)

This time, I set 0 to the /proc/sys/debug/kprobes_optimization, and confirmed
this bug.

Thank you,

--
Masami Hiramatsu <[email protected]>