2023-03-14 19:05:01

by Steven Rostedt

[permalink] [raw]
Subject: [for-linus][PATCH 5/5] tracing: Make tracepoint lockdep check actually test something

From: "Steven Rostedt (Google)" <[email protected]>

A while ago where the trace events had the following:

rcu_read_lock_sched_notrace();
rcu_dereference_sched(...);
rcu_read_unlock_sched_notrace();

If the tracepoint is enabled, it could trigger RCU issues if called in
the wrong place. And this warning was only triggered if lockdep was
enabled. If the tracepoint was never enabled with lockdep, the bug would
not be caught. To handle this, the above sequence was done when lockdep
was enabled regardless if the tracepoint was enabled or not (although the
always enabled code really didn't do anything, it would still trigger a
warning).

But a lot has changed since that lockdep code was added. One is, that
sequence no longer triggers any warning. Another is, the tracepoint when
enabled doesn't even do that sequence anymore.

The main check we care about today is whether RCU is "watching" or not.
So if lockdep is enabled, always check if rcu_is_watching() which will
trigger a warning if it is not (tracepoints require RCU to be watching).

Note, that old sequence did add a bit of overhead when lockdep was enabled,
and with the latest kernel updates, would cause the system to slow down
enough to trigger kernel "stalled" warnings.

Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Joel Fernandes <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
Signed-off-by: Steven Rostedt (Google) <[email protected]>
---
include/linux/tracepoint.h | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index fa1004fcf810..2083f2d2f05b 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -231,12 +231,11 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
* not add unwanted padding between the beginning of the section and the
* structure. Force alignment to the same alignment as the section start.
*
- * When lockdep is enabled, we make sure to always do the RCU portions of
- * the tracepoint code, regardless of whether tracing is on. However,
- * don't check if the condition is false, due to interaction with idle
- * instrumentation. This lets us find RCU issues triggered with tracepoints
- * even when this tracepoint is off. This code has no purpose other than
- * poking RCU a bit.
+ * When lockdep is enabled, we make sure to always test if RCU is
+ * "watching" regardless if the tracepoint is enabled or not. Tracepoints
+ * require RCU to be active, and it should always warn at the tracepoint
+ * site if it is not watching, as it will need to be active when the
+ * tracepoint is enabled.
*/
#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \
extern int __traceiter_##name(data_proto); \
@@ -249,9 +248,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
TP_ARGS(args), \
TP_CONDITION(cond), 0); \
if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \
- rcu_read_lock_sched_notrace(); \
- rcu_dereference_sched(__tracepoint_##name.funcs);\
- rcu_read_unlock_sched_notrace(); \
+ WARN_ON_ONCE(!rcu_is_watching()); \
} \
} \
__DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \
--
2.39.1


2023-03-14 21:08:38

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [for-linus][PATCH 5/5] tracing: Make tracepoint lockdep check actually test something

On Tue, Mar 14, 2023 at 03:02:41PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <[email protected]>
>
> A while ago where the trace events had the following:
>
> rcu_read_lock_sched_notrace();
> rcu_dereference_sched(...);
> rcu_read_unlock_sched_notrace();
>
> If the tracepoint is enabled, it could trigger RCU issues if called in
> the wrong place. And this warning was only triggered if lockdep was
> enabled. If the tracepoint was never enabled with lockdep, the bug would
> not be caught. To handle this, the above sequence was done when lockdep
> was enabled regardless if the tracepoint was enabled or not (although the
> always enabled code really didn't do anything, it would still trigger a
> warning).
>
> But a lot has changed since that lockdep code was added. One is, that
> sequence no longer triggers any warning. Another is, the tracepoint when
> enabled doesn't even do that sequence anymore.
>
> The main check we care about today is whether RCU is "watching" or not.
> So if lockdep is enabled, always check if rcu_is_watching() which will
> trigger a warning if it is not (tracepoints require RCU to be watching).
>
> Note, that old sequence did add a bit of overhead when lockdep was enabled,
> and with the latest kernel updates, would cause the system to slow down
> enough to trigger kernel "stalled" warnings.
>
> Link: http://lore.kernel.org/lkml/[email protected]
> Link: http://lore.kernel.org/lkml/[email protected]
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
>
> Cc: [email protected]
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: "Paul E. McKenney" <[email protected]>
> Cc: Mathieu Desnoyers <[email protected]>
> Cc: Joel Fernandes <[email protected]>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
> Signed-off-by: Steven Rostedt (Google) <[email protected]>

Acked-by: Paul E. McKenney <[email protected]>

> ---
> include/linux/tracepoint.h | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index fa1004fcf810..2083f2d2f05b 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -231,12 +231,11 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> * not add unwanted padding between the beginning of the section and the
> * structure. Force alignment to the same alignment as the section start.
> *
> - * When lockdep is enabled, we make sure to always do the RCU portions of
> - * the tracepoint code, regardless of whether tracing is on. However,
> - * don't check if the condition is false, due to interaction with idle
> - * instrumentation. This lets us find RCU issues triggered with tracepoints
> - * even when this tracepoint is off. This code has no purpose other than
> - * poking RCU a bit.
> + * When lockdep is enabled, we make sure to always test if RCU is
> + * "watching" regardless if the tracepoint is enabled or not. Tracepoints
> + * require RCU to be active, and it should always warn at the tracepoint
> + * site if it is not watching, as it will need to be active when the
> + * tracepoint is enabled.
> */
> #define __DECLARE_TRACE(name, proto, args, cond, data_proto) \
> extern int __traceiter_##name(data_proto); \
> @@ -249,9 +248,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> TP_ARGS(args), \
> TP_CONDITION(cond), 0); \
> if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \
> - rcu_read_lock_sched_notrace(); \
> - rcu_dereference_sched(__tracepoint_##name.funcs);\
> - rcu_read_unlock_sched_notrace(); \
> + WARN_ON_ONCE(!rcu_is_watching()); \
> } \
> } \
> __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \
> --
> 2.39.1

2023-03-14 21:49:57

by Steven Rostedt

[permalink] [raw]
Subject: Re: [for-linus][PATCH 5/5] tracing: Make tracepoint lockdep check actually test something

On Tue, 14 Mar 2023 14:08:28 -0700
"Paul E. McKenney" <[email protected]> wrote:

> On Tue, Mar 14, 2023 at 03:02:41PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" <[email protected]>
> >
> > A while ago where the trace events had the following:
> >
> > rcu_read_lock_sched_notrace();
> > rcu_dereference_sched(...);
> > rcu_read_unlock_sched_notrace();
> >
> > If the tracepoint is enabled, it could trigger RCU issues if called in
> > the wrong place. And this warning was only triggered if lockdep was
> > enabled. If the tracepoint was never enabled with lockdep, the bug would
> > not be caught. To handle this, the above sequence was done when lockdep
> > was enabled regardless if the tracepoint was enabled or not (although the
> > always enabled code really didn't do anything, it would still trigger a
> > warning).
> >
> > But a lot has changed since that lockdep code was added. One is, that
> > sequence no longer triggers any warning. Another is, the tracepoint when
> > enabled doesn't even do that sequence anymore.
> >
> > The main check we care about today is whether RCU is "watching" or not.
> > So if lockdep is enabled, always check if rcu_is_watching() which will
> > trigger a warning if it is not (tracepoints require RCU to be watching).
> >
> > Note, that old sequence did add a bit of overhead when lockdep was enabled,
> > and with the latest kernel updates, would cause the system to slow down
> > enough to trigger kernel "stalled" warnings.
> >
> > Link: http://lore.kernel.org/lkml/[email protected]
> > Link: http://lore.kernel.org/lkml/[email protected]
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
> >
> > Cc: [email protected]
> > Cc: Masami Hiramatsu <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: "Paul E. McKenney" <[email protected]>
> > Cc: Mathieu Desnoyers <[email protected]>
> > Cc: Joel Fernandes <[email protected]>
> > Acked-by: Peter Zijlstra (Intel) <[email protected]>
> > Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
> > Signed-off-by: Steven Rostedt (Google) <[email protected]>
>
> Acked-by: Paul E. McKenney <[email protected]>
>

Thanks Paul!

-- Steve

2023-03-14 23:04:04

by Joel Fernandes

[permalink] [raw]
Subject: Re: [for-linus][PATCH 5/5] tracing: Make tracepoint lockdep check actually test something

On Tue, Mar 14, 2023 at 3:03 PM Steven Rostedt <[email protected]> wrote:
>
> From: "Steven Rostedt (Google)" <[email protected]>
>
> A while ago where the trace events had the following:
>
> rcu_read_lock_sched_notrace();
> rcu_dereference_sched(...);
> rcu_read_unlock_sched_notrace();
>
> If the tracepoint is enabled, it could trigger RCU issues if called in
> the wrong place. And this warning was only triggered if lockdep was
> enabled. If the tracepoint was never enabled with lockdep, the bug would
> not be caught. To handle this, the above sequence was done when lockdep
> was enabled regardless if the tracepoint was enabled or not (although the
> always enabled code really didn't do anything, it would still trigger a
> warning).
>
> But a lot has changed since that lockdep code was added. One is, that
> sequence no longer triggers any warning. Another is, the tracepoint when
> enabled doesn't even do that sequence anymore.

I agree with the change but I am confused by the commit message a bit
due to "Another is, the tracepoint when enabled doesn't even do that
sequence anymore.".

Whether the tracepoint was enabled or disabled, it is always doing the
old sequence because we were skipping the tracepoint's static key test
before running the sequence. Right?

So how was it not doing the old sequence before?

Other than that,
Reviewed-by: Joel Fernandes (Google) <[email protected]>

- Joel


> The main check we care about today is whether RCU is "watching" or not.
> So if lockdep is enabled, always check if rcu_is_watching() which will
> trigger a warning if it is not (tracepoints require RCU to be watching).
>
> Note, that old sequence did add a bit of overhead when lockdep was enabled,
> and with the latest kernel updates, would cause the system to slow down
> enough to trigger kernel "stalled" warnings.
>
> Link: http://lore.kernel.org/lkml/[email protected]
> Link: http://lore.kernel.org/lkml/[email protected]
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
>
> Cc: [email protected]
> Cc: Masami Hiramatsu <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: "Paul E. McKenney" <[email protected]>
> Cc: Mathieu Desnoyers <[email protected]>
> Cc: Joel Fernandes <[email protected]>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
> Signed-off-by: Steven Rostedt (Google) <[email protected]>
> ---
> include/linux/tracepoint.h | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index fa1004fcf810..2083f2d2f05b 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -231,12 +231,11 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> * not add unwanted padding between the beginning of the section and the
> * structure. Force alignment to the same alignment as the section start.
> *
> - * When lockdep is enabled, we make sure to always do the RCU portions of
> - * the tracepoint code, regardless of whether tracing is on. However,
> - * don't check if the condition is false, due to interaction with idle
> - * instrumentation. This lets us find RCU issues triggered with tracepoints
> - * even when this tracepoint is off. This code has no purpose other than
> - * poking RCU a bit.
> + * When lockdep is enabled, we make sure to always test if RCU is
> + * "watching" regardless if the tracepoint is enabled or not. Tracepoints
> + * require RCU to be active, and it should always warn at the tracepoint
> + * site if it is not watching, as it will need to be active when the
> + * tracepoint is enabled.
> */
> #define __DECLARE_TRACE(name, proto, args, cond, data_proto) \
> extern int __traceiter_##name(data_proto); \
> @@ -249,9 +248,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> TP_ARGS(args), \
> TP_CONDITION(cond), 0); \
> if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \
> - rcu_read_lock_sched_notrace(); \
> - rcu_dereference_sched(__tracepoint_##name.funcs);\
> - rcu_read_unlock_sched_notrace(); \
> + WARN_ON_ONCE(!rcu_is_watching()); \
> } \
> } \
> __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \
> --
> 2.39.1

2023-03-14 23:07:57

by Joel Fernandes

[permalink] [raw]
Subject: Re: [for-linus][PATCH 5/5] tracing: Make tracepoint lockdep check actually test something

On Tue, Mar 14, 2023 at 7:03 PM Joel Fernandes <[email protected]> wrote:
>
> On Tue, Mar 14, 2023 at 3:03 PM Steven Rostedt <[email protected]> wrote:
> >
> > From: "Steven Rostedt (Google)" <[email protected]>
> >
> > A while ago where the trace events had the following:
> >
> > rcu_read_lock_sched_notrace();
> > rcu_dereference_sched(...);
> > rcu_read_unlock_sched_notrace();
> >
> > If the tracepoint is enabled, it could trigger RCU issues if called in
> > the wrong place. And this warning was only triggered if lockdep was
> > enabled. If the tracepoint was never enabled with lockdep, the bug would
> > not be caught. To handle this, the above sequence was done when lockdep
> > was enabled regardless if the tracepoint was enabled or not (although the
> > always enabled code really didn't do anything, it would still trigger a
> > warning).
> >
> > But a lot has changed since that lockdep code was added. One is, that
> > sequence no longer triggers any warning. Another is, the tracepoint when
> > enabled doesn't even do that sequence anymore.
>
> I agree with the change but I am confused by the commit message a bit
> due to "Another is, the tracepoint when enabled doesn't even do that
> sequence anymore.".
>
> Whether the tracepoint was enabled or disabled, it is always doing the
> old sequence because we were skipping the tracepoint's static key test
> before running the sequence. Right?
>
> So how was it not doing the old sequence before?

Ah I see, you meant "It was doing a dummy de-ref", not that "it was
_not_ doing anything". ;-)

So it is good then, but perhaps (optionally) call the code as a dummy
RCU deref which was supposed to trigger a warning. ;-)

- Joel


>
> Other than that,
> Reviewed-by: Joel Fernandes (Google) <[email protected]>
>
> - Joel
>
>
> > The main check we care about today is whether RCU is "watching" or not.
> > So if lockdep is enabled, always check if rcu_is_watching() which will
> > trigger a warning if it is not (tracepoints require RCU to be watching).
> >
> > Note, that old sequence did add a bit of overhead when lockdep was enabled,
> > and with the latest kernel updates, would cause the system to slow down
> > enough to trigger kernel "stalled" warnings.
> >
> > Link: http://lore.kernel.org/lkml/[email protected]
> > Link: http://lore.kernel.org/lkml/[email protected]
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
> >
> > Cc: [email protected]
> > Cc: Masami Hiramatsu <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: "Paul E. McKenney" <[email protected]>
> > Cc: Mathieu Desnoyers <[email protected]>
> > Cc: Joel Fernandes <[email protected]>
> > Acked-by: Peter Zijlstra (Intel) <[email protected]>
> > Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
> > Signed-off-by: Steven Rostedt (Google) <[email protected]>
> > ---
> > include/linux/tracepoint.h | 15 ++++++---------
> > 1 file changed, 6 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> > index fa1004fcf810..2083f2d2f05b 100644
> > --- a/include/linux/tracepoint.h
> > +++ b/include/linux/tracepoint.h
> > @@ -231,12 +231,11 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > * not add unwanted padding between the beginning of the section and the
> > * structure. Force alignment to the same alignment as the section start.
> > *
> > - * When lockdep is enabled, we make sure to always do the RCU portions of
> > - * the tracepoint code, regardless of whether tracing is on. However,
> > - * don't check if the condition is false, due to interaction with idle
> > - * instrumentation. This lets us find RCU issues triggered with tracepoints
> > - * even when this tracepoint is off. This code has no purpose other than
> > - * poking RCU a bit.
> > + * When lockdep is enabled, we make sure to always test if RCU is
> > + * "watching" regardless if the tracepoint is enabled or not. Tracepoints
> > + * require RCU to be active, and it should always warn at the tracepoint
> > + * site if it is not watching, as it will need to be active when the
> > + * tracepoint is enabled.
> > */
> > #define __DECLARE_TRACE(name, proto, args, cond, data_proto) \
> > extern int __traceiter_##name(data_proto); \
> > @@ -249,9 +248,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > TP_ARGS(args), \
> > TP_CONDITION(cond), 0); \
> > if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \
> > - rcu_read_lock_sched_notrace(); \
> > - rcu_dereference_sched(__tracepoint_##name.funcs);\
> > - rcu_read_unlock_sched_notrace(); \
> > + WARN_ON_ONCE(!rcu_is_watching()); \
> > } \
> > } \
> > __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \
> > --
> > 2.39.1